sorting uniquely by message-id

Next Topic
 
classic Classic list List threaded Threaded
19 messages Options
Reply | Threaded
Open this post in threaded view
|

sorting uniquely by message-id

Michael Richardson-5

Due to the way that IETF lists and aliases work, I often get two, three
sometimes four copies of an email.  I attach a mh-e view of a thread below.
I really want to just keep the copy that went through the list, because
that's the one that went into the public archive, and replies will work
better on that version.
But, sometimes some of the messages get copied to me only, and can't lose them.

I can see which is which mostly, because I've added "List-Id" to my scan
output that mh-e shows me.

I started thinking that I could do something with:
  sortm -textfield message-id:
 
but then I realized I don't want to change the order of the folder.
Instead, I am doing:

  scan -format '%5(msg),%{message-id},%<(addr{List-Id})%>' +inbox'

and then process this with Perl to do the right thing....
I think that I ought to use something other than , that can't appear in any
of the relevant fields.  ASCII defined this "FS" character (ASCII 29 or
something), and I wonder why we don't use it more often...  i wonder if the
format spec ought to make this easily accessible?

Before:
 142 -09/16 ["Rob Wilton \(r]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
 156 %09/16   [To:"Rob Wilton ]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
 164 %09/17     [Carsten Bormann]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
 175 %09/17     ["Rob Wilton (rw]                  RE: [Asdf] Robert Wilton's Block
 176 %09/17   ["Rob Wilton (rw]                  RE: [Asdf] Robert Wilton's Block
 177 %09/17     <"Rob Wilton \(r>     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
 178 %09/17     [Carsten Bormann]                  Re: [Asdf] Robert Wilton's Block
 179 %09/17       <Carsten Bormann>     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
 187 %09/17       [David Kemp     ]                  Re: [Asdf] Robert Wilton's Block
 188 %09/17         <David Kemp     >     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
 200 %09/17         [Carsten Bormann]                  Re: [Asdf] Robert Wilton's Block
 201 %09/17           <Carsten Bormann>     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
 180 %09/17   <"Rob Wilton (rw>                  RE: [Asdf] Robert Wilton's Block
 181 %09/17     <"Rob Wilton \(r>     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
 184+%09/17     [Carsten Bormann]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
 185 %09/17       <Carsten Bormann>                  Re: [Asdf] Robert Wilton's Block
 189 %09/17       ["Rob Wilton (rw]                  RE: [Asdf] Robert Wilton's Block
 190 %09/17         <"Rob Wilton \(r>     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
 193 %09/17         [Alexander Pelov]                  Re: [Asdf] Robert Wilton's Block
 194 %09/17           <Alexander Pelov>     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
 196 %09/17         [Barry Leiba    ]                  Re: [Asdf] Robert Wilton's Block
 197 %09/17           <Barry Leiba    >                  Re: [Asdf] Robert Wilton's Block
 198 %09/17           ["Rob Wilton \(r]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
 199 %09/17             <"Rob Wilton (rw>                  RE: [Asdf] Robert Wilton's Block
 327 %09/18           [Carsten Bormann]                  Re: [Asdf] Robert Wilton's Block

After:

 142 -09/16 ["Rob Wilton \(r]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block on charter-ietf-as
 156 %09/16   [To:"Rob Wilton ]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block on charter-ietf-
 164 %09/17     [Carsten Bormann]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block on charter-iet
 175 %09/17     ["Rob Wilton (rw]                  RE: [Asdf] Robert Wilton's Block on charter-iet
 177 %09/17   ["Rob Wilton \(r]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block on charter-ietf-
 179 %09/17     [Carsten Bormann]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block on charter-iet
 188 %09/17       [David Kemp     ]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block on charter-i
 201 %09/17         [Carsten Bormann]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block on charter
 181 %09/17   <"Rob Wilton \(r>     asdf.ietf.or Re: [Asdf] Robert Wilton's Block on charter-ietf-
 185 %09/17     [Carsten Bormann]                  Re: [Asdf] Robert Wilton's Block on charter-iet
 190 %09/17       ["Rob Wilton \(r]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block on charter-i
 194 %09/17         [Alexander Pelov]     asdf.ietf.or Re: [Asdf] Robert Wilton's Block on charter
 197 %09/17         [Barry Leiba    ]                  Re: [Asdf] Robert Wilton's Block on charter
 199 %09/17           ["Rob Wilton (rw]                  RE: [Asdf] Robert Wilton's Block on chart
 327 %09/18           [Carsten Bormann]                  Re: [Asdf] Robert Wilton's Block on chart

the script:

#!/usr/bin/perl

my %msglist;
my %msgnum;
my @todelete;

open(MESSAGES, "scan -format '%5(msg),%{message-id},%<(addr{List-Id})%>' +inbox". "|");
while(<MESSAGES>) {
    chomp;
    my($msgnum, $msgid, $listid) = split(/,/);

    #print STDERR "processing: $_\n";
    #print STDERR " -> $msgnum, $msgid, $listid\n";
   
    if(defined($msgnum{$msgid})) { # it is a duplicate, see which one to keep.
        #print "$msgnum seen $msgid before at ${msgnum{$msgid}}\n";

        if(defined($msglist{$msgid})) { # keep the other one, delete this one!
            #print "Keeping other one: $msgnum{$msgid} not $msgnum\n";
            @todelete << $msgnum;
        } else {
            #print "Keeping  this one: $msgnum not $msgnum{$msgid}\n";
            # delete the other one, keep this one.
            $msglist{$msgid} = $listid;
            push(@todelete, $msgnum{$msgid});
        }
    } else {
        $msgnum{$msgid} = $msgnum;
    }
}

print "rmm +inbox", join(" ", @todelete), "\n";


--
]               Never tell me the odds!                 | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works        | network architect  [
]     [hidden email]  http://www.sandelman.ca/        |   ruby on rails    [
       

signature.asc (497 bytes) Download Attachment
kat
Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

kat
Dear Michael,

You may take inspiration from my dupl.
https://krabben.twilightparadox.com/fossil/nmh-contrib/artifact/ff6fb524b2bedb96

It doesn't work for you as-is, as it doesn't choose the right copy.

Also, you should add "-width 0" to the scan.

I avail myself of this opportunity to renew to you the assurances
of my highest consideration.

Krullen Van De Trap

Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

Ralph Corderoy
In reply to this post by Michael Richardson-5
Hi Michael,

> I really want to just keep the copy that went through the list,
> because that's the one that went into the public archive, and replies
> will work better on that version.  But, sometimes some of the messages
> get copied to me only, and can't lose them.

So for all emails with the same message-id field, keep just the one with
a list-id field but if none of them have a list-id field then keep all
of them.

> I realized I don't want to change the order of the folder.  Instead,
> I am doing:
>
>   scan -format '%5(msg),%{message-id},%<(addr{List-Id})%>' +inbox'
>
> and then process this with Perl to do the right thing....
>
>     if(defined($msgnum{$msgid})) {
>         if(defined($msglist{$msgid})) {
>             @todelete << $msgnum;
>         } else {
>             $msglist{$msgid} = $listid;
>             push(@todelete, $msgnum{$msgid});
>         }
>     } else {
>         $msgnum{$msgid} = $msgnum;
>     }

Doesn't that add empty $listids to %msglist?  Consider this input.

    1,m,li@foo  $msgnum{'m'} = 1
    2,m         $msglist{'m'} = ''; push(@todelete, 1);

>   156 %09/16   To:"Rob Wilton      asdf.ietf.or Re: [Asdf] Robert Wilton's Block
>   164 %09/17     Carsten Bormann     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
>   175 %09/17     "Rob Wilton                      RE: [Asdf] Robert Wilton's Block
> - 176 %09/17   "Rob Wilton                      RE: [Asdf] Robert Wilton's Block
>   177 %09/17   "Rob Wilton         asdf.ietf.or Re: [Asdf] Robert Wilton's Block
> - 178 %09/17     Carsten Bormann                  Re: [Asdf] Robert Wilton's Block
>   179 %09/17     Carsten Bormann     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
> - 187 %09/17       David Kemp                       Re: [Asdf] Robert Wilton's Block
>   188 %09/17       David Kemp          asdf.ietf.or Re: [Asdf] Robert Wilton's Block
> - 200 %09/17         Carsten Bormann                  Re: [Asdf] Robert Wilton's Block
>   201 %09/17         Carsten Bormann     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
> - 180 %09/17   "Rob Wilton                      RE: [Asdf] Robert Wilton's Block
>   181 %09/17   "Rob Wilton         asdf.ietf.or Re: [Asdf] Robert Wilton's Block
> - 184 %09/17     Carsten Bormann     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
>   185 %09/17     Carsten Bormann                  Re: [Asdf] Robert Wilton's Block
> - 189 %09/17       "Rob Wilton                      RE: [Asdf] Robert Wilton's Block
>   190 %09/17       "Rob Wilton         asdf.ietf.or Re: [Asdf] Robert Wilton's Block
> - 193 %09/17         Alexander Pelov                  Re: [Asdf] Robert Wilton's Block
>   194 %09/17         Alexander Pelov     asdf.ietf.or Re: [Asdf] Robert Wilton's Block
> - 196 %09/17         Barry Leiba                      Re: [Asdf] Robert Wilton's Block
>   197 %09/17         Barry Leiba                      Re: [Asdf] Robert Wilton's Block
> - 198 %09/17           "Rob Wilton         asdf.ietf.or Re: [Asdf] Robert Wilton's Block
>   199 %09/17           "Rob Wilton                      RE: [Asdf] Robert Wilton's Block
>   327 %09/18           Carsten Bormann                  Re: [Asdf] Robert Wilton's Block

Given

    $ grep -i sequence-negation ~/.mh_profile
    sequence-negation: not

something like this might do.

    mark -seq dup -del all
    pick -seq li --list-id 'z*'
    scan -forma '%{message-id}' li |
    while read -r mi; do
        pick -nozero -seq dup --message-id "$mi" notli
    done

> I think that I ought to use something other than , that can't appear
> in any of the relevant fields.  ASCII defined this "FS" character
> (ASCII 29 or something), and I wonder why we don't use it more
> often...  i wonder if the format spec ought to make this easily
> accessible?

US should be used for the lowest level, working up through RS and GS to
FS last; ascii(7).  Here's US.

    $ fmttest -forma '%{from}'$'\c_''%{date}' -width 0 . | sed -n l
    Michael Richardson <[hidden email]>\037Fri, 18 Sep 2020 16:36:15 -0\
    400$
    $

--
Cheers, Ralph.

Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

spaceman
In reply to this post by Michael Richardson-5
Michael Richardson wrote:
>
> Due to the way that IETF lists and aliases work, I often get two, three
> sometimes four copies of an email.  I attach a mh-e view of a thread below.
> I really want to just keep the copy that went through the list, because
> that's the one that went into the public archive, and replies will work
> better on that version.
>

I might be completely off the bat here, but wouldn’t something like
formail (part of procmail) solve the probem of duplicates instead of
trying to hack a system of perl.

If you need to take account of messageid and listid it might not
help. Although you could run it on a per list basis with a different
cache (i.e. list of message ids) for each list.

Regards,
spaceman

Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

e40 / envoy510
Yeah, I use procmail for this:

# From Ahmon on 2/8/19
#   This creates a little database of already-seen Message-Id and will not
#   redeliver a message with an already-seen Message-Id.
:0 Wh: msgid.lock
| formail -D 8192 ~/mail/msgid.cache


On Sat, Sep 19, 2020 at 6:07 AM spaceman <[hidden email]> wrote:
Michael Richardson wrote:
>
> Due to the way that IETF lists and aliases work, I often get two, three
> sometimes four copies of an email.  I attach a mh-e view of a thread below.
> I really want to just keep the copy that went through the list, because
> that's the one that went into the public archive, and replies will work
> better on that version.
>

I might be completely off the bat here, but wouldn’t something like
formail (part of procmail) solve the probem of duplicates instead of
trying to hack a system of perl.

If you need to take account of messageid and listid it might not
help. Although you could run it on a per list basis with a different
cache (i.e. list of message ids) for each list.

Regards,
spaceman

kat
Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

kat
In reply to this post by spaceman
Indeed spaceman, other utilities can deduplicate emails, but we can do it
with nmh too.

Michael, your selection preference is good, so have updated dupl to work
the way you described. Please find attached "dupl", the utility of
interest, and "dedupl-list", an demonstration of the utility.

The files are also in this repository.
https://krabben.twilightparadox.com/fossil/nmh-contrib/

I should note, dupl is vulnerable to an attack by which one chooses
a bizarre message-id with a space or pipe to cause dupl to mark a
message as duplicate when it truly is not a duplicate. I figured this
was not an issue because there is already the simpler attack of sending
a message with the exact same message-id.

I respectfully remain your servant,
Krullen Van De Trap

dupl (407 bytes) Download Attachment
dedupl-list (483 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

Michael Richardson-5
In reply to this post by spaceman

spaceman <[hidden email]> wrote:
    > Michael Richardson wrote:
    >>
    >> Due to the way that IETF lists and aliases work, I often get two, three
    >> sometimes four copies of an email.  I attach a mh-e view of a thread below.
    >> I really want to just keep the copy that went through the list, because
    >> that's the one that went into the public archive, and replies will work
    >> better on that version.
    >>

    > I might be completely off the bat here, but wouldn’t something like
    > formail (part of procmail) solve the probem of duplicates instead of
    > trying to hack a system of perl.

Yes, assuming that it could operate over multiple days receiving email from a
variety of directions, and which arrive possibly through different mailboxes.
The direct emails can arrive long before the mailing list copy, or could be
delayed by spam/graylisting.

Generally, I wouldn't want to run this all the time, which procmail would force.

--
]               Never tell me the odds!                 | ipv6 mesh networks [
]   Michael Richardson, Sandelman Software Works        |    IoT architect   [
]     [hidden email]  http://www.sandelman.ca/        |   ruby on rails    [


signature.asc (497 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

Ralph Corderoy
Hi Michael,

> The direct emails can arrive long before the mailing list copy

I doubt procmail is up to the job.

I send one email with a message-ID field.

- You may have zero or more copies arrive with no list-ID field.
- You may have zero or more copies arrive with list-ID field X.
- You may have zero or more copies arrive with list-ID field Y.
- And so on depending how many distinct lists send you the same
  message-ID email.

You can get multiple copies from the same list, as I do, because of a
re-try after a transmission error.

All those emails may arrive in any order.

--
Cheers, Ralph.

Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

Ralph Corderoy
In reply to this post by kat
Hi Krullen,

> sort -k 3 | sort -k 2

I don't think that does what you intend.
Anything achieved by the first sort is discarded by the second.

--
Cheers, Ralph.

kat
Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

kat
Dear Ralph,

> > sort -k 3 | sort -k 2
>
> I don't think that does what you intend.
> Anything achieved by the first sort is discarded by the second.

Indeed it does not.

Attached is a better version. I switched it also to use the unit
separator (number 31, like number 29 but for fields of a record).

So now the unnecessarily complex attack is to send a unit separator,
and the simple attack is still to send the exact same message-id.
I suppose you can check SPF, &c to mitigate either attack.

Respectfully yours,
Krullen Van De Trap

dupl (542 bytes) Download Attachment
dedupl-list (585 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

Wolfgang Denk
Dear Krullen Van De Trap,

In message <[hidden email]> you wrote:

>
> > > sort -k 3 | sort -k 2
> >
> > I don't think that does what you intend.
> > Anything achieved by the first sort is discarded by the second.
>
> Indeed it does not.
>
> Attached is a better version. I switched it also to use the unit
> separator (number 31, like number 29 but for fields of a record).
>
> So now the unnecessarily complex attack is to send a unit separator,
> and the simple attack is still to send the exact same message-id.
> I suppose you can check SPF, &c to mitigate either attack.

This does not work for me either.

It appears to have problems with messages that have no message-id
field at all (like those being archived throu a Fcc: header).  In
this case, $key will in my case look like this: '|[hidden email]' and
all such messages will be flaged as dupes.


Best regards,

Wolfgang Denk

--
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: [hidden email]
What we think, or what we know, or what we believe, is in the end, of
little consequence. The only thing of consequence is what we do.
                                                        - John Ruskin

kat
Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

kat
Brilliant, Wolfgang, I have fixed this too. See attached or linked.

https://krabben.twilightparadox.com/fossil/nmh-contrib/finfo?name=bin/dupl
https://krabben.twilightparadox.com/fossil/nmh-contrib/finfo?name=example/nmh-contrib/dedupl-list

With collegial regards,
Krullen Van De Trap

dupl (574 bytes) Download Attachment
dedupl-list (654 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

Ralph Corderoy
In reply to this post by Wolfgang Denk
Hi Wolfgang,

> It appears to have problems with messages that have no message-id
> field at all (like those being archived throu a Fcc: header).

send(1) has a -msgid to add a message-ID field before processing the FCC
field rather than assume something downstream will add one.  That way,
you've a record of the message ID for passing onto other later.

--
Cheers, Ralph.

Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

Wolfgang Denk
Dear Ralph,

In message <[hidden email]> you wrote:
>
> send(1) has a -msgid to add a message-ID field before processing the FCC
> field rather than assume something downstream will add one.  That way,
> you've a record of the message ID for passing onto other later.

Good to know - thanks (guess I should do more RTFMing ;-)


Best regards,

Wolfgang Denk

--
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: [hidden email]
Reader, suppose you were an idiot. And suppose you were a  member  of
Congress. But I repeat myself.                           - Mark Twain

Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

Wolfgang Denk
In reply to this post by kat
Dear Krullen,

In message <[hidden email]> you wrote:
>
> Brilliant, Wolfgang, I have fixed this too. See attached or linked.

Thanks - close, but no cigar ;-)


It still fails in a number of cases because "pick -search 'message-id:.*'"
searches everywhere in the message, including the body; I have a few
messages (as mentioned before, without their own message ID as they
were archived using a Fcc: header), which have an Message-ID: line
embedded in the body.  Then this message is considered identical to
the one containing this message-id in the header...

Best regards,

Wolfgang Denk

--
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: [hidden email]
There are bugs and then there are bugs.  And then there are bugs.
                                                    - Karl Lehenbauer

Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

Paul Fox-3
wolfgang wrote:
 > Dear Krullen,
 >
 > In message <[hidden email]> you wrote:
 > >
 > > Brilliant, Wolfgang, I have fixed this too. See attached or linked.
 >
 > Thanks - close, but no cigar ;-)
 >
 >
 > It still fails in a number of cases because "pick -search 'message-id:.*'"
 > searches everywhere in the message, including the body; I have a few
 > messages (as mentioned before, without their own message ID as they
 > were archived using a Fcc: header), which have an Message-ID: line
 > embedded in the body.  Then this message is considered identical to
 > the one containing this message-id in the header...

The pick man page doesn't present this quite as clearly as it could,
but when it refers to "--component", "component" means "any header
field name".  So I think what you want is:  "pick --message-id .*".
(The double hyphen is important.)

paul
=----------------------
paul fox, [hidden email] (arlington, ma, where it's 51.8 degrees)


Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

Wolfgang Denk
Dear Paul,

In message <[hidden email]> you wrote:

>
>  > It still fails in a number of cases because "pick -search 'message-id:.*'"
>  > searches everywhere in the message, including the body; I have a few
>  > messages (as mentioned before, without their own message ID as they
>  > were archived using a Fcc: header), which have an Message-ID: line
>  > embedded in the body.  Then this message is considered identical to
>  > the one containing this message-id in the header...
>
> The pick man page doesn't present this quite as clearly as it could,
> but when it refers to "--component", "component" means "any header
> field name".  So I think what you want is:  "pick --message-id .*".
> (The double hyphen is important.)

Indeed, this fixes the problem (of course we have to escape the '.*' tp
prevent the shell from expanding it).

Thanks!

Best regards,

Wolfgang Denk

--
DENX Software Engineering GmbH,      Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: [hidden email]
It's all Klatchian to me.
        - Terry Pratchett & Stephen Briggs, _The Discworld Companion_

Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

Ralph Corderoy
Hi,

Wolfgang wrote:
> > The pick man page doesn't present this quite as clearly as it could,
> > but when it refers to "--component", "component" means "any header
> > field name".  So I think what you want is:  "pick --message-id .*".
> > (The double hyphen is important.)
>
> Indeed, this fixes the problem (of course we have to escape the '.*'
> tp prevent the shell from expanding it).

My suggestion used ‘--message-id 'z*'’.  Its approach uses MH sequences,
each a fast set implemented using a bitmap, with a loop iterating for
each distinct message-ID field.

    https://lists.nongnu.org/archive/html/nmh-workers/2020-09/msg00008.html

--
Cheers, Ralph.

kat
Reply | Threaded
Open this post in threaded view
|

Re: sorting uniquely by message-id

kat
In reply to this post by Wolfgang Denk
Wolfgang Denk writes:
> It still fails in a number of cases because "pick -search 'message-id:.*'"
> searches everywhere in the message, including the body; I have a few
> messages (as mentioned before, without their own message ID as they
> were archived using a Fcc: header), which have an Message-ID: line
> embedded in the body.  Then this message is considered identical to
> the one containing this message-id in the header...

Dear Wolfgang,

I don't remember why I didn't use --message-id. Anyway, this patch
seems to work. It is also updated in the fossil repository.

12c12
< scan -width 0 -format "$format" $(pick -search 'message-id:.*' "$@") |
---
> scan -width 0 -format "$format" $(pick --message-id '' "$@") |

If nobody else gets to it, I may update manual page to say that these
are really are different, since this is unclear in the manual.

Your obedient servant,
Krullen Van De Trap