[nmh-workers] To/cc decode or not to/cc decode

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
17 messages Options
Reply | Threaded
Open this post in threaded view
|

[nmh-workers] To/cc decode or not to/cc decode

Conrad Hughes
Upgraded to Debian 10/buster and nmh 1.7.1-4 recently: very swish.

Took me a while to pin down a double appearance of the message context
(a legacy mhl.headers MessageName entry, in the end), but while doing
that I noticed that the default (at least under Debian) To: and cc:
lines in mhl.headers are just empty, implicitly not decoding their
content, while the From: line does extra work:

  From:formatfield="%(unquote(decode{text}))"

.. now actually I'm seeing unreadable garbage in my To: lines quite
frequently, so adding the same format tweak to To: and cc: seems to make
sense.  Any good reason not to do this, or why this isn't the default
for To: and cc: as well as From:?

Conrad

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Ken Hornstein-2
>Took me a while to pin down a double appearance of the message context
>(a legacy mhl.headers MessageName entry, in the end), but while doing
>that I noticed that the default (at least under Debian) To: and cc:
>lines in mhl.headers are just empty, implicitly not decoding their
>content, while the From: line does extra work:
>
>  From:formatfield="%(unquote(decode{text}))"
>
>.. now actually I'm seeing unreadable garbage in my To: lines quite
>frequently, so adding the same format tweak to To: and cc: seems to make
>sense.  Any good reason not to do this, or why this isn't the default
>for To: and cc: as well as From:?

I was curious, so I went and looked.  It seems we never did this because
... we never did it.

It's been that way from the beginning of nmh.  RFC 2047 decoding was
never in stock MH, but it was added somewhere along the way by Richard
Coleman during his initial nmh work.  That's where the "decode" option
was added to mhl (stock MH had "formatfield").  When those examples were
added to the default mhl format files, they were just added to the
From and Subject components.  I do not know why they were not added
to To: and cc: as well; I suspect the issue was that they were relatively
infreqently seen in those headers.

I cannot think of a reason to not add them.  I think we should add them
to the defaults; any objections?

--Ken

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

David Levine-3
In reply to this post by Conrad Hughes
Conrad writes:

> I noticed that the default (at least under Debian) To: and cc:
> lines in mhl.headers are just empty, implicitly not decoding their
> content,

That is the current default in nmh.

> while the From: line does extra work:
>
>   From:formatfield="%(unquote(decode{text}))"

> .. now actually I'm seeing unreadable garbage in my To: lines quite
> frequently, so adding the same format tweak to To: and cc: seems to make
> sense.  Any good reason not to do this, or why this isn't the default
> for To: and cc: as well as From:?

I can't tell from the repo why that's the default.  I agree that it
should be changed.  Any objection to the following change?

diff --git a/etc/mhl.headers b/etc/mhl.headers
index 27a08d71..f017a518 100644
--- a/etc/mhl.headers
+++ b/etc/mhl.headers
@@ -11,2 +11,2 @@ Date:formatfield="%<(nodate{text})%{text}%|%(pretty{text})%>"
-To:
-cc:
+To:formatfield="%(unquote(decode{text}))"
+cc:formatfield="%(unquote(decode{text}))"

David

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Ken Hornstein-2
>I can't tell from the repo why that's the default.  I agree that it
>should be changed.  Any objection to the following change?

LGTM

--Ken

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Ralph Corderoy
In reply to this post by David Levine-3
Hi David,

> I can't tell from the repo why that's the default.

My guess, having read mhl(1), is the CPU cost was weighed against
the likeliehood of the To or CC fields needing decoding.

> Any objection to the following change?

No.  Though it still leaves Resent-{From,To,CC} fields encoded.
They could be added too, rather than left as Extras.  Or, if mhl knows
what fields contain addresses, I forget, then perhaps a method of
globally saying all address fields should be decoded and unquoted?

--
Cheers, Ralph.

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

David Levine-3
Ralph writes:

> Though it still leaves Resent-{From,To,CC} fields encoded.
> They could be added too, rather than left as Extras.

Done.  I added those Resent- components, thanks.

> Or, if mhl knows what fields contain addresses, I forget,
> then perhaps a method of globally saying all address
> fields should be decoded and unquoted?

That would be a nice enhancement.

David

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Ken Hornstein-2
>> Or, if mhl knows what fields contain addresses, I forget,
>> then perhaps a method of globally saying all address
>> fields should be decoded and unquoted?
>
>That would be a nice enhancement.

My vague plans are that when nmh is changes to have a "real" internal
API is that in the normal course of things you will automatically get
the decoded form of header fields, so you won't need to specify things
like this in your mhl file (unless someone could think of a reason
you WOULDN'T want to do that).

--Ken

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Ralph Corderoy
Hi Ken,

> > > Or, if mhl knows what fields contain addresses, I forget, then
> > > perhaps a method of globally saying all address fields should be
> > > decoded and unquoted?
> >
> > That would be a nice enhancement.
>
> My vague plans are that when nmh is changes to have a "real" internal
> API is that in the normal course of things you will automatically get
> the decoded form of header fields, so you won't need to specify things
> like this in your mhl file (unless someone could think of a reason you
> WOULDN'T want to do that).

Because MH lets us arrange things how we want so if decoding was done by
default we'd need a means to get to the raw original version instead,
e.g. I want to scan or pick them based on the raw value.  Given decoding
isn't lossless, this suggests both the original and decoded would need
to be kept.

Instead, and rather than apply the negative un-decode, my feeling is the
decode should be explicit.  We can provide default configurations that
apply for it new users, much as we are now.

--
Cheers, Ralph.

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Ken Hornstein-2
>Because MH lets us arrange things how we want so if decoding was done by
>default we'd need a means to get to the raw original version instead,
>e.g. I want to scan or pick them based on the raw value.  Given decoding
>isn't lossless, this suggests both the original and decoded would need
>to be kept.
>
>Instead, and rather than apply the negative un-decode, my feeling is the
>decode should be explicit.  We can provide default configurations that
>apply for it new users, much as we are now.

Arrrrrgggghhhh.

I can VAGUELY imagine a situation where you might want to search
undecoded header fields, but I have to believe it is NOT the normal
behavior (also, I think a tool to search undecoded header fields already
exists and it's called grep).  How we deal with character set conversion
is going to be "interesting" but I think it is solvable, even for those
few cranks who don't want to switch to a UTF-8 locale.

And it seems one of the lessons from this whole thing (and it's not just
this, it KEEPS COMING UP AGAIN AND AGAIN) is that people don't update
their config files because they put in them in place 20 years ago and
they worked fine, so changing the default templates isn't as helpful as
one would think.  Clearly the default should be decoded header fields
unless a user explictly asks for them unencded.

--Ken

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Ralph Corderoy
Hi Ken,

> > Because MH lets us arrange things how we want so if decoding was
> > done by default we'd need a means to get to the raw original version
> > instead, e.g. I want to scan or pick them based on the raw value.
> > Given decoding isn't lossless, this suggests both the original and
> > decoded would need to be kept.
>
> I can VAGUELY imagine a situation where you might want to search
> undecoded header fields, but I have to believe it is NOT the normal
> behavior

I didn't suggest it's the normal behaviour, even for me.  :-)

> (also, I think a tool to search undecoded header fields already exists
> and it's called grep).

That doesn't work since fields can be across multiple lines.  And I said
scan and pick, not so not just search.

> > Instead, and rather than apply the negative un-decode, my feeling is
> > the decode should be explicit.  We can provide default
> > configurations that apply for it new users, much as we are now.
>
> people don't update their config files because they put in them in
> place 20 years ago and they worked fine

That includes me.  When I've time, I need to clear them all out and
start from scratch to see what's needed and the best way to solve
problems, using a new installation in a second account as a guide.

> so changing the default templates isn't as helpful as one would think.

It's a good form of documentation.  In particular, I think it would be
helpful to spell out to users how they can get a ‘pristine’ set of
.mh_profile, etc., that they can set an environment variable or two to
so it's used instead of their unharmed ones from last millennia's.

> Clearly the default should be decoded header fields unless a user
> explictly asks for them unencded.

I don't see how that's compatible with access to the lossless raw ones
unless both are kept around.

I may have suggested this before, but shipping a folder of test emails
that a user can ‘scan +/usr/share/nmh/...’, etc., as a test of their
configuration, along with pre-formatted ‘golden’ output would be an easy
way for them to check they've updated to the latest bell and whistle.

--
Cheers, Ralph.

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Conrad Hughes
Perhaps we could all collaborate on collecting old .mh_profiles (maybe
also mhl.replys etc.?), piping 'em all through "sort -u" and looking to
see what was obsolete, then adding a check-mh-upgrade script which flags
obsolete stuff and gets invoked whenever the user's .mh_profile has a
missing or obsolete "mh-version: 1.7" line?

Is there stuff that would be obviously detectable in this way?

C.

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Ken Hornstein-2
In reply to this post by Ralph Corderoy
>> people don't update their config files because they put in them in
>> place 20 years ago and they worked fine
>
>That includes me.  When I've time, I need to clear them all out and
>start from scratch to see what's needed and the best way to solve
>problems, using a new installation in a second account as a guide.

Right, but ... the fact that you haven't YET done that I think just
bolsters my point.

It's like when someone complains, "Oh, I've been seeing this garbage in
my scan output for years now, will nmh ever deal with it??" and we all
point out, "Hey, it's LITERALLY been dealing with that since the first
release of nmh" and it turns out the problem is they put in place their
own custom scan format back in the 80s and haven't updated it since.
If this just happened once, fine, but it keeps happening and that sure
suggests to me that we're approaching this wrong.

>> so changing the default templates isn't as helpful as one would think.
>
>It's a good form of documentation.  In particular, I think it would be
>helpful to spell out to users how they can get a ‘pristine’ set of
>.mh_profile, etc., that they can set an environment variable or two to
>so it's used instead of their unharmed ones from last millennia's.

Oh, I agree that having the templates as an example is helpful as
documentation; we've tried to make some of the default format files
have a bunch of comments explain what is going on.  It just doesn't
help the user who changed their files a decade or two ago and hasn't
had the time or inclination to update things.

>> Clearly the default should be decoded header fields unless a user
>> explictly asks for them unencded.
>
>I don't see how that's compatible with access to the lossless raw ones
>unless both are kept around.

Well, "kept around" is sort-of vague, especially when we're talking about
something as vague as a hypothetical future API which hasn't really been
designed yet :-)

But, here are my thoughts in that regard.  It seems like most USERS of
header fields want the decoded, "unfolded" field.  So obviously any
API should provide that.  I think when you specify a component field
in a mh-format file by default it should give you the decoded component.

But there are cases when you want the undecoded fields.  The most
obvious one I can think of is when generating replies; for various reasons
when you are inserting email addresses into the reply draft the most
reliable thing is to use the "raw" address without any decoding.
I'm sure there are others that I am not thinking of.  But I wouldn't
worry about the features (or lack thereof) of some future undefined
and unimplemented API; if we need to change things that would be easy
to do!  Because, you know, it hasn't been written yet :-)

And I hope it goes without saying that this wouldn't be touching the
on-disk format of messages.

>I may have suggested this before, but shipping a folder of test emails
>that a user can ‘scan +/usr/share/nmh/...’, etc., as a test of their
>configuration, along with pre-formatted ‘golden’ output would be an easy
>way for them to check they've updated to the latest bell and whistle.

That might be useful, although the classic problem of letting people
know that such a thing exists is a always an issue.

--Ken

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Ken Hornstein-2
In reply to this post by Conrad Hughes
>Perhaps we could all collaborate on collecting old .mh_profiles (maybe
>also mhl.replys etc.?), piping 'em all through "sort -u" and looking to
>see what was obsolete, then adding a check-mh-upgrade script which flags
>obsolete stuff and gets invoked whenever the user's .mh_profile has a
>missing or obsolete "mh-version: 1.7" line?

Weeellll ... a lot of the stuff is in mh-format files, which would be
hard to write an external parser for.  And while a check-mh-upgrade
script might not be a bad idea, the problems I see are a) figuring out
how to let people know it exists, and b) getting people to run it.
Those two problems make me not want to spend my time writing it.  But
if someone else wants to write it, please do so!

--Ken

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Conrad Hughes
Ken> And while a check-mh-upgrade
Ken> script might not be a bad idea, the problems I see are a) figuring
Ken> out how to let people know it exists, and b) getting people to run
Ken> it.

My thought was that it got invoked if .mh_profile didn't have the
"mh-version" line.  Given your comment about this being spread out among
lots of files, perhaps we could simplify to a uniform process: all files
are versioned, and all generate the same message on being read if
there's been a relevant version change — something like this:

  File <name> was created while using an old version of MH; a good way
  of checking whether you really need it is to back it up, remove it,
  and try this command again.  If it does do stuff that you still want,
  then add (or update) "; nmh-version=1.7" as the first line, and
  compare it with the default version in /etc/nmh/foofile to see what's
  new.

For efficiency the /etc/nmh files could be versioned too, and the
version check only triggered when their version is bumped.  That way, if
release 1.7 did nothing to change /etc/nmh/mhl.reply, that would stay at
version 1.6 and the out-of-date warning wouldn't be triggered if the
user's mhl.reply was also at 1.6.  Does require a partial parse of the
/etc files too, which could be annoying; perhaps the last-changed
version could be in the code instead.

Hm.  Do you have a prioritised list of jobs you'd most like done?

C.

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Ken Hornstein-2
>My thought was that it got invoked if .mh_profile didn't have the
>"mh-version" line.  Given your comment about this being spread out among
>lots of files, perhaps we could simplify to a uniform process: all files
>are versioned, and all generate the same message on being read if
>there's been a relevant version change — something like this:

Weeeelll ... that's not a bad idea, actually.  It might be kind of how
we do the NEWS file now.  Although that tends to show up more often
that I want sometimes.  I'm not completely sure how that works for format
files; we'd need to think about that.  I guess a comment in the begining
would work.

>For efficiency the /etc/nmh files could be versioned too, and the
>version check only triggered when their version is bumped.  That way, if
>release 1.7 did nothing to change /etc/nmh/mhl.reply, that would stay at
>version 1.6 and the out-of-date warning wouldn't be triggered if the
>user's mhl.reply was also at 1.6.  Does require a partial parse of the
>/etc files too, which could be annoying; perhaps the last-changed
>version could be in the code instead.

That might be worth doing as well.

>Hm.  Do you have a prioritised list of jobs you'd most like done?

Well, here's my "wish list" for 1.8:

https://lists.gnu.org/archive/html/nmh-workers/2019-05/msg00000.html

--Ken

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Ralph Corderoy
In reply to this post by Ken Hornstein-2
Hi Ken,

> > I may have suggested this before, but shipping a folder of test
> > emails that a user can ‘scan +/usr/share/nmh/...’, etc., as a test
> > of their configuration, along with pre-formatted ‘golden’ output
> > would be an easy way for them to check they've updated to the latest
> > bell and whistle.
>
> That might be useful, although the classic problem of letting people
> know that such a thing exists is a always an issue.

There's David's context-version check, the FAQ, and this mailing-list's
page and appended signature.

I've been thinking a bit more about this.  A folder of test emails that
can be tested by a installed command available to the user.  That can
either use what it thinks is a good modern configuration so the user can
see what's possible in theory, or the user's configuration so they can
test their set up on a variety of emails.  The former could be built on
the fly based on what commands are installed, e.g. lynx(1) v. links(1).

--
Cheers, Ralph.

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers
Reply | Threaded
Open this post in threaded view
|

Re: To/cc decode or not to/cc decode

Ken Hornstein-2
>I've been thinking a bit more about this.  A folder of test emails that
>can be tested by a installed command available to the user.  That can
>either use what it thinks is a good modern configuration so the user can
>see what's possible in theory, or the user's configuration so they can
>test their set up on a variety of emails.  The former could be built on
>the fly based on what commands are installed, e.g. lynx(1) v. links(1).

Hm, well ... I think notifying the user of that might be tough.  But ...
we DO have the problem where users say, "Hey, this email doesn't look
right" and we first have to figure out what the email contains to try
to determine what is going wrong.  A set of emails that we know the content
of sure would help in debugging that.  I think that would be worthwhile
to put together if you wanted to do that.  It might even help a few
long-time MH/nmh users on this mailing list who somehow manage to keep
munging 8-bit characters in their replies (you know who you are!).

--Ken

--
nmh-workers
https://lists.nongnu.org/mailman/listinfo/nmh-workers