Re: encoding of message-ids
authorW. Trevor King <wking@tremily.us>
Wed, 24 Feb 2016 17:15:24 +0000 (09:15 +1600)
committerW. Trevor King <wking@tremily.us>
Sat, 20 Aug 2016 23:21:11 +0000 (16:21 -0700)
5b/6e6fa4eb1e60210e77a3fef48098764c564e5e [new file with mode: 0644]

diff --git a/5b/6e6fa4eb1e60210e77a3fef48098764c564e5e b/5b/6e6fa4eb1e60210e77a3fef48098764c564e5e
new file mode 100644 (file)
index 0000000..f3449f1
--- /dev/null
@@ -0,0 +1,158 @@
+Return-Path: <wking@tremily.us>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+ by arlo.cworth.org (Postfix) with ESMTP id E72966DE0B4F\r
+ for <notmuch@notmuchmail.org>; Wed, 24 Feb 2016 09:23:43 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at cworth.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: -0.099\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=-0.099 tagged_above=-999 required=5 tests=[AWL=0.213,\r
+  DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,\r
+ RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.211, SPF_PASS=-0.001]\r
+ autolearn=disabled\r
+Received: from arlo.cworth.org ([127.0.0.1])\r
+ by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024)\r
+ with ESMTP id RFzXC0ldVQ2X for <notmuch@notmuchmail.org>;\r
+ Wed, 24 Feb 2016 09:23:39 -0800 (PST)\r
+X-Greylist: delayed 490 seconds by postgrey-1.35 at arlo;\r
+ Wed, 24 Feb 2016 09:23:39 PST\r
+Received: from resqmta-ch2-12v.sys.comcast.net\r
+ (resqmta-ch2-12v.sys.comcast.net [69.252.207.44])\r
+ by arlo.cworth.org (Postfix) with ESMTPS id A45F36DE0AC2\r
+ for <notmuch@notmuchmail.org>; Wed, 24 Feb 2016 09:23:39 -0800 (PST)\r
+Received: from resomta-ch2-13v.sys.comcast.net ([69.252.207.109])\r
+ by resqmta-ch2-12v.sys.comcast.net with comcast\r
+ id NHDG1s0062N9P4d01HFSvV; Wed, 24 Feb 2016 17:15:26 +0000\r
+Received: from mail.tremily.us ([73.221.72.168])\r
+ by resomta-ch2-13v.sys.comcast.net with comcast\r
+ id NHFQ1s00a3dr3C901HFRsS; Wed, 24 Feb 2016 17:15:26 +0000\r
+Received: by mail.tremily.us (Postfix, from userid 1000)\r
+ id B5FE01BD02A1; Wed, 24 Feb 2016 09:15:24 -0800 (PST)\r
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tremily.us; s=odin;\r
+ t=1456334124; bh=Vkx3ZOGyNSxOIPbT2HJQ6UZW4hHixJsKxezLJ6PCdao=;\r
+ h=Date:From:To:Cc:Subject:References:In-Reply-To;\r
+ b=n4ZOTdIF4OSRTy3KKLiB3DWZ3YR0leR+UWrVs6bnNcJyhpq6fMrkpxfFH5An2m039\r
+ 6XAjlzeFVfR2ZllJ/LiBtvWm/hvXPf2M3Y9awB+4IoIoPYu7hc5tEBTj6hLvluRHZv\r
+ K9NPN26vpUtSl41T/3Sw61/7t2fAo/anxH65upVs=\r
+Date: Wed, 24 Feb 2016 09:15:24 -0800\r
+From: "W. Trevor King" <wking@tremily.us>\r
+To: David Bremner <david@tethera.net>\r
+Cc: Daniel Kahn Gillmor <dkg@fifthhorseman.net>, notmuch@notmuchmail.org\r
+Subject: Re: encoding of message-ids\r
+Message-ID: <20160224171524.GS4265@odin.tremily.us>\r
+References: <87si0svnim.fsf@zancas.localnet>\r
+ <87ziv0iimt.fsf@alice.fifthhorseman.net>\r
+ <877fi3v4t6.fsf@zancas.localnet>\r
+MIME-Version: 1.0\r
+Content-Type: multipart/signed; micalg=pgp-sha1;\r
+ protocol="application/pgp-signature"; boundary="Djp5PRGHu2Cmyd8M"\r
+Content-Disposition: inline\r
+In-Reply-To: <877fi3v4t6.fsf@zancas.localnet>\r
+OpenPGP: id=39A2F3FA2AB17E5D8764F388FC29BDCDF15F5BE8;\r
+ url=http://tremily.us/pubkey.txt\r
+User-Agent: Mutt/1.5.23 (2014-03-12)\r
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net;\r
+ s=q20140121; t=1456334126;\r
+ bh=FH2fuPKxpALNCdxB9yFY2XIPPEO6VETMGjI0e9JU2Rk=;\r
+ h=Received:Received:Received:Date:From:To:Subject:Message-ID:\r
+ MIME-Version:Content-Type;\r
+ b=aplBt8g2OvDTDmfayW2v6qm3VdL/1L+RYh2X4tmVnJCYm4M9T9B3XHr3JL7LSi+Jc\r
+ xDdMVBDVZb3HNRzzu/baSVMppAzKAsRUObG5hN5KOj+/+sQmFdi43felFOjzokAkAU\r
+ CgBpD7KuLR07fmJKH3BF/jPybceWzUJboBmrCBUj1J52Zkg6pQ2gGHbz9eDd0m5tgh\r
+ /+ht6Ah+8yYetXILk7G4TvmbSRqG8lNjLEzrfMvMzfear+a68m6RbTafwJJ1+ajMSg\r
+ 354pJT8cvrTPupkaa49CP9qz2h2yHSQY14LbZVtmti3B/xIhp7ctt7MIHbkFfp8K7b\r
+ 1mG24aKL0ysWg==\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.20\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+ <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <https://notmuchmail.org/mailman/options/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch/>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <https://notmuchmail.org/mailman/listinfo/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Wed, 24 Feb 2016 17:23:44 -0000\r
+\r
+\r
+--Djp5PRGHu2Cmyd8M\r
+Content-Type: text/plain; charset=utf-8\r
+Content-Disposition: inline\r
+Content-Transfer-Encoding: quoted-printable\r
+\r
+On Wed, Feb 17, 2016 at 09:34:29AM -0400, David Bremner wrote:\r
+> Daniel Kahn Gillmor writes:\r
+> > That said, RFC 2047 suggest that its encodings are only relevant\r
+> > in places where a "text" token would be used.  Message-ID (and\r
+> > References and In-Reply-To) are intended to only contain\r
+> > dot-atom-text tokens.  So probably it would be more correct to\r
+> > avoid applying to these specific fields.\r
+> >\r
+> > i dunno that it's a big deal though, given the analysis above.\r
+>\r
+> I guess there are two seperate issues. One is the (mildly bogus)\r
+> application of RFC2047 decoding to message-ids. The other other is\r
+> the coercion into utf8 from whatever wacky 8bit encoding some\r
+> creative person might use in a message-id.\r
+\r
+It looks like there's already an =E2=80=9Cimplicit encodings are complicate=\r
+d=E2=80=9D\r
+RFC discussing this issue [1].  RFC 6532 overrides (among other\r
+things) the atext behind message-id [2,3] for message/global messages.\r
+Other related RFCs cover internationalized domain names [4] and\r
+internationalized email addresses [5].  I think we should:\r
+\r
+* Store message IDs as NFKC UTF-8 in notmuch (do we already do this?).\r
+* For message/global messages:\r
+  * Convert headers to Unicode using UTF-8 (per RFC 6532).\r
+* For non-message/global messages:\r
+  * Ignore any RFC 2047 =3D? encoding or RFC 5890 xn-- encoding that may\r
+    be present.\r
+  * Convert to Unicode by percent-encoding [6] (e.g. =E2=80=98=C3=BC%=E2=80=\r
+=99 represented\r
+    as the three UTF-8 bytes =E2=80=98\xc3\xbc\x25=E2=80=99 would be repres=\r
+ented by\r
+    the Unicode =E2=80=98%C3%BC%25=E2=80=99).\r
+\r
+Cheers,\r
+Trevor\r
+\r
+[1]: https://tools.ietf.org/html/rfc6055\r
+[2]: https://tools.ietf.org/html/rfc5322#section-3.6.4\r
+[3]: https://tools.ietf.org/html/rfc5322#section-3.2.3\r
+[4]: https://tools.ietf.org/html/rfc5890\r
+[5]: https://tools.ietf.org/html/rfc6530\r
+[6]: https://tools.ietf.org/html/rfc3986#section-2\r
+[7]: https://tools.ietf.org/html/rfc2606#section-2\r
+\r
+--=20\r
+This email may be signed or encrypted with GnuPG (http://www.gnupg.org).\r
+For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy\r
+\r
+--Djp5PRGHu2Cmyd8M\r
+Content-Type: application/pgp-signature; name="signature.asc"\r
+Content-Description: OpenPGP digital signature\r
+\r
+-----BEGIN PGP SIGNATURE-----\r
+Version: GnuPG v2\r
+\r
+iQIcBAEBAgAGBQJWzeUpAAoJEAPqygegUbGsJ9AP/Rac5YlnUmrsnWtyftmNbj4f\r
+PDZyTHxDptFBK+brg39phzkC3wjrwuAiodP/l7j5LT0EqTywYY0pANKwf2oMhu0j\r
+iC4gxY321U3JrlbP/L/YjJKqbGIoT5f+X65GDklQ1K8bvGjbOEbnTF2hFzzNs3rh\r
+60fqreZ/sSUVw39c3TyBL/FMTu0SFe2gqPqZl0o2IqSK/MClxwzQxQgmDXVPHZOP\r
+9M+D92yvDAQ2Eoxvdj5Yv6k0CPNN3zXZOEpLrjwS6gN9VrYQEFwPzIrKcU9UBOM1\r
+b0WAE/E7C1KNnb5WbBnGljQl8Pu2A0r//ER3j8CMj2/9+Ll0d4iSupr9erGrCG4m\r
+nnkN691yQqX4wlYtIl2+KCySOLhuO8BZRBB6QS3aIb6Bbke380I7Ajua/Wym+LVY\r
+9sWH/HHaQPNWJizbAjRn4dYFcVfz8IhO6UtqDa881bXvlCWKmCsJP2fQQpgAAjza\r
+AbUn7WJgGzGANi8AJ1j7OEO4XdTi6sXyg2oVKF0eMVL8InJvXLRL7JTWRxBnnEuh\r
+IeHsGsKRyQqpy+IcbhatP6ikJCwPYUtRyiaVabGsXWVs8rzXjyMOCO48e4MQNCLs\r
+BMm7xazRrskbqM/XvcespGl5SKhwPnoXW2fmDa1WyzYPQdUz8X4eCRD10WM4w1Vf\r
+dEv7ZsBvXaAA3d/hoTqC\r
+=1gr7\r
+-----END PGP SIGNATURE-----\r
+\r
+--Djp5PRGHu2Cmyd8M--\r