Re: [PATCH] nmbug: Allow Unicode tags and IDs in Python 2
authorW. Trevor King <wking@tremily.us>
Tue, 16 Feb 2016 17:56:42 +0000 (09:56 +1600)
committerW. Trevor King <wking@tremily.us>
Sat, 20 Aug 2016 23:21:08 +0000 (16:21 -0700)
af/56082e5ca3436bfef99978ae28d1b8980e5765 [new file with mode: 0644]

diff --git a/af/56082e5ca3436bfef99978ae28d1b8980e5765 b/af/56082e5ca3436bfef99978ae28d1b8980e5765
new file mode 100644 (file)
index 0000000..68d45b7
--- /dev/null
@@ -0,0 +1,160 @@
+Return-Path: <wking@tremily.us>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+ by arlo.cworth.org (Postfix) with ESMTP id 60AEA6DE0FC5\r
+ for <notmuch@notmuchmail.org>; Tue, 16 Feb 2016 09:56:49 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at cworth.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: 0.006\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=0.006 tagged_above=-999 required=5 tests=[AWL=0.107, \r
+ DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,\r
+ RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=disabled\r
+Received: from arlo.cworth.org ([127.0.0.1])\r
+ by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024)\r
+ with ESMTP id yVnHj204P_0S for <notmuch@notmuchmail.org>;\r
+ Tue, 16 Feb 2016 09:56:46 -0800 (PST)\r
+Received: from resqmta-po-07v.sys.comcast.net (resqmta-po-07v.sys.comcast.net\r
+ [96.114.154.166])\r
+ by arlo.cworth.org (Postfix) with ESMTPS id 493176DE0274\r
+ for <notmuch@notmuchmail.org>; Tue, 16 Feb 2016 09:56:45 -0800 (PST)\r
+Received: from resomta-po-02v.sys.comcast.net ([96.114.154.226])\r
+ by resqmta-po-07v.sys.comcast.net with comcast\r
+ id K5w71s0034tLnxL015wk9g; Tue, 16 Feb 2016 17:56:44 +0000\r
+Received: from mail.tremily.us ([73.221.72.168])\r
+ by resomta-po-02v.sys.comcast.net with comcast\r
+ id K5wj1s0013dr3C9015wj9s; Tue, 16 Feb 2016 17:56:43 +0000\r
+Received: by mail.tremily.us (Postfix, from userid 1000)\r
+ id 584F31BB68B5; Tue, 16 Feb 2016 09:56:42 -0800 (PST)\r
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tremily.us; s=odin;\r
+ t=1455645402; bh=TEP8Sw9j07KEJgNbC1J165nSutgHUfgYtmUFJc8IFxU=;\r
+ h=Date:From:To:Cc:Subject:References:In-Reply-To;\r
+ b=rj/Iby0ky+nNCncfscnlHBxaOFRhavBB80gpk2yzPCMl9rkPWU0bd4dkxmjk9SjZ+\r
+ saRZ5mkwMrcGpUQxpDdXFWQPwrlAE4+B44OyNT+GX8+/VnCQCNsTR0ZRF+D0Td9dE1\r
+ nidhiz4wb5I4JyUsWRi7bhIjGTUKK8WamHGRsacM=\r
+Date: Tue, 16 Feb 2016 09:56:42 -0800\r
+From: "W. Trevor King" <wking@tremily.us>\r
+To: David Bremner <david@tethera.net>\r
+Cc: notmuch@notmuchmail.org\r
+Subject: Re: [PATCH] nmbug: Allow Unicode tags and IDs in Python 2\r
+Message-ID: <20160216175641.GN4265@odin.tremily.us>\r
+References:\r
+ <e287050a10ce1d2120db996d2d200f610370a44e.1455513965.git.wking@tremily.us>\r
+ <87lh6kvmbc.fsf@zancas.localnet>\r
+MIME-Version: 1.0\r
+Content-Type: multipart/signed; micalg=pgp-sha1;\r
+ protocol="application/pgp-signature"; boundary="b1CVx77D595wdcW8"\r
+Content-Disposition: inline\r
+In-Reply-To: <87lh6kvmbc.fsf@zancas.localnet>\r
+OpenPGP: id=39A2F3FA2AB17E5D8764F388FC29BDCDF15F5BE8;\r
+ url=http://tremily.us/pubkey.txt\r
+User-Agent: Mutt/1.5.23 (2014-03-12)\r
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net;\r
+ s=q20140121; t=1455645404;\r
+ bh=GNHISTXPs64gA/u2kwNoMXs4UYForBVSixUbFcETY60=;\r
+ h=Received:Received:Received:Date:From:To:Subject:Message-ID:\r
+ MIME-Version:Content-Type;\r
+ b=GNWDE72k34WFrwBeh149DZ/SYogshv7ydzZ0DM0gC9XO72mBtbwWl0QQK7vLtKLvx\r
+ GUiVzdGDCYuMgEkOuKqab6Qz1r2T0psLO8hDZqpAJNr1qeLr3Xpkrr7hZRbeJp03+O\r
+ 2eI28dzwpWdXmJqBI1dppjUD5lUAiH0PAHwgPjBOHgROTuhxYf6CKFRN+jPjZ1g82F\r
+ 6yMgtfOkeaBuJMJdiCkCYYev1WKu3dTzyXL4vsrXY4vB5jVnMVDlToGdv+S/rCEwZt\r
+ BKrXw1N+gtwQaJibVmj+5/4bkY4tReXnAbzX/KK7SnAzWdpxoqC4WWO1AxjizmBx0B\r
+ yeiYueV7krl4g==\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.20\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+ <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <https://notmuchmail.org/mailman/options/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch/>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <https://notmuchmail.org/mailman/listinfo/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Tue, 16 Feb 2016 17:56:49 -0000\r
+\r
+\r
+--b1CVx77D595wdcW8\r
+Content-Type: text/plain; charset=utf-8\r
+Content-Disposition: inline\r
+Content-Transfer-Encoding: quoted-printable\r
+\r
+On Tue, Feb 16, 2016 at 09:04:07AM -0400, David Bremner wrote:\r
+> W. Trevor King writes:\r
+> > Coercing to UTF-8 (regardless of locale) gives us consistent tag\r
+> > IDs for sharing between users.\r
+>=20\r
+> I'm not sure what "tag IDs" are. Do you mean message-ids here? or "tags\r
+> and IDs"?\r
+\r
+Yeah.  I'll fix that in v2.\r
+\r
+> At first I thought there might be problems with non-utf8 message-ids,\r
+> but that turns out not to be the case [1].  It seems like it would take\r
+> a fairly heroic effort to get non-UTF8 tags into the database (perhaps\r
+> by calling the library interface with bad strings?) so we can probably\r
+> ignore this case. It might be good to document the limitation though,\r
+> since AFAIK, dump and restore can roundtrip any old crap.\r
+\r
+How about in a NEWS entry in v2 of this series, and then echoing that\r
+NEWS entry in the notmuch-dtags (or whatever) man page once I work up\r
+that series?\r
+\r
+> > The 'isnumeric' check identifies Unicode instances in both Python\r
+> > 2 [9] and Python 3 [10].\r
+>=20\r
+> I still haven't really tried to understand this part, but probably\r
+> it deserves inline documentation.\r
+\r
+It's just =E2=80=9Cif you have a Unicode instance (str in Python 3, unicode=\r
+ in\r
+Python 2), convert it to bytes (bytes in Python 3, str in Python 2).\r
+Only Unicode instances will have an =E2=80=98isnumeric=E2=80=99 method, so =\r
+it's a\r
+convenient marker for switching that logic.  I'll add a =E2=80=9Cconvert fr=\r
+om\r
+Unicode if necessary=E2=80=9D comment to v2.\r
+\r
+> > I haven't checked the other commands for issues with Unicode IDs\r
+> > or tags.  It's possible that in addition to this explicit encoding\r
+> > to UTF-8, we'll also want explicit decoding from UTF-8 when\r
+> > reading from Git trees (for 'nmbug checkout' and 'nmbug status').\r
+>=20\r
+> Yes, this seems to be a problem, with the patch applied I can\r
+> commit, but the same utf-8 message-id causes problems.\r
+\r
+Ugh.  Thanks for checking.  I'll try to fix all the places where this\r
+would have an impact in v2 of this series.\r
+\r
+Cheers,\r
+Trevor\r
+\r
+--=20\r
+This email may be signed or encrypted with GnuPG (http://www.gnupg.org).\r
+For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy\r
+\r
+--b1CVx77D595wdcW8\r
+Content-Type: application/pgp-signature; name="signature.asc"\r
+Content-Description: OpenPGP digital signature\r
+\r
+-----BEGIN PGP SIGNATURE-----\r
+Version: GnuPG v2\r
+\r
+iQIcBAEBAgAGBQJWw2LXAAoJEAPqygegUbGsRC8P/2Ao8dkv4CzkN95Fr6k5t/yR\r
+3Y231DolowEHfS11uHE7JwD0XuQrM7K/Z8orm692Q8QaoNb6iAuf3KvtIoJKUWZB\r
+wJc4Fhv8MDNzeyNmCtLMGZw3u/Cm5OTL513c154qUtcrkO6WoQWknKQqzNLViamr\r
+5zWzD/w0tW8dZHJWWntOHyz73mbC0E7Ib4UoEUJ06BWJKCel1qv8TumtsDr0sh+e\r
+F7mNMyZwnE85LHOePBPkwNedtjOq4fQ9xfma5moN3rZU3owYXkJUXcG8TqCzm8X4\r
+kJaSUZv8B9ig9uyS7yBN9Gp7B4EMuKrI8k2PM/sDv9p/IncHsGZPCqf/skeoBHUg\r
+bnoget1HjpOiTjEau4gB+DPGrxEOCHbzt50USM12/6vIRVhQmWKZcV8kLA5jtA+V\r
+91dEagZjMWavxXUr07E3YP4dzo3PH/vsPCLA5aaJVpbiIEIq/xm/J7QHkVTro66E\r
+sCpag5SjFDk3lkN4cvmDBWRF/VzT58qbQ+NM1nMg4Ydfiu+mmXZSe907EnBweRQd\r
+ffzoQpN7rubP4QLpVrVAr/kHB6sXYNJOMSn7SS4Dul5bLQOwk7LeqZZEQjA6B8Cb\r
+MeSXwIfJuoY+rnSdeaTFjvtF5c6Ri85ptQMpPnShA3u66ym9k8gDOINQOXsr/Max\r
+Sdg9K0czco6hXUfIcZN5\r
+=bRRC\r
+-----END PGP SIGNATURE-----\r
+\r
+--b1CVx77D595wdcW8--\r