--- /dev/null
+Return-Path: <david@tethera.net>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+ by arlo.cworth.org (Postfix) with ESMTP id 440C26DE141B\r
+ for <notmuch@notmuchmail.org>; Tue, 16 Feb 2016 05:04:12 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at cworth.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: -0.307\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=-0.307 tagged_above=-999 required=5 tests=[AWL=0.244,\r
+ RP_MATCHES_RCVD=-0.55, SPF_PASS=-0.001] autolearn=disabled\r
+Received: from arlo.cworth.org ([127.0.0.1])\r
+ by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024)\r
+ with ESMTP id s3vrzFIj2p0b for <notmuch@notmuchmail.org>;\r
+ Tue, 16 Feb 2016 05:04:10 -0800 (PST)\r
+Received: from fethera.tethera.net (fethera.tethera.net [198.245.60.197])\r
+ by arlo.cworth.org (Postfix) with ESMTPS id 1669C6DE02C9\r
+ for <notmuch@notmuchmail.org>; Tue, 16 Feb 2016 05:04:09 -0800 (PST)\r
+Received: from remotemail by fethera.tethera.net with local (Exim 4.84)\r
+ (envelope-from <david@tethera.net>)\r
+ id 1aVfHq-0002cb-Ri; Tue, 16 Feb 2016 08:03:26 -0500\r
+Received: (nullmailer pid 25980 invoked by uid 1000);\r
+ Tue, 16 Feb 2016 13:04:07 -0000\r
+From: David Bremner <david@tethera.net>\r
+To: "W. Trevor King" <wking@tremily.us>, notmuch@notmuchmail.org\r
+Subject: Re: [PATCH] nmbug: Allow Unicode tags and IDs in Python 2\r
+In-Reply-To:\r
+ <e287050a10ce1d2120db996d2d200f610370a44e.1455513965.git.wking@tremily.us>\r
+References:\r
+ <e287050a10ce1d2120db996d2d200f610370a44e.1455513965.git.wking@tremily.us>\r
+User-Agent: Notmuch/0.21+26~g9404723 (http://notmuchmail.org) Emacs/24.5.1\r
+ (x86_64-pc-linux-gnu)\r
+Date: Tue, 16 Feb 2016 09:04:07 -0400\r
+Message-ID: <87lh6kvmbc.fsf@zancas.localnet>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain; charset=utf-8\r
+Content-Transfer-Encoding: quoted-printable\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.20\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+ <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <https://notmuchmail.org/mailman/options/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch/>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <https://notmuchmail.org/mailman/listinfo/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Tue, 16 Feb 2016 13:04:12 -0000\r
+\r
+"W. Trevor King" <wking@tremily.us> writes:\r
+\r
+> Avoid a UnicodeWarning and broken pipe on 'nmbug commit' in Python 2\r
+> when a tag or message ID contains non-ASCII characters [1].\r
+>\r
+> There are a number of Python bugs associated with this behavior\r
+> [2,3,4,5,6]. There's also some useful background in [8]. [3] lead to\r
+> the currently working Python 3 implementation, which encodes to UTF-8\r
+> by default and has 'encoding' and 'errors' arguments [7]. This commit\r
+> follows that approach in a way that's compatible with both Python 2\r
+> and Python 3. Coercing to UTF-8 (regardless of locale) gives us\r
+> consistent tag IDs for sharing between users.\r
+\r
+I'm not sure what "tag IDs" are. Do you mean message-ids here? or "tags\r
+and IDs"?\r
+\r
+At first I thought there might be problems with non-utf8 message-ids,\r
+but that turns out not to be the case [1]. It seems like it would take\r
+a fairly heroic effort to get non-UTF8 tags into the database (perhaps\r
+by calling the library interface with bad strings?) so we can probably\r
+ignore this case. It might be good to document the limitation though,\r
+since AFAIK, dump and restore can roundtrip any old crap.\r
+\r
+\r
+>\r
+> The 'isnumeric' check identifies Unicode instances in both Python 2\r
+> [9] and Python 3 [10].\r
+>\r
+\r
+I still haven't really tried to understand this part, but probably it\r
+deserves inline documentation.\r
+\r
+> ---\r
+> I haven't checked the other commands for issues with Unicode IDs or\r
+> tags. It's possible that in addition to this explicit encoding to\r
+> UTF-8, we'll also want explicit decoding from UTF-8 when reading from\r
+> Git trees (for 'nmbug checkout' and 'nmbug status').\r
+\r
+Yes, this seems to be a problem, with the patch applied I can commit,\r
+but the same utf-8 message-id causes problems.\r
+\r
+bremner@zancas:~/software/upstream/notmuch$ ./devel/nmbug/nmbug status\r
+U D1B4DEBCAFFC4A05A4D4349A6EC5C9D8@=C3=83=C3=82=C3=83=C3=82=C2=A5=C3=83=C3=\r
+=82=C2=B0=C3=83=C3=82=C2=A3=C3=83=C3=82=C2=A5=C3=83=C3=82=C2=A9-=C3=83=C3=\r
+=82=C3=83=C3=82 unread\r
+A D1B4DEBCAFFC4A05A4D4349A6EC5C9D8@=C3=83=C3=83=C2=A5=C3=83=C2=B0=C3=83=C2=\r
+=A3=C3=83=C2=A5=C3=83=C2=A9-=C3=83=C3=83 unread\r
+\r
+bremner@zancas:~/software/upstream/notmuch$ delve -a -1 ~/Maildir/.notmuch/=\r
+xapian | grep D1B4DEBCAFFC4A05A4D4349A6EC5C9D8\r
+QD1B4DEBCAFFC4A05A4D4349A6EC5C9D8@=C3=91=C3=A5=C3=B0=C3=A3=C3=A5=C3=A9-=C3=\r
+=8F=C3=8A\r
+\r
+[1]: id:87si0svnim.fsf@zancas.localnet\r