1 Return-Path: <wking@tremily.us>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by arlo.cworth.org (Postfix) with ESMTP id 181856DE0A87
\r
6 for <notmuch@notmuchmail.org>; Sun, 14 Feb 2016 21:29:11 -0800 (PST)
\r
7 X-Virus-Scanned: Debian amavisd-new at cworth.org
\r
11 X-Spam-Status: No, score=0.057 tagged_above=-999 required=5 tests=[AWL=0.058,
\r
12 DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001,
\r
13 SPF_PASS=-0.001] autolearn=disabled
\r
14 Received: from arlo.cworth.org ([127.0.0.1])
\r
15 by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024)
\r
16 with ESMTP id YwGviKfM47oN for <notmuch@notmuchmail.org>;
\r
17 Sun, 14 Feb 2016 21:29:08 -0800 (PST)
\r
18 Received: from resqmta-po-08v.sys.comcast.net (resqmta-po-08v.sys.comcast.net
\r
20 by arlo.cworth.org (Postfix) with ESMTPS id F3AA26DE0943
\r
21 for <notmuch@notmuchmail.org>; Sun, 14 Feb 2016 21:29:07 -0800 (PST)
\r
22 Received: from resomta-po-09v.sys.comcast.net ([96.114.154.233])
\r
23 by resqmta-po-08v.sys.comcast.net with comcast
\r
24 id JVUu1s00552QWKC01VV5xF; Mon, 15 Feb 2016 05:29:05 +0000
\r
25 Received: from mail.tremily.us ([73.221.72.168])
\r
26 by resomta-po-09v.sys.comcast.net with comcast
\r
27 id JVV31s00K3dr3C901VV4g9; Mon, 15 Feb 2016 05:29:05 +0000
\r
28 Received: from ullr.tremily.us (unknown [192.168.10.7])
\r
29 by mail.tremily.us (Postfix) with ESMTPS id 13BA61BB2A0D;
\r
30 Sun, 14 Feb 2016 21:29:03 -0800 (PST)
\r
31 Received: (nullmailer pid 22232 invoked by uid 1000);
\r
32 Mon, 15 Feb 2016 05:30:14 -0000
\r
33 From: "W. Trevor King" <wking@tremily.us>
\r
34 To: notmuch@notmuchmail.org
\r
35 Subject: [PATCH] nmbug: Allow Unicode tags and IDs in Python 2
\r
36 Date: Sun, 14 Feb 2016 21:30:11 -0800
\r
38 <e287050a10ce1d2120db996d2d200f610370a44e.1455513965.git.wking@tremily.us>
\r
39 X-Mailer: git-send-email 2.1.0.60.g85f0837
\r
40 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net;
\r
41 s=q20140121; t=1455514145;
\r
42 bh=JSA6UzUGX/hI8zjlpzXe0gXS06y41p1jTP778uTzK/M=;
\r
43 h=Received:Received:Received:Received:From:To:Subject:Date:
\r
45 b=p/x4GLjW56OLBfGD4zuYDpy+VrFGH6arhZpb4jqVbHWS/O3bUYtwg6uoe0oX2nkDg
\r
46 1iTNLUrAR6WJWCiTevS4evair0kNyvidoqEwc6kbLn1+hc2qaCB6khcDbDixrIt2UC
\r
47 ihgTMKaxqvR/HppE0jLc79gLN2HR7fKHy3RbSGDyAlJx5+wRiUi9hq+Lof9Qf5adkT
\r
48 ZJNqiUnXtF8HXGjbLSTkX0VemiBYLGHh2meLN+hCcKPXCbFh47XkZXuL/uIzrmoJ2y
\r
49 T9BrsWTry9Pmk4v6VMv/SgtG9z40IhLhwNIgfMI9OQlubaXg5oTY77xqHYbqVZpX0I
\r
51 X-BeenThere: notmuch@notmuchmail.org
\r
52 X-Mailman-Version: 2.1.20
\r
54 List-Id: "Use and development of the notmuch mail system."
\r
55 <notmuch.notmuchmail.org>
\r
56 List-Unsubscribe: <https://notmuchmail.org/mailman/options/notmuch>,
\r
57 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
58 List-Archive: <http://notmuchmail.org/pipermail/notmuch/>
\r
59 List-Post: <mailto:notmuch@notmuchmail.org>
\r
60 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
61 List-Subscribe: <https://notmuchmail.org/mailman/listinfo/notmuch>,
\r
62 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
63 X-List-Received-Date: Mon, 15 Feb 2016 05:29:11 -0000
\r
65 Avoid a UnicodeWarning and broken pipe on 'nmbug commit' in Python 2
\r
66 when a tag or message ID contains non-ASCII characters [1].
\r
68 There are a number of Python bugs associated with this behavior
\r
69 [2,3,4,5,6]. There's also some useful background in [8]. [3] lead to
\r
70 the currently working Python 3 implementation, which encodes to UTF-8
\r
71 by default and has 'encoding' and 'errors' arguments [7]. This commit
\r
72 follows that approach in a way that's compatible with both Python 2
\r
73 and Python 3. Coercing to UTF-8 (regardless of locale) gives us
\r
74 consistent tag IDs for sharing between users.
\r
76 The 'isnumeric' check identifies Unicode instances in both Python 2
\r
77 [9] and Python 3 [10].
\r
79 [1]: id:87twlbv5vj.fsf@zancas.localnet
\r
80 http://thread.gmane.org/gmane.mail.notmuch.general/21855/focus=21862
\r
81 Subject: Re: problems with nmbug and empty prefix (UnicodeWarning and broken pipe)
\r
82 Date: Sun, 14 Feb 2016 08:22:24 -0400
\r
83 [2]: http://bugs.python.org/issue2637
\r
84 [3]: http://bugs.python.org/issue3300
\r
85 [4]: http://bugs.python.org/issue22231
\r
86 [5]: http://bugs.python.org/issue23885
\r
87 [6]: http://bugs.python.org/issue1712522
\r
88 [7]: https://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote
\r
89 [8]: https://mail.python.org/pipermail/python-dev/2006-July/067335.html
\r
90 [9]: https://docs.python.org/2/library/stdtypes.html#unicode.isnumeric
\r
91 [10]: https://docs.python.org/3/library/stdtypes.html#str.isnumeric
\r
93 I haven't checked the other commands for issues with Unicode IDs or
\r
94 tags. It's possible that in addition to this explicit encoding to
\r
95 UTF-8, we'll also want explicit decoding from UTF-8 when reading from
\r
96 Git trees (for 'nmbug checkout' and 'nmbug status').
\r
101 devel/nmbug/nmbug | 13 +++++++++++--
\r
102 1 file changed, 11 insertions(+), 2 deletions(-)
\r
104 diff --git a/devel/nmbug/nmbug b/devel/nmbug/nmbug
\r
105 index 81f582c..284d374 100755
\r
106 --- a/devel/nmbug/nmbug
\r
107 +++ b/devel/nmbug/nmbug
\r
109 #!/usr/bin/env python
\r
111 -# Copyright (c) 2011-2014 David Bremner <david@tethera.net>
\r
112 +# Copyright (c) 2011-2016 David Bremner <david@tethera.net>
\r
113 # W. Trevor King <wking@tremily.us>
\r
115 # This program is free software: you can redistribute it and/or modify
\r
116 @@ -95,7 +95,7 @@ except AttributeError: # Python < 3.2
\r
117 _tempfile.TemporaryDirectory = _TemporaryDirectory
\r
120 -def _hex_quote(string, safe='+@=:,'):
\r
121 +def _hex_quote(string, safe='+@=:,', encoding='utf-8', errors='strict'):
\r
123 quote('abc def') -> 'abc%20def'.
\r
125 @@ -103,6 +103,15 @@ def _hex_quote(string, safe='+@=:,'):
\r
126 addition to letters, digits, and '_.-') and lowercase hex digits
\r
127 (e.g. '%3a' instead of '%3A').
\r
129 + if hasattr(string, 'isnumeric'):
\r
130 + string = string.encode(encoding, errors)
\r
131 + if hasattr(safe, 'isnumeric'):
\r
132 + safe_bytes = safe.encode(encoding, errors)
\r
133 + if len(safe_bytes) != len(safe):
\r
134 + raise ValueError(
\r
135 + 'some safe characters are encoded as multiple bytes '
\r
136 + '({!r} -> {!r})'.format(safe, safe_bytes))
\r
137 + safe = safe_bytes
\r
138 uppercase_escapes = _quote(string, safe)
\r
139 return _HEX_ESCAPE_REGEX.sub(
\r
140 lambda match: match.group(0).lower(),
\r