1 Return-Path: <wking@tremily.us>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by arlo.cworth.org (Postfix) with ESMTP id E61966DE1AB3
\r
6 for <notmuch@notmuchmail.org>; Sun, 14 Feb 2016 14:33:58 -0800 (PST)
\r
7 X-Virus-Scanned: Debian amavisd-new at cworth.org
\r
11 X-Spam-Status: No, score=0.007 tagged_above=-999 required=5 tests=[AWL=0.108,
\r
12 DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
\r
13 RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=disabled
\r
14 Received: from arlo.cworth.org ([127.0.0.1])
\r
15 by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024)
\r
16 with ESMTP id tfctEmjFlFY3 for <notmuch@notmuchmail.org>;
\r
17 Sun, 14 Feb 2016 14:33:55 -0800 (PST)
\r
18 Received: from resqmta-po-05v.sys.comcast.net (resqmta-po-05v.sys.comcast.net
\r
20 by arlo.cworth.org (Postfix) with ESMTPS id 9FA766DE1A2F
\r
21 for <notmuch@notmuchmail.org>; Sun, 14 Feb 2016 14:33:55 -0800 (PST)
\r
22 Received: from resomta-po-06v.sys.comcast.net ([96.114.154.230])
\r
23 by resqmta-po-05v.sys.comcast.net with comcast
\r
24 id JNZj1s0054yXVJQ01NZtMf; Sun, 14 Feb 2016 22:33:53 +0000
\r
25 Received: from mail.tremily.us ([73.221.72.168])
\r
26 by resomta-po-06v.sys.comcast.net with comcast
\r
27 id JNZs1s00G3dr3C901NZsTU; Sun, 14 Feb 2016 22:33:53 +0000
\r
28 Received: by mail.tremily.us (Postfix, from userid 1000)
\r
29 id E81831BB253C; Sun, 14 Feb 2016 14:33:51 -0800 (PST)
\r
30 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tremily.us; s=odin;
\r
31 t=1455489232; bh=uWzmLoX+dWgJdQCzrLX0ENn+nZLZ4e4b+5NVJPXu3VA=;
\r
32 h=Date:From:To:Cc:Subject:References:In-Reply-To;
\r
33 b=N7yWOTsH6zgbQZE/vTtwsZdERLq+QqLGSa68m93pijeTSPZKjBORVUUHcCFl1sAot
\r
34 5hU2g0uMwtc1D8AoMB4Dwc54n5GlxOH6X6lwrp8cIFrGID9R3+Oj29j60bsnfDbrI5
\r
35 tzDFo0cws1Al6C+l+nbcOCpqpQ2OLyfizWP5NX/4=
\r
36 Date: Sun, 14 Feb 2016 14:33:51 -0800
\r
37 From: "W. Trevor King" <wking@tremily.us>
\r
38 To: David Bremner <david@tethera.net>
\r
39 Cc: notmuch@notmuchmail.org
\r
40 Subject: Re: problems with nmbug and empty prefix (UnicodeWarning and broken
\r
42 Message-ID: <20160214223351.GE4265@odin.tremily.us>
\r
43 References: <87oabko293.fsf@zancas.localnet>
\r
44 <20160213223357.GC4265@odin.tremily.us>
\r
45 <87ziv4813v.fsf@zancas.localnet>
\r
46 <20160214063132.GD4265@odin.tremily.us>
\r
47 <87twlbv5vj.fsf@zancas.localnet>
\r
49 Content-Type: multipart/signed; micalg=pgp-sha1;
\r
50 protocol="application/pgp-signature"; boundary="PHDeMLmKefytWajp"
\r
51 Content-Disposition: inline
\r
52 In-Reply-To: <87twlbv5vj.fsf@zancas.localnet>
\r
53 OpenPGP: id=39A2F3FA2AB17E5D8764F388FC29BDCDF15F5BE8;
\r
54 url=http://tremily.us/pubkey.txt
\r
55 User-Agent: Mutt/1.5.23 (2014-03-12)
\r
56 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net;
\r
57 s=q20140121; t=1455489233;
\r
58 bh=AmI96bcDE9KQbbJ1Spxj1vrVdkaZ/rAdiaT/BDouZKg=;
\r
59 h=Received:Received:Received:Date:From:To:Subject:Message-ID:
\r
60 MIME-Version:Content-Type;
\r
61 b=qCdo3bB4sHo0UkCWw8A/iMXFuEz7wbLknaL0b2Qz+6sGZfElCsZnbXGR9u60c226J
\r
62 ZFuRQMP10G1hFo8lqx8WmKCPEoz+BhbGDT3WNfotg4NiWmNaBx7YHxIWpjJeBnPpBn
\r
63 MSQ6Q2C9RwXW+bpIcDveWKX25iGOoLJpfY11A/8WTzyBkJFTwborueQz+C74vtuIiz
\r
64 qKhJs7VO3mp75sWtdDIKD9MvthX2jBa/CGUgVx5bTbrr4Jnjn79qERP5CNzw8aS0dm
\r
65 E0KpVziITyezKeXL0b6Mj54ADCSN5WyBZDcqlfUYV8X7OyAeh/w4BNQucI7ag/d6Tt
\r
67 X-BeenThere: notmuch@notmuchmail.org
\r
68 X-Mailman-Version: 2.1.20
\r
70 List-Id: "Use and development of the notmuch mail system."
\r
71 <notmuch.notmuchmail.org>
\r
72 List-Unsubscribe: <https://notmuchmail.org/mailman/options/notmuch>,
\r
73 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
74 List-Archive: <http://notmuchmail.org/pipermail/notmuch/>
\r
75 List-Post: <mailto:notmuch@notmuchmail.org>
\r
76 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
77 List-Subscribe: <https://notmuchmail.org/mailman/listinfo/notmuch>,
\r
78 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
79 X-List-Received-Date: Sun, 14 Feb 2016 22:33:59 -0000
\r
83 Content-Type: text/plain; charset=utf-8
\r
84 Content-Disposition: inline
\r
85 Content-Transfer-Encoding: quoted-printable
\r
87 On Sun, Feb 14, 2016 at 08:22:24AM -0400, David Bremner wrote:
\r
88 > W. Trevor King writes:
\r
89 > > for tag in tags:
\r
90 > > _LOG.debug('building a quoted path for {!r} / {!r}'.format(id, ta=
\r
92 > > path =3D 'tags/{id}/{tag}'.format(
\r
93 > > id=3D_hex_quote(string=3Did), tag=3D_hex_quote(string=3Dtag))
\r
94 > > yield '{mode} {hash}\t{path}\n'.format(mode=3Dmode, hash=3Dhash, =
\r
98 > I think the problem is not a bad tag, but a bad message-id. The last
\r
99 > line of output before the UnicodeWarning and the broken pipe is
\r
101 > building a quoted path for u'D1B4DEBCAFFC4A05A4D4349A6EC5C9D8@\xd1\xe5\xf=
\r
102 0\xe3\xe5\xe9-\xcf\xca' / u'unread'
\r
104 $ ln -s nmbug nmbug.py
\r
105 $ python2 -W error -c "import nmbug; nmbug._hex_quote(u'D1B4DEBCAFFC4A05A=
\r
106 4D4349A6EC5C9D8@\xd1\xe5\xf0\xe3\xe5\xe9-\xcf\xca')"
\r
107 Traceback (most recent call last):
\r
108 File "<string>", line 1, in <module>
\r
109 File "nmbug.py", line 106, in _hex_quote
\r
110 uppercase_escapes =3D _quote(string, safe)
\r
111 File "/usr/lib64/python2.7/urllib.py", line 1303, in quote
\r
112 return ''.join(map(quoter, s))
\r
113 UnicodeWarning: Unicode equal comparison failed to convert both arguments=
\r
114 to Unicode - interpreting them as being unequal
\r
116 The problem seems to be having Unicode characters in either quote argument:
\r
118 $ python2 -W error -c "import urllib; urllib.quote(u'D1B4DEBCAFFC4A05A4D4=
\r
119 349A6EC5C9D8@\xd1\xe5\xf0\xe3\xe5\xe9-\xcf\xca')"
\r
121 UnicodeWarning: Unicode equal comparison failed to convert both arguments=
\r
122 to Unicode - interpreting them as being unequal
\r
123 $ python2 -W error -c "import urllib; urllib.quote(u'D1B4DEBCAFFC4A05A4D4=
\r
124 349A6EC5C9D8@\xd1\xe5\xf0\xe3\xe5\xe9-\xcf\xca', u'+@=3D:,')"
\r
126 UnicodeWarning: Unicode equal comparison failed to convert both arguments=
\r
127 to Unicode - interpreting them as being unequal
\r
128 $ python2 -W error -c "import urllib; urllib.quote(u'D1B4DEBCAFFC4A05A4D4=
\r
129 349A6EC5C9D8@\xd1\xe5\xf0\xe3\xe5\xe9-\xcf\xca'.encode('utf-8'), u'+@=3D:,'=
\r
132 UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 33: =
\r
133 ordinal not in range(128)
\r
134 $ python2 -W error -c "import urllib; print(urllib.quote(u'D1B4DEBCAFFC4A=
\r
135 05A4D4349A6EC5C9D8@\xd1\xe5\xf0\xe3\xe5\xe9-\xcf\xca'.encode('utf-8'), u'+@=
\r
136 =3D:,'.encode('utf-8')))"
\r
137 D1B4DEBCAFFC4A05A4D4349A6EC5C9D8@%C3%91%C3%A5%C3%B0%C3%A3%C3%A5%C3%A9-%C3=
\r
140 Related Python issues [1,2,3,4,5]. [2] lead to the currently working
\r
141 Python 3 implementation, which encodes to UTF-8 by default and has an
\r
142 =E2=80=98encoding=E2=80=99 option [6]. There's some useful background in [=
\r
144 compatibility with Python 3, I suggest patching _hex_quote to take an
\r
145 encoding option, defaulting to UTF-8, and encoding both strings that
\r
146 are passed to _quote. We should probably raise a ValueError if the
\r
147 length of the encoded safe characters doesn't match the length of the
\r
148 Unicode safe characters, because the caller will probably not expect
\r
149 the byte-level quoting that would cause. Python 3 covers that by
\r
150 restricting the safe characters to ASCII [6], although passing
\r
151 non-ASCII characters with safe doesn't seem to raise an exception:
\r
153 $ python3 -c "from urllib.parse import quote; print(quote('\u0091', '\u00=
\r
156 $ python3 -c "from urllib.parse import quote; print(quote('\u203b', '\u20=
\r
160 Anyhow, I'll file a patch adding UTF-8 encoding so Python 2 works like
\r
166 [1]: http://bugs.python.org/issue2637
\r
167 [2]: http://bugs.python.org/issue3300
\r
168 [3]: http://bugs.python.org/issue22231
\r
169 [4]: http://bugs.python.org/issue23885
\r
170 [5]: http://bugs.python.org/issue1712522
\r
171 [6]: https://docs.python.org/3/library/urllib.parse.html#urllib.parse.quote
\r
172 [7]: https://mail.python.org/pipermail/python-dev/2006-July/067335.html
\r
175 This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
\r
176 For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy
\r
179 Content-Type: application/pgp-signature; name="signature.asc"
\r
180 Content-Description: OpenPGP digital signature
\r
182 -----BEGIN PGP SIGNATURE-----
\r
185 iQIcBAEBAgAGBQJWwQDNAAoJEAPqygegUbGspqsP/ikbYEb5/7+y7uncDLolCWGN
\r
186 25t1gJdZNdAhqmHaUbSqRHKc8yOUkr/MmmNG2RFcKZQaSwwweg3pELAyFiuYljHl
\r
187 n+0da6aizwsguDubMwJes288FkOjSvVeVkrAXQgtMmDCYJXB5/FvHgU9cBBJAqvs
\r
188 FQFZhhrwt4RX9m851ksaomCxJvNW8/zHBQNoVVAEZI6jp2NRYmBriC1xOUlLZ8iF
\r
189 AAkFMisnFnosFH8xbLEN/7qVXRov8LQFF9w7dHqAxFcZu9ML6Byl44Ha2LTfUC1F
\r
190 SNQ+uSD0NaGDhpTYSMG1OE/ODdlQKs8ah5erzq6D1E1CdxyMSDRUYvkFUmGOzd3b
\r
191 v0FfTzLwE9SoEtzu7CP2TvPGyGmqfIaF1y7HwAKCfgl+wDM5ZvO2CtZXcfsqCTOv
\r
192 QySwNZT1aZse6zX3x0utSEyqRoLtqD5DUXFRPr4IiCnhU80/Jdvy+H1OyJmSW/GV
\r
193 1JUI7tu4AuAgXVuOXGDhkSvCyklFKiJB9Tau4giXD2/l318wlqoHYDPl/LpRFl7t
\r
194 jm5GgPhJ9gxYlGdTunWZRVAV97GsRjEGdERYbL86yGBsj5FayM6PG517/b8ZrJdm
\r
195 TN5onwoRpt2YFT41ORAgJa7yC6khHnPbYKnpEZ9sjyUQLg0AXdUKJoevEi2V9PMF
\r
196 afi06G05r5RwO2ocMNvI
\r
198 -----END PGP SIGNATURE-----
\r
200 --PHDeMLmKefytWajp--
\r