1 Return-Path: <moritz@tarn-vedra.de>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 3EBA6431FBC
\r
6 for <notmuch@notmuchmail.org>; Tue, 12 Aug 2014 14:48:04 -0700 (PDT)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5
\r
12 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id oejNArLt4CJB for <notmuch@notmuchmail.org>;
\r
16 Tue, 12 Aug 2014 14:47:53 -0700 (PDT)
\r
17 Received: from mail-we0-f169.google.com (mail-we0-f169.google.com
\r
18 [74.125.82.169]) (using TLSv1 with cipher RC4-SHA (128/128 bits))
\r
19 (No client certificate requested)
\r
20 by olra.theworths.org (Postfix) with ESMTPS id A9C44431FAF
\r
21 for <notmuch@notmuchmail.org>; Tue, 12 Aug 2014 14:47:53 -0700 (PDT)
\r
22 Received: by mail-we0-f169.google.com with SMTP id u56so10625935wes.0
\r
23 for <notmuch@notmuchmail.org>; Tue, 12 Aug 2014 14:47:52 -0700 (PDT)
\r
24 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
\r
25 d=1e100.net; s=20130820;
\r
26 h=x-gm-message-state:from:to:cc:subject:in-reply-to:references
\r
27 :user-agent:date:message-id:mime-version:content-type;
\r
28 bh=aGEFSJVm6ula0Y0B1JNnKKT/y7aCOaHZvA2Fmfu1mJI=;
\r
29 b=Y+ujb5GV/0ZaE3J3B+XbtTzEazTK3KmwEEkFNeVQt9P1TTtWt05RW7Dy1KQeWTRyvq
\r
30 m/AGuA3hfeTahPZI6kXPcxF+qTrZv+1oyLCL91BwpcRf8sPCMsj0kI2P6zlM7ytBfnSo
\r
31 RUwaHXK7IWtcfZAehmr+ipWjgEBY0Uk8DVP4jrAHuCb9QYZoVtM626luTFXw0w+I9yO3
\r
32 ylPfuZDSQawCS4sIf4kkjz45JRHLjvYtnMuEtjdqD/kSSFCosl+BQKCCrlrtPG4zoyo9
\r
33 h71ihwArxGbjU+8WNGtaXdWK1xNbTHTZvikme1bj8qwN/0Fv8q5vfdmi8R+NNypKW5n0
\r
36 ALoCoQnD8z+qfz4QVERVjL6bWY7qwQz+T3OsFAkDry1O4u2eyxkcSfBvoNhoqvO+9G9ue6raFIzv
\r
37 X-Received: by 10.180.72.146 with SMTP id d18mr102161wiv.53.1407880071158;
\r
38 Tue, 12 Aug 2014 14:47:51 -0700 (PDT)
\r
39 Received: from moritz-x230 (p3E9BBDA6.dip0.t-ipconnect.de. [62.155.189.166])
\r
40 by mx.google.com with ESMTPSA id w1sm60559662wiz.14.2014.08.12.14.47.49
\r
41 for <multiple recipients>
\r
42 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
\r
43 Tue, 12 Aug 2014 14:47:49 -0700 (PDT)
\r
44 From: Moritz Ulrich <moritz@tarn-vedra.de>
\r
45 To: "Austin T. Clements" <aclements@csail.mit.edu>
\r
46 Subject: Re: `notmuch-escape-boolean-term': Broken for non-ascii characters
\r
48 <20140812103300.Horde.O1lIjfCL-Lh8XGn65RO2Cg1@webmail.csail.mit.edu>
\r
49 References: <874mxiu5hj.fsf@tarn-vedra.de>
\r
50 <20140812103300.Horde.O1lIjfCL-Lh8XGn65RO2Cg1@webmail.csail.mit.edu>
\r
51 User-Agent: Notmuch/0.18.1 (http://notmuchmail.org) Emacs/24.3.1
\r
52 (x86_64-unknown-linux-gnu)
\r
53 Date: Tue, 12 Aug 2014 23:47:42 +0200
\r
54 Message-ID: <874mxhbcsh.fsf@tarn-vedra.de>
\r
56 Content-Type: multipart/signed; boundary="=-=-=";
\r
57 micalg=pgp-sha256; protocol="application/pgp-signature"
\r
58 X-Mailman-Approved-At: Tue, 12 Aug 2014 22:26:53 -0700
\r
59 Cc: notmuch@notmuchmail.org
\r
60 X-BeenThere: notmuch@notmuchmail.org
\r
61 X-Mailman-Version: 2.1.13
\r
63 List-Id: "Use and development of the notmuch mail system."
\r
64 <notmuch.notmuchmail.org>
\r
65 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
66 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
67 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
68 List-Post: <mailto:notmuch@notmuchmail.org>
\r
69 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
70 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
71 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
72 X-List-Received-Date: Tue, 12 Aug 2014 21:48:04 -0000
\r
75 Content-Type: text/plain; charset=utf-8
\r
76 Content-Transfer-Encoding: quoted-printable
\r
78 "Austin T. Clements" <aclements@csail.mit.edu> writes:
\r
80 > Quoting Moritz Ulrich <moritz@tarn-vedra.de>:
\r
83 >> I recently adopted notmuch as my primary way to read mail, so thank you
\r
84 >> for this great tool!
\r
86 >> Unfortunately, I ran into a problem of the Emacs side of the project
\r
87 >> when used in a non-ascii environment:
\r
89 >> Having a tag named 'uni-k=C3=B6ln', the tag:-completion doesn't work.
\r
91 >> This is caused by `notmuch-escape-boolean-term' errornously escaping the
\r
94 >> (notmuch-escape-boolean-term "uni-k=C3=B6ln") =3D> "\"uni-k=C3=B6ln\""
\r
96 >> This is caused by `string-match' with the following errornously matching
\r
99 >> (string-match "[^!#-'*-~]" "uni-k=C3=B6ln") =3D> 5
\r
100 >> (string-match "[^!#-'*-~]" "uni-koln") =3D> nil
\r
102 >> I'm not exactly sure how to tackle this - the Regexp was crafted to match
\r
103 >> (, ), " if I understand it correct. A simple way would be just adding
\r
104 >> more characters as a sort-of whitelist. A nicer solution would be
\r
105 >> converting it from [^...] to [...] to explicitly mark letters that needs
\r
108 > notmuch-escape-boolean-term used to use a blacklist, but we switched
\r
109 > to a whitelist because Xapian's own parser has changed over the years
\r
110 > in its handling of non-ASCII characters and invalidated our blacklist.
\r
111 > Ultimately it seemed much safer to go with a whitelist. Quoting
\r
112 > "uni-k=C3=B6ln" isn't erroneous, it's just conservative.
\r
114 > Could you explain in more detail what's broken? I tried adding the
\r
115 > tag uni-k=C3=B6ln to a message in Emacs, then hitting "s" to start a sear=
\r
117 > then "tag:<TAB>" and that tag (surrounded by quotes) was one of the
\r
118 > completion options. Upon completing to that tag, the search worked
\r
121 > Are you objecting to the unnecessary (but legal) quotes in the
\r
122 > completion? We might be able to include Unicode word characters in
\r
123 > the quoting whitelist, though that seems like a spot fix (probably a
\r
124 > fairly broad one, so maybe that's fine) and might be tricky because of
\r
125 > Emacs' somewhat weird Unicode regexp support (using [[:alpha:]] might
\r
126 > Just Work, but we'd have to be careful of the active syntax table).
\r
127 > Or tab completion could recognize that, say, tag:uni doesn't require
\r
128 > quoting, but still expand it to tag:"uni-k=C3=B6ln".
\r
130 Thanks for explaining the reason for the whitelist-approach. Knowing
\r
131 this is quite helpful.
\r
133 I can't really explain why, but I just didn't notice tag:"uni-k=C3=B6ln" in
\r
134 the tag-completion - I think my expectations for finding it as
\r
135 tag:uni-k=C3=B6ln must have blinded me.
\r
137 While it isn't errornous, it's higly unintuitive to quote tags like
\r
138 this. I can understand that a much more permissive whitelist could cause
\r
139 other problems which are harder to track down, so maybe it's possible to
\r
140 make the behavior configurable (e.g. by using a `defvar' for the regex).
\r
146 Content-Type: application/pgp-signature
\r
148 -----BEGIN PGP SIGNATURE-----
\r
151 iQIcBAEBCAAGBQJT6ouDAAoJEKnhzHnsv6QyYJkP/Rdf5grt5sz/hxDS6QehollQ
\r
152 kzAWNlmPulxNWPPTGbfBqUOKSynNJipaMtiout1x8rMEnFpw+lgWGtTy8Zxz4s1U
\r
153 5xBIp3v3IH98Imm/bLS7P8rDU7ExI6RITI9829nyLVZTMftyN0EmE36qKAwA+nDv
\r
154 z+71wD7tRODxy2bgvKoZJfyisIfemfb3UthhlS71fzjqlo44hqkZg1GKFRtMpDCm
\r
155 vNAArH5VqxY5ooQ7Omtgv57PGNQReg7uFwbnC65t40b1QAbUpDF6h639BJjIM36o
\r
156 zItU0d6OsmBwKb7IhIX2npev/yDq4hDHJAHeYxqK+/WCRNIQUK1kmsUTB++xzFUP
\r
157 ECP8fr1N1yUR2mo7DniY/FP/T9GvKGVUTiCWg5xiID25LLAfVyIFfWS+M6jusOrR
\r
158 G54NypRJR9hWuCgoZFz2qbRZu4sP6S2umTe9Efji7Lha4YDZgf9m6MPtXbEKGLPU
\r
159 YdlIdnPg12RQvMOHLlpfhSK9w1ZGUty+7xxbdKT04NsQ3N4VmSyHC6J079zmaSMz
\r
160 FnjFLAEiyqkWa0op4FJQHopb/R6rRPw97055ULDpB1Bwa5Bssa3nq74JwfsFIdGt
\r
161 j00jIaQMp0aABvCXjHUPDXakFzvq2ID2RBrlzybkHgTt9FA29MMlB1NJA/XyKEBz
\r
162 yxbvd9c8DMHUDWNAksUj
\r
164 -----END PGP SIGNATURE-----
\r