1 Return-Path: <m.walters@qmul.ac.uk>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 2322A431FBC
\r
6 for <notmuch@notmuchmail.org>; Sun, 23 Dec 2012 18:34:44 -0800 (PST)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-1.098 tagged_above=-999 required=5
\r
12 tests=[DKIM_ADSP_CUSTOM_MED=0.001, FREEMAIL_FROM=0.001,
\r
13 NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled
\r
14 Received: from olra.theworths.org ([127.0.0.1])
\r
15 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
16 with ESMTP id tfdxGNHbv73t for <notmuch@notmuchmail.org>;
\r
17 Sun, 23 Dec 2012 18:34:43 -0800 (PST)
\r
18 Received: from mail2.qmul.ac.uk (mail2.qmul.ac.uk [138.37.6.6])
\r
19 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
\r
20 (No client certificate requested)
\r
21 by olra.theworths.org (Postfix) with ESMTPS id A4975431FAF
\r
22 for <notmuch@notmuchmail.org>; Sun, 23 Dec 2012 18:34:42 -0800 (PST)
\r
23 Received: from smtp.qmul.ac.uk ([138.37.6.40])
\r
24 by mail2.qmul.ac.uk with esmtp (Exim 4.71)
\r
25 (envelope-from <m.walters@qmul.ac.uk>)
\r
26 id 1TmxsA-0003Ts-Dk; Mon, 24 Dec 2012 02:34:36 +0000
\r
27 Received: from 93-97-24-31.zone5.bethere.co.uk ([93.97.24.31] helo=localhost)
\r
28 by smtp.qmul.ac.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69)
\r
29 (envelope-from <m.walters@qmul.ac.uk>)
\r
30 id 1Tmxs9-0003hM-T4; Mon, 24 Dec 2012 02:34:34 +0000
\r
31 From: Mark Walters <markwalters1009@gmail.com>
\r
32 To: david@tethera.net, notmuch@notmuchmail.org
\r
33 Subject: Re: v9 of batch tagging
\r
34 In-Reply-To: <1356313183-9266-1-git-send-email-david@tethera.net>
\r
35 References: <1356313183-9266-1-git-send-email-david@tethera.net>
\r
36 User-Agent: Notmuch/0.14+236~g1d0044f (http://notmuchmail.org) Emacs/23.4.1
\r
37 (x86_64-pc-linux-gnu)
\r
38 Date: Mon, 24 Dec 2012 02:34:33 +0000
\r
39 Message-ID: <8738yw2n5y.fsf@qmul.ac.uk>
\r
41 Content-Type: text/plain; charset=us-ascii
\r
42 X-Sender-Host-Address: 93.97.24.31
\r
43 X-QM-SPAM-Info: Sender has good ham record. :)
\r
44 X-QM-Body-MD5: 0364f6ddfdb86c7c92c1825bff0e5610 (of first 20000 bytes)
\r
45 X-SpamAssassin-Score: -1.8
\r
46 X-SpamAssassin-SpamBar: -
\r
47 X-SpamAssassin-Report: The QM spam filters have analysed this message to
\r
49 spam. We require at least 5.0 points to mark a message as spam.
\r
50 This message scored -1.8 points.
\r
51 Summary of the scoring:
\r
52 * -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/,
\r
54 * [138.37.6.40 listed in list.dnswl.org]
\r
55 * 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail
\r
56 provider * (markwalters1009[at]gmail.com)
\r
57 * -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay
\r
59 * 0.5 AWL AWL: From: address is in the auto white-list
\r
60 X-QM-Scan-Virus: ClamAV says the message is clean
\r
61 X-BeenThere: notmuch@notmuchmail.org
\r
62 X-Mailman-Version: 2.1.13
\r
64 List-Id: "Use and development of the notmuch mail system."
\r
65 <notmuch.notmuchmail.org>
\r
66 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
67 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
68 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
69 List-Post: <mailto:notmuch@notmuchmail.org>
\r
70 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
71 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
72 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
73 X-List-Received-Date: Mon, 24 Dec 2012 02:34:44 -0000
\r
76 On Mon, 24 Dec 2012, david@tethera.net wrote:
\r
79 > id:1356095307-22895-1-git-send-email-david@tethera.net
\r
81 > The main changes since v8 are the rebasing against the notmuch-restore
\r
82 > fixes in master, and the rewrite of the query (pre)-processing
\r
83 > unhex_and_quote. This incorporates the changes of
\r
85 > id:1356231570-28232-1-git-send-email-david@tethera.net
\r
87 > and now handles '()' (cf. id:87a9t5p4dz.fsf@qmul.ac.uk)
\r
92 > | Finally, I don't know if a query can contain a : without being a
\r
93 > | prefix query. If it can that could end up being misquoted.
\r
96 > This is pretty easy to work around by encoding that :. I think unless
\r
97 > it is a problem in practice I prefer not to keep an explicity list of
\r
98 > prefixes here; recognizing prefixes should really be a service from
\r
101 I am quite happy with this.
\r
103 > I dropped two patches (strnspn and hex_invariant), but picked up a new
\r
104 > strtok variation. Probably the name strtok_len2 could be improved
\r
105 > (and I see there is a typo in the patch subject).
\r
107 > [Patch v9 05/17] util/string-util: add a new string tokenized
\r
110 Patches 5 and 6 look good to me.
\r
112 > Finally I added a test for the new parenthesis handling.
\r
114 My recollection is that dump prints the messages unsorted: does this
\r
115 mean that we could get unstable results for these tests (eg with
\r
116 different Xapian versions)?
\r
123 > [Patch v9 17/17] test/tagging: add test for handling of parens
\r
127 > Fixup wise, the tests needed to be adjusted a bit for () being delimiters,
\r
128 > and the man page as well.
\r
130 > I added the fclose in id:87wqw9hf9a.fsf@oiva.home.nikula.org
\r
132 > And I modified the return value per id:87zk15hi7f.fsf@oiva.home.nikula.org
\r
134 > Here is the interdiff for unhex_and_quote:
\r
136 > commit 67c6aee87db5c7da25529e1c0feb64e422abb4b7
\r
137 > Author: David Bremner <bremner@unb.ca>
\r
138 > Date: Sat Dec 22 22:49:02 2012 -0400
\r
140 > simplify unhex_and_quote, support parens
\r
142 > the overgeneral definition of a prefix can be replaced by lower case
\r
143 > alphabetic, and still work fine with current notmuch query syntax.
\r
145 > use () as delimiters in unhex_and_quote, preserve delimiters
\r
147 > diff --git a/tag-util.c b/tag-util.c
\r
148 > index 6f62fe6..91f3603 100644
\r
151 > @@ -56,6 +56,21 @@ illegal_tag (const char *tag, notmuch_bool_t remove)
\r
155 > +/* Factor out the boilerplate to append a token to the query string.
\r
156 > + * For use in unhex_and_quote */
\r
158 > +static tag_parse_status_t
\r
159 > +append_tok (const char *tok, size_t tok_len,
\r
160 > + const char *line_for_error, char **query_string)
\r
163 > + *query_string = talloc_strndup_append_buffer (*query_string, tok, tok_len);
\r
164 > + if (*query_string == NULL)
\r
165 > + return line_error (TAG_PARSE_OUT_OF_MEMORY, line_for_error, "aborting");
\r
167 > + return TAG_PARSE_SUCCESS;
\r
170 > /* Input is a hex encoded string, presumed to be a query for Xapian.
\r
172 > * Space delimited tokens are decoded and quoted, with '*' and prefixes
\r
173 > @@ -67,45 +82,41 @@ unhex_and_quote (void *ctx, char *encoded, const char *line_for_error,
\r
175 > char *tok = encoded;
\r
176 > size_t tok_len = 0;
\r
177 > + size_t delim_len = 0;
\r
178 > char *buf = NULL;
\r
179 > size_t buf_len = 0;
\r
180 > tag_parse_status_t ret = TAG_PARSE_SUCCESS;
\r
182 > *query_string = talloc_strdup (ctx, "");
\r
184 > - while ((tok = strtok_len (tok + tok_len, " ", &tok_len)) != NULL) {
\r
185 > + while ((tok = strtok_len2 (tok + tok_len + delim_len, " ()",
\r
186 > + &tok_len, &delim_len)) != NULL) {
\r
188 > size_t prefix_len;
\r
189 > char delim = *(tok + tok_len);
\r
191 > - *(tok + tok_len++) = '\0';
\r
192 > + *(tok + tok_len) = '\0';
\r
194 > - prefix_len = hex_invariant (tok, tok_len);
\r
195 > + /* The following matches a superset of prefixes currently
\r
196 > + * used by notmuch */
\r
197 > + prefix_len = strspn (tok, "abcdefghijklmnopqrstuvwxyz");
\r
199 > - if ((strcmp (tok, "*") == 0) || prefix_len >= tok_len - 1) {
\r
200 > + if ((strcmp (tok, "*") == 0) || prefix_len == tok_len) {
\r
202 > /* pass some things through without quoting or decoding.
\r
203 > * Note for '*' this is mandatory.
\r
206 > - if (! (*query_string = talloc_asprintf_append_buffer (
\r
207 > - *query_string, "%s%c", tok, delim))) {
\r
209 > - ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
\r
210 > - line_for_error, "aborting");
\r
213 > + ret = append_tok (tok, tok_len, line_for_error, query_string);
\r
214 > + if (ret) goto DONE;
\r
217 > /* potential prefix: one for ':', then something after */
\r
218 > - if ((tok_len - prefix_len > 2) && *(tok + prefix_len) == ':') {
\r
219 > - if (! (*query_string = talloc_strndup_append (*query_string,
\r
221 > - prefix_len + 1))) {
\r
222 > - ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
\r
223 > - line_for_error, "aborting");
\r
226 > + if ((tok_len - prefix_len >= 2) && *(tok + prefix_len) == ':') {
\r
227 > + ret = append_tok (tok, prefix_len + 1,
\r
228 > + line_for_error, query_string);
\r
229 > + if (ret) goto DONE;
\r
231 > tok += prefix_len + 1;
\r
232 > tok_len -= prefix_len + 1;
\r
234 > @@ -122,13 +133,15 @@ unhex_and_quote (void *ctx, char *encoded, const char *line_for_error,
\r
238 > - if (! (*query_string = talloc_asprintf_append_buffer (
\r
239 > - *query_string, "%s%c", buf, delim))) {
\r
240 > - ret = line_error (TAG_PARSE_OUT_OF_MEMORY,
\r
241 > - line_for_error, "aborting");
\r
244 > + ret = append_tok (buf, buf_len, line_for_error, query_string);
\r
245 > + if (ret) goto DONE;
\r
247 > + /* restore the string */
\r
248 > + *(tok + tok_len) = delim;
\r
250 > + /* copy any delimiters */
\r
251 > + ret = append_tok (tok + tok_len, delim_len, line_for_error, query_string);
\r
252 > + if (ret) goto DONE;
\r
257 > _______________________________________________
\r
258 > notmuch mailing list
\r
259 > notmuch@notmuchmail.org
\r
260 > http://notmuchmail.org/mailman/listinfo/notmuch
\r