1 Return-Path: <amdragon@mit.edu>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 6CE17431FB6
\r
6 for <notmuch@notmuchmail.org>; Tue, 25 Dec 2012 19:49:03 -0800 (PST)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5
\r
12 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id Z-+Z449FvnOD for <notmuch@notmuchmail.org>;
\r
16 Tue, 25 Dec 2012 19:48:58 -0800 (PST)
\r
17 Received: from dmz-mailsec-scanner-5.mit.edu (DMZ-MAILSEC-SCANNER-5.MIT.EDU
\r
19 by olra.theworths.org (Postfix) with ESMTP id 95D45431FAF
\r
20 for <notmuch@notmuchmail.org>; Tue, 25 Dec 2012 19:48:58 -0800 (PST)
\r
21 X-AuditID: 12074422-b7f616d000000e7c-18-50da73a9b9d2
\r
22 Received: from mailhub-auth-4.mit.edu ( [18.7.62.39])
\r
23 by dmz-mailsec-scanner-5.mit.edu (Symantec Messaging Gateway) with SMTP
\r
24 id D1.4B.03708.9A37AD05; Tue, 25 Dec 2012 22:48:57 -0500 (EST)
\r
25 Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103])
\r
26 by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id qBQ3muYt019978;
\r
27 Tue, 25 Dec 2012 22:48:56 -0500
\r
28 Received: from drake.dyndns.org (c-76-21-105-205.hsd1.ca.comcast.net
\r
29 [76.21.105.205]) (authenticated bits=0)
\r
30 (User authenticated as amdragon@ATHENA.MIT.EDU)
\r
31 by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id qBQ3mrZR013180
\r
32 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT);
\r
33 Tue, 25 Dec 2012 22:48:55 -0500 (EST)
\r
34 Received: from amthrax by drake.dyndns.org with local (Exim 4.77)
\r
35 (envelope-from <amdragon@mit.edu>)
\r
36 id 1Tnhz8-0002yu-HR; Tue, 25 Dec 2012 22:48:50 -0500
\r
37 From: Austin Clements <amdragon@MIT.EDU>
\r
38 To: notmuch@notmuchmail.org
\r
39 Subject: [PATCH v2 0/5] Use Xapian query syntax for batch-tag dump/restore
\r
40 Date: Tue, 25 Dec 2012 22:48:38 -0500
\r
41 Message-Id: <1356493723-11085-1-git-send-email-amdragon@mit.edu>
\r
42 X-Mailer: git-send-email 1.7.10.4
\r
43 X-Brightmail-Tracker:
\r
44 H4sIAAAAAAAAA+NgFtrGIsWRmVeSWpSXmKPExsUixG6nrruy+FaAweTDZhY3WrsZLZqmO1us
\r
45 nstjcf3mTGYHFo+ds+6ye9y6/5rd49mqW8weWw69Zw5gieKySUnNySxLLdK3S+DKuPL5OUvB
\r
46 Q92KGa3TWBoY9yh3MXJySAiYSLx8+ZwZwhaTuHBvPRuILSSwj1HiwU6NLkYuIHsDo8TUr3fZ
\r
47 IJyLTBLzpm1nhXDmMkqsP3aQEaSFTUBDYtv+5WC2iIC0xM67s1lBbGYBR4kzr9vA4sICXhJn
\r
48 ViwAmsTBwSKgKvHqvyBImFfAQWLmgUtQVyhKdD+bwDaBkXcBI8MqRtmU3Crd3MTMnOLUZN3i
\r
49 5MS8vNQiXVO93MwSvdSU0k2M4OBxUdrB+POg0iFGAQ5GJR7eDd9vBgixJpYVV+YeYpTkYFIS
\r
50 5d1ecCtAiC8pP6UyI7E4I76oNCe1+BCjBAezkgiv80egct6UxMqq1KJ8mJQ0B4uSOO+1lJv+
\r
51 QgLpiSWp2ampBalFMFkZDg4lCd7NRUBDBYtS01Mr0jJzShDSTBycIMN5gIbfBKnhLS5IzC3O
\r
52 TIfIn2JUlBLnvQCSEABJZJTmwfXCovsVozjQK8K8e0CqeICJAa77FdBgJqDBsXw3QAaXJCKk
\r
53 pBoYZ89tuXEkKeE0778ivlLzC/9cRS/KR988Of/Srtf7AwXf107VerxWREAlhGHFAa6TN+5s
\r
54 FDuw/e/BLVNNql5ItTp+mXelyjl21xOx3MK+/I279csT3ireWcUg8X5LMdstqwdK9i28fCWb
\r
55 Tl5cbX/SquNr5+aOxka/3p82bDZe/myWf+4as15PV2Ipzkg01GIuKk4EAGT71V3JAgAA
\r
56 X-BeenThere: notmuch@notmuchmail.org
\r
57 X-Mailman-Version: 2.1.13
\r
59 List-Id: "Use and development of the notmuch mail system."
\r
60 <notmuch.notmuchmail.org>
\r
61 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
62 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
63 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
64 List-Post: <mailto:notmuch@notmuchmail.org>
\r
65 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
66 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
67 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
68 X-List-Received-Date: Wed, 26 Dec 2012 03:49:04 -0000
\r
72 id:1356415076-5692-1-git-send-email-amdragon@mit.edu
\r
74 In addition to incorporating all of David's suggestions, this reworks
\r
75 the boolean term parsing so it only handles the subset of quoting
\r
76 syntax used by make_boolean_term (which also happens to be all that we
\r
77 described in the man page for the format). The diff from v1 is below.
\r
79 diff --git a/man/man1/notmuch-restore.1 b/man/man1/notmuch-restore.1
\r
80 index 6bba628..78fef52 100644
\r
81 --- a/man/man1/notmuch-restore.1
\r
82 +++ b/man/man1/notmuch-restore.1
\r
83 @@ -57,10 +57,8 @@ sup calls them).
\r
86 dump format is intended to more robust against malformed message-ids
\r
87 -and tags containing whitespace or non-\fBascii\fR(7) characters. This
\r
88 -format hex-escapes all characters those outside of a small character
\r
89 -set, intended to be suitable for e.g. pathnames in most UNIX-like
\r
91 +and tags containing whitespace or non-\fBascii\fR(7) characters. See
\r
92 +\fBnotmuch-dump\fR(1) for details on this format.
\r
94 .B "notmuch restore"
\r
95 updates the maildir flags according to tag changes if the
\r
96 diff --git a/test/dump-restore b/test/dump-restore
\r
97 index aecc393..f9ae5b3 100755
\r
98 --- a/test/dump-restore
\r
99 +++ b/test/dump-restore
\r
100 @@ -200,6 +200,8 @@ a
\r
101 # the next non-comment line should report an an empty tag error for
\r
102 # batch tagging, but not for restore
\r
103 + +e -- id:20091117232137.GA7669@griffis1.net
\r
104 +# valid id, but warning about missing message
\r
105 ++e id:missing_message_id
\r
108 cat <<EOF > EXPECTED
\r
109 @@ -211,6 +213,7 @@ Warning: no query string after -- [+c +d --]
\r
110 Warning: hex decoding of tag %zz failed [+%zz -- id:whatever]
\r
111 Warning: cannot parse query: id:"
\r
112 Warning: not an id query: tag:abc
\r
113 +Warning: cannot apply tags to missing message: missing_message_id
\r
116 test_expect_equal_file EXPECTED OUTPUT
\r
117 diff --git a/test/random-corpus.c b/test/random-corpus.c
\r
118 index d0e3e8f..8b7748e 100644
\r
119 --- a/test/random-corpus.c
\r
120 +++ b/test/random-corpus.c
\r
121 @@ -96,9 +96,9 @@ random_utf8_string (void *ctx, size_t char_count)
\r
122 buf = talloc_realloc (ctx, buf, gchar, buf_size);
\r
125 - randomchar = random_unichar ();
\r
126 - if (randomchar == '\n')
\r
127 - randomchar = 'x';
\r
129 + randomchar = random_unichar ();
\r
130 + } while (randomchar == '\n');
\r
132 written = g_unichar_to_utf8 (randomchar, buf + offset);
\r
134 diff --git a/util/string-util.c b/util/string-util.c
\r
135 index eaa6c99..db01b4b 100644
\r
136 --- a/util/string-util.c
\r
137 +++ b/util/string-util.c
\r
138 @@ -43,9 +43,11 @@ make_boolean_term (void *ctx, const char *prefix, const char *term,
\r
140 int need_quoting = 0;
\r
142 - /* Do we need quoting? */
\r
143 + /* Do we need quoting? To be paranoid, we quote anything
\r
144 + * containing a quote, even though it only matters at the
\r
145 + * beginning, and anything containing non-ASCII text. */
\r
146 for (in = term; *in && !need_quoting; in++)
\r
147 - if (*in <= ' ' || *in == ')' || *in == '"')
\r
148 + if (*in <= ' ' || *in == ')' || *in == '"' || (unsigned char)*in > 127)
\r
152 @@ -95,21 +97,6 @@ make_boolean_term (void *ctx, const char *prefix, const char *term,
\r
157 -consume_double_quote (const char **str)
\r
159 - if (**str == '"') {
\r
162 - } else if (strncmp(*str, "\xe2\x80\x9c", 3) == 0 || /* UTF8 0x201c */
\r
163 - strncmp(*str, "\xe2\x80\x9d", 3) == 0) { /* UTF8 0x201d */
\r
172 parse_boolean_term (void *ctx, const char *str,
\r
173 char **prefix_out, char **term_out)
\r
174 @@ -123,28 +110,31 @@ parse_boolean_term (void *ctx, const char *str,
\r
175 *prefix_out = talloc_strndup (ctx, str, pos - str);
\r
178 - /* Implement Xapian's boolean term de-quoting. This is a nearly
\r
179 - * direct translation of QueryParser::Internal::parse_query. */
\r
180 - pos = *term_out = talloc_strdup (ctx, pos);
\r
181 - if (consume_double_quote (&pos)) {
\r
182 - char *out = talloc_strdup (ctx, pos);
\r
183 - pos = *term_out = out;
\r
186 - /* Premature end of string */
\r
188 - } else if (*pos == '"') {
\r
189 - if (*++pos != '"')
\r
190 + /* Implement de-quoting compatible with make_boolean_term. */
\r
191 + if (*pos == '"') {
\r
192 + char *out = talloc_strdup (ctx, pos + 1);
\r
194 + /* Find the closing quote and un-double doubled internal
\r
196 + for (pos = *term_out = out; *pos; ) {
\r
197 + if (*pos == '"') {
\r
199 + if (*pos != '"') {
\r
200 + /* Found the closing quote. */
\r
203 - } else if (consume_double_quote (&pos)) {
\r
210 + /* Did the term terminate without a closing quote or is there
\r
211 + * trailing text after the closing quote? */
\r
212 + if (!closed || *pos)
\r
216 + *term_out = talloc_strdup (ctx, pos);
\r
217 + /* Check for text after the boolean term. */
\r
218 while (*pos > ' ' && *pos != ')')
\r
221 diff --git a/util/string-util.h b/util/string-util.h
\r
222 index e4e4c42..aff2d65 100644
\r
223 --- a/util/string-util.h
\r
224 +++ b/util/string-util.h
\r
225 @@ -28,9 +28,9 @@ char *strtok_len (char *s, const char *delim, size_t *len);
\r
226 int make_boolean_term (void *talloc_ctx, const char *prefix, const char *term,
\r
227 char **buf, size_t *len);
\r
229 -/* Parse a boolean term query, returning the prefix in *prefix_out and
\r
230 - * the term in *term_out. *prefix_out and *term_out will be talloc'd
\r
231 - * with context ctx.
\r
232 +/* Parse a boolean term query produced by make_boolean_term, returning
\r
233 + * the prefix in *prefix_out and the term in *term_out. *prefix_out
\r
234 + * and *term_out will be talloc'd with context ctx.
\r
236 * Return: 0 on success, non-zero on parse error (including trailing
\r