1 Return-Path: <amdragon@mit.edu>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id A62E1429E26
\r
6 for <notmuch@notmuchmail.org>; Wed, 9 Nov 2011 05:37:48 -0800 (PST)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5
\r
12 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id bZcJ0LOkaKsv for <notmuch@notmuchmail.org>;
\r
16 Wed, 9 Nov 2011 05:37:44 -0800 (PST)
\r
17 Received: from dmz-mailsec-scanner-2.mit.edu (DMZ-MAILSEC-SCANNER-2.MIT.EDU
\r
19 by olra.theworths.org (Postfix) with ESMTP id B3D94431FD0
\r
20 for <notmuch@notmuchmail.org>; Wed, 9 Nov 2011 05:37:44 -0800 (PST)
\r
21 X-AuditID: 1209190d-b7f726d0000008d1-0c-4eba822714a4
\r
22 Received: from mailhub-auth-4.mit.edu ( [18.7.62.39])
\r
23 by dmz-mailsec-scanner-2.mit.edu (Symantec Messaging Gateway) with SMTP
\r
24 id FE.05.02257.7228ABE4; Wed, 9 Nov 2011 08:37:43 -0500 (EST)
\r
25 Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103])
\r
26 by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id pA9DbgqX024428;
\r
27 Wed, 9 Nov 2011 08:37:43 -0500
\r
28 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91])
\r
29 (authenticated bits=0)
\r
30 (User authenticated as amdragon@ATHENA.MIT.EDU)
\r
31 by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id pA9DbfNU018625
\r
32 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT);
\r
33 Wed, 9 Nov 2011 08:37:42 -0500 (EST)
\r
34 Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77)
\r
35 (envelope-from <amdragon@mit.edu>)
\r
36 id 1RO8Ny-0007KU-0D; Wed, 09 Nov 2011 08:40:14 -0500
\r
37 Date: Wed, 9 Nov 2011 08:40:13 -0500
\r
38 From: Austin Clements <amdragon@MIT.EDU>
\r
39 To: Jani Nikula <jani@nikula.org>
\r
40 Subject: Re: [PATCH] tag: Automatically limit to messages whose tags will
\r
42 Message-ID: <20111109134013.GK2658@mit.edu>
\r
43 References: <1320724523-23568-1-git-send-email-amdragon@mit.edu>
\r
44 <87ty6d1y5x.fsf@nikula.org>
\r
46 Content-Type: text/plain; charset=us-ascii
\r
47 Content-Disposition: inline
\r
48 In-Reply-To: <87ty6d1y5x.fsf@nikula.org>
\r
49 User-Agent: Mutt/1.5.21 (2010-09-15)
\r
50 X-Brightmail-Tracker:
\r
51 H4sIAAAAAAAAA+NgFuplleLIzCtJLcpLzFFi42IRYrdT11Vv2uVnsLZbyqJpurPF9ZszmR2Y
\r
52 PG7df83u8WzVLeYApigum5TUnMyy1CJ9uwSujI1z/zMWrDSo+NDexNzA+F+5i5GTQ0LARGLq
\r
53 lL2sELaYxIV769m6GLk4hAT2MUqsnHKFCcJZzygxafV5KOcEk8Tp9SvYIZwljBK/Nx1j6WLk
\r
54 4GARUJFY08wDMopNQENi2/7ljCC2iICixOaT+8FsZgFpiW+/m5lAbGGBGIm933azgdi8AtoS
\r
55 Mw6cZgcZIyQQJ/FpIStEWFDi5MwnLBCtWhI3/r1kAikBGbP8HwdImBNo08elbWBTRIEOmHJy
\r
56 G9sERqFZSLpnIemehdC9gJF5FaNsSm6Vbm5iZk5xarJucXJiXl5qka6RXm5miV5qSukmRlBI
\r
57 c0ry7mB8d1DpEKMAB6MSD++lop1+QqyJZcWVuYcYJTmYlER59Rp2+QnxJeWnVGYkFmfEF5Xm
\r
58 pBYfYpTgYFYS4dWvBsrxpiRWVqUW5cOkpDlYlMR5C3c4+AkJpCeWpGanphakFsFkZTg4lCR4
\r
59 ExqBGgWLUtNTK9Iyc0oQ0kwcnCDDeYCGvwdZzFtckJhbnJkOkT/FqMtx+s+lU4xCLHn5ealS
\r
60 4ry1IIMEQIoySvPg5sBS0StGcaC3hHmjQap4gGkMbtIroCVMQEtUDcGWlCQipKQaGLn/Xvmh
\r
61 LWorlrhV9rnLJIu9lyT0Twao5y08t5J/ZqTCO3ujF4YWLItLEm/fTdv5eunq1Te7+fgc/fx6
\r
62 os2jlhqKHlFWTYhafvOp79R+Rw/PNx9n3vYsZb2UfHzxHuWZkzSyOF/biL52VSydtYqZT1sv
\r
63 g0ddVclG+LOmtNrks+LXWVu+ORTsUWIpzkg01GIuKk4EAI/lqjAgAwAA
\r
64 Cc: notmuch@notmuchmail.org
\r
65 X-BeenThere: notmuch@notmuchmail.org
\r
66 X-Mailman-Version: 2.1.13
\r
68 List-Id: "Use and development of the notmuch mail system."
\r
69 <notmuch.notmuchmail.org>
\r
70 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
71 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
72 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
73 List-Post: <mailto:notmuch@notmuchmail.org>
\r
74 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
75 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
76 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
77 X-List-Received-Date: Wed, 09 Nov 2011 13:37:48 -0000
\r
79 Quoth Jani Nikula on Nov 09 at 8:46 am:
\r
81 > FWIW, I reviewed this and didn't find any obvious problems. A few
\r
82 > nitpicks below, though.
\r
87 > On Mon, 7 Nov 2011 22:55:23 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
\r
88 > > This optimizes the user's tagging query to exclude messages that won't
\r
89 > > be affected by the tagging operation, saving computation and IO for
\r
90 > > redundant tagging operations.
\r
93 > > notmuch tag +notmuch to:notmuch@notmuchmail.org
\r
94 > > will now use the query
\r
95 > > ( to:notmuch@notmuchmail.org ) and (not tag:"notmuch")
\r
97 > > In the past, we've often suggested that people do this exact
\r
98 > > transformation by hand for slow tagging operations. This makes that
\r
101 > > I was about to implement this optimization in my initial tagging
\r
102 > > script, but then I figured, why not just do it in notmuch so we can
\r
103 > > stop telling people to do this by hand?
\r
105 > > NEWS | 9 ++++++
\r
106 > > notmuch-tag.c | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++
\r
107 > > 2 files changed, 85 insertions(+), 0 deletions(-)
\r
109 > > diff --git a/NEWS b/NEWS
\r
110 > > index e00452a..9ca5e0c 100644
\r
113 > > @@ -16,6 +16,15 @@ Add search terms to "notmuch dump"
\r
114 > > search/show/tag. The output file argument of dump is deprecated in
\r
115 > > favour of using stdout.
\r
120 > > +Automatic tag query optimization
\r
122 > > + "notmuch tag" now automatically optimizes the user's query to
\r
123 > > + exclude messages whose tags won't change. In the past, we've
\r
124 > > + suggested that people do this by hand; this is no longer necessary.
\r
126 > > Notmuch 0.9 (2011-10-01)
\r
127 > > ========================
\r
129 > > diff --git a/notmuch-tag.c b/notmuch-tag.c
\r
130 > > index dded39e..62c4bf1 100644
\r
131 > > --- a/notmuch-tag.c
\r
132 > > +++ b/notmuch-tag.c
\r
133 > > @@ -30,6 +30,76 @@ handle_sigint (unused (int sig))
\r
134 > > interrupted = 1;
\r
138 > > +_escape_tag (char *buf, const char *tag)
\r
140 > > + const char *in = tag;
\r
141 > > + char *out = buf;
\r
142 > > + /* Boolean terms surrounded by double quotes can contain any
\r
143 > > + * character. Double quotes are quoted by doubling them. */
\r
144 > > + *(out++) = '"';
\r
145 > > + while (*in) {
\r
146 > > + if (*in == '"')
\r
147 > > + *(out++) = '"';
\r
148 > > + *(out++) = *(in++);
\r
150 > > + *(out++) = '"';
\r
152 > The parenthesis are unnecessary for *p++.
\r
154 Removed. I put these in out of paranoia, but I suppose it wouldn't be
\r
155 an lvalue if it parsed differently.
\r
162 > > +_optimize_tag_query (void *ctx, const char *orig_query_string, char *argv[],
\r
163 > > + int *add_tags, int add_tags_count,
\r
164 > > + int *remove_tags, int remove_tags_count)
\r
166 > > + /* This is subtler than it looks. Xapian ignores the '-' operator
\r
167 > > + * at the beginning both queries and parenthesized groups and,
\r
168 > > + * furthermore, the presence of a '-' operator at the beginning of
\r
169 > > + * a group can inhibit parsing of the previous operator. Hence,
\r
170 > > + * the user-provided query MUST appear first, but it is safe to
\r
171 > > + * parenthesize and the exclusion part of the query must not use
\r
172 > > + * the '-' operator (though the NOT operator is fine). */
\r
174 > > + char *escaped, *query_string;
\r
175 > > + const char *join = "";
\r
177 > > + unsigned int max_tag_len = 0;
\r
179 > > + /* Allocate a buffer for escaping tags. */
\r
180 > > + for (i = 0; i < add_tags_count; i++)
\r
181 > > + if (strlen (argv[add_tags[i]] + 1) > max_tag_len)
\r
182 > > + max_tag_len = strlen (argv[add_tags[i]] + 1);
\r
183 > > + for (i = 0; i < remove_tags_count; i++)
\r
184 > > + if (strlen (argv[remove_tags[i]] + 1) > max_tag_len)
\r
185 > > + max_tag_len = strlen (argv[remove_tags[i]] + 1);
\r
186 > > + escaped = talloc_array(ctx, char, max_tag_len * 2 + 3);
\r
188 > Perhaps a comment here or above _escape_tag() explaining the worst case
\r
189 > memory consumption of strlen(tag) * 2 + 3 for a tag of "s would be in
\r
194 > It's unrelated, but looking at the above also made me check something
\r
195 > I've suspected before: notmuch allows you to have empty or zero length
\r
196 > tags "", which is probably not intentional.
\r
198 > There's no check for talloc failures here or below. But then there are
\r
199 > few checks for that in the cli in general. *shrug*.
\r
201 It's unfortunate that error handling obscures C code so much. But
\r
202 there's no sense in not handling errors, so I fixed this.
\r
205 > > + /* Build the new query string */
\r
206 > > + if (strcmp (orig_query_string, "*") == 0)
\r
207 > > + query_string = talloc_strdup (ctx, "(");
\r
209 > > + query_string = talloc_asprintf (ctx, "( %s ) and (", orig_query_string);
\r
211 > > + for (i = 0; i < add_tags_count; i++) {
\r
212 > > + query_string = talloc_asprintf_append_buffer (
\r
213 > > + query_string, "%snot tag:%s", join,
\r
214 > > + _escape_tag (escaped, argv[add_tags[i]] + 1));
\r
215 > > + join = " or ";
\r
217 > > + for (i = 0; i < remove_tags_count; i++) {
\r
218 > > + query_string = talloc_asprintf_append_buffer (
\r
219 > > + query_string, "%stag:%s", join,
\r
220 > > + _escape_tag (escaped, argv[remove_tags[i]] + 1));
\r
221 > > + join = " or ";
\r
224 > > + query_string = talloc_strdup_append_buffer (query_string, ")");
\r
226 > > + talloc_free (escaped);
\r
227 > > + return query_string;
\r
231 > > notmuch_tag_command (void *ctx, unused (int argc), unused (char *argv[]))
\r
233 > > @@ -93,6 +163,12 @@ notmuch_tag_command (void *ctx, unused (int argc), unused (char *argv[]))
\r
237 > > + /* Optimize the query so it excludes messages that already have
\r
238 > > + * the specified set of tags. */
\r
239 > > + query_string = _optimize_tag_query (ctx, query_string, argv,
\r
240 > > + add_tags, add_tags_count,
\r
241 > > + remove_tags, remove_tags_count);
\r
243 > > config = notmuch_config_open (ctx, NULL, NULL);
\r
244 > > if (config == NULL)
\r