--- /dev/null
+Return-Path: <m.walters@qmul.ac.uk>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+ by olra.theworths.org (Postfix) with ESMTP id 14435431FB6\r
+ for <notmuch@notmuchmail.org>; Mon, 31 Dec 2012 04:41:38 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: -1.098\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=-1.098 tagged_above=-999 required=5\r
+ tests=[DKIM_ADSP_CUSTOM_MED=0.001, FREEMAIL_FROM=0.001,\r
+ NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+ by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+ with ESMTP id gg92KVyxIuYK for <notmuch@notmuchmail.org>;\r
+ Mon, 31 Dec 2012 04:41:37 -0800 (PST)\r
+Received: from mail2.qmul.ac.uk (mail2.qmul.ac.uk [138.37.6.6])\r
+ (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))\r
+ (No client certificate requested)\r
+ by olra.theworths.org (Postfix) with ESMTPS id 02686431FAF\r
+ for <notmuch@notmuchmail.org>; Mon, 31 Dec 2012 04:41:37 -0800 (PST)\r
+Received: from smtp.qmul.ac.uk ([138.37.6.40])\r
+ by mail2.qmul.ac.uk with esmtp (Exim 4.71)\r
+ (envelope-from <m.walters@qmul.ac.uk>)\r
+ id 1TpegO-0006m8-DK; Mon, 31 Dec 2012 12:41:34 +0000\r
+Received: from 188.31.19.240.threembb.co.uk ([188.31.19.240] helo=localhost)\r
+ by smtp.qmul.ac.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69)\r
+ (envelope-from <m.walters@qmul.ac.uk>)\r
+ id 1TpegN-0000SS-2S; Mon, 31 Dec 2012 12:41:32 +0000\r
+From: Mark Walters <markwalters1009@gmail.com>\r
+To: Austin Clements <amdragon@MIT.EDU>, notmuch@notmuchmail.org\r
+Subject: Re: [PATCH v4 4/5] dump/restore: Use Xapian queries for batch-tag\r
+ format\r
+In-Reply-To: <1356936162-2589-5-git-send-email-amdragon@mit.edu>\r
+References: <1356936162-2589-1-git-send-email-amdragon@mit.edu>\r
+ <1356936162-2589-5-git-send-email-amdragon@mit.edu>\r
+User-Agent: Notmuch/0.14+236~g1d0044f (http://notmuchmail.org) Emacs/23.4.1\r
+ (x86_64-pc-linux-gnu)\r
+Date: Mon, 31 Dec 2012 12:41:37 +0000\r
+Message-ID: <87zk0utmv2.fsf@qmul.ac.uk>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain; charset=us-ascii\r
+X-Sender-Host-Address: 188.31.19.240\r
+X-QM-SPAM-Info: Sender has good ham record. :)\r
+X-QM-Body-MD5: 7c47a787259cd84677ab306e72d8c75a (of first 20000 bytes)\r
+X-SpamAssassin-Score: -1.8\r
+X-SpamAssassin-SpamBar: -\r
+X-SpamAssassin-Report: The QM spam filters have analysed this message to\r
+ determine if it is\r
+ spam. We require at least 5.0 points to mark a message as spam.\r
+ This message scored -1.8 points.\r
+ Summary of the scoring: \r
+ * -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/,\r
+ * medium trust\r
+ * [138.37.6.40 listed in list.dnswl.org]\r
+ * 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail\r
+ provider * (markwalters1009[at]gmail.com)\r
+ * -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay\r
+ * domain\r
+ * 0.5 AWL AWL: From: address is in the auto white-list\r
+X-QM-Scan-Virus: ClamAV says the message is clean\r
+Cc: tomi.ollila@iki.fi\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+ <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Mon, 31 Dec 2012 12:41:38 -0000\r
+\r
+On Mon, 31 Dec 2012, Austin Clements <amdragon@MIT.EDU> wrote:\r
+> This switches the new batch-tag format away from using a home-grown\r
+> hex-encoding scheme for message IDs in the dump to simply using Xapian\r
+> queries with Xapian quoting syntax.\r
+>\r
+> This has a variety of advantages beyond presenting a cleaner and more\r
+> consistent interface. Foremost is that it will dramatically simplify\r
+> the quoting for batch tagging, which shares the same input format.\r
+> While the hex-encoding is no better or worse for the simple ID queries\r
+> used by dump/restore, it becomes onerous for general-purpose queries\r
+> used in batch tagging. It also better handles strange cases like\r
+> "id:foo and bar", since this is no longer syntactically valid.\r
+\r
+This series as a whole looks good to me modulo the allocation query for\r
+parse_boolean_term and a couple of trivial points below.\r
+\r
+> ---\r
+> notmuch-dump.c | 9 +++++----\r
+> notmuch-restore.c | 22 ++++++++++------------\r
+> tag-util.c | 6 ------\r
+> test/dump-restore | 14 ++++++++------\r
+> 4 files changed, 23 insertions(+), 28 deletions(-)\r
+>\r
+> diff --git a/notmuch-dump.c b/notmuch-dump.c\r
+> index 29d79da..bf01a39 100644\r
+> --- a/notmuch-dump.c\r
+> +++ b/notmuch-dump.c\r
+> @@ -20,6 +20,7 @@\r
+> \r
+> #include "notmuch-client.h"\r
+> #include "dump-restore-private.h"\r
+> +#include "string-util.h"\r
+> \r
+> int\r
+> notmuch_dump_command (unused (void *ctx), int argc, char *argv[])\r
+> @@ -141,13 +142,13 @@ notmuch_dump_command (unused (void *ctx), int argc, char *argv[])\r
+> fprintf (stderr, "Error: cannot dump message id containing line break: %s\n", message_id);\r
+> return 1;\r
+> }\r
+> - if (hex_encode (notmuch, message_id,\r
+> - &buffer, &buffer_size) != HEX_SUCCESS) {\r
+> - fprintf (stderr, "Error: failed to hex-encode msg-id %s\n",\r
+> + if (make_boolean_term (notmuch, "id", message_id,\r
+> + &buffer, &buffer_size)) {\r
+> + fprintf (stderr, "Error: failed to quote message id %s\n",\r
+> message_id);\r
+> return 1;\r
+> }\r
+> - fprintf (output, " -- id:%s\n", buffer);\r
+> + fprintf (output, " -- %s\n", buffer);\r
+> }\r
+> \r
+> notmuch_message_destroy (message);\r
+> diff --git a/notmuch-restore.c b/notmuch-restore.c\r
+> index 9ed9b51..77a4c27 100644\r
+> --- a/notmuch-restore.c\r
+> +++ b/notmuch-restore.c\r
+> @@ -207,7 +207,7 @@ notmuch_restore_command (unused (void *ctx), int argc, char *argv[])\r
+> INTERNAL_ERROR ("compile time constant regex failed.");\r
+> \r
+> do {\r
+> - char *query_string;\r
+> + char *query_string, *prefix, *term;\r
+> \r
+> if (line_ctx != NULL)\r
+> talloc_free (line_ctx);\r
+> @@ -220,19 +220,17 @@ notmuch_restore_command (unused (void *ctx), int argc, char *argv[])\r
+> &query_string, tag_ops);\r
+> \r
+> if (ret == 0) {\r
+> - if (strncmp ("id:", query_string, 3) != 0) {\r
+> - fprintf (stderr, "Warning: unsupported query: %s\n", query_string);\r
+> + ret = parse_boolean_term (line_ctx, query_string,\r
+> + &prefix, &term);\r
+> + if (ret) {\r
+> + fprintf (stderr, "Warning: cannot parse query: %s\n",\r
+> + query_string);\r
+> + continue;\r
+> + } else if (strcmp ("id", prefix) != 0) {\r
+> + fprintf (stderr, "Warning: not an id query: %s\n", query_string);\r
+> continue;\r
+\r
+I think it would be worth adding "skipping" or something similar to this\r
+fprintf as it may not be obvious whether we warn but tag anyway or warn\r
+and skip. Perhaps also add it to the previous one but there it is\r
+obvious we can't do anything but skip.\r
+\r
+> }\r
+> - /* delete id: from front of string; tag_message\r
+> - * expects a raw message-id.\r
+> - *\r
+> - * XXX: Note that query string id:foo and bar will be\r
+> - * interpreted as a message id "foo and bar". This\r
+> - * should eventually be fixed to give a better error\r
+> - * message.\r
+> - */\r
+> - query_string = query_string + 3;\r
+> + query_string = term;\r
+> }\r
+> }\r
+> \r
+> diff --git a/tag-util.c b/tag-util.c\r
+> index 705b7ba..e4e5dda 100644\r
+> --- a/tag-util.c\r
+> +++ b/tag-util.c\r
+> @@ -124,12 +124,6 @@ parse_tag_line (void *ctx, char *line,\r
+> }\r
+> \r
+> /* tok now points to the query string */\r
+> - if (hex_decode_inplace (tok) != HEX_SUCCESS) {\r
+> - ret = line_error (TAG_PARSE_INVALID, line_for_error,\r
+> - "hex decoding of query %s failed", tok);\r
+> - goto DONE;\r
+> - }\r
+> -\r
+> *query_string = tok;\r
+> \r
+> DONE:\r
+> diff --git a/test/dump-restore b/test/dump-restore\r
+> index 6a989b6..f9ae5b3 100755\r
+> --- a/test/dump-restore\r
+> +++ b/test/dump-restore\r
+> @@ -195,23 +195,25 @@ a\r
+> \r
+> # the previous line was blank; also no yelling please\r
+> +%zz -- id:whatever\r
+> -+e +f id:%yy\r
+> ++e +f id:"\r
+> ++e +f tag:abc\r
+\r
+It might be worth adding some more test lines here to test the various\r
+paths in parse_boolean_term: along the lines of\r
++e -- id:some)stuff\r
++e -- id:some"stuff\r
++e -- id:some stuff\r
++e -- id:"a_message_id_with""_a_quote"\r
++e -- id:"a message id with spaces"\r
+\r
+One other thing that is noticeable from the errors is that most of the\r
+rest of the errors are very informative but the parse_boolean_term one\r
+is relatively uninformative: it just says we cannot parse the id even\r
+though we know rather more about what the error is (trailing text, no\r
+closing quote, illegal character in an unquoted id etc). I am happy with\r
+it how it is but perhaps David Bremner might like to comment?\r
+\r
+Best wishes\r
+\r
+Mark\r
+\r
+\r
+\r
+\r
+> # the next non-comment line should report an an empty tag error for\r
+> # batch tagging, but not for restore\r
+> + +e -- id:20091117232137.GA7669@griffis1.net\r
+> -# highlight the sketchy id parsing; this should be last\r
+> -+g -- id:foo and bar\r
+> +# valid id, but warning about missing message\r
+> ++e id:missing_message_id\r
+> EOF\r
+> \r
+> cat <<EOF > EXPECTED\r
+> -Warning: unsupported query: a\r
+> +Warning: cannot parse query: a\r
+> Warning: no query string [+0]\r
+> Warning: no query string [+a +b]\r
+> Warning: missing query string [+a +b ]\r
+> Warning: no query string after -- [+c +d --]\r
+> Warning: hex decoding of tag %zz failed [+%zz -- id:whatever]\r
+> -Warning: hex decoding of query id:%yy failed [+e +f id:%yy]\r
+> -Warning: cannot apply tags to missing message: foo and bar\r
+> +Warning: cannot parse query: id:"\r
+> +Warning: not an id query: tag:abc\r
+> +Warning: cannot apply tags to missing message: missing_message_id\r
+> EOF\r
+> \r
+> test_expect_equal_file EXPECTED OUTPUT\r
+> -- \r
+> 1.7.10.4\r