Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id C68EE431FAF for ; Sun, 6 Jan 2013 12:23:13 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ktdQvEuy5Zas for ; Sun, 6 Jan 2013 12:23:12 -0800 (PST) Received: from dmz-mailsec-scanner-6.mit.edu (DMZ-MAILSEC-SCANNER-6.MIT.EDU [18.7.68.35]) by olra.theworths.org (Postfix) with ESMTP id E6FAE431FAE for ; Sun, 6 Jan 2013 12:23:11 -0800 (PST) X-AuditID: 12074423-b7ef96d000000725-20-50e9dd2ed548 Received: from mailhub-auth-4.mit.edu ( [18.7.62.39]) by dmz-mailsec-scanner-6.mit.edu (Symantec Messaging Gateway) with SMTP id A6.D1.01829.E2DD9E05; Sun, 6 Jan 2013 15:23:10 -0500 (EST) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id r06KN9tj012714; Sun, 6 Jan 2013 15:23:09 -0500 Received: from drake.dyndns.org (a069.catapulsion.net [70.36.81.69]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id r06KMqP7020340 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Sun, 6 Jan 2013 15:23:02 -0500 (EST) Received: from amthrax by drake.dyndns.org with local (Exim 4.77) (envelope-from ) id 1Trwk5-0007Y6-EE; Sun, 06 Jan 2013 15:22:49 -0500 From: Austin Clements To: notmuch@notmuchmail.org Subject: [PATCH v5 0/6] Use Xapian query syntax for batch-tag dump/restore Date: Sun, 6 Jan 2013 15:22:36 -0500 Message-Id: <1357503762-28759-1-git-send-email-amdragon@mit.edu> X-Mailer: git-send-email 1.7.10.4 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrEIsWRmVeSWpSXmKPExsUixG6nrqt392WAwdmnghY3WrsZLZqmO1us nstjcf3mTGaLNyvnsTqweuycdZfd4/DXhSwet+6/Zvd4tuoWs8eWQ++ZA1ijuGxSUnMyy1KL 9O0SuDKmLHnFWLA5oGLC4nlMDYz7bLsYOTkkBEwkFt2/wAZhi0lcuLceyObiEBLYxyixceMr VghnPaPEq1/PoDL7mSQmTFjOAuHMZZR48mozC0g/m4CGxLb9yxlBbBEBaYmdd2ezgtjMAnES Ky8tZgexhQW8JKb0TAXbxyKgKtHyfDIziM0r4CBx/EMX1B2KEt3PJrBNYORdwMiwilE2JbdK NzcxM6c4NVm3ODkxLy+1SNdMLzezRC81pXQTIzioXJR3MP45qHSIUYCDUYmH98LOFwFCrIll xZW5hxglOZiURHl3X3wZIMSXlJ9SmZFYnBFfVJqTWnyIUYKDWUmEd98xoBxvSmJlVWpRPkxK moNFSZz3WspNfyGB9MSS1OzU1ILUIpisDAeHkgSv5B2gRsGi1PTUirTMnBKENBMHJ8hwHqDh L2+DDC8uSMwtzkyHyJ9iVJQS55UBaRYASWSU5sH1wqL+FaM40CvCvAYgVTzAhAHX/QpoMBPQ 4NTHz0EGlyQipKQaGB07rOXapmw7f/pHUNaS4utrqnfPtFn4oFNO5vzLa0bFleueiTHszt0a v3/jxpi1X6/wXLHj2nCPQz3zzDoD811cnj8WnOIySkwvyDLleMwYuEgn0feNuJXIOr0LyQVB ZvvdZ2du8Py8f7Vuy44UL/livkb3EK+jzcyRUdeWiRY7ZOWHRC3mUGIpzkg01GIuKk4EAGpD +tDVAgAA Cc: tomi.ollila@iki.fi X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 06 Jan 2013 20:23:13 -0000 This obsoletes id:1356936162-2589-1-git-send-email-amdragon@mit.edu v5 should address all of the comments on v4 except those I specifically replied to (via the ML or IRC). It also adds a new patch at the beginning that makes missing message IDs non-fatal in restore, like they were in 0.14. This patch can be pushed separately; it's in this series because later tests rely on it. The diff from v4 follows. diff --git a/notmuch-dump.c b/notmuch-dump.c index bf01a39..a3244e0 100644 --- a/notmuch-dump.c +++ b/notmuch-dump.c @@ -103,6 +103,18 @@ notmuch_dump_command (unused (void *ctx), int argc, char *argv[]) message = notmuch_messages_get (messages); message_id = notmuch_message_get_message_id (message); + if (output_format == DUMP_FORMAT_BATCH_TAG && + strchr (message_id, '\n')) { + /* This will produce a line break in the output, which + * would be difficult to handle in tools. However, it's + * also impossible to produce an email containing a line + * break in a message ID because of unfolding, so we can + * safely disallow it. */ + fprintf (stderr, "Warning: skipping message id containing line break: \"%s\"\n", message_id); + notmuch_message_destroy (message); + continue; + } + if (output_format == DUMP_FORMAT_SUP) { fprintf (output, "%s (", message_id); } @@ -133,19 +145,10 @@ notmuch_dump_command (unused (void *ctx), int argc, char *argv[]) if (output_format == DUMP_FORMAT_SUP) { fputs (")\n", output); } else { - if (strchr (message_id, '\n')) { - /* This will produce a line break in the output, which - * would be difficult to handle in tools. However, - * it's also impossible to produce an email containing - * a line break in a message ID because of unfolding, - * so we can safely disallow it. */ - fprintf (stderr, "Error: cannot dump message id containing line break: %s\n", message_id); - return 1; - } if (make_boolean_term (notmuch, "id", message_id, &buffer, &buffer_size)) { - fprintf (stderr, "Error: failed to quote message id %s\n", - message_id); + fprintf (stderr, "Error quoting message id %s: %s\n", + message_id, strerror (errno)); return 1; } fprintf (output, " -- %s\n", buffer); diff --git a/notmuch-restore.c b/notmuch-restore.c index 77a4c27..81d4d98 100644 --- a/notmuch-restore.c +++ b/notmuch-restore.c @@ -26,7 +26,8 @@ static regex_t regex; /* Non-zero return indicates an error in retrieving the message, - * or in applying the tags. + * or in applying the tags. Missing messages are reported, but not + * considered errors. */ static int tag_message (unused (void *ctx), @@ -40,13 +41,17 @@ tag_message (unused (void *ctx), int ret = 0; status = notmuch_database_find_message (notmuch, message_id, &message); - if (status || message == NULL) { - fprintf (stderr, "Warning: cannot apply tags to %smessage: %s\n", - message ? "" : "missing ", message_id); - if (status) - fprintf (stderr, "%s\n", notmuch_status_to_string (status)); + if (status) { + fprintf (stderr, "Error applying tags to message %s: %s\n", + message_id, notmuch_status_to_string (status)); return 1; } + if (message == NULL) { + fprintf (stderr, "Warning: cannot apply tags to missing message: %s\n", + message_id); + /* We consider this a non-fatal error. */ + return 0; + } /* In order to detect missing messages, this check/optimization is * intentionally done *after* first finding the message. */ @@ -222,12 +227,17 @@ notmuch_restore_command (unused (void *ctx), int argc, char *argv[]) if (ret == 0) { ret = parse_boolean_term (line_ctx, query_string, &prefix, &term); - if (ret) { - fprintf (stderr, "Warning: cannot parse query: %s\n", - query_string); + if (ret && errno == EINVAL) { + fprintf (stderr, "Warning: cannot parse query: %s (skipping)\n", query_string); continue; + } else if (ret) { + /* This is more fatal (e.g., out of memory) */ + fprintf (stderr, "Error parsing query: %s\n", + strerror (errno)); + ret = 1; + break; } else if (strcmp ("id", prefix) != 0) { - fprintf (stderr, "Warning: not an id query: %s\n", query_string); + fprintf (stderr, "Warning: not an id query: %s (skipping)\n", query_string); continue; } query_string = term; diff --git a/test/dump-restore b/test/dump-restore index f9ae5b3..f076c12 100755 --- a/test/dump-restore +++ b/test/dump-restore @@ -202,18 +202,32 @@ a + +e -- id:20091117232137.GA7669@griffis1.net # valid id, but warning about missing message +e id:missing_message_id +# exercise parser ++e -- id:some)stuff ++e -- id:some stuff ++e -- id:some"stuff ++e -- id:"a_message_id_with""_a_quote" ++e -- id:"a message id with spaces" ++e -- id:an_id_with_leading_and_trailing_ws \ + EOF cat < EXPECTED -Warning: cannot parse query: a +Warning: cannot parse query: a (skipping) Warning: no query string [+0] Warning: no query string [+a +b] Warning: missing query string [+a +b ] Warning: no query string after -- [+c +d --] Warning: hex decoding of tag %zz failed [+%zz -- id:whatever] -Warning: cannot parse query: id:" -Warning: not an id query: tag:abc +Warning: cannot parse query: id:" (skipping) +Warning: not an id query: tag:abc (skipping) Warning: cannot apply tags to missing message: missing_message_id +Warning: cannot parse query: id:some)stuff (skipping) +Warning: cannot parse query: id:some stuff (skipping) +Warning: cannot apply tags to missing message: some"stuff +Warning: cannot apply tags to missing message: a_message_id_with"_a_quote +Warning: cannot apply tags to missing message: a message id with spaces +Warning: cannot apply tags to missing message: an_id_with_leading_and_trailing_ws EOF test_expect_equal_file EXPECTED OUTPUT diff --git a/util/string-util.c b/util/string-util.c index 52c7781..aba9aa8 100644 --- a/util/string-util.c +++ b/util/string-util.c @@ -23,6 +23,7 @@ #include "talloc.h" #include +#include char * strtok_len (char *s, const char *delim, size_t *len) @@ -36,6 +37,12 @@ strtok_len (char *s, const char *delim, size_t *len) return *len ? s : NULL; } +static int +is_unquoted_terminator (unsigned char c) +{ + return c == 0 || c <= ' ' || c == ')'; +} + int make_boolean_term (void *ctx, const char *prefix, const char *term, char **buf, size_t *len) @@ -49,7 +56,8 @@ make_boolean_term (void *ctx, const char *prefix, const char *term, * containing a quote, even though it only matters at the * beginning, and anything containing non-ASCII text. */ for (in = term; *in && !need_quoting; in++) - if (*in <= ' ' || *in == ')' || *in == '"' || (unsigned char)*in > 127) + if (is_unquoted_terminator (*in) || *in == '"' + || (unsigned char)*in > 127) need_quoting = 1; if (need_quoting) @@ -67,8 +75,10 @@ make_boolean_term (void *ctx, const char *prefix, const char *term, *buf = talloc_realloc (ctx, *buf, char, *len); } - if (! *buf) - return 1; + if (! *buf) { + errno = ENOMEM; + return -1; + } out = *buf; @@ -102,7 +112,7 @@ make_boolean_term (void *ctx, const char *prefix, const char *term, static const char* skip_space (const char *str) { - while (*str && isspace (*str)) + while (*str && isspace ((unsigned char) *str)) ++str; return str; } @@ -111,6 +121,7 @@ int parse_boolean_term (void *ctx, const char *str, char **prefix_out, char **term_out) { + int err = EINVAL; *prefix_out = *term_out = NULL; /* Parse prefix */ @@ -119,12 +130,20 @@ parse_boolean_term (void *ctx, const char *str, if (! pos) goto FAIL; *prefix_out = talloc_strndup (ctx, str, pos - str); + if (! *prefix_out) { + err = ENOMEM; + goto FAIL; + } ++pos; /* Implement de-quoting compatible with make_boolean_term. */ if (*pos == '"') { char *out = talloc_array (ctx, char, strlen (pos)); int closed = 0; + if (! out) { + err = ENOMEM; + goto FAIL; + } *term_out = out; /* Skip the opening quote, find the closing quote, and * un-double doubled internal quotes. */ @@ -148,18 +167,25 @@ parse_boolean_term (void *ctx, const char *str, } else { const char *start = pos; /* Check for text after the boolean term. */ - while (*pos > ' ' && *pos != ')') + while (! is_unquoted_terminator (*pos)) ++pos; - if (*skip_space (pos)) + if (*skip_space (pos)) { + err = EINVAL; goto FAIL; + } /* No trailing text; dup the string so the caller can free * it. */ *term_out = talloc_strndup (ctx, start, pos - start); + if (! *term_out) { + err = ENOMEM; + goto FAIL; + } } return 0; FAIL: talloc_free (*prefix_out); talloc_free (*term_out); - return 1; + errno = err; + return -1; } diff --git a/util/string-util.h b/util/string-util.h index 8b9fe50..0194607 100644 --- a/util/string-util.h +++ b/util/string-util.h @@ -28,7 +28,8 @@ char *strtok_len (char *s, const char *delim, size_t *len); * can be parsed by parse_boolean_term. * * Output is into buf; it may be talloc_realloced. - * Return: 0 on success, non-zero on memory allocation failure. + * Return: 0 on success, -1 on error. errno will be set to ENOMEM if + * there is an allocation failure. */ int make_boolean_term (void *talloc_ctx, const char *prefix, const char *term, char **buf, size_t *len); @@ -42,7 +43,8 @@ int make_boolean_term (void *talloc_ctx, const char *prefix, const char *term, * of the quoting styles supported by Xapian (and hence notmuch). * *prefix_out and *term_out will be talloc'd with context ctx. * - * Return: 0 on success, non-zero on parse error. + * Return: 0 on success, -1 on error. errno will be set to EINVAL if + * there is a parse error or ENOMEM if there is an allocation failure. */ int parse_boolean_term (void *ctx, const char *str,