Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 20626431FD9 for ; Mon, 24 Dec 2012 21:58:47 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 1.151 X-Spam-Level: * X-Spam-Status: No, score=1.151 tagged_above=-999 required=5 tests=[FUZZY_AMBIEN=1.851, RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id s9UJxaGqfk9A for ; Mon, 24 Dec 2012 21:58:45 -0800 (PST) Received: from dmz-mailsec-scanner-8.mit.edu (DMZ-MAILSEC-SCANNER-8.MIT.EDU [18.7.68.37]) by olra.theworths.org (Postfix) with ESMTP id 950A2429E31 for ; Mon, 24 Dec 2012 21:58:23 -0800 (PST) X-AuditID: 12074425-b7ff26d000007f8d-e0-50d9407fce22 Received: from mailhub-auth-4.mit.edu ( [18.7.62.39]) by dmz-mailsec-scanner-8.mit.edu (Symantec Messaging Gateway) with SMTP id E6.5F.32653.F7049D05; Tue, 25 Dec 2012 00:58:23 -0500 (EST) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id qBP5wCPk008905; Tue, 25 Dec 2012 00:58:12 -0500 Received: from drake.dyndns.org (c-76-21-105-205.hsd1.ca.comcast.net [76.21.105.205]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id qBP5w7Zb011704 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Tue, 25 Dec 2012 00:58:10 -0500 (EST) Received: from amthrax by drake.dyndns.org with local (Exim 4.77) (envelope-from ) id 1TnNWg-0001Xo-Ge; Tue, 25 Dec 2012 00:58:06 -0500 From: Austin Clements To: notmuch@notmuchmail.org Subject: [PATCH 1/5] util: Factor out boolean term quoting routine Date: Tue, 25 Dec 2012 00:57:52 -0500 Message-Id: <1356415076-5692-2-git-send-email-amdragon@mit.edu> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1356415076-5692-1-git-send-email-amdragon@mit.edu> References: <1356415076-5692-1-git-send-email-amdragon@mit.edu> X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrPIsWRmVeSWpSXmKPExsUixG6nrlvvcDPAYPt5Hosbrd2MFk3TnS1W z+WxuH5zJrMDi8fOWXfZPW7df83u8WzVLWaPLYfeMwewRHHZpKTmZJalFunbJXBl3N7ZyVyw Sq/iys5JjA2MR5W6GDk5JARMJM4+O8sGYYtJXLi3HswWEtjHKPHyIVMXIxeQvYFR4vCsOSwQ zkUmicO3N7BBOHMZJVac3Q7WwiagIbFt/3JGEFtEQFpi593ZrCA2s0CexMNHW9lBbGEBJ4lJ fddZQGwWAVWJyw9ugtm8AvYSW293s0KcoSjR/WwC2ExOAQeJuX9WskCcZC/xde5mlgmM/AsY GVYxyqbkVunmJmbmFKcm6xYnJ+blpRbpWujlZpbopaaUbmIEh5uL6g7GCYeUDjEKcDAq8fAW TL4RIMSaWFZcmXuIUZKDSUmU97XdzQAhvqT8lMqMxOKM+KLSnNTiQ4wSHMxKIrwmPEA53pTE yqrUonyYlDQHi5I4742Um/5CAumJJanZqakFqUUwWRkODiUJ3lJ7oEbBotT01Iq0zJwShDQT ByfIcB6g4aEgNbzFBYm5xZnpEPlTjIpS4ryrQRICIImM0jy4Xlg6eMUoDvSKMK8PSBUPMJXA db8CGswENDiW7wbI4JJEhJRUA+MZkafVnN7vJkVGvWfSS0y8sG9l8aurz8srRaWKpy/ifajZ fbWkjeWOWeq3j/57bSvi6xKiN565za7i7aT6115+rqTgHf6Jq92OB7Wey2aTepKnvj4kdMFC L874lceOhK74dv6Sy6xzT8V7Pi1/P22Lozb70ltf/m141/OePWIimwXf1a+6S6YqsRRnJBpq MRcVJwIAIXpz5OICAAA= X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Dec 2012 05:58:48 -0000 From: Austin Clements This is now a generic boolean term quoting function. It performs minimal quoting to produce user-friendly queries. This could live in tag-util as well, but it is really nothing specific to tags (although the conventions are specific to Xapian). The API is changed from "caller-allocates" to "readline-like". The scan for max tag length is pushed down into the quoting routine. Furthermore, this now combines the term prefix with the quoted term; arguably this is just as easy to do in the caller, but this will nicely parallel the boolean term parsing function to be introduced shortly. This is an amalgamation of code written by David Bremner and myself. --- notmuch-tag.c | 48 ++++++++++++---------------------------- util/string-util.c | 62 ++++++++++++++++++++++++++++++++++++++++++++++++++++ util/string-util.h | 9 ++++++++ 3 files changed, 85 insertions(+), 34 deletions(-) diff --git a/notmuch-tag.c b/notmuch-tag.c index 88d559b..fc9d43a 100644 --- a/notmuch-tag.c +++ b/notmuch-tag.c @@ -19,6 +19,7 @@ */ #include "notmuch-client.h" +#include "string-util.h" static volatile sig_atomic_t interrupted; @@ -35,25 +36,6 @@ handle_sigint (unused (int sig)) interrupted = 1; } -static char * -_escape_tag (char *buf, const char *tag) -{ - const char *in = tag; - char *out = buf; - - /* Boolean terms surrounded by double quotes can contain any - * character. Double quotes are quoted by doubling them. */ - *out++ = '"'; - while (*in) { - if (*in == '"') - *out++ = '"'; - *out++ = *in++; - } - *out++ = '"'; - *out = 0; - return buf; -} - typedef struct { const char *tag; notmuch_bool_t remove; @@ -71,25 +53,16 @@ _optimize_tag_query (void *ctx, const char *orig_query_string, * parenthesize and the exclusion part of the query must not use * the '-' operator (though the NOT operator is fine). */ - char *escaped, *query_string; + char *escaped = NULL; + size_t escaped_len = 0; + char *query_string; const char *join = ""; - int i; - unsigned int max_tag_len = 0; + size_t i; /* Don't optimize if there are no tag changes. */ if (tag_ops[0].tag == NULL) return talloc_strdup (ctx, orig_query_string); - /* Allocate a buffer for escaping tags. This is large enough to - * hold a fully escaped tag with every character doubled plus - * enclosing quotes and a NUL. */ - for (i = 0; tag_ops[i].tag; i++) - if (strlen (tag_ops[i].tag) > max_tag_len) - max_tag_len = strlen (tag_ops[i].tag); - escaped = talloc_array (ctx, char, max_tag_len * 2 + 3); - if (! escaped) - return NULL; - /* Build the new query string */ if (strcmp (orig_query_string, "*") == 0) query_string = talloc_strdup (ctx, "("); @@ -97,10 +70,17 @@ _optimize_tag_query (void *ctx, const char *orig_query_string, query_string = talloc_asprintf (ctx, "( %s ) and (", orig_query_string); for (i = 0; tag_ops[i].tag && query_string; i++) { + /* XXX in case of OOM, query_string will be deallocated when + * ctx is, which might be at shutdown */ + if (make_boolean_term (ctx, + "tag", tag_ops[i].tag, + &escaped, &escaped_len)) + return NULL; + query_string = talloc_asprintf_append_buffer ( - query_string, "%s%stag:%s", join, + query_string, "%s%s%s", join, tag_ops[i].remove ? "" : "not ", - _escape_tag (escaped, tag_ops[i].tag)); + escaped); join = " or "; } diff --git a/util/string-util.c b/util/string-util.c index 44f8cd3..161a4dd 100644 --- a/util/string-util.c +++ b/util/string-util.c @@ -20,6 +20,7 @@ #include "string-util.h" +#include "talloc.h" char * strtok_len (char *s, const char *delim, size_t *len) @@ -32,3 +33,64 @@ strtok_len (char *s, const char *delim, size_t *len) return *len ? s : NULL; } + +int +make_boolean_term (void *ctx, const char *prefix, const char *term, + char **buf, size_t *len) +{ + const char *in; + char *out; + size_t needed = 3; + int need_quoting = 0; + + /* Do we need quoting? */ + for (in = term; *in && !need_quoting; in++) + if (*in <= ' ' || *in == ')' || *in == '"') + need_quoting = 1; + + if (need_quoting) + for (in = term; *in; in++) + needed += (*in == '"') ? 2 : 1; + else + needed = strlen (term) + 1; + + /* Reserve space for the prefix */ + if (prefix) + needed += strlen (prefix) + 1; + + if ((*buf == NULL) || (needed > *len)) { + *len = 2 * needed; + *buf = talloc_realloc (ctx, *buf, char, *len); + } + + if (! *buf) + return 1; + + out = *buf; + + /* Copy in the prefix */ + if (prefix) { + strcpy (out, prefix); + out += strlen (prefix); + *out++ = ':'; + } + + if (! need_quoting) { + strcpy (out, term); + return 0; + } + + /* Quote term by enclosing it in double quotes and doubling any + * internal double quotes. */ + *out++ = '"'; + in = term; + while (*in) { + if (*in == '"') + *out++ = '"'; + *out++ = *in++; + } + *out++ = '"'; + *out = '\0'; + + return 0; +} diff --git a/util/string-util.h b/util/string-util.h index ac7676c..7475e2c 100644 --- a/util/string-util.h +++ b/util/string-util.h @@ -19,4 +19,13 @@ char *strtok_len (char *s, const char *delim, size_t *len); +/* Construct a boolean term query with the specified prefix (e.g., + * "id") and search term, quoting term as necessary. + * + * Output is into buf; it may be talloc_realloced. + * Return: 0 on success, non-zero on memory allocation failure. + */ +int make_boolean_term (void *talloc_ctx, const char *prefix, const char *term, + char **buf, size_t *len); + #endif -- 1.7.10.4