Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 402E3431FBC for ; Sat, 15 Dec 2012 14:20:51 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DpbKc0TzkIEn for ; Sat, 15 Dec 2012 14:20:49 -0800 (PST) Received: from mail-lb0-f181.google.com (mail-lb0-f181.google.com [209.85.217.181]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 53C8D431FB6 for ; Sat, 15 Dec 2012 14:20:49 -0800 (PST) Received: by mail-lb0-f181.google.com with SMTP id ge1so3662168lbb.26 for ; Sat, 15 Dec 2012 14:20:47 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:in-reply-to:references:user-agent:date :message-id:mime-version:content-type:x-gm-message-state; bh=PLnt+C2UQKPR8q12fFktFhDO5bw/oqGtj4oRH1Ud4V8=; b=bfmkqzd/TvyoYNFWuzqQ2aroZfKHKU/mlTwqi/UINi2WUcsgrUMRYGhlvrF/bNT2fe In+F++xbmaIQw8djPbVSCZN7ooFjoYT6T9iAD36tmexGD3B3ItmoLUvj4VQTwyb33vDY t0naeOaveAhxaGAPJlUonLcwxVJiSohySTH5/h56i4toMymHVGG12UR8+m6FsRYPLAui XLflMss2HdFyKpgvxLFW/nE+66KTrGADyBCNwwAPCsjDnLn1GLmgTiBnDUELTCx4q7Uz Qjx0K4RboOeOe3l5gx7eoSSLg7QNGGRHx7JGNCvUPloLm4x3sJxiXb3mGGBWXj9yTU1p N6Cw== Received: by 10.112.50.43 with SMTP id z11mr3984586lbn.36.1355610046340; Sat, 15 Dec 2012 14:20:46 -0800 (PST) Received: from localhost (dsl-hkibrasgw4-50df51-27.dhcp.inet.fi. [80.223.81.27]) by mx.google.com with ESMTPS id fb1sm3247399lbb.15.2012.12.15.14.20.43 (version=SSLv3 cipher=OTHER); Sat, 15 Dec 2012 14:20:44 -0800 (PST) From: Jani Nikula To: david@tethera.net, notmuch@notmuchmail.org Subject: Re: [Patch v7 04/14] notmuch-tag: factor out double quoting routine In-Reply-To: <1355492062-7546-5-git-send-email-david@tethera.net> References: <1355492062-7546-1-git-send-email-david@tethera.net> <1355492062-7546-5-git-send-email-david@tethera.net> User-Agent: Notmuch/0.14+138~g7041c56 (http://notmuchmail.org) Emacs/23.4.1 (i686-pc-linux-gnu) Date: Sun, 16 Dec 2012 00:20:42 +0200 Message-ID: <87zk1fot39.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Gm-Message-State: ALoCoQlvGOEE9k1xDPF+Wk/Oxa/6vrQm4o74sMNA/RWPvneQWBmnF5SgQng57vSMJs3RXjlvIdOD Cc: David Bremner X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 15 Dec 2012 22:20:51 -0000 On Fri, 14 Dec 2012, david@tethera.net wrote: > From: David Bremner > > This could live in tag-util as well, but it is really nothing specific > to tags (although the conventions are arguable specific to Xapian). > > The API is changed from "caller-allocates" to "readline-like". The scan for > max tag length is pushed down into the double quoting routine. > --- > notmuch-tag.c | 50 ++++++++++++++++---------------------------------- > util/string-util.c | 34 ++++++++++++++++++++++++++++++++++ > util/string-util.h | 8 ++++++++ > 3 files changed, 58 insertions(+), 34 deletions(-) > > diff --git a/notmuch-tag.c b/notmuch-tag.c > index 0965ee7..13f2268 100644 > --- a/notmuch-tag.c > +++ b/notmuch-tag.c > @@ -20,6 +20,7 @@ > > #include "notmuch-client.h" > #include "tag-util.h" > +#include "string-util.h" > > static volatile sig_atomic_t interrupted; > > @@ -37,25 +38,6 @@ handle_sigint (unused (int sig)) > } > > static char * > -_escape_tag (char *buf, const char *tag) > -{ > - const char *in = tag; > - char *out = buf; > - > - /* Boolean terms surrounded by double quotes can contain any > - * character. Double quotes are quoted by doubling them. */ > - *out++ = '"'; > - while (*in) { > - if (*in == '"') > - *out++ = '"'; > - *out++ = *in++; > - } > - *out++ = '"'; > - *out = 0; > - return buf; > -} > - > -static char * > _optimize_tag_query (void *ctx, const char *orig_query_string, > const tag_op_list_t *list) > { > @@ -67,44 +49,44 @@ _optimize_tag_query (void *ctx, const char *orig_query_string, > * parenthesize and the exclusion part of the query must not use > * the '-' operator (though the NOT operator is fine). */ > > - char *escaped, *query_string; > + char *escaped = NULL; > + size_t escaped_len = 0; > + char *query_string; > const char *join = ""; > size_t i; > - unsigned int max_tag_len = 0; > > /* Don't optimize if there are no tag changes. */ > if (tag_op_list_size (list) == 0) > return talloc_strdup (ctx, orig_query_string); > > - /* Allocate a buffer for escaping tags. This is large enough to > - * hold a fully escaped tag with every character doubled plus > - * enclosing quotes and a NUL. */ > - for (i = 0; i < tag_op_list_size (list); i++) > - if (strlen (tag_op_list_tag (list, i)) > max_tag_len) > - max_tag_len = strlen (tag_op_list_tag (list, i)); > - > - escaped = talloc_array (ctx, char, max_tag_len * 2 + 3); > - if (! escaped) > - return NULL; > - > /* Build the new query string */ > if (strcmp (orig_query_string, "*") == 0) > query_string = talloc_strdup (ctx, "("); > else > query_string = talloc_asprintf (ctx, "( %s ) and (", orig_query_string); > > + > + /* Boolean terms surrounded by double quotes can contain any > + * character. Double quotes are quoted by doubling them. */ > + > for (i = 0; i < tag_op_list_size (list) && query_string; i++) { > + double_quote_str (ctx, > + tag_op_list_tag (list, i), > + &escaped, &escaped_len); Check return value? > + > query_string = talloc_asprintf_append_buffer ( > query_string, "%s%stag:%s", join, > tag_op_list_isremove (list, i) ? "" : "not ", > - _escape_tag (escaped, tag_op_list_tag (list, i))); > + escaped); > join = " or "; > } > > if (query_string) > query_string = talloc_strdup_append_buffer (query_string, ")"); > > - talloc_free (escaped); > + if (escaped) > + talloc_free (escaped); > + > return query_string; > } > > diff --git a/util/string-util.c b/util/string-util.c > index 44f8cd3..ea7c25b 100644 > --- a/util/string-util.c > +++ b/util/string-util.c > @@ -20,6 +20,7 @@ > > > #include "string-util.h" > +#include "talloc.h" > > char * > strtok_len (char *s, const char *delim, size_t *len) > @@ -32,3 +33,36 @@ strtok_len (char *s, const char *delim, size_t *len) > > return *len ? s : NULL; > } > + > + > +int > +double_quote_str (void *ctx, const char *str, > + char **buf, size_t *len) > +{ > + const char *in; > + char *out; > + size_t needed = 3; > + > + for (in = str; *in; in++) > + needed += (*in == '"') ? 2 : 1; > + > + if (needed > *len) > + *buf = talloc_realloc (ctx, *buf, char, 2*needed); You fail to set *len to 2*needed, leading to doing realloc every time. Also, I think you should follow the getline pattern like you did in hex_encode: if *buf == NULL, the input value of *len is ignored. BR, Jani. > + > + if (! *buf) > + return 1; > + > + out = *buf; > + > + *out++ = '"'; > + in = str; > + while (*in) { > + if (*in == '"') > + *out++ = '"'; > + *out++ = *in++; > + } > + *out++ = '"'; > + *out = 0; > + > + return 0; > +} > diff --git a/util/string-util.h b/util/string-util.h > index ac7676c..b593bc7 100644 > --- a/util/string-util.h > +++ b/util/string-util.h > @@ -19,4 +19,12 @@ > > char *strtok_len (char *s, const char *delim, size_t *len); > > +/* Copy str to dest, surrounding with double quotes. > + * Any internal double-quotes are doubled, i.e. a"b -> "a""b" > + * > + * Output is into buf; it may be talloc_realloced > + * return 0 on success, non-zero on failure. > + */ > +int double_quote_str (void *talloc_ctx, const char *str, > + char **buf, size_t *len); > #endif > -- > 1.7.10.4 > > _______________________________________________ > notmuch mailing list > notmuch@notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch