Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 0018A431FAF for ; Sun, 23 Dec 2012 00:19:30 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -1.098 X-Spam-Level: X-Spam-Status: No, score=-1.098 tagged_above=-999 required=5 tests=[DKIM_ADSP_CUSTOM_MED=0.001, FREEMAIL_FROM=0.001, NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MlAfxwpVKgnL for ; Sun, 23 Dec 2012 00:19:28 -0800 (PST) Received: from mail2.qmul.ac.uk (mail2.qmul.ac.uk [138.37.6.6]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 66E6E431FAE for ; Sun, 23 Dec 2012 00:19:28 -0800 (PST) Received: from smtp.qmul.ac.uk ([138.37.6.40]) by mail2.qmul.ac.uk with esmtp (Exim 4.71) (envelope-from ) id 1TmgmG-0005A0-3l; Sun, 23 Dec 2012 08:19:22 +0000 Received: from 93-97-24-31.zone5.bethere.co.uk ([93.97.24.31] helo=localhost) by smtp.qmul.ac.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69) (envelope-from ) id 1TmgmF-0004BE-LS; Sun, 23 Dec 2012 08:19:19 +0000 From: Mark Walters To: david@tethera.net, notmuch@notmuchmail.org Subject: Re: [PATCH] simplify unhex_and_quote In-Reply-To: <1356231570-28232-1-git-send-email-david@tethera.net> References: <87txrdhd7g.fsf@oiva.home.nikula.org> <1356231570-28232-1-git-send-email-david@tethera.net> User-Agent: Notmuch/0.14+236~g1d0044f (http://notmuchmail.org) Emacs/23.4.1 (x86_64-pc-linux-gnu) Date: Sun, 23 Dec 2012 08:19:20 +0000 Message-ID: <87a9t5p4dz.fsf@qmul.ac.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Sender-Host-Address: 93.97.24.31 X-QM-SPAM-Info: Sender has good ham record. :) X-QM-Body-MD5: daa53310a90217d87a28cd2cc186fe1a (of first 20000 bytes) X-SpamAssassin-Score: -1.8 X-SpamAssassin-SpamBar: - X-SpamAssassin-Report: The QM spam filters have analysed this message to determine if it is spam. We require at least 5.0 points to mark a message as spam. This message scored -1.8 points. Summary of the scoring: * -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, * medium trust * [138.37.6.40 listed in list.dnswl.org] * 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider * (markwalters1009[at]gmail.com) * -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay * domain * 0.5 AWL AWL: From: address is in the auto white-list X-QM-Scan-Virus: ClamAV says the message is clean Cc: David Bremner X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 23 Dec 2012 08:19:31 -0000 On Sun, 23 Dec 2012, david@tethera.net wrote: > From: David Bremner > > the overgeneral definition of a prefix can be replaced by lower case > alphabetic, and still work fine with current notmuch query syntax. > > token_len++ is moved to the end, and we restore the delimiter just so > we can leave the string as as we found it. > --- > > As always, Jani has a keen eye for muddle. Except he's wrong about > tok_len - prefix_len, and Mark and I are right. Hopefully ;). > > Restoring the delimiter at the end might be pointless (since the rest > of the input line is modified), but it is one less surprise for somebody > repurposing the function. I am now worried about side bit of Xapian syntax, in particular, what about brackets. I think we could have (tag:inbox or tag:tag%20with%20spaces) and In which case the first token is (tag:inbox which does not match. Additionally the third token is tag:tag%20with%20spaces) which presumably gets quoted to tag:"tag with spaces)" and I am guessing Xapian treats this differently than with bracket after the quote. Finally, I don't know if a query can contain a : without being a prefix query. If it can that could end up being misquoted. One possible way round the first problem might be to require the Xapian syntax to be space separated from the rest but that does mean we are diverging from the command line syntax. (I am not very familiar with Xapian syntax nor with quite where this function is used so I may be worrying about nothing) Best wishes Mark > > Patches 5 and 6 can be ignored now. > tag-util.c | 12 ++++++++---- > 1 file changed, 8 insertions(+), 4 deletions(-) > > diff --git a/tag-util.c b/tag-util.c > index b0a846b..ee28512 100644 > --- a/tag-util.c > +++ b/tag-util.c > @@ -78,11 +78,13 @@ unhex_and_quote (void *ctx, char *encoded, const char *line_for_error, > size_t prefix_len; > char delim = *(tok + tok_len); > > - *(tok + tok_len++) = '\0'; > + *(tok + tok_len) = '\0'; > > - prefix_len = hex_invariant (tok, tok_len); > + /* The following matches a superset of prefixes currently > + * used by notmuch */ > + prefix_len = strspn (tok, "abcdefghijklmnopqrstuvwxyz"); > > - if ((strcmp (tok, "*") == 0) || prefix_len >= tok_len - 1) { > + if ((strcmp (tok, "*") == 0) || prefix_len == tok_len) { > > /* pass some things through without quoting or decoding. > * Note for '*' this is mandatory. > @@ -98,7 +100,7 @@ unhex_and_quote (void *ctx, char *encoded, const char *line_for_error, > > } else { > /* potential prefix: one for ':', then something after */ > - if ((tok_len - prefix_len > 2) && *(tok + prefix_len) == ':') { > + if ((tok_len - prefix_len >= 2) && *(tok + prefix_len) == ':') { > if (! (*query_string = talloc_strndup_append (*query_string, > tok, > prefix_len + 1))) { > @@ -129,6 +131,8 @@ unhex_and_quote (void *ctx, char *encoded, const char *line_for_error, > goto DONE; > } > } > + /* restore the string and skip delimiter */ > + *(tok + tok_len++) = delim; > } > > DONE: > -- > 1.7.10.4 > > _______________________________________________ > notmuch mailing list > notmuch@notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch