Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 03602431FB6 for ; Thu, 17 Oct 2013 07:17:17 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id KYd1PwNmmAC1 for ; Thu, 17 Oct 2013 07:17:04 -0700 (PDT) Received: from mail-ee0-f53.google.com (mail-ee0-f53.google.com [74.125.83.53]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 5B6F8431FAE for ; Thu, 17 Oct 2013 07:17:04 -0700 (PDT) Received: by mail-ee0-f53.google.com with SMTP id t10so1084212eei.26 for ; Thu, 17 Oct 2013 07:17:03 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references :user-agent:date:message-id:mime-version:content-type; bh=WKr5XBJedjXYyxITlEMnrox6FTKqHUBfGmPY/ius0OM=; b=lKqoaDn55h0+l05N7t32zkdCYEYan9g3DTnrptEIxvM8lpZmXsiB0fYB6zAMuABXWp a7VdYxZSmM7eZ4g9fPGvrNczbOkIBJQCGE9GKaT3X5g+v0+w55tVvFfAb4EPK/2/uzIa OWBwJ08VUmBpVZD3KOLUzfICYCj/TEHwpR/0ZunPUPY2015DLkdv2uwLnK2ps6B/uFjR 5V2eYnsEX6ynFLOTGtHomKHGdO0EzXxi1/PUfg4Z3wGhPRyJtn2w/hQv6BzBoqwGJaQh VljZufdOTLnLrPXgxRD1HFou4Xn5tagdnGY03nUKhXpQron0FgzqR3erbClrA6N8Z6yD dQcw== X-Gm-Message-State: ALoCoQmHKhgz5k2OUiZtYNy7G8mOlni60+Cs4qtil87c0SoDsqSxkMQwBxfL6pBQ6H6KJ2mEEhzd X-Received: by 10.15.98.9 with SMTP id bi9mr3986193eeb.67.1382019423188; Thu, 17 Oct 2013 07:17:03 -0700 (PDT) Received: from localhost (dsl-hkibrasgw2-58c36f-91.dhcp.inet.fi. [88.195.111.91]) by mx.google.com with ESMTPSA id a1sm193403452eem.1.1969.12.31.16.00.00 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Thu, 17 Oct 2013 07:17:01 -0700 (PDT) From: Jani Nikula To: "Alexey I. Froloff" , notmuch@notmuchmail.org Subject: Re: [PATCH] lib: Add a new prefix "list" to the search-terms syntax In-Reply-To: <1365549369-12776-1-git-send-email-raorn@raorn.name> References: <20130409083010.GA27675@raorn.name> <1365549369-12776-1-git-send-email-raorn@raorn.name> User-Agent: Notmuch/0.16+97~g6878b0b (http://notmuchmail.org) Emacs/24.3.1 (x86_64-pc-linux-gnu) Date: Thu, 17 Oct 2013 17:17:00 +0300 Message-ID: <87bo2ougmb.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain Cc: "Alexey I. Froloff" X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 17 Oct 2013 14:17:17 -0000 On Wed, 10 Apr 2013, "Alexey I. Froloff" wrote: > From: "Alexey I. Froloff" > > Add support for indexing and searching the message's List-Id header. > This is useful when matching all the messages belonging to a particular > mailing list. There's an issue with our duplicate message-id handling that is likely to cause confusion with List-Id: searches. If you receive several duplicates of the same message (judged by the message-id), only the first one of them gets indexed, and the rest are ignored. This means that for messages you receive both directly and through a list, it will be arbitrary whether the List-Id: gets indexed or not. Therefore a list: search might not return all the messages you'd expect. BR, Jani. > Rework of the patch by Pablo Oliveira > > Differences from original patch: > > The whole list ID indexed as boolean term, not split by words. > List description is not indexed at all. > > Thanks to ojwb and amdragon from irc://irc.freenode.net/notmuch > > Signed-off-by: Alexey I. Froloff > --- > lib/database.cc | 1 + > lib/index.cc | 45 ++++++++++++++++++++++++++++++++++++++++- > man/man7/notmuch-search-terms.7 | 8 ++++++++ > 3 files changed, 53 insertions(+), 1 deletion(-) > > diff --git a/lib/database.cc b/lib/database.cc > index 91d4329..6313913 100644 > --- a/lib/database.cc > +++ b/lib/database.cc > @@ -203,6 +203,7 @@ static prefix_t BOOLEAN_PREFIX_INTERNAL[] = { > }; > > static prefix_t BOOLEAN_PREFIX_EXTERNAL[] = { > + { "list", "XLIST"}, > { "thread", "G" }, > { "tag", "K" }, > { "is", "K" }, > diff --git a/lib/index.cc b/lib/index.cc > index a2edd6d..8b97ec3 100644 > --- a/lib/index.cc > +++ b/lib/index.cc > @@ -304,6 +304,46 @@ _index_address_list (notmuch_message_t *message, > } > } > > +static void > +_index_list_id (notmuch_message_t *message, > + const char *list_id_header) > +{ > + const char *begin_list_id, *end_list_id; > + > + if (list_id_header == NULL) > + return; > + > + /* RFC2919 says that the list-id is found at the end of the header > + * and enclosed between angle brackets. If we cannot find a > + * matching pair of brackets containing at least one character, > + * we ignore the list id header. */ > + begin_list_id = strrchr (list_id_header, '<'); > + if (!begin_list_id) { > + fprintf (stderr, "Warning: Not indexing mailformed List-Id tag.\n"); > + return; > + } > + > + end_list_id = strrchr(begin_list_id, '>'); > + if (!end_list_id || (end_list_id - begin_list_id < 2)) { > + fprintf (stderr, "Warning: Not indexing mailformed List-Id tag.\n"); > + return; > + } > + > + void *local = talloc_new (message); > + > + /* We extract the list id between the angle brackets */ > + const char *list_id = talloc_strndup (local, begin_list_id + 1, > + end_list_id - begin_list_id - 1); > + > + /* _notmuch_message_add_term() may return > + * NOTMUCH_PRIVATE_STATUS_TERM_TOO_LONG here. We can't fix it, but > + * this is not a reason to exit with error... */ > + if (_notmuch_message_add_term (message, "list", list_id)) > + fprintf (stderr, "Warning: Not indexing List-Id: <%s>\n", list_id); > + > + talloc_free (local); > +} > + > /* Callback to generate terms for each mime part of a message. */ > static void > _index_mime_part (notmuch_message_t *message, > @@ -432,7 +472,7 @@ _notmuch_message_index_file (notmuch_message_t *message, > GMimeMessage *mime_message = NULL; > InternetAddressList *addresses; > FILE *file = NULL; > - const char *from, *subject; > + const char *from, *subject, *list_id; > notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS; > static int initialized = 0; > char from_buf[5]; > @@ -500,6 +540,9 @@ mboxes is deprecated and may be removed in the future.\n", filename); > subject = g_mime_message_get_subject (mime_message); > _notmuch_message_gen_terms (message, "subject", subject); > > + list_id = g_mime_object_get_header (GMIME_OBJECT (mime_message), "List-Id"); > + _index_list_id (message, list_id); > + > _index_mime_part (message, g_mime_message_get_mime_part (mime_message)); > > DONE: > diff --git a/man/man7/notmuch-search-terms.7 b/man/man7/notmuch-search-terms.7 > index eb417ba..9cae107 100644 > --- a/man/man7/notmuch-search-terms.7 > +++ b/man/man7/notmuch-search-terms.7 > @@ -52,6 +52,8 @@ terms to match against specific portions of an email, (where > > thread: > > + list: > + > folder: > > date:.. > @@ -100,6 +102,12 @@ thread ID values can be seen in the first column of output from > .B "notmuch search" > > The > +.BR list: , > +is used to match mailing list ID of an email message \- contents of the > +List\-Id: header without the '<', '>' delimiters or decoded list > +description. > + > +The > .B folder: > prefix can be used to search for email message files that are > contained within particular directories within the mail store. Only > -- > 1.8.1.4 > > _______________________________________________ > notmuch mailing list > notmuch@notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch