Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id A8104429E54 for ; Mon, 23 Jan 2012 18:45:57 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fsT7AA8+VDIP for ; Mon, 23 Jan 2012 18:45:57 -0800 (PST) Received: from dmz-mailsec-scanner-5.mit.edu (DMZ-MAILSEC-SCANNER-5.MIT.EDU [18.7.68.34]) by olra.theworths.org (Postfix) with ESMTP id D0986429E21 for ; Mon, 23 Jan 2012 18:45:56 -0800 (PST) X-AuditID: 12074422-b7fd66d0000008f9-6a-4f1e1b620d8d Received: from mailhub-auth-1.mit.edu ( [18.9.21.35]) by dmz-mailsec-scanner-5.mit.edu (Symantec Messaging Gateway) with SMTP id 3B.DD.02297.26B1E1F4; Mon, 23 Jan 2012 21:45:54 -0500 (EST) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id q0O2jrWj008982; Mon, 23 Jan 2012 21:45:54 -0500 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id q0O2jqOI011949 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Mon, 23 Jan 2012 21:45:53 -0500 (EST) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77) (envelope-from ) id 1RpWNt-0004Sg-VU; Mon, 23 Jan 2012 21:45:22 -0500 Date: Mon, 23 Jan 2012 21:45:21 -0500 From: Austin Clements To: Mark Walters Subject: Re: [RFC PATCH 2/4] Add NOTMUCH_MESSAGE_FLAG_EXCLUDED flag Message-ID: <20120124024521.GY16740@mit.edu> References: <20120124011609.GX16740@mit.edu> <1327367923-18228-2-git-send-email-markwalters1009@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1327367923-18228-2-git-send-email-markwalters1009@gmail.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmpileLIzCtJLcpLzFFi42IR4hRV1k2SlvM3eL9Tw2L1XB6L6zdnMjsw eeycdZfd49mqW8wBTFFcNimpOZllqUX6dglcGTv+bGAq2KlYcWvBc6YGxhbpLkZODgkBE4mf d96zQdhiEhfurQeyuTiEBPYxSizrnMoK4WxglJh29QeUc5JJYvOiq1BlSxgltn3dww7SzyKg KrFr2RZWEJtNQENi2/7ljCC2iICOxO1DC8BqmAWkJb79bmYCsYUFnCWeN30Fq+cFqmncvA3I 5gAamiNxvjcOIiwocXLmExaIVi2JG/9eMoGUgIxZ/o8DJMwp4CWxbMllZhBbVEBFYsrJbWwT GIVmIemehaR7FkL3AkbmVYyyKblVurmJmTnFqcm6xcmJeXmpRbqmermZJXqpKaWbGEFBze6i tIPx50GlQ4wCHIxKPLwSM2X9hVgTy4orcw8xSnIwKYnyxknJ+QvxJeWnVGYkFmfEF5XmpBYf YpTgYFYS4VU7B1TOm5JYWZValA+TkuZgURLnVdd65yckkJ5YkpqdmlqQWgSTleHgUJLgnQcy VLAoNT21Ii0zpwQhzcTBCTKcB2j4DZAa3uKCxNzizHSI/ClGRSlx3uUgCQGQREZpHlwvLOm8 YhQHekWY9zpIFQ8wYcF1vwIazAQ0mCNPCmRwSSJCSqqBUbj5tefWnROWbHF6t22rVIusnvYb lsqHLUJBa3d1/25dJft7FtucfYcrJZSSk5UvB4uk6968aSpXujeoZFrOEkuuo92LuQU/HfrX kZHy8MyC13Mz2G+fcbBYPD/hx6ar31hXsHc4x5TfLbZq2Vb1eQnf7A5RX3eZmd9yztdMr1sc +8Ds5vJ8diWW4oxEQy3mouJEADVg1bQVAwAA Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 24 Jan 2012 02:45:57 -0000 The overall structure of this series looks great. There's obviously a lot of clean up to do, but I'll reply with a few high-level comments. Quoth Mark Walters on Jan 24 at 1:18 am: > Form excluded doc_ids set and use that to exclude messages. > Should be no functional change. > > --- > lib/notmuch-private.h | 1 + > lib/query.cc | 28 ++++++++++++++++++++++++++-- > 2 files changed, 27 insertions(+), 2 deletions(-) > > diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h > index 7bf153e..e791bb0 100644 > --- a/lib/notmuch-private.h > +++ b/lib/notmuch-private.h > @@ -401,6 +401,7 @@ typedef struct _notmuch_message_list { > */ > struct visible _notmuch_messages { > notmuch_bool_t is_of_list_type; > + notmuch_doc_id_set_t *excluded_doc_ids; > notmuch_message_node_t *iterator; > }; > > diff --git a/lib/query.cc b/lib/query.cc > index c25b301..92fa834 100644 > --- a/lib/query.cc > +++ b/lib/query.cc > @@ -57,6 +57,11 @@ struct visible _notmuch_threads { > notmuch_doc_id_set_t match_set; > }; > > +static notmuch_bool_t > +_notmuch_doc_id_set_init (void *ctx, > + notmuch_doc_id_set_t *doc_ids, > + GArray *arr); > + > notmuch_query_t * > notmuch_query_create (notmuch_database_t *notmuch, > const char *query_string) > @@ -173,6 +178,7 @@ notmuch_query_search_messages (notmuch_query_t *query) > "mail")); > Xapian::Query string_query, final_query, exclude_query; > Xapian::MSet mset; > + Xapian::MSetIterator iterator; > unsigned int flags = (Xapian::QueryParser::FLAG_BOOLEAN | > Xapian::QueryParser::FLAG_PHRASE | > Xapian::QueryParser::FLAG_LOVEHATE | > @@ -193,8 +199,21 @@ notmuch_query_search_messages (notmuch_query_t *query) > > exclude_query = _notmuch_exclude_tags (query, final_query); > > - final_query = Xapian::Query (Xapian::Query::OP_AND_NOT, > - final_query, exclude_query); > + enquire.set_weighting_scheme (Xapian::BoolWeight()); > + enquire.set_query (exclude_query); > + > + mset = enquire.get_mset (0, notmuch->xapian_db->get_doccount ()); > + > + GArray *excluded_doc_ids = g_array_new (FALSE, FALSE, sizeof (unsigned int)); > + > + for (iterator = mset.begin (); iterator != mset.end (); iterator++) > + { > + unsigned int doc_id = *iterator; > + g_array_append_val (excluded_doc_ids, doc_id); > + } > + messages->base.excluded_doc_ids = talloc (query, _notmuch_doc_id_set); > + _notmuch_doc_id_set_init (query, messages->base.excluded_doc_ids, > + excluded_doc_ids); This might be inefficient for message-only queries, since it will fetch *all* excluded docids. This highlights a basic difference between message and thread search: thread search can return messages that don't match the original query and hence needs to know all potentially excluded messages, while message search can only return messages that match the original query. It's entirely possible this doesn't matter because Xapian probably still needs to fetch the full posting lists of the excluded terms, but it would be worth doing a quick/hacky benchmark to verify this, with enough excluded messages to make the cost non-trivial. If it does matter, you could pass in a flag indicating if the exclude query should be limited by the original query or not. Or you could do the limited exclude query in notmuch_query_search_messages and a separate open-ended exclude query in notmuch_query_search_threads. > > enquire.set_weighting_scheme (Xapian::BoolWeight()); > > @@ -294,6 +313,11 @@ _notmuch_mset_messages_move_to_next (notmuch_messages_t *messages) > mset_messages = (notmuch_mset_messages_t *) messages; > > mset_messages->iterator++; > + > + while ((mset_messages->iterator != mset_messages->iterator_end) && > + (_notmuch_doc_id_set_contains (messages->excluded_doc_ids, > + *mset_messages->iterator))) > + mset_messages->iterator++; This seemed a little weird, since you remove it in the next patch. Is this just to keep the tests happy? (If so, it would be worth mentioning in the commit message; other reviewers will definitely have the same question.) > } > > static notmuch_bool_t