1 Return-Path: <amdragon@mit.edu>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 1D13C40DEFB
\r
6 for <notmuch@notmuchmail.org>; Wed, 17 Nov 2010 23:38:42 -0800 (PST)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-1.9 tagged_above=-999 required=5
\r
12 tests=[BAYES_00=-1.9] autolearn=ham
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id BRXgZfIOyN53 for <notmuch@notmuchmail.org>;
\r
16 Wed, 17 Nov 2010 23:38:32 -0800 (PST)
\r
17 Received: from dmz-mailsec-scanner-8.mit.edu (DMZ-MAILSEC-SCANNER-8.MIT.EDU
\r
19 by olra.theworths.org (Postfix) with ESMTP id F1CD540DEF6
\r
20 for <notmuch@notmuchmail.org>; Wed, 17 Nov 2010 23:38:31 -0800 (PST)
\r
21 X-AuditID: 12074425-b7c98ae000000a04-14-4ce4d7f7a7e5
\r
22 Received: from mailhub-auth-4.mit.edu ( [18.7.62.39])
\r
23 by dmz-mailsec-scanner-8.mit.edu (Symantec Brightmail Gateway) with
\r
24 SMTP id 09.B7.02564.7F7D4EC4; Thu, 18 Nov 2010 02:38:31 -0500 (EST)
\r
25 Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103])
\r
26 by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id oAI7cUnf002393
\r
27 for <notmuch@notmuchmail.org>; Thu, 18 Nov 2010 02:38:31 -0500
\r
28 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91])
\r
29 (authenticated bits=0)
\r
30 (User authenticated as amdragon@ATHENA.MIT.EDU)
\r
31 by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id oAI7cTEI022928
\r
32 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT)
\r
33 for <notmuch@notmuchmail.org>; Thu, 18 Nov 2010 02:38:30 -0500 (EST)
\r
34 Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.72)
\r
35 (envelope-from <amdragon@mit.edu>) id 1PIz4f-0002Zr-1F
\r
36 for notmuch@notmuchmail.org; Thu, 18 Nov 2010 02:38:29 -0500
\r
37 Date: Thu, 18 Nov 2010 02:38:29 -0500
\r
38 From: Austin Clements <amdragon@MIT.EDU>
\r
39 To: notmuch@notmuchmail.org
\r
40 Subject: Re: [PATCH 3/4] Optimize thread search using matched docid sets.
\r
41 Message-ID: <20101118073828.GD2439@mit.edu>
\r
42 References: <20101117192826.GU2439@mit.edu>
\r
44 Content-Type: text/plain; charset=us-ascii
\r
45 Content-Disposition: inline
\r
46 In-Reply-To: <20101117192826.GU2439@mit.edu>
\r
47 User-Agent: Mutt/1.5.20 (2009-06-14)
\r
48 X-Brightmail-Tracker: AAAAAA==
\r
49 X-BeenThere: notmuch@notmuchmail.org
\r
50 X-Mailman-Version: 2.1.13
\r
52 List-Id: "Use and development of the notmuch mail system."
\r
53 <notmuch.notmuchmail.org>
\r
54 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
55 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
56 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
57 List-Post: <mailto:notmuch@notmuchmail.org>
\r
58 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
59 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
60 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
61 X-List-Received-Date: Thu, 18 Nov 2010 07:38:42 -0000
\r
63 Currently this code uses a bitmap indexed by docid as a simple, fast
\r
64 set structure. This is quite memory-efficient if the docid space is
\r
65 dense, even if the largest docid is quite large. Is there a danger
\r
66 that the docid space will be large and sparse? Is it worth replacing
\r
67 this with a smarter bit set structure?
\r
69 Quoth myself on Nov 17 at 2:28 pm:
\r
70 > This reduces thread search's 1+2t Xapian queries (where t is the
\r
71 > number of matched threads) to 1+t queries and constructs exactly one
\r
72 > notmuch_message_t for each message instead of 2 to 3.
\r
73 > notmuch_query_search_threads eagerly fetches the docids of all
\r
74 > messages matching the user query instead of lazily constructing
\r
75 > message objects and fetching thread ID's from term lists.
\r
76 > _notmuch_thread_create takes a seed docid and the set of all matched
\r
77 > docids and uses a single Xapian query to expand this docid to its
\r
78 > containing thread, using the matched docid set to determine which
\r
79 > messages in the thread match the user query instead of using a second
\r
82 > As a side effect, this fixes author order so authors are always sorted
\r
83 > by first occurrence in each thread. This breaks two emacs tests that
\r
84 > hard-code the old, buggy author order.
\r
86 > This reduces the amount of time required to load my inbox from 4.523
\r
87 > seconds to 3.025 seconds (1.5X faster).
\r