Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 56EE2431FB5 for ; Wed, 8 Dec 2010 13:58:48 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 0 X-Spam-Level: X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none] autolearn=unavailable Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vWHQiZBRt-e7 for ; Wed, 8 Dec 2010 13:58:48 -0800 (PST) Received: from dmz-mailsec-scanner-6.mit.edu (DMZ-MAILSEC-SCANNER-6.MIT.EDU [18.7.68.35]) by olra.theworths.org (Postfix) with ESMTP id 29DFF431FB6 for ; Wed, 8 Dec 2010 13:58:48 -0800 (PST) X-AuditID: 12074423-b7bd0ae000000a00-92-4cffff970d62 Received: from mailhub-auth-1.mit.edu ( [18.9.21.35]) by dmz-mailsec-scanner-6.mit.edu (Symantec Brightmail Gateway) with SMTP id 92.47.02560.79FFFFC4; Wed, 8 Dec 2010 16:58:47 -0500 (EST) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id oB8Lwk2e000460; Wed, 8 Dec 2010 16:58:46 -0500 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id oB8LwieZ018638 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Wed, 8 Dec 2010 16:58:45 -0500 (EST) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.72) (envelope-from ) id 1PQS28-0001fb-D8; Wed, 08 Dec 2010 16:58:44 -0500 Date: Wed, 8 Dec 2010 16:58:44 -0500 From: Austin Clements To: Carl Worth Subject: Re: [PATCH 3/4] Optimize thread search using matched docid sets. Message-ID: <20101208215844.GS2447@mit.edu> References: <20101117192826.GU2439@mit.edu> <874oap5aek.fsf@yoom.home.cworth.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <874oap5aek.fsf@yoom.home.cworth.org> User-Agent: Mutt/1.5.20 (2009-06-14) X-Brightmail-Tracker: AAAAAA== Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Dec 2010 21:58:48 -0000 Quoth Carl Worth on Dec 07 at 5:19 pm: > On Wed, 17 Nov 2010 14:28:26 -0500, Austin Clements wrote: > > This reduces thread search's 1+2t Xapian queries (where t is the > > number of matched threads) to 1+t queries and constructs exactly one > > notmuch_message_t for each message instead of 2 to 3. > > Fantastic stuff, Austin! > > I've merged this now, (sorry it took me a while to get to it). > > One of the reasons I didn't merge it immediately is that I wanted to > ensure that I understood the original author-ordering bug. Basically, > I'm inherently uncomfortable with a performance optimization that fixes > a bug as a side effect, (unless we understand that very well). > > So what I pushed actually adds the bug fix first, so that the > performance optimization makes no change at all to the test suite. That > feels better to me, (even though it simply demonstrated conclusively > that the bug was in a piece of code that was eliminated by the > optimization). Ah, good. You are less lazy than I. > Anyway, in a quick reading of the code, the only little thing I saw was: > > > + size_t count = (bound + sizeof (doc_ids->bitmap[0]) - 1) / > > + sizeof (doc_ids->bitmap[0]); > > Which would look better to my eyes with a 1 factored out of the > division: > > size_t count = 1 + (bound - 1) / sizeof (doc_ids->bitmap[0]); > > And the repeated use of "sizeof (doc_ids->bitmap[0])" could maybe do > with a macro for better legibility. Though it would be an evil macro if > it didn't accept an argument, and it wouldn't be much shorter if it > did. So maybe it's fine as-is. I found what I think is a cleaner way to write that bit of code. A small patch is forthcoming. > Thanks for the optimization. Now all we need is a little notmuch > benchmark so that I can be sure not to regress any performance work with > my sloppy coding! Now that this is in (and I have a temporary respite from TA duties), I'm going to finish up and send out my other ~1.7X improvement, just to get it out of my queue. Then I'll look at making a performance regression suite. Were you thinking of some standard set of timed operations wrapped in a little script that can tell you if you've made things worse, or something more elaborate? Thanks for pushing these patches!