Date: Wed, 8 Dec 2010 16:58:44 -0500
From: Austin Clements <amdragon@MIT.EDU>
To: Carl Worth <cworth@cworth.org>
Subject: Re: [PATCH 3/4] Optimize thread search using matched docid sets.
Message-ID: <20101208215844.GS2447@mit.edu>
References: <20101117192826.GU2439@mit.edu>
	<874oap5aek.fsf@yoom.home.cworth.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <874oap5aek.fsf@yoom.home.cworth.org>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: notmuch@notmuchmail.org
Precedence: list

Quoth Carl Worth on Dec 07 at  5:19 pm:
> On Wed, 17 Nov 2010 14:28:26 -0500, Austin Clements <amdragon@MIT.EDU> wrote:
> > This reduces thread search's 1+2t Xapian queries (where t is the
> > number of matched threads) to 1+t queries and constructs exactly one
> > notmuch_message_t for each message instead of 2 to 3.
> 
> Fantastic stuff, Austin!
> 
> I've merged this now, (sorry it took me a while to get to it).
> 
> One of the reasons I didn't merge it immediately is that I wanted to
> ensure that I understood the original author-ordering bug. Basically,
> I'm inherently uncomfortable with a performance optimization that fixes
> a bug as a side effect, (unless we understand that very well).
> 
> So what I pushed actually adds the bug fix first, so that the
> performance optimization makes no change at all to the test suite. That
> feels better to me, (even though it simply demonstrated conclusively
> that the bug was in a piece of code that was eliminated by the
> optimization).

Ah, good.  You are less lazy than I.

> Anyway, in a quick reading of the code, the only little thing I saw was:
> 
> > +    size_t count = (bound + sizeof (doc_ids->bitmap[0]) - 1) /
> > +	sizeof (doc_ids->bitmap[0]);
> 
> Which would look better to my eyes with a 1 factored out of the
> division:
> 
> 	size_t count = 1 + (bound - 1) / sizeof (doc_ids->bitmap[0]);
> 
> And the repeated use of "sizeof (doc_ids->bitmap[0])" could maybe do
> with a macro for better legibility. Though it would be an evil macro if
> it didn't accept an argument, and it wouldn't be much shorter if it
> did. So maybe it's fine as-is.

I found what I think is a cleaner way to write that bit of code.  A
small patch is forthcoming.

> Thanks for the optimization. Now all we need is a little notmuch
> benchmark so that I can be sure not to regress any performance work with
> my sloppy coding!

Now that this is in (and I have a temporary respite from TA duties),
I'm going to finish up and send out my other ~1.7X improvement, just
to get it out of my queue.  Then I'll look at making a performance
regression suite.  Were you thinking of some standard set of timed
operations wrapped in a little script that can tell you if you've made
things worse, or something more elaborate?

Thanks for pushing these patches!