Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id C4EBB429E20 for ; Thu, 10 Mar 2011 21:26:06 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.99 X-Spam-Level: X-Spam-Status: No, score=-0.99 tagged_above=-999 required=5 tests=[ALL_TRUSTED=-1, T_MIME_NO_TEXT=0.01] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fZealZKe9vIN; Thu, 10 Mar 2011 21:26:05 -0800 (PST) Received: from yoom.home.cworth.org (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 94963431FB5; Thu, 10 Mar 2011 21:26:05 -0800 (PST) Received: by yoom.home.cworth.org (Postfix, from userid 1000) id 1ED5325401B; Thu, 10 Mar 2011 21:26:05 -0800 (PST) From: Carl Worth To: Austin Clements Subject: Re: Xapian locking errors with custom query parser In-Reply-To: <20110311024730.GA31011@mit.edu> References: <87d3nhe3g9.fsf@steelpick.2x.cz> <87lj0m8ki5.fsf@yoom.home.cworth.org> <20110311024730.GA31011@mit.edu> User-Agent: Notmuch/0.5 (http://notmuchmail.org) Emacs/23.2.1 (i486-pc-linux-gnu) Date: Thu, 10 Mar 2011 21:26:04 -0800 Message-ID: <8762rq8byr.fsf@yoom.home.cworth.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 11 Mar 2011 05:26:06 -0000 --=-=-= Content-Transfer-Encoding: quoted-printable On Thu, 10 Mar 2011 21:47:30 -0500, Austin Clements wrot= e: > Yes, qparser-3 is ready for you, and has this fix folded in to it (see > id:20110202050336.GB28537@mit.edu). Thanks. I've finally had a chance to start looking at this. The first thing that caught my eye was this question: > +/* XXX notmuch currently registers "tag" as an exclusive boolean > + * prefix, which means queries like "tag:x tag:y" will return messages > + * with tag x OR tag y. Is this intentional? */ This isn't "intentional" in the sense that it is desired, no. Our documentation for the search syntax says: In addition to individual terms, multiple terms can be combined with Boolean operators ( and, or, not , etc.). Each term in the query will be implicitly connected by a logical AND if no explicit operator is provided, (except that terms with a common prefix will be implicitly combined with OR until we get Xapian defect #402 fixed). So, when I originally wrote this code, the add_boolean_prefix function didn't have the "exclusive" parameter that it has now. So that's something to fix. The next thing I notice is quite a lot of concern in the testing for whether things were precisely Xapian compatible or not. I have two different opinions about this: 1. For "new" search features (ADJ,NEAR,etc.) I do not have a strong interest in compatibility with Xapian. I was very careful when I wrote the documentation for the notmuch search syntax to only document features that I had used and tested, and that I was sure I wanted. (I was already thinking forward to perhaps writing a custom query parser at some point.) So you should really use our existing documentation as the guide. Please implement and test what it says. Beyond that, if you want to add additional features not mentioned in our documentation, then feel free to, and there's no good reason not to be Xapian compatible. But I also don't think there's a strong reason that we have to be compatible. Of course, for any new features here I would also like to see the documentation be updated. 2. For term splitting I do have a strong interest in Xapian compatibility. The difference here is that we aren't doing our own indexing, but instead relying on Xapian to do that for us, and we have also never carefully documented how the term splitting happens. What I want to happen here is that if a user grabs a chunk of text from an email, (say, "x#y"), and searches for it, that notmuch will find emails that actually contain that text. So if the indexer and the query parser disagree about something like this, then notmuch can break badly. I don't know how well notmuch currently meets that requirement, but I've been trusting in consistent term-splitting in the indexer and query-parser to help with this. So the frequent comments about incompatibility along these lines in your patches make me nervous. Can you enlighten me more about the compatibility differences in this area, and how things might break here? > Interesting. I could see this being useful for decluttering > superseded review branches, though that would require renaming > superseded branches, which always causes a mess. Deleting any superseded for-cworth branch would never cause me any problem. If you had other consumers of your branches that wouldn't be as happy with branch names disappearing, then you might want to just let them have another name outside the "for-cworth" space. Anyway, it's just one idea for helping me get some more information from git. =2DCarl =2D-=20 carl.d.worth@intel.com --=-=-= Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iD8DBQFNebJs6JDdNq8qSWgRAqzHAJ9b5R9tAFYaoOLg3nNUSzrzsuCfdgCgjDuz VkPEm9Osy6+wz3mF9T7lv+A= =2nE4 -----END PGP SIGNATURE----- --=-=-=--