Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 41693431FBF for ; Wed, 29 Jan 2014 12:46:24 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id wSq9WGnEPv8h for ; Wed, 29 Jan 2014 12:46:17 -0800 (PST) Received: from dmz-mailsec-scanner-4.mit.edu (dmz-mailsec-scanner-4.mit.edu [18.9.25.15]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 2AE06431FBD for ; Wed, 29 Jan 2014 12:46:17 -0800 (PST) X-AuditID: 1209190f-f790b6d000000c3a-b4-52e96898ad56 Received: from mailhub-auth-4.mit.edu ( [18.7.62.39]) (using TLS with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by dmz-mailsec-scanner-4.mit.edu (Symantec Messaging Gateway) with SMTP id 15.84.03130.89869E25; Wed, 29 Jan 2014 15:46:16 -0500 (EST) Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id s0TKkEEH005254; Wed, 29 Jan 2014 15:46:15 -0500 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id s0TKkBCu006144 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Wed, 29 Jan 2014 15:46:13 -0500 Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80) (envelope-from ) id 1W8c1S-0008Io-75; Wed, 29 Jan 2014 15:46:10 -0500 Date: Wed, 29 Jan 2014 15:46:09 -0500 From: Austin Clements To: Carl Worth Subject: Re: [PATCH 0/5] lib: make folder: prefix literal Message-ID: <20140129204608.GE4375@mit.edu> References: <87y525m649.fsf@awakening.csail.mit.edu> <87r47wfltb.fsf@nikula.org> <87iot8f4vg.fsf@nikula.org> <874n4rvcvo.fsf@yoom.home.cworth.org> <874n4mfw1x.fsf@nikula.org> <87k3dir3ci.fsf@yoom.home.cworth.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87k3dir3ci.fsf@yoom.home.cworth.org> User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFuplleLIzCtJLcpLzFFi42IRYrdT152R8TLI4OkrPoubP+ewWTRNd7a4 fnMmswOzx+7ND1g8bt1/ze7xbNUt5gDmKC6blNSczLLUIn27BK6M2ZufsRRc56k4sfQoSwNj D1cXIyeHhICJxMOvy1ggbDGJC/fWs3UxcnEICcxmklg64SQzhLORUaKp/zMrhHOaSaL37S8W CGc1o0TbncfsIP0sAqoSTzbvZgKx2QT0JVasncQKYosIKEk8PbIKLM4sYCXRsOUDI4gtLGAp MfXOFLA4r4C2xIWFj6GGtjNJ9NxaAZUQlDg58wkLRLOWxI1/L4HiHEC2tMTyfxwgJqeAkcSU yxkgFaICKhJTTm5jm8AoNAtJ8ywkzbMQmhcwMq9ilE3JrdLNTczMKU5N1i1OTszLSy3SNdHL zSzRS00p3cQICnVOSf4djN8OKh1iFOBgVOLhXXHxeZAQa2JZcWXuIUZJDiYlUd4DqS+DhPiS 8lMqMxKLM+KLSnNSiw8xSnAwK4nwfowDyvGmJFZWpRblw6SkOViUxHlvctgHCQmkJ5akZqem FqQWwWRlODiUJHi3pQM1ChalpqdWpGXmlCCkmTg4QYbzAA2fBlLDW1yQmFucmQ6RP8WoKCUO kRAASWSU5sH1wlLRK0ZxoFeEeTeCVPEA0xhc9yugwUxAg8V2vAAZXJKIkJJqYBRg8pV4e8/n 3ypHpsOySybLmq238ApKdMpecqie8WRswREhzV1uIUIaLBsD72oLdZ5n6NLfaVXVsm2Skk2a a7vi48W+08z1dY0t1yd37r36nVmhPuSHzPn5osn+LldlHk3asTUy9IrRN5cV4iLTDP5I8pwo E/cNXSV6Z/NVwZrb628djbq5T4mlOCPRUIu5qDgRAMOxfoYgAwAA Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jan 2014 20:46:24 -0000 Quoth Carl Worth on Jan 29 at 11:32 am: > Jani Nikula writes: > > Unfortunately, I haven't had the time to experiment with this. But it > > bugs me that the probabilistic folder: prefix has stemming and it's case > > insensitive. It's possible to work around the stemming with the anchors > > you suggest or by quoting, but is there a way to have case sensitive > > probabilistic prefixes? > > The stemming and case insensitivity just has to do with which terms are > shoved into the database, (you have to add extra terms to get these > features). If we're getting those features for folder now, (and I agree > that we don't want them), it's because we're calling some Xapian > convenience function along the lines of "create a bunch of terms for > this chunk of text". > > The fix for that is to do the simple thing and simply break the path at > each '/' and add a term for each component. Then these problems all go > away. I think you're assuming we have much more control over this than we do. It's true that we're using Xapian::TermGenerator for this, which is what strips case and stems terms (and removes any punctuation like $ or ^), but Xapian's current query parser only gives us two options for a prefix: either don't parse them at all (boolean terms), or parse them using TermGenerator (probabilistic terms). We can index these terms however we want, but there's simply no hook into the query parser that would let us split the query at each '/' at search time. > So fixes for this should not require switching from a probabilistic to a > Boolean prefix.