Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 87C22431FBF for ; Wed, 29 Jan 2014 11:05:46 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id jdHG4c6S51KY for ; Wed, 29 Jan 2014 11:05:37 -0800 (PST) Received: from mail-ea0-f179.google.com (mail-ea0-f179.google.com [209.85.215.179]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 45D7D431FBD for ; Wed, 29 Jan 2014 11:05:37 -0800 (PST) Received: by mail-ea0-f179.google.com with SMTP id q10so941042ead.24 for ; Wed, 29 Jan 2014 11:05:34 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:subject:in-reply-to:references :user-agent:date:message-id:mime-version:content-type; bh=ruJSi2DoWO7Z83uIA1BRpmJdAFm2S77i5UIHbdMn+II=; b=Ieye9qTsbW6SPdUsPu4MpS/ibwHdVCCzEdWaj8A9wuiqHntT+YhroHJEjn61XaR5qf B6JuYmVuh/93NsyBYM9bW6qeWwrERQi8eWes1Kfug++A0/9J9uBiC/ZvtvdI6aeygR0Z xbC/fBxzhYcNNrdlyJPMBrFNEQJyr9uAFy+TeaXDmXSPEgWMc1k3y1sM0CBgCHG76iL2 p+BCP4etSqzbUiet4CEFN5Qdcgl2rNAclIJDiScT5mfZWbqtchE4DHpgtQA2sDCkpdMw zPoFVF5nLoYh3Y4/13pYlk2IvxsHrRF7fLbmnzbBziHH4e5vgHVxK+xz7L1QDbIp5ypb EqDA== X-Gm-Message-State: ALoCoQmQwlRs051nC/uX5gx8uhesp8aluvZlalxEqTrAmDaJ00mNFORPKS01BXjHaKKJL9BVdc/M X-Received: by 10.15.36.65 with SMTP id h41mr11808769eev.0.1391022334730; Wed, 29 Jan 2014 11:05:34 -0800 (PST) Received: from localhost (dsl-hkibrasgw2-58c36f-91.dhcp.inet.fi. [88.195.111.91]) by mx.google.com with ESMTPSA id k6sm12426881eep.17.2014.01.29.11.05.31 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Wed, 29 Jan 2014 11:05:33 -0800 (PST) From: Jani Nikula To: Carl Worth , Austin Clements , notmuch@notmuchmail.org Subject: Re: [PATCH 0/5] lib: make folder: prefix literal In-Reply-To: <874n4rvcvo.fsf@yoom.home.cworth.org> References: <87y525m649.fsf@awakening.csail.mit.edu> <87r47wfltb.fsf@nikula.org> <87iot8f4vg.fsf@nikula.org> <874n4rvcvo.fsf@yoom.home.cworth.org> User-Agent: Notmuch/0.17+44~ge3b4cd9 (http://notmuchmail.org) Emacs/24.3.1 (x86_64-pc-linux-gnu) Date: Wed, 29 Jan 2014 21:05:30 +0200 Message-ID: <874n4mfw1x.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 29 Jan 2014 19:05:46 -0000 On Sun, 26 Jan 2014, Carl Worth wrote: > Jani Nikula writes: >> Here's a thought. With boolean prefix folder:, we can devise a scheme >> where the folder: query defines what is to be matched. > > I like the idea, but I tried to infer the rules from the examples, and I > failed. It looks like there are two new symbols, "/" and "/." but I > couldn't decipher the exact semantics of each. > > I think a proposal like this should not re-use the '/' symbol as we > already have that as a path divider. (See rsync for lots of user > confusion with a significant trailing '/'). > > I propose a similar, but slightly different approach, where we add two > additional symbols: > > '^' Matches the beginning of a path > > '$' Matches the end of a path > > [Obviously, I chose these symbols from regular expressions. I would be > OK with alternate symbols, ('$' seems like it might be problematic in > the shell, but perhaps not too much if it's always at the end of a > phrase.)] > > This way, one could search for: > > folder:foo Works like "folder:" historically > > folder:^full/path$ Works like Jani's proposal > > folder:^path/prefix Satisfies Tomi's use case, (as well as anyone > who doesn't want to have to specify or > distinguish between "/cur" or "/new". > > Any extra '/' at the beginning or end of a search string, (such as > "folder:^/full/path/$") would not change the semantics. > > Further, I think we can implement this with less database bloat by > leaving "folder" as probabilistic and simply indexing two new terms to > indicate the beginning of the path and the end of the path. > > Finally, we could also extend the scheme to other things like subject: > to allow for an exact subject search like: > > "subject:^lib: make folder: prefix literal$" > > It was with an eye toward something like this that I chose to make > folder: probabilistic in the first place. (I probably would have indexed > things appropriately in the first place as well, but at the time doing > the necessary query parsing for '^' and '$' seemed daunting). Unfortunately, I haven't had the time to experiment with this. But it bugs me that the probabilistic folder: prefix has stemming and it's case insensitive. It's possible to work around the stemming with the anchors you suggest or by quoting, but is there a way to have case sensitive probabilistic prefixes? BR, Jani.