Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 19567431FBF for ; Sat, 1 Feb 2014 06:54:29 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Gkz3wK1sj3ui for ; Sat, 1 Feb 2014 06:54:25 -0800 (PST) Received: from mail-ee0-f46.google.com (mail-ee0-f46.google.com [74.125.83.46]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 14F28431FBD for ; Sat, 1 Feb 2014 06:54:24 -0800 (PST) Received: by mail-ee0-f46.google.com with SMTP id c13so2754427eek.5 for ; Sat, 01 Feb 2014 06:54:23 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references :user-agent:date:message-id:mime-version:content-type; bh=hOpNcrd+p+Y+5ZMi3qG1FeBvJQSXKRiclwMdusszEH0=; b=QL5jkEp0lcNBTGoSw06qsX/Ws1UdfR0SsHCgMlAqmI/h0S2nnIv6JOLIvXoWZJlia2 zDIPrjvlPFid9GUzrpoUfMOV/lzHVnXvZ7uFlxD5WZgtVu2/Z8iUOY/G8fJ9dBz9diAB NXPtWhhYkF7lS7RtweYm8MdGiu3aAvvudsEgkpiCqCIO7O/iDeA+0Ln3ICEtbDruidGN A2zk/pu3dqtey3Gn2VN6HIszVU1twBubHRrOGJ7OQgYVrjaA1o7Eq4bzhQ4vMkMeoyS7 wFtkhUg6UjGVxjYRkN1wdCXcNBqFlDzPhddky5kp0nRyVRYmvO3b16YeOyRNQT7fAJIx 5Nbg== X-Gm-Message-State: ALoCoQn07yRBTY5bTI1U0uSJUUzobMV7kaa45tmCNKaGrrKPbcl5NrsBIwedA19yysgNj8T99s93 X-Received: by 10.14.2.193 with SMTP id 41mr21090234eef.55.1391266462369; Sat, 01 Feb 2014 06:54:22 -0800 (PST) Received: from localhost (dsl-hkibrasgw2-58c36f-91.dhcp.inet.fi. [88.195.111.91]) by mx.google.com with ESMTPSA id g1sm50286382eet.6.2014.02.01.06.54.20 for (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sat, 01 Feb 2014 06:54:21 -0800 (PST) From: Jani Nikula To: Austin Clements Subject: Re: [PATCH 0/5] lib: make folder: prefix literal In-Reply-To: <20140130220234.GI4375@mit.edu> References: <87y525m649.fsf@awakening.csail.mit.edu> <87r47wfltb.fsf@nikula.org> <87iot8f4vg.fsf@nikula.org> <20140130220234.GI4375@mit.edu> User-Agent: Notmuch/0.17+44~ge3b4cd9 (http://notmuchmail.org) Emacs/24.3.1 (x86_64-pc-linux-gnu) Date: Sat, 01 Feb 2014 16:54:19 +0200 Message-ID: <87fvo2yjc4.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Feb 2014 14:54:29 -0000 On Fri, 31 Jan 2014, Austin Clements wrote: > What if we introduce two prefixes, say folder: and path: (maybe dir:?) > to address both use cases, each as naturally as possible? Both would > be boolean prefixes because of the limitations of probabilistic > prefixes, but we could take advantage of Jani's idea of generating > several boolean terms. Agreed. On to details: > folder: could work the way I suggested (simply the path to the file, > with {cur,new} stripped off). What if the file is not in a folder named cur/new? I suggest indexing the folder as-is, if only for some backwards compatibility. What if there is not all of cur/new/tmp folders? I suggest ignoring that, and only look at the path to the file being indexed. This is simplest to implement, and it does not matter if the sibling directories come and go, and for this reason also unsurprising. For top level cur/new, index the empty string "". > path: would support file system search > uses. These seem more varied, but I think fall into exact match and > recursive match. Since I don't have this use case, I can't have any > strong opinions about syntax, but I'll throw out an idea: many shells > support "**" for recursive path matching and people are already quite > familiar with glob patterns for paths, so why not simply adopt this? > In other words, when adding the path "a/b/cur/x:2," add path: terms > "a/b/cur" and "a/b/**" and "a/**" and "**". Since folder: would cover the cur/new cases, I suggest the non-recursive variant of path: prefix is the exact filesystem folder name as-is (with the top level being the empty string ""). I presume this is what you meant too. I kind of like the "/**" suffix for recursive, but there's two small wrinkles: 1) it needs quoting on the command line (unlike my original suggestion of just "/" suffix), and 2) what should the top level recursive search be? path:"**" or path:"/**" or path:"./**"? I guess the first one is most obvious? So here's what my original suggestions would become: >> Here's a thought. With boolean prefix folder:, we can devise a scheme >> where the folder: query defines what is to be matched. >> >> For example: >> >> folder:foo match files in foo, foo/new, and foo/cur. -> folder:foo >> folder:foo/ match all files in all subdirectories under foo (this >> would handle Tomi's use case), including foo/new and foo/cur. -> path:"foo/**" >> folder:foo/. match in foo only, and specifically not in foo/cur or foo/new. -> path:foo >> folder:foo/new match in foo/new, and specifically not in foo/cur (this >> allows distinguishing between messages in cur and new). -> path:foo/new >> folder:/ match everything. -> path:"**" >> folder:/. match in top level maildir only. -> path:"" >> folder:"" match in top level maildir, including cur/new. -> folder:"" I'd like these details to be ironed out and agreed on before I send the next version. BR, Jani.