From ffef4c3d0d091551118c7918ceb1ed13d1618c42 Mon Sep 17 00:00:00 2001 From: Austin Clements Date: Fri, 31 Jan 2014 17:02:34 +1900 Subject: [PATCH] Re: [PATCH 0/5] lib: make folder: prefix literal --- f5/bf1248a825f3f3217ba159ce9bd438028d296b | 146 ++++++++++++++++++++++ 1 file changed, 146 insertions(+) create mode 100644 f5/bf1248a825f3f3217ba159ce9bd438028d296b diff --git a/f5/bf1248a825f3f3217ba159ce9bd438028d296b b/f5/bf1248a825f3f3217ba159ce9bd438028d296b new file mode 100644 index 000000000..4c36d9990 --- /dev/null +++ b/f5/bf1248a825f3f3217ba159ce9bd438028d296b @@ -0,0 +1,146 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 5FA8F431FB6 + for ; Thu, 30 Jan 2014 14:02:47 -0800 (PST) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: -0.7 +X-Spam-Level: +X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 + tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id gykcxSB7ztQl for ; + Thu, 30 Jan 2014 14:02:41 -0800 (PST) +Received: from dmz-mailsec-scanner-8.mit.edu (dmz-mailsec-scanner-8.mit.edu + [18.7.68.37]) + (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id 66F21431FBD + for ; Thu, 30 Jan 2014 14:02:41 -0800 (PST) +X-AuditID: 12074425-f79906d000000cf9-1b-52eacc005076 +Received: from mailhub-auth-1.mit.edu ( [18.9.21.35]) + (using TLS with cipher AES256-SHA (256/256 bits)) + (Client did not present a certificate) + by dmz-mailsec-scanner-8.mit.edu (Symantec Messaging Gateway) with SMTP + id 9A.BD.03321.00CCAE25; Thu, 30 Jan 2014 17:02:40 -0500 (EST) +Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) + by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id s0UM2cxg030481; + Thu, 30 Jan 2014 17:02:39 -0500 +Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) + (authenticated bits=0) + (User authenticated as amdragon@ATHENA.MIT.EDU) + by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id s0UM2ZtW008034 + (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); + Thu, 30 Jan 2014 17:02:37 -0500 +Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80) + (envelope-from ) + id 1W8zgx-0007d3-7K; Thu, 30 Jan 2014 17:02:35 -0500 +Date: Thu, 30 Jan 2014 17:02:34 -0500 +From: Austin Clements +To: Jani Nikula +Subject: Re: [PATCH 0/5] lib: make folder: prefix literal +Message-ID: <20140130220234.GI4375@mit.edu> +References: + <87y525m649.fsf@awakening.csail.mit.edu> + <87r47wfltb.fsf@nikula.org> <87iot8f4vg.fsf@nikula.org> +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +Content-Disposition: inline +In-Reply-To: <87iot8f4vg.fsf@nikula.org> +User-Agent: Mutt/1.5.21 (2010-09-15) +X-Brightmail-Tracker: + H4sIAAAAAAAAA+NgFmpmleLIzCtJLcpLzFFi42IR4hRV1mU48yrIYN9eCYum6c4W12/OZHZg + 8rh1/zW7x7NVt5gDmKK4bFJSczLLUov07RK4Mh7t3M1WsFSyYvpb3wbGbSJdjJwcEgImEid/ + T2eHsMUkLtxbz9bFyMUhJDCbSeLihzcsEM5GRolF77awQjinmSRm/9rCDuEsYZRY23eNCaSf + RUBVYlnbHVYQm01AQ2Lb/uWMILaIgKLE5pP7wWxmAWmJb7+bweqFBSwlpt6ZAmbzCmhLLLw5 + H2roTEaJGR0XmCESghInZz5hgWjWkrjx7yVQAwfYoOX/OEDCnEC7Vu1vBysXFVCRmHJyG9sE + RqFZSLpnIemehdC9gJF5FaNsSm6Vbm5iZk5xarJucXJiXl5qka6FXm5miV5qSukmRlBYs7uo + 7mCccEjpEKMAB6MSD++MtFdBQqyJZcWVuYcYJTmYlER53+0CCvEl5adUZiQWZ8QXleakFh9i + lOBgVhLhfd8PlONNSaysSi3Kh0lJc7AoifPe4rAPEhJITyxJzU5NLUgtgsnKcHAoSfB+OQXU + KFiUmp5akZaZU4KQZuLgBBnOAzSc6zTI8OKCxNzizHSI/ClGRSlx3h0gzQIgiYzSPLheWNp5 + xSgO9Iow7z+QKh5gyoLrfgU0mAlosFY52OCSRISUVANjoNWDfbO5DLtOLZMPXGwWwn5MTkv9 + 7Zy92u8Uzi59kxcn/ueOx/ptW2Y9L394qXan+afM2XzH7zGzKU+4euPWnSMLj77IWyHK6uFw + 7GVpUP3Dh+++v/vmKn7wa97lt3GRO63W7NV/c9NXW2/+LdXLRceM5H43cE1c3cFWeVZmtidz + bKCDYPxsLSWW4oxEQy3mouJEAGfMYYIWAwAA +Cc: notmuch@notmuchmail.org +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Thu, 30 Jan 2014 22:02:47 -0000 + +Quoth Jani Nikula on Jan 25 at 5:38 pm: +> On Sat, 25 Jan 2014, Jani Nikula wrote: +> > Perhaps we need to have two prefixes, one of which is the literal +> > filesystem folder and another which hides the implementation details, +> > like I mentioned in my mail to Peter [1]. But consider this: my proposed +> > implementation does cover *all* use cases. +> +> Here's a thought. With boolean prefix folder:, we can devise a scheme +> where the folder: query defines what is to be matched. +> +> For example: +> +> folder:foo match files in foo, foo/new, and foo/cur. +> folder:foo/ match all files in all subdirectories under foo (this +> would handle Tomi's use case), including foo/new and foo/cur. +> folder:foo/. match in foo only, and specifically not in foo/cur or foo/new. +> folder:foo/new match in foo/new, and specifically not in foo/cur (this +> allows distinguishing between messages in cur and new). +> folder:/ match everything. +> folder:/. match in top level maildir only. +> folder:"" match in top level maildir, including cur/new. +> +> This requires indexing all the path components with suitable +> suffixes. For example, a file "foo/new/baz" would get terms "/", "foo", +> "foo/", "foo/new", and "foo/new/.". A file foo/bar would get terms "/", +> "foo", "foo/", and "foo/.". +> +> It's obviously a concern this increases the database size; not sure how +> it would compare with the current stemmed probabilistic prefix. +> +> Opinions on this? This would really cover all use cases, and address +> Austin's interface and backward compatibility concerns. + +I like this idea in general, though I agree with others that the +specific syntax seems a little wanting. The concept of adding several +boolean terms seems powerful, and I would be surprised if the extra +terms had any substantive effect on database size. + +However, it seems like this is overloading one prefix for two +meanings. And I think that's because people want two similar but +distinct things. Several of us want a simple, natural Maildir-aware +folder search (the Maildir folder of "a/b/cur/x:2," is "a/b"). Others +want file system search. It's easy to conflate these because Maildir +represents folders as directory paths, but maybe they need to be +treated as distinct things. + +What if we introduce two prefixes, say folder: and path: (maybe dir:?) +to address both use cases, each as naturally as possible? Both would +be boolean prefixes because of the limitations of probabilistic +prefixes, but we could take advantage of Jani's idea of generating +several boolean terms. + +folder: could work the way I suggested (simply the path to the file, +with {cur,new} stripped off). path: would support file system search +uses. These seem more varied, but I think fall into exact match and +recursive match. Since I don't have this use case, I can't have any +strong opinions about syntax, but I'll throw out an idea: many shells +support "**" for recursive path matching and people are already quite +familiar with glob patterns for paths, so why not simply adopt this? +In other words, when adding the path "a/b/cur/x:2," add path: terms +"a/b/cur" and "a/b/**" and "a/**" and "**". + +> BR, +> Jani. -- 2.26.2