From: Austin Clements Date: Mon, 6 Jun 2016 20:22:15 +0000 (+2000) Subject: Re: searching: '*analysis' vs 'reanalysis' X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=e709dc377e6f008eb086314dbb1bf1eb588c804e;p=notmuch-archives.git Re: searching: '*analysis' vs 'reanalysis' --- diff --git a/0d/ca391b97c0fb77b9fbe56a3fee06ba5e8c6bfb b/0d/ca391b97c0fb77b9fbe56a3fee06ba5e8c6bfb new file mode 100644 index 000000000..a24cc4da7 --- /dev/null +++ b/0d/ca391b97c0fb77b9fbe56a3fee06ba5e8c6bfb @@ -0,0 +1,80 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by arlo.cworth.org (Postfix) with ESMTP id A27496DE0130 + for ; Mon, 6 Jun 2016 13:22:32 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at cworth.org +X-Spam-Flag: NO +X-Spam-Score: -0.304 +X-Spam-Level: +X-Spam-Status: No, score=-0.304 tagged_above=-999 required=5 + tests=[AWL=-0.293, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] + autolearn=disabled +Received: from arlo.cworth.org ([127.0.0.1]) + by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id FEl2IQvXzlt2 for ; + Mon, 6 Jun 2016 13:22:25 -0700 (PDT) +Received: from outgoing-tmp.csail.mit.edu (outgoing-tmp.csail.mit.edu + [128.30.2.206]) + by arlo.cworth.org (Postfix) with ESMTP id EA27C6DE00DA + for ; Mon, 6 Jun 2016 13:22:24 -0700 (PDT) +Received: from awakening.a20.io ([104.131.20.129] helo=awakening) + by outgoing-tmp.csail.mit.edu with esmtpsa (TLS1.2:RSA_AES_128_CBC_SHA1:128) + (Exim 4.82) (envelope-from ) + id 1bA12T-0004KF-Dy; Mon, 06 Jun 2016 16:22:21 -0400 +Received: from amthrax by awakening with local (Exim 4.86) + (envelope-from ) + id 1bA12S-0005II-Oi; Mon, 06 Jun 2016 16:22:20 -0400 +Date: Mon, 6 Jun 2016 16:22:15 -0400 +From: Austin Clements +To: Gaute Hope +Cc: David Bremner , sfischme@uwaterloo.ca, + notmuch +Subject: Re: searching: '*analysis' vs 'reanalysis' +Message-ID: <20160606202215.GE7854@csail.mit.edu> +References: <1465196150-astroid-3-33kf2otxir-16915@strange> + <87lh2ijxor.fsf@tesseract.cs.unb.ca> + <1465217156-astroid-4-8l08w9cils-2318@strange> + <877fe2tiy8.fsf@uwaterloo.ca> <878tyins3j.fsf@tesseract.cs.unb.ca> + + <1465243657-astroid-0-zfssqjwtff-28912@strange> +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +Content-Disposition: inline +In-Reply-To: <1465243657-astroid-0-zfssqjwtff-28912@strange> +User-Agent: Mutt/1.5.24 (2015-08-30) +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.20 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Mon, 06 Jun 2016 20:22:32 -0000 + +Quoth Gaute Hope on Jun 06 at 8:08 pm: +> Austin Clements writes on juni 6, 2016 21:20: +> > +> >The experiment was specifically for regexp matching subject, but it should +> >work for any header we store a literal copy of in the database. +> +> Does it work for terms in the body of the message? + +No. It's not impossible that it could be made to work, but it might be +slow and unintuitive. It would have to iterate over all of the terms +in the database and see which ones match the regexp. These are +available, but I don't know how much time it takes to iterate over all +of them. It might be okay. It might not. + +It could also expand to a very large query if the regexp matches many +terms, akin to how searching for "a*" can be quite expensive. + +And it might not match what you expect. It could only match individual +terms, so a regexp containing any punctuation (including but not +limited to a space) simply wouldn't match anything.