Re: [PATCH v3 3/5] Add indexing for the mimetype term
authorDavid Bremner <david@tethera.net>
Sat, 17 Jan 2015 15:21:50 +0000 (16:21 +0100)
committerW. Trevor King <wking@tremily.us>
Sat, 20 Aug 2016 21:47:28 +0000 (14:47 -0700)
86/32211145b86c96fe70fc5ef42182c33f9f13b9 [new file with mode: 0644]

diff --git a/86/32211145b86c96fe70fc5ef42182c33f9f13b9 b/86/32211145b86c96fe70fc5ef42182c33f9f13b9
new file mode 100644 (file)
index 0000000..510208b
--- /dev/null
@@ -0,0 +1,113 @@
+Return-Path: <david@tethera.net>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+       by olra.theworths.org (Postfix) with ESMTP id D8373431FC2\r
+       for <notmuch@notmuchmail.org>; Sat, 17 Jan 2015 07:21:59 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: 2.438\r
+X-Spam-Level: **\r
+X-Spam-Status: No, score=2.438 tagged_above=-999 required=5\r
+       tests=[DNS_FROM_AHBL_RHSBL=2.438] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+       by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+       with ESMTP id rDDm1QhRTpCf for <notmuch@notmuchmail.org>;\r
+       Sat, 17 Jan 2015 07:21:56 -0800 (PST)\r
+Received: from yantan.tethera.net (yantan.tethera.net [199.188.72.155])\r
+       (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits))\r
+       (No client certificate requested)\r
+       by olra.theworths.org (Postfix) with ESMTPS id A5560431FAF\r
+       for <notmuch@notmuchmail.org>; Sat, 17 Jan 2015 07:21:56 -0800 (PST)\r
+Received: from remotemail by yantan.tethera.net with local (Exim 4.80)\r
+       (envelope-from <david@tethera.net>)\r
+       id 1YCVCF-0007kL-AJ; Sat, 17 Jan 2015 11:21:55 -0400\r
+Received: (nullmailer pid 12523 invoked by uid 1000); Sat, 17 Jan 2015\r
+       15:21:50 -0000\r
+From: David Bremner <david@tethera.net>\r
+To: Todd <todd@electricoding.com>, notmuch@notmuchmail.org\r
+Subject: Re: [PATCH v3 3/5] Add indexing for the mimetype term\r
+In-Reply-To: <1421368229-4360-3-git-send-email-todd@electricoding.com>\r
+References: <1421368229-4360-1-git-send-email-todd@electricoding.com>\r
+       <1421368229-4360-3-git-send-email-todd@electricoding.com>\r
+User-Agent: Notmuch/0.19+27~g29ffde4 (http://notmuchmail.org) Emacs/24.4.1\r
+       (x86_64-pc-linux-gnu)\r
+Date: Sat, 17 Jan 2015 16:21:50 +0100\r
+Message-ID: <877fwlbfg1.fsf@maritornes.cs.unb.ca>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+       <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Sat, 17 Jan 2015 15:22:00 -0000\r
+\r
+Todd <todd@electricoding.com> writes:\r
+\r
+> Adds the indexing and removes the broken test flag\r
+> ---\r
+>  lib/database.cc        |  1 +\r
+>  lib/index.cc           | 10 ++++++++++\r
+>  test/T190-multipart.sh |  4 ----\r
+>  3 files changed, 11 insertions(+), 4 deletions(-)\r
+>\r
+> diff --git a/lib/database.cc b/lib/database.cc\r
+> index 0d2c417..3974e2e 100644\r
+> --- a/lib/database.cc\r
+> +++ b/lib/database.cc\r
+> @@ -254,6 +254,7 @@ static prefix_t PROBABILISTIC_PREFIX[]= {\r
+>      { "from",                       "XFROM" },\r
+>      { "to",                 "XTO" },\r
+>      { "attachment",         "XATTACHMENT" },\r
+> +    { "mimetype",           "XMIMETYPE"},\r
+>      { "subject",            "XSUBJECT"},\r
+>  };\r
+\r
+I think the commit message should articulate why we are indexing this as\r
+a probabilistic prefix, rather than as a boolean prefix. In particular,\r
+this gives people a last chance to complain.\r
+\r
+The reference I know is http://xapian.org/docs/queryparser.html\r
+\r
+If I understand correctly (it would be great if you could test this\r
+Todd) , with a probabilistic prefix,\r
+\r
+   mimetime:pdf\r
+\r
+will match\r
+\r
+application/pdf\r
+image/pdf\r
+application/x-pdf\r
+application/x-ext-pdf\r
+\r
+but not\r
+\r
+application/x-bzpdf\r
+application/x-gzpdf\r
+application/x-xzpdf\r
+\r
+On the whole, this is probably more beneficial than bad.  The downside\r
+of probabilistic prefixes/fields is that they are not "anchored", so\r
+there is no easy way to distinguish\r
+\r
+      application/pdf\r
+\r
+from\r
+\r
+      pdf\r
+      application/x-pdf\r
+\r
+I guess in a perfect world this would also be explained in\r
+notmuch-search-terms(7), but that's pretty much orthogonal to this\r
+series.\r
+\r
+d\r