--- /dev/null
+Return-Path: <todd@electricoding.com>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+ by olra.theworths.org (Postfix) with ESMTP id A977C431FC2\r
+ for <notmuch@notmuchmail.org>; Sat, 17 Jan 2015 08:41:47 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: 2.438\r
+X-Spam-Level: **\r
+X-Spam-Status: No, score=2.438 tagged_above=-999 required=5\r
+ tests=[DNS_FROM_AHBL_RHSBL=2.438] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+ by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+ with ESMTP id LuMEjxPDhkyZ for <notmuch@notmuchmail.org>;\r
+ Sat, 17 Jan 2015 08:41:44 -0800 (PST)\r
+Received: from s75.web-hosting.com (s75.web-hosting.com [198.187.31.9])\r
+ (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))\r
+ (No client certificate requested)\r
+ by olra.theworths.org (Postfix) with ESMTPS id 6847F431FAF\r
+ for <notmuch@notmuchmail.org>; Sat, 17 Jan 2015 08:41:44 -0800 (PST)\r
+Received: from user-69-73-37-128.knology.net ([69.73.37.128]:46736\r
+ helo=tz-lab) by server75.web-hosting.com with esmtpsa\r
+ (UNKNOWN:DHE-RSA-AES128-SHA:128) (Exim 4.82) (envelope-from\r
+ <todd@electricoding.com>) id 1YCWRS-001OHm-HT; Sat, 17 Jan 2015 11:41:42\r
+ -0500\r
+From: Todd <todd@electricoding.com>\r
+To: David Bremner <david@tethera.net>, notmuch@notmuchmail.org\r
+Subject: Re: [PATCH v3 3/5] Add indexing for the mimetype term\r
+In-Reply-To: <877fwlbfg1.fsf@maritornes.cs.unb.ca>\r
+References: <1421368229-4360-1-git-send-email-todd@electricoding.com>\r
+ <1421368229-4360-3-git-send-email-todd@electricoding.com>\r
+ <877fwlbfg1.fsf@maritornes.cs.unb.ca>\r
+User-Agent: Notmuch/0.19+17~gd8b219d (http://notmuchmail.org) Emacs/24.4.1\r
+ (x86_64-unknown-linux-gnu)\r
+Date: Sat, 17 Jan 2015 10:41:10 -0600\r
+Message-ID: <871tmt5pi1.fsf@electricoding.com>\r
+MIME-Version: 1.0\r
+Content-Type: multipart/signed; boundary="=-=-=";\r
+ micalg=pgp-sha1; protocol="application/pgp-signature"\r
+X-AntiAbuse: This header was added to track abuse,\r
+ please include it with any abuse report\r
+X-AntiAbuse: Primary Hostname - server75.web-hosting.com\r
+X-AntiAbuse: Original Domain - notmuchmail.org\r
+X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]\r
+X-AntiAbuse: Sender Address Domain - electricoding.com\r
+X-Get-Message-Sender-Via: server75.web-hosting.com: authenticated_id:\r
+ todd@electricoding.com\r
+X-Source: \r
+X-Source-Args: \r
+X-Source-Dir: \r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+ <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Sat, 17 Jan 2015 16:41:47 -0000\r
+\r
+--=-=-=\r
+Content-Type: text/plain\r
+Content-Transfer-Encoding: quoted-printable\r
+\r
+\r
+>>>>> "DB" =3D=3D David Bremner <david@tethera.net> writes:\r
+\r
+ DB> Todd <todd@electricoding.com> writes:\r
+ >> Adds the indexing and removes the broken test flag\r
+ >> ---\r
+ >> lib/database.cc | 1 +\r
+ >> lib/index.cc | 10 ++++++++++\r
+ >> test/T190-multipart.sh | 4 ----\r
+ >> 3 files changed, 11 insertions(+), 4 deletions(-)\r
+ >>\r
+ >> diff --git a/lib/database.cc b/lib/database.cc\r
+ >> index 0d2c417..3974e2e 100644\r
+ >> --- a/lib/database.cc\r
+ >> +++ b/lib/database.cc\r
+ >> @@ -254,6 +254,7 @@ static prefix_t PROBABILISTIC_PREFIX[]=3D {\r
+ >> { "from", "XFROM" },\r
+ >> { "to", "XTO" },\r
+ >> { "attachment", "XATTACHMENT" },\r
+ >> + { "mimetype", "XMIMETYPE"},\r
+ >> { "subject", "XSUBJECT"},\r
+ >> };\r
+\r
+ DB> I think the commit message should articulate why we are indexing th=\r
+is as\r
+ DB> a probabilistic prefix, rather than as a boolean prefix. In particu=\r
+lar,\r
+ DB> this gives people a last chance to complain.\r
+\r
+ DB> The reference I know is http://xapian.org/docs/queryparser.html\r
+\r
+ DB> If I understand correctly (it would be great if you could test this\r
+ DB> Todd) , with a probabilistic prefix,\r
+\r
+ DB> mimetime:pdf\r
+\r
+ DB> will match\r
+\r
+ DB> application/pdf\r
+ DB> image/pdf\r
+ DB> application/x-pdf\r
+ DB> application/x-ext-pdf\r
+\r
+ DB> but not\r
+\r
+ DB> application/x-bzpdf\r
+ DB> application/x-gzpdf\r
+ DB> application/x-xzpdf\r
+\r
+ I just tested, and it does work this way with your examples. I\r
+ *believe* from reading the docs, that xapian is treating the full\r
+ MIME-type queries as phrase searches anyway due to the embedded\r
+ slashes.\r
+\r
+ From http://xapian.org/docs/queryparser.html:\r
+\r
+ A phrase surrounded with double quotes ("") matches documents\r
+ containing that exact phrase. Hyphenated words are also treated\r
+ as phrases, as are cases such as filenames and email addresses\r
+ (e.g. /etc/passwd or president@whitehouse.gov).\r
+\r
+ I think that we'll get good behavior from the types of queries that\r
+ will typically be performed due to this automatic phrasing.\r
+\r
+\r
+\r
+ DB> On the whole, this is probably more beneficial than bad. The downs=\r
+ide\r
+ DB> of probabilistic prefixes/fields is that they are not "anchored", so\r
+ DB> there is no easy way to distinguish\r
+\r
+ DB> application/pdf\r
+\r
+ DB> from\r
+\r
+ DB> pdf\r
+ DB> application/x-pdf\r
+\r
+ DB> I guess in a perfect world this would also be explained in\r
+ DB> notmuch-search-terms(7), but that's pretty much orthogonal to this\r
+ DB> series.\r
+\r
+ If separate messages with application/pdf and application/x-pdf are\r
+ indexed, then:\r
+=20=20=20=20\r
+ mimetype:application/x-pdf finds only the application/x-pdf\r
+ mimetype:application/pdf finds only the application/pdf\r
+ mimetype:pdf finds both of the messages\r
+\r
+ I am fairly sure that this behaviour is a result of the automatic\r
+ phrasing mentioned above.\r
+\r
+ - Todd\r
+=20=20=20=20\r
+ DB> d\r
+\r
+--=-=-=\r
+Content-Type: application/pgp-signature; name="signature.asc"\r
+\r
+-----BEGIN PGP SIGNATURE-----\r
+Version: GnuPG v1\r
+\r
+iQIcBAEBAgAGBQJUupCnAAoJEEc0ULlfRYDu0f8QAJVtVpA9kQKjBgpTkrieYQnE\r
+ADCWWrIwiI7rU8MyaWD5GqVBPVUdHvYaKCGoQhiirnqvNEk0CrsF4rrDB7UNcSVH\r
+LKV5SDNIBGxw0EsMtukPXz0zgoJfKIWfqWieC97j832fI/2NZHetrs9VEWPHVLzJ\r
+1VnPQpsAFt3dLXw8ff9WjkEZVcj/fbVBvHNZNX+YqY9RdzTRomJP4pqn0S1YKY9o\r
+SohqbLpS7HVh7JFOdPMVyALOqs5dh44n0PJYe7FDazqNwb2w0PqEa2dQnHjGF/0e\r
+8SRUSKCTpvYC9buRfcFmZj5KWGx/vgi9T17etXJYU2Vd/CQNPAZmliZS9gaYKlWt\r
+8YasMJyDDRq79XmiFbJwao47HUig6IFBdgGCMVxzmUZPTlINO8lQyuP/O9DlHVo5\r
+2PK2vf/d07k5VnH6tjukEY6fEMQqQFkXG5JIWw0VLKMbVBG8esFwfpeEx0KdW6Qi\r
+oJfHxjmHMfAug9L/lukHotW7fH3mHZ2RQLWClaqhVBGgeGRfyMJEjnbLVCiZlk/0\r
+0p4TDt5LTVAtopquwCMHpwJG7BA9CMOwGdOJB7hv/OTqVuj3ZSq1JP93jsrV4tO7\r
+azEYOYW/VnrsOoGmsW/K3Hggl2OYej9aYmugTw3fodU9RV+xmfSrvvU/qkKWMSel\r
+oTv39uIcY/R+dmhU8EO1\r
+=5flc\r
+-----END PGP SIGNATURE-----\r
+--=-=-=--\r