1 Return-Path: <todd@electricoding.com>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id A977C431FC2
\r
6 for <notmuch@notmuchmail.org>; Sat, 17 Jan 2015 08:41:47 -0800 (PST)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=2.438 tagged_above=-999 required=5
\r
12 tests=[DNS_FROM_AHBL_RHSBL=2.438] autolearn=disabled
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id LuMEjxPDhkyZ for <notmuch@notmuchmail.org>;
\r
16 Sat, 17 Jan 2015 08:41:44 -0800 (PST)
\r
17 Received: from s75.web-hosting.com (s75.web-hosting.com [198.187.31.9])
\r
18 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
\r
19 (No client certificate requested)
\r
20 by olra.theworths.org (Postfix) with ESMTPS id 6847F431FAF
\r
21 for <notmuch@notmuchmail.org>; Sat, 17 Jan 2015 08:41:44 -0800 (PST)
\r
22 Received: from user-69-73-37-128.knology.net ([69.73.37.128]:46736
\r
23 helo=tz-lab) by server75.web-hosting.com with esmtpsa
\r
24 (UNKNOWN:DHE-RSA-AES128-SHA:128) (Exim 4.82) (envelope-from
\r
25 <todd@electricoding.com>) id 1YCWRS-001OHm-HT; Sat, 17 Jan 2015 11:41:42
\r
27 From: Todd <todd@electricoding.com>
\r
28 To: David Bremner <david@tethera.net>, notmuch@notmuchmail.org
\r
29 Subject: Re: [PATCH v3 3/5] Add indexing for the mimetype term
\r
30 In-Reply-To: <877fwlbfg1.fsf@maritornes.cs.unb.ca>
\r
31 References: <1421368229-4360-1-git-send-email-todd@electricoding.com>
\r
32 <1421368229-4360-3-git-send-email-todd@electricoding.com>
\r
33 <877fwlbfg1.fsf@maritornes.cs.unb.ca>
\r
34 User-Agent: Notmuch/0.19+17~gd8b219d (http://notmuchmail.org) Emacs/24.4.1
\r
35 (x86_64-unknown-linux-gnu)
\r
36 Date: Sat, 17 Jan 2015 10:41:10 -0600
\r
37 Message-ID: <871tmt5pi1.fsf@electricoding.com>
\r
39 Content-Type: multipart/signed; boundary="=-=-=";
\r
40 micalg=pgp-sha1; protocol="application/pgp-signature"
\r
41 X-AntiAbuse: This header was added to track abuse,
\r
42 please include it with any abuse report
\r
43 X-AntiAbuse: Primary Hostname - server75.web-hosting.com
\r
44 X-AntiAbuse: Original Domain - notmuchmail.org
\r
45 X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
\r
46 X-AntiAbuse: Sender Address Domain - electricoding.com
\r
47 X-Get-Message-Sender-Via: server75.web-hosting.com: authenticated_id:
\r
48 todd@electricoding.com
\r
52 X-BeenThere: notmuch@notmuchmail.org
\r
53 X-Mailman-Version: 2.1.13
\r
55 List-Id: "Use and development of the notmuch mail system."
\r
56 <notmuch.notmuchmail.org>
\r
57 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
58 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
59 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
60 List-Post: <mailto:notmuch@notmuchmail.org>
\r
61 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
62 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
63 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
64 X-List-Received-Date: Sat, 17 Jan 2015 16:41:47 -0000
\r
67 Content-Type: text/plain
\r
68 Content-Transfer-Encoding: quoted-printable
\r
71 >>>>> "DB" =3D=3D David Bremner <david@tethera.net> writes:
\r
73 DB> Todd <todd@electricoding.com> writes:
\r
74 >> Adds the indexing and removes the broken test flag
\r
76 >> lib/database.cc | 1 +
\r
77 >> lib/index.cc | 10 ++++++++++
\r
78 >> test/T190-multipart.sh | 4 ----
\r
79 >> 3 files changed, 11 insertions(+), 4 deletions(-)
\r
81 >> diff --git a/lib/database.cc b/lib/database.cc
\r
82 >> index 0d2c417..3974e2e 100644
\r
83 >> --- a/lib/database.cc
\r
84 >> +++ b/lib/database.cc
\r
85 >> @@ -254,6 +254,7 @@ static prefix_t PROBABILISTIC_PREFIX[]=3D {
\r
86 >> { "from", "XFROM" },
\r
88 >> { "attachment", "XATTACHMENT" },
\r
89 >> + { "mimetype", "XMIMETYPE"},
\r
90 >> { "subject", "XSUBJECT"},
\r
93 DB> I think the commit message should articulate why we are indexing th=
\r
95 DB> a probabilistic prefix, rather than as a boolean prefix. In particu=
\r
97 DB> this gives people a last chance to complain.
\r
99 DB> The reference I know is http://xapian.org/docs/queryparser.html
\r
101 DB> If I understand correctly (it would be great if you could test this
\r
102 DB> Todd) , with a probabilistic prefix,
\r
108 DB> application/pdf
\r
110 DB> application/x-pdf
\r
111 DB> application/x-ext-pdf
\r
115 DB> application/x-bzpdf
\r
116 DB> application/x-gzpdf
\r
117 DB> application/x-xzpdf
\r
119 I just tested, and it does work this way with your examples. I
\r
120 *believe* from reading the docs, that xapian is treating the full
\r
121 MIME-type queries as phrase searches anyway due to the embedded
\r
124 From http://xapian.org/docs/queryparser.html:
\r
126 A phrase surrounded with double quotes ("") matches documents
\r
127 containing that exact phrase. Hyphenated words are also treated
\r
128 as phrases, as are cases such as filenames and email addresses
\r
129 (e.g. /etc/passwd or president@whitehouse.gov).
\r
131 I think that we'll get good behavior from the types of queries that
\r
132 will typically be performed due to this automatic phrasing.
\r
136 DB> On the whole, this is probably more beneficial than bad. The downs=
\r
138 DB> of probabilistic prefixes/fields is that they are not "anchored", so
\r
139 DB> there is no easy way to distinguish
\r
141 DB> application/pdf
\r
146 DB> application/x-pdf
\r
148 DB> I guess in a perfect world this would also be explained in
\r
149 DB> notmuch-search-terms(7), but that's pretty much orthogonal to this
\r
152 If separate messages with application/pdf and application/x-pdf are
\r
155 mimetype:application/x-pdf finds only the application/x-pdf
\r
156 mimetype:application/pdf finds only the application/pdf
\r
157 mimetype:pdf finds both of the messages
\r
159 I am fairly sure that this behaviour is a result of the automatic
\r
160 phrasing mentioned above.
\r
167 Content-Type: application/pgp-signature; name="signature.asc"
\r
169 -----BEGIN PGP SIGNATURE-----
\r
172 iQIcBAEBAgAGBQJUupCnAAoJEEc0ULlfRYDu0f8QAJVtVpA9kQKjBgpTkrieYQnE
\r
173 ADCWWrIwiI7rU8MyaWD5GqVBPVUdHvYaKCGoQhiirnqvNEk0CrsF4rrDB7UNcSVH
\r
174 LKV5SDNIBGxw0EsMtukPXz0zgoJfKIWfqWieC97j832fI/2NZHetrs9VEWPHVLzJ
\r
175 1VnPQpsAFt3dLXw8ff9WjkEZVcj/fbVBvHNZNX+YqY9RdzTRomJP4pqn0S1YKY9o
\r
176 SohqbLpS7HVh7JFOdPMVyALOqs5dh44n0PJYe7FDazqNwb2w0PqEa2dQnHjGF/0e
\r
177 8SRUSKCTpvYC9buRfcFmZj5KWGx/vgi9T17etXJYU2Vd/CQNPAZmliZS9gaYKlWt
\r
178 8YasMJyDDRq79XmiFbJwao47HUig6IFBdgGCMVxzmUZPTlINO8lQyuP/O9DlHVo5
\r
179 2PK2vf/d07k5VnH6tjukEY6fEMQqQFkXG5JIWw0VLKMbVBG8esFwfpeEx0KdW6Qi
\r
180 oJfHxjmHMfAug9L/lukHotW7fH3mHZ2RQLWClaqhVBGgeGRfyMJEjnbLVCiZlk/0
\r
181 0p4TDt5LTVAtopquwCMHpwJG7BA9CMOwGdOJB7hv/OTqVuj3ZSq1JP93jsrV4tO7
\r
182 azEYOYW/VnrsOoGmsW/K3Hggl2OYej9aYmugTw3fodU9RV+xmfSrvvU/qkKWMSel
\r
183 oTv39uIcY/R+dmhU8EO1
\r
185 -----END PGP SIGNATURE-----
\r