1 Return-Path: <jani@nikula.org>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 80B49431FB6
\r
6 for <notmuch@notmuchmail.org>; Sat, 4 May 2013 09:25:11 -0700 (PDT)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5
\r
12 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id KN1iJ2ohgAsp for <notmuch@notmuchmail.org>;
\r
16 Sat, 4 May 2013 09:25:08 -0700 (PDT)
\r
17 Received: from mail-lb0-f174.google.com (mail-lb0-f174.google.com
\r
18 [209.85.217.174]) (using TLSv1 with cipher RC4-SHA (128/128 bits))
\r
19 (No client certificate requested)
\r
20 by olra.theworths.org (Postfix) with ESMTPS id AB4F7431FAF
\r
21 for <notmuch@notmuchmail.org>; Sat, 4 May 2013 09:25:07 -0700 (PDT)
\r
22 Received: by mail-lb0-f174.google.com with SMTP id r10so2318674lbi.5
\r
23 for <notmuch@notmuchmail.org>; Sat, 04 May 2013 09:25:04 -0700 (PDT)
\r
24 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
\r
25 d=google.com; s=20120113;
\r
26 h=x-received:from:to:subject:in-reply-to:references:user-agent:date
\r
27 :message-id:mime-version:content-type:x-gm-message-state;
\r
28 bh=R672KHlbP2lzZVHtqc2ZQ57gzOm4SUtAvJ5jEasxLeA=;
\r
29 b=f4T19bNSCNgx3IfBCHaRlgtolA6/E+R3BFEy013Mi3VptxQLHMQjRKPy1zUGhxNkP+
\r
30 biB90VW4WwSDIT052kCX08PtPN0vW5VTNfjHPsqhWw70rjJ+vZLgLsYDLbYOPeBC1zjm
\r
31 4mhoPYVbIhD4ZQF2Da1T4x04dS7Jcs2LkHFJqWxODBzWu0OrL5067P9FyExqMX6226KM
\r
32 nxYjoMi9/Fmm9YRHp6RzNGqCDuUP4d9Tn2nO+lE6Uh5KUf9moI8QZnzNUbUPIsgy9Nl7
\r
33 yuVFvXFpIfBRtI65j7zX/4vkXpstWr1S+w71Feubbf6V8KuAT2MOP00OzE16XvqqgHZn
\r
35 X-Received: by 10.152.8.231 with SMTP id u7mr5758551laa.27.1367684704832;
\r
36 Sat, 04 May 2013 09:25:04 -0700 (PDT)
\r
37 Received: from localhost (dsl-hkibrasgw2-58c376-211.dhcp.inet.fi.
\r
39 by mx.google.com with ESMTPSA id l20sm5845965lbv.9.2013.05.04.09.25.03
\r
40 for <multiple recipients>
\r
41 (version=TLSv1.2 cipher=RC4-SHA bits=128/128);
\r
42 Sat, 04 May 2013 09:25:03 -0700 (PDT)
\r
43 From: Jani Nikula <jani@nikula.org>
\r
44 To: Aaron Ecay <aaronecay@gmail.com>, notmuch@notmuchmail.org
\r
45 Subject: Re: [PATCH 2/2] lib/database.cc: change how the parent of a message
\r
47 In-Reply-To: <1362540709-28765-2-git-send-email-aaronecay@gmail.com>
\r
48 References: <87ppzfzxuk.fsf@zancas.localnet>
\r
49 <1362540709-28765-1-git-send-email-aaronecay@gmail.com>
\r
50 <1362540709-28765-2-git-send-email-aaronecay@gmail.com>
\r
51 User-Agent: Notmuch/0.15.2+87~gc69f540 (http://notmuchmail.org) Emacs/24.3.1
\r
52 (x86_64-pc-linux-gnu)
\r
53 Date: Sat, 04 May 2013 19:24:59 +0300
\r
54 Message-ID: <87r4hm1zms.fsf@nikula.org>
\r
56 Content-Type: text/plain
\r
58 ALoCoQn1HSv9xROPnMaZdwmP/cOSGDUHSV62N7YI2hWxXs95xfFv4cUbnbcUxRmsHiQBsR0oDDHu
\r
59 X-BeenThere: notmuch@notmuchmail.org
\r
60 X-Mailman-Version: 2.1.13
\r
62 List-Id: "Use and development of the notmuch mail system."
\r
63 <notmuch.notmuchmail.org>
\r
64 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
65 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
66 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
67 List-Post: <mailto:notmuch@notmuchmail.org>
\r
68 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
69 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
70 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
71 X-List-Received-Date: Sat, 04 May 2013 16:25:11 -0000
\r
76 On Wed, 06 Mar 2013, Aaron Ecay <aaronecay@gmail.com> wrote:
\r
77 > Presently, the code which finds the parent of a message as it is being
\r
78 > added to the database assumes that the first Message-ID-like substring
\r
79 > of the In-Reply-To header is the parent Message ID. Some mail clients,
\r
80 > however, put stuff other than the Message-ID of the parent in the
\r
81 > In-Reply-To header, such as the email address of the sender of the
\r
82 > parent. This can fool notmuch.
\r
84 > The updated algorithm prefers the last Message ID in the References
\r
85 > header. The References header lists messages oldest-first, so the last
\r
86 > Message ID is the parent (RFC2822, p. 24). The References header is
\r
87 > also less likely to be in a non-standard
\r
88 > syntax (http://cr.yp.to/immhf/thread.html,
\r
89 > http://www.jwz.org/doc/threading.html). In case the References header
\r
90 > is not to be found, fall back to the old behavior.
\r
92 > V2 of this patch, incorporating feedback from Jani and (indirectly)
\r
95 > lib/database.cc | 48 +++++++++++++++++++++++++++++++++---------------
\r
96 > test/thread-replies | 4 ----
\r
97 > 2 files changed, 33 insertions(+), 19 deletions(-)
\r
99 > diff --git a/lib/database.cc b/lib/database.cc
\r
100 > index 91d4329..52ed618 100644
\r
101 > --- a/lib/database.cc
\r
102 > +++ b/lib/database.cc
\r
103 > @@ -501,8 +501,10 @@ _parse_message_id (void *ctx, const char *message_id, const char **next)
\r
104 > * 'message_id' in the result (to avoid mass confusion when a single
\r
105 > * message references itself cyclically---and yes, mail messages are
\r
106 > * not infrequent in the wild that do this---don't ask me why).
\r
110 > + * Return the last reference parsed, if it is not equal to message_id.
\r
113 > parse_references (void *ctx,
\r
114 > const char *message_id,
\r
115 > GHashTable *hash,
\r
116 > @@ -511,7 +513,7 @@ parse_references (void *ctx,
\r
119 > if (refs == NULL || *refs == '\0')
\r
124 > ref = _parse_message_id (ctx, refs, &refs);
\r
125 > @@ -519,6 +521,17 @@ parse_references (void *ctx,
\r
126 > if (ref && strcmp (ref, message_id))
\r
127 > g_hash_table_insert (hash, ref, NULL);
\r
130 > + /* The return value of this function is used to add a parent
\r
131 > + * reference to the database. We should avoid making a message
\r
132 > + * its own parent, thus the following check.
\r
135 > + if (ref && strcmp(ref, message_id)) {
\r
143 > @@ -1510,28 +1523,33 @@ _notmuch_database_link_message_to_parents (notmuch_database_t *notmuch,
\r
145 > GHashTable *parents = NULL;
\r
146 > const char *refs, *in_reply_to, *in_reply_to_message_id;
\r
147 > + const char *last_ref_message_id, *this_message_id;
\r
148 > GList *l, *keys = NULL;
\r
149 > notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS;
\r
151 > parents = g_hash_table_new_full (g_str_hash, g_str_equal,
\r
152 > _my_talloc_free_for_g_hash, NULL);
\r
153 > + this_message_id = notmuch_message_get_message_id (message);
\r
155 > refs = notmuch_message_file_get_header (message_file, "references");
\r
156 > - parse_references (message, notmuch_message_get_message_id (message),
\r
157 > - parents, refs);
\r
158 > + last_ref_message_id = parse_references (message,
\r
159 > + this_message_id,
\r
160 > + parents, refs);
\r
162 > in_reply_to = notmuch_message_file_get_header (message_file, "in-reply-to");
\r
163 > - parse_references (message, notmuch_message_get_message_id (message),
\r
164 > - parents, in_reply_to);
\r
166 > - /* Carefully avoid adding any self-referential in-reply-to term. */
\r
167 > - in_reply_to_message_id = _parse_message_id (message, in_reply_to, NULL);
\r
168 > - if (in_reply_to_message_id &&
\r
169 > - strcmp (in_reply_to_message_id,
\r
170 > - notmuch_message_get_message_id (message)))
\r
172 > + in_reply_to_message_id = parse_references (message,
\r
173 > + this_message_id,
\r
174 > + parents, in_reply_to);
\r
176 > + /* For the parent of this message, use the last message ID of the
\r
177 > + * References header, if available. If not, fall back to the
\r
178 > + * first message ID in the In-Reply-To header. */
\r
179 > + if (last_ref_message_id) {
\r
180 > + _notmuch_message_add_term (message, "replyto",
\r
181 > + last_ref_message_id);
\r
182 > + } else if (in_reply_to_message_id) {
\r
183 > _notmuch_message_add_term (message, "replyto",
\r
184 > - _parse_message_id (message, in_reply_to, NULL));
\r
185 > + in_reply_to_message_id);
\r
188 > keys = g_hash_table_get_keys (parents);
\r
189 > diff --git a/test/thread-replies b/test/thread-replies
\r
190 > index a902691..28c2b1f 100755
\r
191 > --- a/test/thread-replies
\r
192 > +++ b/test/thread-replies
\r
193 > @@ -11,7 +11,6 @@ constructed properly, even in the presence of non-RFC-compliant headers'
\r
196 > test_begin_subtest "Use References when In-Reply-To is broken"
\r
197 > -test_subtest_known_broken
\r
198 > add_message '[id]="foo@one.com"' \
\r
200 > add_message '[in-reply-to]="mumble"' \
\r
201 > @@ -46,7 +45,6 @@ expected=`echo "$expected" | notmuch_json_show_sanitize`
\r
202 > test_expect_equal_json "$output" "$expected"
\r
204 > test_begin_subtest "Prefer References to In-Reply-To"
\r
205 > -test_subtest_known_broken
\r
206 > add_message '[id]="foo@two.com"' \
\r
208 > add_message '[in-reply-to]="<bar@baz.com>"' \
\r
209 > @@ -77,7 +75,6 @@ expected=`echo "$expected" | notmuch_json_show_sanitize`
\r
210 > test_expect_equal_json "$output" "$expected"
\r
212 > test_begin_subtest "Use In-Reply-To when no References"
\r
213 > -test_subtest_known_broken
\r
214 > add_message '[id]="foo@three.com"' \
\r
215 > '[subject]="three"'
\r
216 > add_message '[in-reply-to]="<foo@three.com>"' \
\r
217 > @@ -104,7 +101,6 @@ expected=`echo "$expected" | notmuch_json_show_sanitize`
\r
218 > test_expect_equal_json "$output" "$expected"
\r
220 > test_begin_subtest "Use last Reference"
\r
221 > -test_subtest_known_broken
\r
222 > add_message '[id]="foo@four.com"' \
\r
223 > '[subject]="four"'
\r
224 > add_message '[id]="bar@four.com"' \
\r
228 > _______________________________________________
\r
229 > notmuch mailing list
\r
230 > notmuch@notmuchmail.org
\r
231 > http://notmuchmail.org/mailman/listinfo/notmuch
\r