1 Return-Path: <m.walters@qmul.ac.uk>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 30B40431FBD
\r
6 for <notmuch@notmuchmail.org>; Sun, 2 Feb 2014 10:23:59 -0800 (PST)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-1.098 tagged_above=-999 required=5
\r
12 tests=[DKIM_ADSP_CUSTOM_MED=0.001, FREEMAIL_FROM=0.001,
\r
13 NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled
\r
14 Received: from olra.theworths.org ([127.0.0.1])
\r
15 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
16 with ESMTP id uMsOckJtkBkL for <notmuch@notmuchmail.org>;
\r
17 Sun, 2 Feb 2014 10:23:52 -0800 (PST)
\r
18 Received: from mail2.qmul.ac.uk (mail2.qmul.ac.uk [138.37.6.6])
\r
19 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
\r
20 (No client certificate requested)
\r
21 by olra.theworths.org (Postfix) with ESMTPS id 88DF4431FBC
\r
22 for <notmuch@notmuchmail.org>; Sun, 2 Feb 2014 10:23:52 -0800 (PST)
\r
23 Received: from smtp.qmul.ac.uk ([138.37.6.40])
\r
24 by mail2.qmul.ac.uk with esmtp (Exim 4.71)
\r
25 (envelope-from <m.walters@qmul.ac.uk>)
\r
26 id 1WA1hu-0001h9-EX; Sun, 02 Feb 2014 18:23:50 +0000
\r
27 Received: from 93-97-24-31.zone5.bethere.co.uk ([93.97.24.31] helo=localhost)
\r
28 by smtp.qmul.ac.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.71)
\r
29 (envelope-from <m.walters@qmul.ac.uk>)
\r
30 id 1WA1h0-0006D9-Ps; Sun, 02 Feb 2014 18:22:55 +0000
\r
31 From: Mark Walters <markwalters1009@gmail.com>
\r
32 To: Jani Nikula <jani@nikula.org>, notmuch@notmuchmail.org
\r
33 Subject: Re: [PATCH v2 2/7] cli: refactor reply from guessing
\r
35 <c877beb316c1ebab5d5b7224483ce9fac4f69598.1385825425.git.jani@nikula.org>
\r
36 References: <cover.1385825425.git.jani@nikula.org>
\r
37 <c877beb316c1ebab5d5b7224483ce9fac4f69598.1385825425.git.jani@nikula.org>
\r
38 User-Agent: Notmuch/0.15.2+484~gfb59956 (http://notmuchmail.org) Emacs/23.4.1
\r
39 (x86_64-pc-linux-gnu)
\r
40 Date: Sun, 02 Feb 2014 18:21:28 +0000
\r
41 Message-ID: <87vbwxz87r.fsf@qmul.ac.uk>
\r
43 Content-Type: text/plain; charset=us-ascii
\r
44 X-Sender-Host-Address: 93.97.24.31
\r
45 X-QM-Geographic: According to ripencc,
\r
46 this message was delivered by a machine in Britain (UK) (GB).
\r
47 X-QM-SPAM-Info: Sender has good ham record. :)
\r
48 X-QM-Body-MD5: 80499a0cc79b7c6c53c87dbaf1e23162 (of first 20000 bytes)
\r
49 X-SpamAssassin-Score: 0.0
\r
50 X-SpamAssassin-SpamBar: /
\r
51 X-SpamAssassin-Report: The QM spam filters have analysed this message to
\r
53 spam. We require at least 5.0 points to mark a message as spam.
\r
54 This message scored 0.0 points. Summary of the scoring:
\r
55 * 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail
\r
56 provider * (markwalters1009[at]gmail.com)
\r
57 * 0.0 AWL AWL: From: address is in the auto white-list
\r
58 X-QM-Scan-Virus: ClamAV says the message is clean
\r
59 X-BeenThere: notmuch@notmuchmail.org
\r
60 X-Mailman-Version: 2.1.13
\r
62 List-Id: "Use and development of the notmuch mail system."
\r
63 <notmuch.notmuchmail.org>
\r
64 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
65 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
66 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
67 List-Post: <mailto:notmuch@notmuchmail.org>
\r
68 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
69 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
70 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
71 X-List-Received-Date: Sun, 02 Feb 2014 18:23:59 -0000
\r
73 On Sat, 30 Nov 2013, Jani Nikula <jani@nikula.org> wrote:
\r
74 > The guess_from_received_header() function had grown quite big. Chop it
\r
75 > up into smaller functions.
\r
77 > No functional changes.
\r
79 > notmuch-reply.c | 178 +++++++++++++++++++++++++++++++++-----------------------
\r
80 > 1 file changed, 105 insertions(+), 73 deletions(-)
\r
82 > diff --git a/notmuch-reply.c b/notmuch-reply.c
\r
83 > index 9d6f843..ca41405 100644
\r
84 > --- a/notmuch-reply.c
\r
85 > +++ b/notmuch-reply.c
\r
86 > @@ -369,78 +369,44 @@ add_recipients_from_message (GMimeMessage *reply,
\r
91 > + * Look for the user's address in " for <email@add.res>" in the
\r
92 > + * received headers.
\r
94 > + * Return the address that was found, if any, and NULL otherwise.
\r
96 > static const char *
\r
97 > -guess_from_received_header (notmuch_config_t *config, notmuch_message_t *message)
\r
98 > +guess_from_received_for (notmuch_config_t *config, const char *received)
\r
100 > - const char *addr, *received, *by;
\r
101 > - char *mta,*ptr,*token;
\r
102 > - char *domain=NULL;
\r
103 > - char *tld=NULL;
\r
104 > - const char *delim=". \t";
\r
107 > - const char *to_headers[] = {
\r
109 > - "X-Original-To",
\r
110 > - "Delivered-To",
\r
113 > - /* sadly, there is no standard way to find out to which email
\r
114 > - * address a mail was delivered - what is in the headers depends
\r
115 > - * on the MTAs used along the way. So we are trying a number of
\r
116 > - * heuristics which hopefully will answer this question.
\r
118 > - * We only got here if none of the users email addresses are in
\r
119 > - * the To: or Cc: header. From here we try the following in order:
\r
120 > - * 1) check for an Envelope-to: header
\r
121 > - * 2) check for an X-Original-To: header
\r
122 > - * 3) check for a Delivered-To: header
\r
123 > - * 4) check for a (for <email@add.res>) clause in Received: headers
\r
124 > - * 5) check for the domain part of known email addresses in the
\r
125 > - * 'by' part of Received headers
\r
126 > - * If none of these work, we give up and return NULL
\r
129 I like having the logic laid out in a comment as above so would prefer
\r
130 to see something similar included (that is points 1-6) but I am happy to
\r
133 > - for (i = 0; i < ARRAY_SIZE (to_headers); i++) {
\r
134 > - const char *tohdr = notmuch_message_get_header (message, to_headers[i]);
\r
136 > - /* Note: tohdr potentially contains a list of email addresses. */
\r
137 > - addr = user_address_in_string (tohdr, config);
\r
141 > + const char *ptr;
\r
143 > - /* We get the concatenated Received: headers and search from the
\r
144 > - * front (last Received: header added) and try to extract from
\r
145 > - * them indications to which email address this message was
\r
147 > - * The Received: header is special in our get_header function
\r
148 > - * and is always concatenated.
\r
150 > - received = notmuch_message_get_header (message, "received");
\r
151 > - if (received == NULL)
\r
152 > + ptr = strstr (received, " for ");
\r
156 > - /* First we look for a " for <email@add.res>" in the received
\r
159 > - ptr = strstr (received, " for ");
\r
160 > + return user_address_in_string (ptr, config);
\r
163 > - /* Note: ptr potentially contains a list of email addresses. */
\r
164 > - addr = user_address_in_string (ptr, config);
\r
168 > - /* Finally, we parse all the " by MTA ..." headers to guess the
\r
169 > - * email address that this was originally delivered to.
\r
170 > - * We extract just the MTA here by removing leading whitespace and
\r
171 > - * assuming that the MTA name ends at the next whitespace.
\r
172 > - * We test for *(by+4) to be non-'\0' to make sure there's
\r
173 > - * something there at all - and then assume that the first
\r
174 > - * whitespace delimited token that follows is the receiving
\r
175 > - * system in this step of the receive chain
\r
178 > - while((by = strstr (by, " by ")) != NULL) {
\r
180 > + * Parse all the " by MTA ..." parts in received headers to guess the
\r
181 > + * email address that this was originally delivered to.
\r
183 > + * Extract just the MTA here by removing leading whitespace and
\r
184 > + * assuming that the MTA name ends at the next whitespace. Test for
\r
185 > + * *(by+4) to be non-'\0' to make sure there's something there at all
\r
186 > + * - and then assume that the first whitespace delimited token that
\r
187 > + * follows is the receiving system in this step of the receive chain.
\r
189 > + * Return the address that was found, if any, and NULL otherwise.
\r
191 > +static const char *
\r
192 > +guess_from_received_by (notmuch_config_t *config, const char *received)
\r
194 > + const char *addr;
\r
195 > + const char *by = received;
\r
196 > + char *domain, *tld, *mta, *ptr, *token;
\r
198 > + while ((by = strstr (by, " by ")) != NULL) {
\r
202 > @@ -454,7 +420,7 @@ guess_from_received_header (notmuch_config_t *config, notmuch_message_t *message
\r
203 > * as domain and tld.
\r
205 > domain = tld = NULL;
\r
206 > - while ((ptr = strsep (&token, delim)) != NULL) {
\r
207 > + while ((ptr = strsep (&token, ". \t")) != NULL) {
\r
208 > if (*ptr == '\0')
\r
211 > @@ -462,13 +428,13 @@ guess_from_received_header (notmuch_config_t *config, notmuch_message_t *message
\r
215 > - /* Recombine domain and tld and look for it among the configured
\r
216 > - * email addresses.
\r
217 > - * This time we have a known domain name and nothing else - so
\r
218 > - * the test is the other way around: we check if this is a
\r
219 > - * substring of one of the email addresses.
\r
220 > + /* Recombine domain and tld and look for it among the
\r
221 > + * configured email addresses. This time we have a known
\r
222 > + * domain name and nothing else - so the test is the other
\r
223 > + * way around: we check if this is a substring of one of
\r
224 > + * the email addresses.
\r
226 > - *(tld-1) = '.';
\r
227 > + *(tld - 1) = '.';
\r
229 > addr = string_in_user_address (domain, config);
\r
231 > @@ -482,6 +448,63 @@ guess_from_received_header (notmuch_config_t *config, notmuch_message_t *message
\r
236 > + * Get the concatenated Received: headers and search from the front
\r
237 > + * (last Received: header added) and try to extract from them
\r
238 > + * indications to which email address this message was delivered.
\r
240 > + * The Received: header is special in our get_header function and is
\r
241 > + * always concatenated.
\r
243 > + * Return the address that was found, if any, and NULL otherwise.
\r
245 > +static const char *
\r
246 > +guess_from_received_header (notmuch_config_t *config,
\r
247 > + notmuch_message_t *message)
\r
249 > + const char *received, *addr;
\r
251 > + received = notmuch_message_get_header (message, "received");
\r
252 > + if (! received)
\r
255 > + addr = guess_from_received_for (config, received);
\r
257 > + addr = guess_from_received_by (config, received);
\r
263 > + * Try to find user's email address in one of the extra To-like
\r
264 > + * headers, such as Envelope-To, X-Original-To, and
\r
265 > + * Delivered-To.
\r
267 > + * Return the address that was found, if any, and NULL otherwise.
\r
270 I would prefer to replace the "extra To-like headers, such as ..." by
\r
271 something more explicit: eg "extra To-like headers: Envelope-To,
\r
272 X-Original-To, and Delivered-To (searched in that order)"
\r
275 > +static const char *
\r
276 > +from_from_to_headers (notmuch_config_t *config, notmuch_message_t *message)
\r
278 I am not keen on this name, but I am not sure I have a better
\r
287 > + const char *tohdr, *addr;
\r
288 > + const char *to_headers[] = {
\r
290 > + "X-Original-To",
\r
291 > + "Delivered-To",
\r
294 > + for (i = 0; i < ARRAY_SIZE (to_headers); i++) {
\r
295 > + tohdr = notmuch_message_get_header (message, to_headers[i]);
\r
297 > + /* Note: tohdr potentially contains a list of email addresses. */
\r
298 > + addr = user_address_in_string (tohdr, config);
\r
306 > static GMimeMessage *
\r
307 > create_reply_message(void *ctx,
\r
308 > notmuch_config_t *config,
\r
309 > @@ -508,6 +531,15 @@ create_reply_message(void *ctx,
\r
310 > from_addr = add_recipients_from_message (reply, config,
\r
311 > message, reply_all);
\r
314 > + * Sadly, there is no standard way to find out to which email
\r
315 > + * address a mail was delivered - what is in the headers depends
\r
316 > + * on the MTAs used along the way. So we are trying a number of
\r
317 > + * heuristics which hopefully will answer this question.
\r
319 > + if (from_addr == NULL)
\r
320 > + from_addr = from_from_to_headers (config, message);
\r
322 > if (from_addr == NULL)
\r
323 > from_addr = guess_from_received_header (config, message);
\r
328 > _______________________________________________
\r
329 > notmuch mailing list
\r
330 > notmuch@notmuchmail.org
\r
331 > http://notmuchmail.org/mailman/listinfo/notmuch
\r