--- /dev/null
+Return-Path: <m.walters@qmul.ac.uk>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+ by olra.theworths.org (Postfix) with ESMTP id 30B40431FBD\r
+ for <notmuch@notmuchmail.org>; Sun, 2 Feb 2014 10:23:59 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: -1.098\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=-1.098 tagged_above=-999 required=5\r
+ tests=[DKIM_ADSP_CUSTOM_MED=0.001, FREEMAIL_FROM=0.001,\r
+ NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+ by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+ with ESMTP id uMsOckJtkBkL for <notmuch@notmuchmail.org>;\r
+ Sun, 2 Feb 2014 10:23:52 -0800 (PST)\r
+Received: from mail2.qmul.ac.uk (mail2.qmul.ac.uk [138.37.6.6])\r
+ (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))\r
+ (No client certificate requested)\r
+ by olra.theworths.org (Postfix) with ESMTPS id 88DF4431FBC\r
+ for <notmuch@notmuchmail.org>; Sun, 2 Feb 2014 10:23:52 -0800 (PST)\r
+Received: from smtp.qmul.ac.uk ([138.37.6.40])\r
+ by mail2.qmul.ac.uk with esmtp (Exim 4.71)\r
+ (envelope-from <m.walters@qmul.ac.uk>)\r
+ id 1WA1hu-0001h9-EX; Sun, 02 Feb 2014 18:23:50 +0000\r
+Received: from 93-97-24-31.zone5.bethere.co.uk ([93.97.24.31] helo=localhost)\r
+ by smtp.qmul.ac.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.71)\r
+ (envelope-from <m.walters@qmul.ac.uk>)\r
+ id 1WA1h0-0006D9-Ps; Sun, 02 Feb 2014 18:22:55 +0000\r
+From: Mark Walters <markwalters1009@gmail.com>\r
+To: Jani Nikula <jani@nikula.org>, notmuch@notmuchmail.org\r
+Subject: Re: [PATCH v2 2/7] cli: refactor reply from guessing\r
+In-Reply-To:\r
+ <c877beb316c1ebab5d5b7224483ce9fac4f69598.1385825425.git.jani@nikula.org>\r
+References: <cover.1385825425.git.jani@nikula.org>\r
+ <c877beb316c1ebab5d5b7224483ce9fac4f69598.1385825425.git.jani@nikula.org>\r
+User-Agent: Notmuch/0.15.2+484~gfb59956 (http://notmuchmail.org) Emacs/23.4.1\r
+ (x86_64-pc-linux-gnu)\r
+Date: Sun, 02 Feb 2014 18:21:28 +0000\r
+Message-ID: <87vbwxz87r.fsf@qmul.ac.uk>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain; charset=us-ascii\r
+X-Sender-Host-Address: 93.97.24.31\r
+X-QM-Geographic: According to ripencc,\r
+ this message was delivered by a machine in Britain (UK) (GB).\r
+X-QM-SPAM-Info: Sender has good ham record. :)\r
+X-QM-Body-MD5: 80499a0cc79b7c6c53c87dbaf1e23162 (of first 20000 bytes)\r
+X-SpamAssassin-Score: 0.0\r
+X-SpamAssassin-SpamBar: /\r
+X-SpamAssassin-Report: The QM spam filters have analysed this message to\r
+ determine if it is\r
+ spam. We require at least 5.0 points to mark a message as spam.\r
+ This message scored 0.0 points. Summary of the scoring: \r
+ * 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail\r
+ provider * (markwalters1009[at]gmail.com)\r
+ * 0.0 AWL AWL: From: address is in the auto white-list\r
+X-QM-Scan-Virus: ClamAV says the message is clean\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+ <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Sun, 02 Feb 2014 18:23:59 -0000\r
+\r
+On Sat, 30 Nov 2013, Jani Nikula <jani@nikula.org> wrote:\r
+> The guess_from_received_header() function had grown quite big. Chop it\r
+> up into smaller functions.\r
+>\r
+> No functional changes.\r
+> ---\r
+> notmuch-reply.c | 178 +++++++++++++++++++++++++++++++++-----------------------\r
+> 1 file changed, 105 insertions(+), 73 deletions(-)\r
+>\r
+> diff --git a/notmuch-reply.c b/notmuch-reply.c\r
+> index 9d6f843..ca41405 100644\r
+> --- a/notmuch-reply.c\r
+> +++ b/notmuch-reply.c\r
+> @@ -369,78 +369,44 @@ add_recipients_from_message (GMimeMessage *reply,\r
+> return from_addr;\r
+> }\r
+> \r
+> +/*\r
+> + * Look for the user's address in " for <email@add.res>" in the\r
+> + * received headers.\r
+> + *\r
+> + * Return the address that was found, if any, and NULL otherwise.\r
+> + */\r
+> static const char *\r
+> -guess_from_received_header (notmuch_config_t *config, notmuch_message_t *message)\r
+> +guess_from_received_for (notmuch_config_t *config, const char *received)\r
+> {\r
+> - const char *addr, *received, *by;\r
+> - char *mta,*ptr,*token;\r
+> - char *domain=NULL;\r
+> - char *tld=NULL;\r
+> - const char *delim=". \t";\r
+> - size_t i;\r
+> -\r
+> - const char *to_headers[] = {\r
+> - "Envelope-to",\r
+> - "X-Original-To",\r
+> - "Delivered-To",\r
+> - };\r
+> -\r
+> - /* sadly, there is no standard way to find out to which email\r
+> - * address a mail was delivered - what is in the headers depends\r
+> - * on the MTAs used along the way. So we are trying a number of\r
+> - * heuristics which hopefully will answer this question.\r
+> -\r
+> - * We only got here if none of the users email addresses are in\r
+> - * the To: or Cc: header. From here we try the following in order:\r
+> - * 1) check for an Envelope-to: header\r
+> - * 2) check for an X-Original-To: header\r
+> - * 3) check for a Delivered-To: header\r
+> - * 4) check for a (for <email@add.res>) clause in Received: headers\r
+> - * 5) check for the domain part of known email addresses in the\r
+> - * 'by' part of Received headers\r
+> - * If none of these work, we give up and return NULL\r
+> - */\r
+\r
+I like having the logic laid out in a comment as above so would prefer\r
+to see something similar included (that is points 1-6) but I am happy to\r
+be overruled.\r
+\r
+> - for (i = 0; i < ARRAY_SIZE (to_headers); i++) {\r
+> - const char *tohdr = notmuch_message_get_header (message, to_headers[i]);\r
+> -\r
+> - /* Note: tohdr potentially contains a list of email addresses. */\r
+> - addr = user_address_in_string (tohdr, config);\r
+> - if (addr)\r
+> - return addr;\r
+> - }\r
+> + const char *ptr;\r
+> \r
+> - /* We get the concatenated Received: headers and search from the\r
+> - * front (last Received: header added) and try to extract from\r
+> - * them indications to which email address this message was\r
+> - * delivered.\r
+> - * The Received: header is special in our get_header function\r
+> - * and is always concatenated.\r
+> - */\r
+> - received = notmuch_message_get_header (message, "received");\r
+> - if (received == NULL)\r
+> + ptr = strstr (received, " for ");\r
+> + if (! ptr)\r
+> return NULL;\r
+> \r
+> - /* First we look for a " for <email@add.res>" in the received\r
+> - * header\r
+> - */\r
+> - ptr = strstr (received, " for ");\r
+> + return user_address_in_string (ptr, config);\r
+> +}\r
+> \r
+> - /* Note: ptr potentially contains a list of email addresses. */\r
+> - addr = user_address_in_string (ptr, config);\r
+> - if (addr)\r
+> - return addr;\r
+> -\r
+> - /* Finally, we parse all the " by MTA ..." headers to guess the\r
+> - * email address that this was originally delivered to.\r
+> - * We extract just the MTA here by removing leading whitespace and\r
+> - * assuming that the MTA name ends at the next whitespace.\r
+> - * We test for *(by+4) to be non-'\0' to make sure there's\r
+> - * something there at all - and then assume that the first\r
+> - * whitespace delimited token that follows is the receiving\r
+> - * system in this step of the receive chain\r
+> - */\r
+> - by = received;\r
+> - while((by = strstr (by, " by ")) != NULL) {\r
+> +/*\r
+> + * Parse all the " by MTA ..." parts in received headers to guess the\r
+> + * email address that this was originally delivered to.\r
+> + *\r
+> + * Extract just the MTA here by removing leading whitespace and\r
+> + * assuming that the MTA name ends at the next whitespace. Test for\r
+> + * *(by+4) to be non-'\0' to make sure there's something there at all\r
+> + * - and then assume that the first whitespace delimited token that\r
+> + * follows is the receiving system in this step of the receive chain.\r
+> + *\r
+> + * Return the address that was found, if any, and NULL otherwise.\r
+> + */\r
+> +static const char *\r
+> +guess_from_received_by (notmuch_config_t *config, const char *received)\r
+> +{\r
+> + const char *addr;\r
+> + const char *by = received;\r
+> + char *domain, *tld, *mta, *ptr, *token;\r
+> +\r
+> + while ((by = strstr (by, " by ")) != NULL) {\r
+> by += 4;\r
+> if (*by == '\0')\r
+> break;\r
+> @@ -454,7 +420,7 @@ guess_from_received_header (notmuch_config_t *config, notmuch_message_t *message\r
+> * as domain and tld.\r
+> */\r
+> domain = tld = NULL;\r
+> - while ((ptr = strsep (&token, delim)) != NULL) {\r
+> + while ((ptr = strsep (&token, ". \t")) != NULL) {\r
+> if (*ptr == '\0')\r
+> continue;\r
+> domain = tld;\r
+> @@ -462,13 +428,13 @@ guess_from_received_header (notmuch_config_t *config, notmuch_message_t *message\r
+> }\r
+> \r
+> if (domain) {\r
+> - /* Recombine domain and tld and look for it among the configured\r
+> - * email addresses.\r
+> - * This time we have a known domain name and nothing else - so\r
+> - * the test is the other way around: we check if this is a\r
+> - * substring of one of the email addresses.\r
+> + /* Recombine domain and tld and look for it among the\r
+> + * configured email addresses. This time we have a known\r
+> + * domain name and nothing else - so the test is the other\r
+> + * way around: we check if this is a substring of one of\r
+> + * the email addresses.\r
+> */\r
+> - *(tld-1) = '.';\r
+> + *(tld - 1) = '.';\r
+> \r
+> addr = string_in_user_address (domain, config);\r
+> if (addr) {\r
+> @@ -482,6 +448,63 @@ guess_from_received_header (notmuch_config_t *config, notmuch_message_t *message\r
+> return NULL;\r
+> }\r
+> \r
+> +/*\r
+> + * Get the concatenated Received: headers and search from the front\r
+> + * (last Received: header added) and try to extract from them\r
+> + * indications to which email address this message was delivered.\r
+> + *\r
+> + * The Received: header is special in our get_header function and is\r
+> + * always concatenated.\r
+> + *\r
+> + * Return the address that was found, if any, and NULL otherwise.\r
+> + */\r
+> +static const char *\r
+> +guess_from_received_header (notmuch_config_t *config,\r
+> + notmuch_message_t *message)\r
+> +{\r
+> + const char *received, *addr;\r
+> +\r
+> + received = notmuch_message_get_header (message, "received");\r
+> + if (! received)\r
+> + return NULL;\r
+> +\r
+> + addr = guess_from_received_for (config, received);\r
+> + if (! addr)\r
+> + addr = guess_from_received_by (config, received);\r
+> +\r
+> + return addr;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Try to find user's email address in one of the extra To-like\r
+> + * headers, such as Envelope-To, X-Original-To, and\r
+> + * Delivered-To.\r
+> + *\r
+> + * Return the address that was found, if any, and NULL otherwise.\r
+> + */\r
+\r
+I would prefer to replace the "extra To-like headers, such as ..." by\r
+something more explicit: eg "extra To-like headers: Envelope-To,\r
+X-Original-To, and Delivered-To (searched in that order)"\r
+\r
+\r
+> +static const char *\r
+> +from_from_to_headers (notmuch_config_t *config, notmuch_message_t *message)\r
+\r
+I am not keen on this name, but I am not sure I have a better\r
+suggestion.\r
+\r
+Best wishes\r
+\r
+Mark\r
+\r
+> +{\r
+> + size_t i;\r
+> + const char *tohdr, *addr;\r
+> + const char *to_headers[] = {\r
+> + "Envelope-to",\r
+> + "X-Original-To",\r
+> + "Delivered-To",\r
+> + };\r
+> +\r
+> + for (i = 0; i < ARRAY_SIZE (to_headers); i++) {\r
+> + tohdr = notmuch_message_get_header (message, to_headers[i]);\r
+> +\r
+> + /* Note: tohdr potentially contains a list of email addresses. */\r
+> + addr = user_address_in_string (tohdr, config);\r
+> + if (addr)\r
+> + return addr;\r
+> + }\r
+> +\r
+> + return NULL;\r
+> +}\r
+> +\r
+> static GMimeMessage *\r
+> create_reply_message(void *ctx,\r
+> notmuch_config_t *config,\r
+> @@ -508,6 +531,15 @@ create_reply_message(void *ctx,\r
+> from_addr = add_recipients_from_message (reply, config,\r
+> message, reply_all);\r
+> \r
+> + /*\r
+> + * Sadly, there is no standard way to find out to which email\r
+> + * address a mail was delivered - what is in the headers depends\r
+> + * on the MTAs used along the way. So we are trying a number of\r
+> + * heuristics which hopefully will answer this question.\r
+> + */\r
+> + if (from_addr == NULL)\r
+> + from_addr = from_from_to_headers (config, message);\r
+> +\r
+> if (from_addr == NULL)\r
+> from_addr = guess_from_received_header (config, message);\r
+> \r
+> -- \r
+> 1.8.4.2\r
+>\r
+> _______________________________________________\r
+> notmuch mailing list\r
+> notmuch@notmuchmail.org\r
+> http://notmuchmail.org/mailman/listinfo/notmuch\r