From 4aa08a6fff553b72e8129d0bddc2873822e1862e Mon Sep 17 00:00:00 2001 From: Austin Clements Date: Mon, 24 Jun 2013 12:59:39 +2000 Subject: [PATCH] Re: header continuation issue in notmuch frontend/alot/pythons email module --- 48/a41286d2421c031315e8f9f4f7b96d424a29d5 | 185 ++++++++++++++++++++++ 1 file changed, 185 insertions(+) create mode 100644 48/a41286d2421c031315e8f9f4f7b96d424a29d5 diff --git a/48/a41286d2421c031315e8f9f4f7b96d424a29d5 b/48/a41286d2421c031315e8f9f4f7b96d424a29d5 new file mode 100644 index 000000000..09cd36ea8 --- /dev/null +++ b/48/a41286d2421c031315e8f9f4f7b96d424a29d5 @@ -0,0 +1,185 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id DEBFA431FBD + for ; Sun, 23 Jun 2013 09:59:55 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: -0.7 +X-Spam-Level: +X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 + tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id jC52PBMVKejM for ; + Sun, 23 Jun 2013 09:59:46 -0700 (PDT) +Received: from dmz-mailsec-scanner-5.mit.edu (dmz-mailsec-scanner-5.mit.edu + [18.7.68.34]) + by olra.theworths.org (Postfix) with ESMTP id AF04A431FAE + for ; Sun, 23 Jun 2013 09:59:46 -0700 (PDT) +X-AuditID: 12074422-b7ef78e000000935-85-51c72981074c +Received: from mailhub-auth-2.mit.edu ( [18.7.62.36]) + by dmz-mailsec-scanner-5.mit.edu (Symantec Messaging Gateway) with SMTP + id 8F.FA.02357.18927C15; Sun, 23 Jun 2013 12:59:45 -0400 (EDT) +Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) + by mailhub-auth-2.mit.edu (8.13.8/8.9.2) with ESMTP id r5NGxhZg030538; + Sun, 23 Jun 2013 12:59:44 -0400 +Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) + (authenticated bits=0) + (User authenticated as amdragon@ATHENA.MIT.EDU) + by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id r5NGxeMv030643 + (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); + Sun, 23 Jun 2013 12:59:42 -0400 +Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80) + (envelope-from ) + id 1Uqndb-0004Kv-Oo; Sun, 23 Jun 2013 12:59:40 -0400 +Date: Sun, 23 Jun 2013 12:59:39 -0400 +From: Austin Clements +To: Justus Winter <4winter@informatik.uni-hamburg.de> +Subject: Re: header continuation issue in notmuch frontend/alot/pythons email + module +Message-ID: <20130623165938.GA2214@mit.edu> +References: <20130623131145.2526.439@thinkbox.jade-hamburg.de> +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +Content-Disposition: inline +In-Reply-To: <20130623131145.2526.439@thinkbox.jade-hamburg.de> +User-Agent: Mutt/1.5.21 (2010-09-15) +X-Brightmail-Tracker: + H4sIAAAAAAAAA+NgFupnleLIzCtJLcpLzFFi42IRYrdT0W3UPB5o0LJH3GJ26w8mi+s3ZzI7 + MHlMPH+azePZqlvMAUxRXDYpqTmZZalF+nYJXBmH+26wF/xTqdi8YQ9bA+NdmS5GTg4JAROJ + /q7ZbBC2mMSFe+uBbC4OIYF9jBJfX3xjgnA2MkrMXLEJyjnNJNF54wQ7hLOEUWLT/JVg/SwC + qhJPj91nBLHZBDQktu1fDmaLCJhKbHjwgB3EZhYwkri/YzpzFyMHh7BAmMTsTnuQMK+AtsSZ + MzvASoQE7CTWftzLBhEXlDg58wkLRKuWxI1/L5lAWpkFpCWW/+MACXMK2Es8v9/KCmKLCqhI + TDm5jW0Co9AsJN2zkHTPQuhewMi8ilE2JbdKNzcxM6c4NVm3ODkxLy+1SNdULzezRC81pXQT + IzisXZR2MP48qHSIUYCDUYmHN1P1WKAQa2JZcWXuIUZJDiYlUd6zUscDhfiS8lMqMxKLM+KL + SnNSiw8xSnAwK4nwbrgGVM6bklhZlVqUD5OS5mBREucVu7UzUEggPbEkNTs1tSC1CCYrw8Gh + JMFrqQE0VLAoNT21Ii0zpwQhzcTBCTKcB2j4GnWgGt7igsTc4sx0iPwpRl2OyWe3vGcUYsnL + z0uVEuddrAZUJABSlFGaBzcHlo5eMYoDvSXMuwFkFA8wlcFNegW0hAloyZ7Vh0CWlCQipKQa + GJs3HSk6037wGUfJAp1DjdMTuO7ZqRlN5Thmv5HpYuz3JXwpOe0eblUTNyYumnVHc52RI+PJ + tQZxncq/2L/9WDb5Yq+V+fFlV7gfxLYsVlCLzNrhl1kUUriojv/x2xu3Lqny62Q0WC4Sv3C8 + /IXYbdP3JVnpQkvvf4++dHSLJesak6OMKy4zXVFiKc5INNRiLipOBADxp15XIgMAAA== +Cc: notmuch mailing list +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Sun, 23 Jun 2013 16:59:56 -0000 + +Quoth Justus Winter on Jun 23 at 3:11 pm: +> Hi, +> +> I recently had a problem replying to a mail written by Thomas Schwinge +> using an oldish notmuch. Not sure if it has been fixed in more recent +> versions, but I think notmuch could improve uppon its header +> generation (see below). Problematic part of the mail: +> +> ~~~ snip ~~~ +> [...] +> To: someone@example.org, "line +> break" , someoneelse@example.org +> User-Agent: Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/23.4.1 (i486-pc-linux-gnu) +> [...] +> ~~~ snap ~~~ +> +> http://tools.ietf.org/html/rfc2822#section-2.2.3 says: +> +> Note: Though structured field bodies are defined in such a way that +> folding can take place between many of the lexical tokens (and even +> within some of the lexical tokens), folding SHOULD be limited to +> placing the CRLF at higher-level syntactic breaks. For instance, if +> a field body is defined as comma-separated values, it is recommended +> that folding occur after the comma separating the structured items in +> preference to other places where the field could be folded, even if +> it is allowed elsewhere. +> +> So notmuch "rfc-SHOULD" place the newlines after the comma. +> +> The rfc goes on: +> +> The process of moving from this folded multiple-line representation +> of a header field to its single line representation is called +> "unfolding". Unfolding is accomplished by simply removing any CRLF +> that is immediately followed by WSP. Each header field should be +> treated in its unfolded form for further syntactic and semantic +> evaluation. +> +> My interpretation is that unfolding simply removes any linebreaks +> first, so the value does not contain any newlines. But pythons email +> module discriminates quoted and unquoted parts of the value: +> +> ~~~ snip ~~~ +> from __future__ import print_function +> import email +> from email.utils import getaddresses +> +> m = email.message_from_string('''To: "line +> break" , line +> break ''') +> print("m['To'] = ", m['To']) +> print("getaddresses(m.get_all('To')) = ", getaddresses(m.get_all('To'))) +> ~~~ snap ~~~ +> +> % python3 test.py +> m['To'] = "line +> break" , line +> break +> getaddresses(m.get_all('To')) = [('line\n break', 'linebreak@example.org'), ('line break', 'linebreak@example.org')] +> +> I believe that is what's preventing me from replying to the message +> using alot without sanitizing the To header first. Not really sure who +> is wrong or right here... any thoughts? + +There are at least two bugs here. Regardless of what we RFC-should +do, that folding *is* permitted by RFC2822, since quoted +strings can contain folding whitespace: + + http://tools.ietf.org/html/rfc2822#section-3.2.5 + +For completeness, the full derivation for this "To" header is: + +to = "To:" address-list CRLF +address-list = (address *("," address)) / obs-addr-list +address = mailbox / group +mailbox = name-addr / addr-spec +name-addr = [display-name] angle-addr +display-name = phrase +phrase = 1*word / obs-phrase +word = atom / quoted-string +quoted-string = [CFWS] + DQUOTE *([FWS] qcontent) [FWS] DQUOTE + [CFWS] + +Do you happen to know how the strangely folded "to" header was +produced for this message? In notmuch-emacs, a user can put whatever +they want in a message-mode buffer's headers and mm will dutifully +pass it on to their MTA. We could validate it, but that's a slippery +slope and I would hope that the MTA itself is validating it (and +probably more thoroughly than we could). + +That said, the first bug here is in Python. As I mentioned above, +foldable whitespace is allowed in quoted strings. In fact, though the +standard is rather long-winded about whitespace, if you dig into the +grammar, you'll find that *all whitespace can be folded* (except in +the obsolete grammar, which allowed whitespace between the header name +and the colon, which obviously can't be folded). I'm not sure what +Python is doing, but I bet it's going to a lot of effort to +mis-implement something very simple. + +There also appears to be a bug in the notmuch CLI's reply command +where it omits addresses that were folded in the original message. I +don't know if alot uses the CLI's reply command, so this may or may +not be related to your specific issue. I haven't dug into this yet, +other than to confirm that it's the CLI's fault and not +notmuch-emacs's. + +> Justus -- 2.26.2