From: Justus Winter <4winter@informatik.uni-hamburg.de> Date: Mon, 24 Jun 2013 08:57:10 +0000 (+0200) Subject: Re: header continuation issue in notmuch frontend/alot/pythons email module X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=998b5b1e4281746ab1b0c328793d79d1be87da57;p=notmuch-archives.git Re: header continuation issue in notmuch frontend/alot/pythons email module --- diff --git a/0e/9f77048737ef5c195440be5fc6bb8580e6b472 b/0e/9f77048737ef5c195440be5fc6bb8580e6b472 new file mode 100644 index 000000000..c4d0bab61 --- /dev/null +++ b/0e/9f77048737ef5c195440be5fc6bb8580e6b472 @@ -0,0 +1,206 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 16C6C431FBD + for ; Mon, 24 Jun 2013 01:57:44 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: 0 +X-Spam-Level: +X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none] + autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id d+Fd0wHck8ug for ; + Mon, 24 Jun 2013 01:57:34 -0700 (PDT) +Received: from mail.cryptobitch.de (cryptobitch.de [88.198.7.68]) + (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id 8B5F8431FAF + for ; Mon, 24 Jun 2013 01:57:34 -0700 (PDT) +Received: from mail.jade-hamburg.de (mail.jade-hamburg.de [85.183.11.228]) + (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) + (No client certificate requested) + by mail.cryptobitch.de (Postfix) with ESMTPSA id 0509662C358 + for ; Mon, 24 Jun 2013 10:57:17 +0200 (CEST) +Received: by mail.jade-hamburg.de (Postfix, from userid 401) + id 812A4DF2A5; Mon, 24 Jun 2013 10:57:15 +0200 (CEST) +Received: from thinkbox.jade-hamburg.de (unknown + [IPv6:2002:55b7:be4:1:216:d3ff:fe3e:5058]) + (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) + (No client certificate requested) (Authenticated sender: teythoon) + by mail.jade-hamburg.de (Postfix) with ESMTPSA id 99CADDF29F; + Mon, 24 Jun 2013 10:57:11 +0200 (CEST) +Received: from teythoon by thinkbox.jade-hamburg.de with local (Exim 4.80) + (envelope-from ) + id 1Ur2aE-0008Nc-K6; Mon, 24 Jun 2013 10:57:10 +0200 +Content-Type: text/plain; charset="utf-8" +MIME-Version: 1.0 +Content-Transfer-Encoding: quoted-printable +To: Austin Clements , + thomas schwinge , +From: Justus Winter <4winter@informatik.uni-hamburg.de> +In-Reply-To: <20130623165938.GA2214@mit.edu> +References: <20130623131145.2526.439@thinkbox.jade-hamburg.de> + <20130623165938.GA2214@mit.edu> +Message-ID: <20130624085710.31827.41792@thinkbox.jade-hamburg.de> +User-Agent: alot/0.3.4 +Subject: Re: header continuation issue in notmuch frontend/alot/pythons email + module +Date: Mon, 24 Jun 2013 10:57:10 +0200 +Cc: notmuch mailing list +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Mon, 24 Jun 2013 08:57:44 -0000 + +Quoting Austin Clements (2013-06-23 18:59:39) +> Quoth Justus Winter on Jun 23 at 3:11 pm: +> > Hi, +> > = + +> > I recently had a problem replying to a mail written by Thomas Schwinge +> > using an oldish notmuch. Not sure if it has been fixed in more recent +> > versions, but I think notmuch could improve uppon its header +> > generation (see below). Problematic part of the mail: +> > = + +> > ~~~ snip ~~~ +> > [...] +> > To: someone@example.org, "line +> > break" , someoneelse@example.org +> > User-Agent: Notmuch/0.9-101-g81dad07 (http://notmuchmail.org) Emacs/23.= +4.1 (i486-pc-linux-gnu) +> > [...] +> > ~~~ snap ~~~ +> > = + +> > http://tools.ietf.org/html/rfc2822#section-2.2.3 says: +> > = + +> > Note: Though structured field bodies are defined in such a way that +> > folding can take place between many of the lexical tokens (and even +> > within some of the lexical tokens), folding SHOULD be limited to +> > placing the CRLF at higher-level syntactic breaks. For instance, if +> > a field body is defined as comma-separated values, it is recommended +> > that folding occur after the comma separating the structured items in +> > preference to other places where the field could be folded, even if +> > it is allowed elsewhere. +> > = + +> > So notmuch "rfc-SHOULD" place the newlines after the comma. +> > = + +> > The rfc goes on: +> > = + +> > The process of moving from this folded multiple-line representation +> > of a header field to its single line representation is called +> > "unfolding". Unfolding is accomplished by simply removing any CRLF +> > that is immediately followed by WSP. Each header field should be +> > treated in its unfolded form for further syntactic and semantic +> > evaluation. +> > = + +> > My interpretation is that unfolding simply removes any linebreaks +> > first, so the value does not contain any newlines. But pythons email +> > module discriminates quoted and unquoted parts of the value: +> > = + +> > ~~~ snip ~~~ +> > from __future__ import print_function +> > import email +> > from email.utils import getaddresses +> > = + +> > m =3D email.message_from_string('''To: "line +> > break" , line +> > break ''') +> > print("m['To'] =3D ", m['To']) +> > print("getaddresses(m.get_all('To')) =3D ", getaddresses(m.get_all('To'= +))) +> > ~~~ snap ~~~ +> > = + +> > % python3 test.py +> > m['To'] =3D "line +> > break" , line +> > break +> > getaddresses(m.get_all('To')) =3D [('line\n break', 'linebreak@example= +.org'), ('line break', 'linebreak@example.org')] +> > = + +> > I believe that is what's preventing me from replying to the message +> > using alot without sanitizing the To header first. Not really sure who +> > is wrong or right here... any thoughts? +> = + +> There are at least two bugs here. Regardless of what we RFC-should +> do, that folding *is* permitted by RFC2822, since quoted +> strings can contain folding whitespace: +> = + +> http://tools.ietf.org/html/rfc2822#section-3.2.5 +> = + +> For completeness, the full derivation for this "To" header is: +> = + +> to =3D "To:" address-list CRLF +> address-list =3D (address *("," address)) / obs-addr-list +> address =3D mailbox / group +> mailbox =3D name-addr / addr-spec +> name-addr =3D [display-name] angle-addr +> display-name =3D phrase +> phrase =3D 1*word / obs-phrase +> word =3D atom / quoted-string +> quoted-string =3D [CFWS] +> DQUOTE *([FWS] qcontent) [FWS] DQUOTE +> [CFWS] +> = + +> Do you happen to know how the strangely folded "to" header was +> produced for this message? + +No, but Thomas might. Thomas, the problematic message is +id:877ghpqckb.fsf@kepler.schwinge.homeip.net + +> In notmuch-emacs, a user can put whatever +> they want in a message-mode buffer's headers and mm will dutifully +> pass it on to their MTA. We could validate it, but that's a slippery +> slope and I would hope that the MTA itself is validating it (and +> probably more thoroughly than we could). +> = + +> That said, the first bug here is in Python. As I mentioned above, +> foldable whitespace is allowed in quoted strings. In fact, though the +> standard is rather long-winded about whitespace, if you dig into the +> grammar, you'll find that *all whitespace can be folded* (except in +> the obsolete grammar, which allowed whitespace between the header name +> and the colon, which obviously can't be folded). I'm not sure what +> Python is doing, but I bet it's going to a lot of effort to +> mis-implement something very simple. + +Yes, I'm glad you came to the same conclusion. + +> There also appears to be a bug in the notmuch CLI's reply command +> where it omits addresses that were folded in the original message. I +> don't know if alot uses the CLI's reply command, so this may or may +> not be related to your specific issue. I haven't dug into this yet, +> other than to confirm that it's the CLI's fault and not +> notmuch-emacs's. + +No, alot does not use notmuchs reply command. + +Thanks, +Justus