From: Austin Clements Date: Fri, 27 Jun 2014 19:36:59 +0000 (+2000) Subject: Re: Bug#749890: python3-notmuch: missing header in mbox message -> NullPointerError X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=9ca5eda9c25d79c9148482aa654953e9086e1614;p=notmuch-archives.git Re: Bug#749890: python3-notmuch: missing header in mbox message -> NullPointerError --- diff --git a/1d/caf11cec9d98afe22e5706cd14cff0b012e3d9 b/1d/caf11cec9d98afe22e5706cd14cff0b012e3d9 new file mode 100644 index 000000000..3d6dfe9f8 --- /dev/null +++ b/1d/caf11cec9d98afe22e5706cd14cff0b012e3d9 @@ -0,0 +1,158 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 23D91431FBD + for ; Fri, 27 Jun 2014 12:37:12 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: -0.7 +X-Spam-Level: +X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 + tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id dV48znJzOCV6 for ; + Fri, 27 Jun 2014 12:37:06 -0700 (PDT) +Received: from dmz-mailsec-scanner-7.mit.edu (dmz-mailsec-scanner-7.mit.edu + [18.7.68.36]) + (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id 4D821431FAE + for ; Fri, 27 Jun 2014 12:37:06 -0700 (PDT) +X-AuditID: 12074424-f79146d00000067c-c9-53adc7e18bdc +Received: from mailhub-auth-1.mit.edu ( [18.9.21.35]) + (using TLS with cipher AES256-SHA (256/256 bits)) + (Client did not present a certificate) + by dmz-mailsec-scanner-7.mit.edu (Symantec Messaging Gateway) with SMTP + id 02.9A.01660.1E7CDA35; Fri, 27 Jun 2014 15:37:05 -0400 (EDT) +Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) + by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id s5RJb2RE011699; + Fri, 27 Jun 2014 15:37:03 -0400 +Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) + (authenticated bits=0) + (User authenticated as amdragon@ATHENA.MIT.EDU) + by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id s5RJb0R1013147 + (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); + Fri, 27 Jun 2014 15:37:01 -0400 +Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80) + (envelope-from ) + id 1X0bxD-0002cV-Qt; Fri, 27 Jun 2014 15:36:59 -0400 +Date: Fri, 27 Jun 2014 15:36:59 -0400 +From: Austin Clements +To: Jakub Wilk +Subject: Re: Bug#749890: python3-notmuch: missing header in mbox message -> + NullPointerError +Message-ID: <20140627193659.GH4660@mit.edu> +References: <8738ewudra.fsf@zancas.localnet> <20140623201918.GA7346@jwilk.net> + <87ha37fjm3.fsf@zancas.localnet> <20140626213100.GA8930@jwilk.net> + <878uoifj9n.fsf@zancas.localnet> +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +Content-Disposition: inline +In-Reply-To: <878uoifj9n.fsf@zancas.localnet> +User-Agent: Mutt/1.5.21 (2010-09-15) +X-Brightmail-Tracker: + H4sIAAAAAAAAA+NgFlrJKsWRmVeSWpSXmKPExsUixCmqrPvw+Npgg6ZXKhZbXhVa3GjtZrT4 + NP8Qu8X1mzOZHVg8LmydxOLxq20us8ezVbeYPbYces8cwBLFZZOSmpNZllqkb5fAlXH5Wztj + wRuxir0Tm5gaGJcJdTFyckgImEhM3DqJCcIWk7hwbz1bFyMXh5DAbCaJZTuvsEA4Gxkllu89 + zgThnGaSuHn0HTuEs4RR4vrKHhaQfhYBVYnFm1eygdhsAhoS2/YvZwSxRQQUJY4e7GMGsZkF + 4iVWznkDZgsLxEnsX9fH2sXIwcEroC3xZastxMxdjBJTH6xnB6nhFRCUODnzCQtEr5bEjX8v + mUDqmQWkJZb/4wAJcwroSrx78h5spKiAisSUk9vYJjAKzULSPQtJ9yyE7gWMzKsYZVNyq3Rz + EzNzilOTdYuTE/PyUot0zfVyM0v0UlNKNzGCwp/dRWUHY/MhpUOMAhyMSjy8np1rg4VYE8uK + K3MPMUpyMCmJ8i49AhTiS8pPqcxILM6ILyrNSS0+xCjBwawkwiu9AijHm5JYWZValA+TkuZg + URLnfWttFSwkkJ5YkpqdmlqQWgSTleHgUJLgFQLGuZBgUWp6akVaZk4JQpqJgxNkOA/QcEWQ + Gt7igsTc4sx0iPwpRmOOT9eOtTFxPJp0qo1JiCUvPy9VSpxX9BhQqQBIaUZpHtw0WAp7xSgO + 9Jww7z+QKh5g+oOb9wpoFRPQKvOCVSCrShIRUlINjJ3KPw8dCpL8+/1u2nXfGfs3BRhOKuG9 + KuQvX3c4QWiVzVn5C14ZrL25dx9dif6tyr6zX6zt/8Uww73cHOtOTWjr3eB02oR33Z+DXnpn + 14r/+Lt8qd98/icCjXobmCQVrFXFAuxY7tsnPeBkW8l259oUxbkpJx8atuevOvhYom7TpbK0 + m70OJ5RYijMSDbWYi4oTAZLnIAs8AwAA +Cc: notmuch@notmuchmail.org, 749890@bugs.debian.org +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Fri, 27 Jun 2014 19:37:12 -0000 + +Quoth David Bremner on Jun 27 at 12:45 pm: +> Jakub Wilk writes: +> +> > * David Bremner , 2014-06-26, 18:26: +> >>>0.18.1~rc0-1 is much better, thanks! +> >>> +> >>>I still get NullPointerError for one of my messages, though. :-( The +> >>>message is in the MBOXCL format (where message body size is indicated +> >>>by the Content-Length field), and has lines starting with "From " in +> >>>the message body. I've attached a new test case. +> >> +> >>That message (and at a guess other MBOXCL files) is ignored as a +> >>non-mail file by 0.18.1 "notmuch new". +> > +> > Indeed. +> > +> >>Is this another case of files which where indexed with an older version +> >>of notmuch causing problems with a newer version? +> > +> > Yes, that's why I meant. Sorry for not being clear. +> +> As a point of information, I bisected with the following test script: +> +> #!/usr/bin/env bash +> test_description='"notmuch new" in several variations' +> . ./test-lib.sh +> +> test_begin_subtest "Support single-message mbox with content length (deprecated)" +> cat > "${MAIL_DIR}"/mbox_file2 < From jwilk Fri May 30 14:09:05 2014 +> Subject: Hello world! +> Content-Length: 12 +> Lines: 1 +> +> From world! +> +> EOF +> output=$(NOTMUCH_NEW 2>&1) +> test_expect_equal "$output" \ +> "Added 1 new message to the database." +> +> +> test_done +> +> The commit where the behaviour changed to reject MBOXCL files with +> 'From ' in the body was 610f0e09929. This was between 0.14 and 0.15. +> I'd say this was unintentional, although it isn't clear to me yet how +> easy it is fix. + +Thanks for bisecting this, David. + +Unfortunately, when it comes to mbox, the only winning move is not to +play. + +The reason 610f0e09929 matters here is because it *added* support for +mbox (or, rather, this weird but surprisingly common chimera of +mbox-formatted message files with maildir-formatted file names). +Previously, notmuch assumed *everything* was a maildir-formatted +message file; that is, one message per file. It "worked" for mboxcl +because it had no idea what either mbox or mboxcl was. But it would +choke hard when it encountered a large, multi-message mbox archive +because it would try to index the whole thing as one giant email. In +an effort to avoid this, I added explicit support for single-message +mbox files (to keep the chimerians happy). But at that point we lost: +there simply is no way to reliably and programmatically distinguish +the many variants of mbox (see +http://www.jwz.org/doc/content-length.html for a good discussion of +this). + +So, I'm afraid my best advice is to convert your mboxcl files to +something else. Probably maildir, both because you're storing them in +a maildir (I assume?) and because it's easy: just strip off the first +line. I don't think there's anything notmuch can do to fix this +without breaking something else.