Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 68062431FBF; Sat, 21 Nov 2009 19:12:39 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ON4On8Rt7+Gv; Sat, 21 Nov 2009 19:12:38 -0800 (PST) Received: from cworth.org (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 00D23431FAE; Sat, 21 Nov 2009 19:12:37 -0800 (PST) From: Carl Worth To: Mike Hommey In-Reply-To: <20091121222615.GA4925@glandium.org> References: <20091120132625.GA19246@glandium.org> <87y6m0lxym.fsf@yoom.home.cworth.org> <20091120210556.GA25421@glandium.org> <20091121222615.GA4925@glandium.org> Date: Sun, 22 Nov 2009 04:12:26 +0100 Message-ID: <87k4xj2qxx.fsf@yoom.home.cworth.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: notmuch@notmuchmail.org Subject: Re: [notmuch] Segfault with weird Message-ID X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.12 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 22 Nov 2009 03:12:39 -0000 On Sat, 21 Nov 2009 23:26:15 +0100, Mike Hommey wrote: > I just was able to reproduce after starting over. Thanks Mike. I was able to reproduce this as well by eliminating the spurious blank line I had on the 2nd or 3rd line. (So maybe that managed to sneak in when you sent me the message.) > header isn't "", and message_id is correctly filled. I can also confirm > the exception is thrown from notmuch->xapian_db->add_document. Yes. We were trying to add a term that is too long for Xapian. I've fixed this by simply falling back to our existing sha-1 code when a message ID is long. Thanks so much for the bug report! -Carl commit 5d56e931b99d575dbb0b936d24aae5e9903861ad Author: Carl Worth Date: Sun Nov 22 04:03:49 2009 +0100 add_message: Use sha-1 in place of overly long message ID. Since Xapian has a limit on the maximum length of a term, we have to check for that before trying to add the message ID as a term. This fixes the bug reported by Mike Hommey here: <20091120132625.GA19246@glandium.org> I've also constructed 20 files with a range of message ID lengths centered around the Xapian term-length limit which I'll use to seed a new test suite soon. diff --git a/lib/database.cc b/lib/database.cc index 169dc5e..f4a445a 100644 --- a/lib/database.cc +++ b/lib/database.cc @@ -892,7 +892,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch, const char *date, *header; const char *from, *to, *subject; - char *message_id; + char *message_id = NULL; if (message_ret) *message_ret = NULL; @@ -937,11 +937,20 @@ notmuch_database_add_message (notmuch_database_t *notmuch, header = notmuch_message_file_get_header (message_file, "message-id"); if (header && *header != '\0') { message_id = _parse_message_id (message_file, header, NULL); + /* So the header value isn't RFC-compliant, but it's * better than no message-id at all. */ if (message_id == NULL) message_id = talloc_strdup (message_file, header); - } else { + + /* Reject a Message ID that's too long. */ + if (message_id && strlen (message_id) + 1 > NOTMUCH_TERM_MAX) { + talloc_free (message_id); + message_id = NULL; + } + } + + if (message_id == NULL ) { /* No message-id at all, let's generate one by taking a * hash over the file's contents. */ char *sha1 = notmuch_sha1_of_file (filename);