--- /dev/null
+Return-Path: <jani@nikula.org>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+ by olra.theworths.org (Postfix) with ESMTP id ED41C431FAF\r
+ for <notmuch@notmuchmail.org>; Mon, 3 Feb 2014 11:52:47 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Amavis-Alert: BAD HEADER SECTION, Duplicate header field: "References"\r
+X-Spam-Flag: NO\r
+X-Spam-Score: -0.7\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5\r
+ tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+ by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+ with ESMTP id oH12yEE4QUMx for <notmuch@notmuchmail.org>;\r
+ Mon, 3 Feb 2014 11:52:41 -0800 (PST)\r
+Received: from mail-ea0-f179.google.com (mail-ea0-f179.google.com\r
+ [209.85.215.179]) (using TLSv1 with cipher RC4-SHA (128/128 bits))\r
+ (No client certificate requested)\r
+ by olra.theworths.org (Postfix) with ESMTPS id CB8D9429E2F\r
+ for <notmuch@notmuchmail.org>; Mon, 3 Feb 2014 11:52:15 -0800 (PST)\r
+Received: by mail-ea0-f179.google.com with SMTP id q10so3297157ead.38\r
+ for <notmuch@notmuchmail.org>; Mon, 03 Feb 2014 11:52:14 -0800 (PST)\r
+X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;\r
+ d=1e100.net; s=20130820;\r
+ h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to\r
+ :references:in-reply-to:references;\r
+ bh=j6/yU5yTiE2oITHiaMSWAlokJzHivQmMFo+30O6a6Ng=;\r
+ b=ZieoV/+mYSKPYsY+Acs5GwfGUprgj8H96hai3eXuSoSrsLXILG+UYugsMglqCzjmbE\r
+ 9qK80eONAimH/ZR6pAWQFRhTHKZDCZown6ZxEwjoEojO5JTh5724gl3xForeFpTzSOHD\r
+ /MXIDjdBWr9k0uYOlSLsLeUl76SYy3/Amvu5ja5BJwI1zWMicKXOJsDCEcdBrjGynBDu\r
+ 1gppUkKhhg/FoZy5NmwgaU4vtUTp80tEtGNvVN5mq4mYwRsez2iWxou2hp4H3GjW7s44\r
+ UIXZqx2tqG35eT3MmjxvpoXkChsqpdMjZKvYbZHSR9gtplEa1obFJRDMYJZqlUBIbg3y\r
+ jQ0A==\r
+X-Gm-Message-State:\r
+ ALoCoQm7JhTE+y70/Scwe4sc8hPEhqsxMK+yS9sBvxyN2IwiShDzIOM1MlKV4I+q/s60iaQvayS6\r
+X-Received: by 10.14.182.5 with SMTP id n5mr4549528eem.68.1391457134621;\r
+ Mon, 03 Feb 2014 11:52:14 -0800 (PST)\r
+Received: from localhost (dsl-hkibrasgw2-58c36f-91.dhcp.inet.fi.\r
+ [88.195.111.91])\r
+ by mx.google.com with ESMTPSA id g1sm78793412eet.6.2014.02.03.11.52.12\r
+ for <multiple recipients>\r
+ (version=TLSv1.2 cipher=RC4-SHA bits=128/128);\r
+ Mon, 03 Feb 2014 11:52:13 -0800 (PST)\r
+From: Jani Nikula <jani@nikula.org>\r
+To: notmuch@notmuchmail.org\r
+Subject: [PATCH v3 5/6] lib: replace the header parser with gmime\r
+Date: Mon, 3 Feb 2014 21:51:45 +0200\r
+Message-Id:\r
+ <bdef45dc21c777130a7ae1fa650f172d8c6eaaf4.1391456555.git.jani@nikula.org>\r
+X-Mailer: git-send-email 1.8.5.2\r
+In-Reply-To: <cover.1391456555.git.jani@nikula.org>\r
+References: <cover.1391456555.git.jani@nikula.org>\r
+In-Reply-To: <cover.1391456555.git.jani@nikula.org>\r
+References: <cover.1391456555.git.jani@nikula.org>\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+ <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Mon, 03 Feb 2014 19:52:48 -0000\r
+\r
+The notmuch library includes a full blown message header parser. Yet\r
+the same message headers are parsed by gmime during indexing. Switch\r
+to gmime parsing completely.\r
+\r
+These are the main differences between the parsers:\r
+\r
+* Gmime stops header parsing at the first invalid header, and presumes\r
+ the message body starts from there. The current parser is quite\r
+ liberal in accepting broken headers. The change means we will be\r
+ much pickier about accepting invalid messages. (Note that the\r
+ headers after the invalid header would not have been indexed before\r
+ this change anyway, as we use gmime for that.)\r
+\r
+* The current parser converts tabs used in header folding to\r
+ spaces. Gmime preserve the tabs. Due to a broken python library used\r
+ in mailman, there are plenty of mailing lists that produce headers\r
+ with tabs in header folding, and we'll see plenty of tabs. (This\r
+ change has been mitigated in preparatory patches.)\r
+\r
+* For pure header parsing, the current parser is likely faster than\r
+ gmime, which parses the whole message rather than just the\r
+ headers. Since we use gmime for indexing anyway, we can drop an\r
+ extra header parsing round (this is done in a follow-up patch).\r
+\r
+At this step, we only switch the header parsing to gmime.\r
+---\r
+ lib/database.cc | 4 +\r
+ lib/index.cc | 11 --\r
+ lib/message-file.c | 349 ++++++++++++++++----------------------------------\r
+ lib/message.cc | 6 +\r
+ lib/notmuch-private.h | 4 +\r
+ 5 files changed, 125 insertions(+), 249 deletions(-)\r
+\r
+diff --git a/lib/database.cc b/lib/database.cc\r
+index f395061..d1bea88 100644\r
+--- a/lib/database.cc\r
++++ b/lib/database.cc\r
+@@ -1940,6 +1940,10 @@ notmuch_database_add_message (notmuch_database_t *notmuch,\r
+ "to",\r
+ (char *) NULL);\r
+ \r
++ ret = notmuch_message_file_parse (message_file);\r
++ if (ret)\r
++ goto DONE;\r
++\r
+ try {\r
+ /* Before we do any real work, (especially before doing a\r
+ * potential SHA-1 computation on the entire file's contents),\r
+diff --git a/lib/index.cc b/lib/index.cc\r
+index 78c18cf..976e49f 100644\r
+--- a/lib/index.cc\r
++++ b/lib/index.cc\r
+@@ -437,7 +437,6 @@ _notmuch_message_index_file (notmuch_message_t *message,\r
+ static int initialized = 0;\r
+ char from_buf[5];\r
+ bool is_mbox = false;\r
+- static bool mbox_warning = false;\r
+ \r
+ if (! initialized) {\r
+ g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);\r
+@@ -471,16 +470,6 @@ _notmuch_message_index_file (notmuch_message_t *message,\r
+ ret = NOTMUCH_STATUS_FILE_NOT_EMAIL;\r
+ goto DONE;\r
+ }\r
+- /* For historical reasons, we support single-message mboxes,\r
+- * but this behavior is likely to change in the future, so\r
+- * warn. */\r
+- if (!mbox_warning) {\r
+- mbox_warning = true;\r
+- fprintf (stderr, "\\r
+-Warning: %s is an mbox containing a single message,\n\\r
+-likely caused by misconfigured mail delivery. Support for single-message\n\\r
+-mboxes is deprecated and may be removed in the future.\n", filename);\r
+- }\r
+ }\r
+ \r
+ from = g_mime_message_get_sender (mime_message);\r
+diff --git a/lib/message-file.c b/lib/message-file.c\r
+index a2850c2..33f6468 100644\r
+--- a/lib/message-file.c\r
++++ b/lib/message-file.c\r
+@@ -26,30 +26,19 @@\r
+ \r
+ #include <glib.h> /* GHashTable */\r
+ \r
+-typedef struct {\r
+- char *str;\r
+- size_t size;\r
+- size_t len;\r
+-} header_value_closure_t;\r
+-\r
+ struct _notmuch_message_file {\r
+ /* File object */\r
+ FILE *file;\r
++ char *filename;\r
+ \r
+ /* Header storage */\r
+ int restrict_headers;\r
+ GHashTable *headers;\r
+- int broken_headers;\r
+- int good_headers;\r
+- size_t header_size; /* Length of full message header in bytes. */\r
+-\r
+- /* Parsing state */\r
+- char *line;\r
+- size_t line_size;\r
+- header_value_closure_t value;\r
+ \r
+- int parsing_started;\r
+- int parsing_finished;\r
++ GMimeStream *stream;\r
++ GMimeParser *parser;\r
++ GMimeMessage *message;\r
++ notmuch_bool_t parsed;\r
+ };\r
+ \r
+ static int\r
+@@ -76,16 +65,18 @@ strcase_hash (const void *ptr)\r
+ static int\r
+ _notmuch_message_file_destructor (notmuch_message_file_t *message)\r
+ {\r
+- if (message->line)\r
+- free (message->line);\r
+-\r
+- if (message->value.size)\r
+- free (message->value.str);\r
+-\r
+ if (message->headers)\r
+ g_hash_table_destroy (message->headers);\r
+ \r
+- if (message->file)\r
++ if (message->message)\r
++ g_object_unref (message->message);\r
++\r
++ if (message->parser)\r
++ g_object_unref (message->parser);\r
++\r
++ if (message->stream)\r
++ g_object_unref (message->stream);\r
++ else if (message->file) /* GMimeStream takes over the FILE* */\r
+ fclose (message->file);\r
+ \r
+ return 0;\r
+@@ -102,19 +93,21 @@ _notmuch_message_file_open_ctx (void *ctx, const char *filename)\r
+ if (unlikely (message == NULL))\r
+ return NULL;\r
+ \r
++ /* only needed for error messages during parsing */\r
++ message->filename = talloc_strdup (message, filename);\r
++ if (message->filename == NULL)\r
++ goto FAIL;\r
++\r
+ talloc_set_destructor (message, _notmuch_message_file_destructor);\r
+ \r
+ message->file = fopen (filename, "r");\r
+ if (message->file == NULL)\r
+ goto FAIL;\r
+ \r
+- message->headers = g_hash_table_new_full (strcase_hash,\r
+- strcase_equal,\r
+- free,\r
+- g_free);\r
+-\r
+- message->parsing_started = 0;\r
+- message->parsing_finished = 0;\r
++ message->headers = g_hash_table_new_full (strcase_hash, strcase_equal,\r
++ free, g_free);\r
++ if (message->headers == NULL)\r
++ goto FAIL;\r
+ \r
+ return message;\r
+ \r
+@@ -143,7 +136,7 @@ notmuch_message_file_restrict_headersv (notmuch_message_file_t *message,\r
+ {\r
+ char *header;\r
+ \r
+- if (message->parsing_started)\r
++ if (message->parsed)\r
+ INTERNAL_ERROR ("notmuch_message_file_restrict_headers called after parsing has started");\r
+ \r
+ while (1) {\r
+@@ -167,234 +160,114 @@ notmuch_message_file_restrict_headers (notmuch_message_file_t *message, ...)\r
+ notmuch_message_file_restrict_headersv (message, va_headers);\r
+ }\r
+ \r
+-static void\r
+-copy_header_unfolding (header_value_closure_t *value,\r
+- const char *chunk)\r
++/*\r
++ * gmime does not provide access to all Received: headers the way we\r
++ * want, so we'll to use the parser header callback to gather them\r
++ * into a hash table.\r
++ */\r
++static void header_cb (unused(GMimeParser *parser), const char *header,\r
++ const char *value, unused(gint64 offset),\r
++ gpointer user_data)\r
+ {\r
+- char *last;\r
++ notmuch_message_file_t *message = (notmuch_message_file_t *) user_data;\r
++ char *existing = NULL;\r
++ notmuch_bool_t found;\r
+ \r
+- if (chunk == NULL)\r
++ found = g_hash_table_lookup_extended (message->headers, header,\r
++ NULL, (gpointer *) &existing);\r
++ if (! found && message->restrict_headers)\r
+ return;\r
+ \r
+- while (*chunk == ' ' || *chunk == '\t')\r
+- chunk++;\r
+-\r
+- if (value->len + 1 + strlen (chunk) + 1 > value->size) {\r
+- unsigned int new_size = value->size;\r
+- if (value->size == 0)\r
+- new_size = strlen (chunk) + 1;\r
+- else\r
+- while (value->len + 1 + strlen (chunk) + 1 > new_size)\r
+- new_size *= 2;\r
+- value->str = xrealloc (value->str, new_size);\r
+- value->size = new_size;\r
+- }\r
+-\r
+- last = value->str + value->len;\r
+- if (value->len) {\r
+- *last = ' ';\r
+- last++;\r
+- value->len++;\r
+- }\r
+-\r
+- strcpy (last, chunk);\r
+- value->len += strlen (chunk);\r
+-\r
+- last = value->str + value->len - 1;\r
+- if (*last == '\n') {\r
+- *last = '\0';\r
+- value->len--;\r
++ if (existing == NULL) {\r
++ /* No value, add one */\r
++ char *decoded = g_mime_utils_header_decode_text (value);\r
++ g_hash_table_insert (message->headers, xstrdup (header), decoded);\r
++ } else if (strcasecmp (header, "received") == 0) {\r
++ /* Concat all of the Received: headers we encounter. */\r
++ char *combined, *decoded;\r
++ size_t combined_size;\r
++\r
++ decoded = g_mime_utils_header_decode_text (value);\r
++\r
++ combined_size = strlen(existing) + strlen(decoded) + 2;\r
++ combined = g_malloc (combined_size);\r
++ snprintf (combined, combined_size, "%s %s", existing, decoded);\r
++ g_free (decoded);\r
++ g_hash_table_insert (message->headers, xstrdup (header), combined);\r
+ }\r
+ }\r
+ \r
+-/* As a special-case, a value of NULL for header_desired will force\r
+- * the entire header to be parsed if it is not parsed already. This is\r
+- * used by the _notmuch_message_file_get_headers_end function.\r
+- * Another special case is the Received: header. For this header we\r
+- * want to concatenate all instances of the header instead of just\r
+- * hashing the first instance as we use this when analyzing the path\r
+- * the mail has taken from sender to recipient.\r
+- */\r
+-const char *\r
+-notmuch_message_file_get_header (notmuch_message_file_t *message,\r
+- const char *header_desired)\r
++notmuch_status_t\r
++notmuch_message_file_parse (notmuch_message_file_t *message)\r
+ {\r
+- int contains;\r
+- char *header, *decoded_value, *header_sofar, *combined_header;\r
+- const char *s, *colon;\r
+- int match, newhdr, hdrsofar, is_received;\r
+ static int initialized = 0;\r
+-\r
+- is_received = (strcmp(header_desired,"received") == 0);\r
++ char from_buf[5];\r
++ notmuch_bool_t is_mbox = FALSE;\r
++ static notmuch_bool_t mbox_warning = FALSE;\r
+ \r
+ if (! initialized) {\r
+ g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);\r
+ initialized = 1;\r
+ }\r
+ \r
+- message->parsing_started = 1;\r
+-\r
+- if (header_desired == NULL)\r
+- contains = 0;\r
+- else\r
+- contains = g_hash_table_lookup_extended (message->headers,\r
+- header_desired, NULL,\r
+- (gpointer *) &decoded_value);\r
+-\r
+- if (contains && decoded_value)\r
+- return decoded_value;\r
+-\r
+- if (message->parsing_finished)\r
+- return "";\r
+-\r
+-#define NEXT_HEADER_LINE(closure) \\r
+- while (1) { \\r
+- ssize_t bytes_read = getline (&message->line, \\r
+- &message->line_size, \\r
+- message->file); \\r
+- if (bytes_read == -1) { \\r
+- message->parsing_finished = 1; \\r
+- break; \\r
+- } \\r
+- if (*message->line == '\n') { \\r
+- message->parsing_finished = 1; \\r
+- break; \\r
+- } \\r
+- if (closure && \\r
+- (*message->line == ' ' || *message->line == '\t')) \\r
+- { \\r
+- copy_header_unfolding ((closure), message->line); \\r
+- } \\r
+- if (*message->line == ' ' || *message->line == '\t') \\r
+- message->header_size += strlen (message->line); \\r
+- else \\r
+- break; \\r
+- }\r
+-\r
+- if (message->line == NULL)\r
+- NEXT_HEADER_LINE (NULL);\r
+-\r
+- while (1) {\r
+-\r
+- if (message->parsing_finished)\r
+- break;\r
+-\r
+- colon = strchr (message->line, ':');\r
+-\r
+- if (colon == NULL) {\r
+- message->broken_headers++;\r
+- /* A simple heuristic for giving up on things that just\r
+- * don't look like mail messages. */\r
+- if (message->broken_headers >= 10 &&\r
+- message->good_headers < 5)\r
+- {\r
+- message->parsing_finished = 1;\r
+- break;\r
+- }\r
+- NEXT_HEADER_LINE (NULL);\r
+- continue;\r
++ /* Is this mbox? */\r
++ if (fread (from_buf, sizeof (from_buf), 1, message->file) == 1 &&\r
++ strncmp (from_buf, "From ", 5) == 0)\r
++ is_mbox = TRUE;\r
++ rewind (message->file);\r
++\r
++ message->stream = g_mime_stream_file_new (message->file);\r
++ message->parser = g_mime_parser_new_with_stream (message->stream);\r
++ g_mime_parser_set_scan_from (message->parser, is_mbox);\r
++ g_mime_parser_set_header_regex (message->parser, ".*", header_cb,\r
++ (gpointer) message);\r
++\r
++ message->message = g_mime_parser_construct_message (message->parser);\r
++ if (! message->message)\r
++ return NOTMUCH_STATUS_FILE_NOT_EMAIL;\r
++\r
++ if (is_mbox) {\r
++ if (! g_mime_parser_eos (message->parser)) {\r
++ /* This is a multi-message mbox. */\r
++ return NOTMUCH_STATUS_FILE_NOT_EMAIL;\r
+ }\r
+-\r
+- message->header_size += strlen (message->line);\r
+-\r
+- message->good_headers++;\r
+-\r
+- header = xstrndup (message->line, colon - message->line);\r
+-\r
+- if (message->restrict_headers &&\r
+- ! g_hash_table_lookup_extended (message->headers,\r
+- header, NULL, NULL))\r
+- {\r
+- free (header);\r
+- NEXT_HEADER_LINE (NULL);\r
+- continue;\r
+- }\r
+-\r
+- s = colon + 1;\r
+- while (*s == ' ' || *s == '\t')\r
+- s++;\r
+-\r
+- message->value.len = 0;\r
+- copy_header_unfolding (&message->value, s);\r
+-\r
+- NEXT_HEADER_LINE (&message->value);\r
+-\r
+- if (header_desired == NULL)\r
+- match = 0;\r
+- else\r
+- match = (strcasecmp (header, header_desired) == 0);\r
+-\r
+- decoded_value = g_mime_utils_header_decode_text (message->value.str);\r
+- header_sofar = (char *)g_hash_table_lookup (message->headers, header);\r
+- /* we treat the Received: header special - we want to concat ALL of \r
+- * the Received: headers we encounter.\r
+- * for everything else we return the first instance of a header */\r
+- if (strcasecmp(header, "received") == 0) {\r
+- if (header_sofar == NULL) {\r
+- /* first Received: header we encountered; just add it */\r
+- g_hash_table_insert (message->headers, header, decoded_value);\r
+- } else {\r
+- /* we need to add the header to those we already collected */\r
+- newhdr = strlen(decoded_value);\r
+- hdrsofar = strlen(header_sofar);\r
+- combined_header = g_malloc(hdrsofar + newhdr + 2);\r
+- strncpy(combined_header,header_sofar,hdrsofar);\r
+- *(combined_header+hdrsofar) = ' ';\r
+- strncpy(combined_header+hdrsofar+1,decoded_value,newhdr+1);\r
+- g_free (decoded_value);\r
+- g_hash_table_insert (message->headers, header, combined_header);\r
+- }\r
+- } else {\r
+- if (header_sofar == NULL) {\r
+- /* Only insert if we don't have a value for this header, yet. */\r
+- g_hash_table_insert (message->headers, header, decoded_value);\r
+- } else {\r
+- free (header);\r
+- g_free (decoded_value);\r
+- decoded_value = header_sofar;\r
+- }\r
++ /*\r
++ * For historical reasons, we support single-message mboxes,\r
++ * but this behavior is likely to change in the future, so\r
++ * warn.\r
++ */\r
++ if (! mbox_warning) {\r
++ mbox_warning = TRUE;\r
++ fprintf (stderr, "\\r
++Warning: %s is an mbox containing a single message,\n\\r
++likely caused by misconfigured mail delivery. Support for single-message\n\\r
++mboxes is deprecated and may be removed in the future.\n", message->filename);\r
+ }\r
+- /* if we found a match we can bail - unless of course we are\r
+- * collecting all the Received: headers */\r
+- if (match && !is_received)\r
+- return decoded_value;\r
+ }\r
+ \r
+- if (message->parsing_finished) {\r
+- fclose (message->file);\r
+- message->file = NULL;\r
+- }\r
++ message->parsed = TRUE;\r
+ \r
+- if (message->line)\r
+- free (message->line);\r
+- message->line = NULL;\r
++ return NOTMUCH_STATUS_SUCCESS;\r
++}\r
+ \r
+- if (message->value.size) {\r
+- free (message->value.str);\r
+- message->value.str = NULL;\r
+- message->value.size = 0;\r
+- message->value.len = 0;\r
+- }\r
++/* return NULL on errors, empty string for non-existing headers */\r
++const char *\r
++notmuch_message_file_get_header (notmuch_message_file_t *message,\r
++ const char *header)\r
++{\r
++ const char *value = NULL;\r
++ notmuch_bool_t found;\r
+ \r
+- /* For the Received: header we actually might end up here even\r
+- * though we found the header (as we force continued parsing\r
+- * in that case). So let's check if that's the header we were\r
+- * looking for and return the value that we found (if any)\r
+- */\r
+- if (is_received)\r
+- return (char *)g_hash_table_lookup (message->headers, "received");\r
+-\r
+- /* We've parsed all headers and never found the one we're looking\r
+- * for. It's probably just not there, but let's check that we\r
+- * didn't make a mistake preventing us from seeing it. */\r
+- if (message->restrict_headers && header_desired &&\r
+- ! g_hash_table_lookup_extended (message->headers,\r
+- header_desired, NULL, NULL))\r
+- {\r
+- INTERNAL_ERROR ("Attempt to get header \"%s\" which was not\n"\r
+- "included in call to notmuch_message_file_restrict_headers\n",\r
+- header_desired);\r
+- }\r
++ /* the caller shouldn't ask for headers before parsing */\r
++ if (! message->parsed)\r
++ return NULL;\r
++\r
++ found = g_hash_table_lookup_extended (message->headers, header,\r
++ NULL, (gpointer *) &value);\r
++\r
++ /* the caller shouldn't ask for non-restricted headers */\r
++ if (! found && message->restrict_headers)\r
++ return NULL;\r
+ \r
+- return "";\r
++ return value ? value : "";\r
+ }\r
+diff --git a/lib/message.cc b/lib/message.cc\r
+index c91f3a5..9a22d36 100644\r
+--- a/lib/message.cc\r
++++ b/lib/message.cc\r
+@@ -407,6 +407,12 @@ _notmuch_message_ensure_message_file (notmuch_message_t *message)\r
+ return;\r
+ \r
+ message->message_file = _notmuch_message_file_open_ctx (message, filename);\r
++\r
++ /* XXX: better return value handling */\r
++ if (notmuch_message_file_parse (message->message_file)) {\r
++ notmuch_message_file_close (message->message_file);\r
++ message->message_file = NULL;\r
++ }\r
+ }\r
+ \r
+ const char *\r
+diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h\r
+index af185c7..7277df1 100644\r
+--- a/lib/notmuch-private.h\r
++++ b/lib/notmuch-private.h\r
+@@ -345,6 +345,10 @@ notmuch_message_file_open (const char *filename);\r
+ notmuch_message_file_t *\r
+ _notmuch_message_file_open_ctx (void *ctx, const char *filename);\r
+ \r
++/* Parse the message */\r
++notmuch_status_t\r
++notmuch_message_file_parse (notmuch_message_file_t *message);\r
++\r
+ /* Close a notmuch message previously opened with notmuch_message_open. */\r
+ void\r
+ notmuch_message_file_close (notmuch_message_file_t *message);\r
+-- \r
+1.8.5.2\r
+\r