--- /dev/null
+Return-Path: <m.walters@qmul.ac.uk>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+ by olra.theworths.org (Postfix) with ESMTP id 132F7431FB6\r
+ for <notmuch@notmuchmail.org>; Tue, 21 Oct 2014 16:10:38 -0700 (PDT)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: -1.098\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=-1.098 tagged_above=-999 required=5\r
+ tests=[DKIM_ADSP_CUSTOM_MED=0.001, FREEMAIL_FROM=0.001,\r
+ NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+ by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+ with ESMTP id uV+vVp9B3lMM for <notmuch@notmuchmail.org>;\r
+ Tue, 21 Oct 2014 16:10:30 -0700 (PDT)\r
+Received: from mail2.qmul.ac.uk (mail2.qmul.ac.uk [138.37.6.6])\r
+ (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))\r
+ (No client certificate requested)\r
+ by olra.theworths.org (Postfix) with ESMTPS id C5058431FAE\r
+ for <notmuch@notmuchmail.org>; Tue, 21 Oct 2014 16:10:29 -0700 (PDT)\r
+Received: from smtp.qmul.ac.uk ([138.37.6.40])\r
+ by mail2.qmul.ac.uk with esmtp (Exim 4.71)\r
+ (envelope-from <m.walters@qmul.ac.uk>)\r
+ id 1XgiZM-0004l0-3F; Wed, 22 Oct 2014 00:10:26 +0100\r
+Received: from 5751dfa2.skybroadband.com ([87.81.223.162] helo=localhost)\r
+ by smtp.qmul.ac.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.71)\r
+ (envelope-from <m.walters@qmul.ac.uk>)\r
+ id 1XgiZL-0002hb-E9; Wed, 22 Oct 2014 00:10:23 +0100\r
+From: Mark Walters <markwalters1009@gmail.com>\r
+To: Austin Clements <aclements@csail.mit.edu>, notmuch@notmuchmail.org\r
+Subject: Re: [PATCH v2 08/12] lib: Implement ghost-based thread linking\r
+In-Reply-To: <1412637438-4821-9-git-send-email-aclements@csail.mit.edu>\r
+References: <1412637438-4821-1-git-send-email-aclements@csail.mit.edu>\r
+ <1412637438-4821-9-git-send-email-aclements@csail.mit.edu>\r
+User-Agent: Notmuch/0.18.1+86~gef5e66a (http://notmuchmail.org) Emacs/23.4.1\r
+ (x86_64-pc-linux-gnu)\r
+Date: Wed, 22 Oct 2014 00:10:21 +0100\r
+Message-ID: <87zjcphvdu.fsf@qmul.ac.uk>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain; charset=us-ascii\r
+X-Sender-Host-Address: 87.81.223.162\r
+X-QM-Geographic: According to ripencc,\r
+ this message was delivered by a machine in Britain (UK) (GB).\r
+X-QM-SPAM-Info: Sender has good ham record. :)\r
+X-QM-Body-MD5: 9724697ff21306866e8fe63340c902a0 (of first 20000 bytes)\r
+X-SpamAssassin-Score: -0.1\r
+X-SpamAssassin-SpamBar: /\r
+X-SpamAssassin-Report: The QM spam filters have analysed this message to\r
+ determine if it is\r
+ spam. We require at least 5.0 points to mark a message as spam.\r
+ This message scored -0.1 points.\r
+ Summary of the scoring: \r
+ * 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail\r
+ provider * (markwalters1009[at]gmail.com)\r
+ * -0.1 AWL AWL: From: address is in the auto white-list\r
+X-QM-Scan-Virus: ClamAV says the message is clean\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+ <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Tue, 21 Oct 2014 23:10:38 -0000\r
+\r
+On Tue, 07 Oct 2014, Austin Clements <aclements@csail.mit.edu> wrote:\r
+> From: Austin Clements <amdragon@mit.edu>\r
+>\r
+> This updates the thread linking code to use ghost messages instead of\r
+> user metadata to link messages into threads.\r
+>\r
+> In contrast with the old approach, this is actually correct.\r
+> Previously, thread merging updated only the thread IDs of message\r
+> documents, not thread IDs stored in user metadata. As originally\r
+> diagnosed by Mark Walters [1] and as demonstrated by the broken\r
+> T260-thread-order test, this can cause notmuch to fail to link\r
+> messages even though they're in the same thread. In principle the old\r
+> approach could have been fixed by updating the user metadata thread\r
+> IDs as well, but these are not indexed and hence this would have\r
+> required a full scan of all stored thread IDs. Ghost messages solve\r
+> this problem naturally by reusing the exact same thread ID and message\r
+> ID representation and indexing as regular messages.\r
+>\r
+> Furthermore, thanks to this greater symmetry, ghost messages are also\r
+> algorithmically simpler. We continue to support the old user metadata\r
+> format, so this patch can't delete any code, but when we do remove\r
+> support for the old format, several functions can simply be deleted.\r
+>\r
+> [1] id:8738h7kv2q.fsf@qmul.ac.uk\r
+> ---\r
+> lib/database.cc | 86 +++++++++++++++++++++++++++++++++++++++++++++++++--------\r
+> 1 file changed, 75 insertions(+), 11 deletions(-)\r
+>\r
+> diff --git a/lib/database.cc b/lib/database.cc\r
+> index c641bcd..fdcc526 100644\r
+> --- a/lib/database.cc\r
+> +++ b/lib/database.cc\r
+> @@ -1752,6 +1752,12 @@ _get_metadata_thread_id_key (void *ctx, const char *message_id)\r
+> message_id);\r
+> }\r
+> \r
+> +static notmuch_status_t\r
+> +_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,\r
+> + void *ctx,\r
+> + const char *message_id,\r
+> + const char **thread_id_ret);\r
+> +\r
+> /* Find the thread ID to which the message with 'message_id' belongs.\r
+> *\r
+> * Note: 'thread_id_ret' must not be NULL!\r
+> @@ -1760,9 +1766,9 @@ _get_metadata_thread_id_key (void *ctx, const char *message_id)\r
+> *\r
+> * Note: If there is no message in the database with the given\r
+> * 'message_id' then a new thread_id will be allocated for this\r
+> - * message and stored in the database metadata, (where this same\r
+> + * message ID and stored in the database metadata so that the\r
+> * thread ID can be looked up if the message is added to the database\r
+> - * later).\r
+> + * later.\r
+> */\r
+> static notmuch_status_t\r
+> _resolve_message_id_to_thread_id (notmuch_database_t *notmuch,\r
+> @@ -1770,6 +1776,49 @@ _resolve_message_id_to_thread_id (notmuch_database_t *notmuch,\r
+> const char *message_id,\r
+> const char **thread_id_ret)\r
+> {\r
+> + notmuch_private_status_t status;\r
+> + notmuch_message_t *message;\r
+> +\r
+> + if (! (notmuch->features & NOTMUCH_FEATURE_GHOSTS))\r
+> + return _resolve_message_id_to_thread_id_old (notmuch, ctx, message_id,\r
+> + thread_id_ret);\r
+> +\r
+> + /* Look for this message (regular or ghost) */\r
+> + message = _notmuch_message_create_for_message_id (\r
+> + notmuch, message_id, &status);\r
+> + if (status == NOTMUCH_PRIVATE_STATUS_SUCCESS) {\r
+> + /* Message exists */\r
+> + *thread_id_ret = talloc_steal (\r
+> + ctx, notmuch_message_get_thread_id (message));\r
+> + } else if (status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) {\r
+> + /* Message did not exist. Give it a fresh thread ID and\r
+> + * populate this message as a ghost message. */\r
+> + *thread_id_ret = talloc_strdup (\r
+> + ctx, _notmuch_database_generate_thread_id (notmuch));\r
+> + if (! *thread_id_ret) {\r
+> + status = NOTMUCH_PRIVATE_STATUS_OUT_OF_MEMORY;\r
+> + } else {\r
+> + status = _notmuch_message_initialize_ghost (message, *thread_id_ret);\r
+> + if (status == 0)\r
+> + /* Commit the new ghost message */\r
+> + _notmuch_message_sync (message);\r
+> + }\r
+> + } else {\r
+> + /* Create failed. Fall through. */\r
+> + }\r
+> +\r
+> + notmuch_message_destroy (message);\r
+> +\r
+> + return COERCE_STATUS (status, "Error creating ghost message");\r
+> +}\r
+> +\r
+> +/* Pre-ghost messages _resolve_message_id_to_thread_id */\r
+> +static notmuch_status_t\r
+> +_resolve_message_id_to_thread_id_old (notmuch_database_t *notmuch,\r
+> + void *ctx,\r
+> + const char *message_id,\r
+> + const char **thread_id_ret)\r
+> +{\r
+> notmuch_status_t status;\r
+> notmuch_message_t *message;\r
+> string thread_id_string;\r
+> @@ -2007,7 +2056,7 @@ _consume_metadata_thread_id (void *ctx, notmuch_database_t *notmuch,\r
+> }\r
+> }\r
+> \r
+> -/* Given a (mostly empty) 'message' and its corresponding\r
+> +/* Given a blank or ghost 'message' and its corresponding\r
+> * 'message_file' link it to existing threads in the database.\r
+> *\r
+> * The first check is in the metadata of the database to see if we\r
+\r
+There is quite a lot of comment below this and it is not clear to me how\r
+much of it applies to the is_ghost case. In particular does the \r
+\r
+ * Finally, we look in the database for existing message that\r
+ * reference 'message'.\r
+\r
+still apply? I would have thought that we would already have done that\r
+in the is_ghost case. \r
+\r
+Of course this also applies to the actual code which seems to call\r
+_notmuch_database_link_message_to_children unconditionally. It might be\r
+worth a comment in any case.\r
+\r
+Best wishes\r
+\r
+Mark\r
+\r
+\r
+> @@ -2035,16 +2084,22 @@ _consume_metadata_thread_id (void *ctx, notmuch_database_t *notmuch,\r
+> static notmuch_status_t\r
+> _notmuch_database_link_message (notmuch_database_t *notmuch,\r
+> notmuch_message_t *message,\r
+> - notmuch_message_file_t *message_file)\r
+> + notmuch_message_file_t *message_file,\r
+> + notmuch_bool_t is_ghost)\r
+> {\r
+> void *local = talloc_new (NULL);\r
+> notmuch_status_t status;\r
+> - const char *thread_id;\r
+> + const char *thread_id = NULL;\r
+> \r
+> /* Check if the message already had a thread ID */\r
+> - thread_id = _consume_metadata_thread_id (local, notmuch, message);\r
+> - if (thread_id)\r
+> - _notmuch_message_add_term (message, "thread", thread_id);\r
+> + if (notmuch->features & NOTMUCH_FEATURE_GHOSTS) {\r
+> + if (is_ghost)\r
+> + thread_id = notmuch_message_get_thread_id (message);\r
+> + } else {\r
+> + thread_id = _consume_metadata_thread_id (local, notmuch, message);\r
+> + if (thread_id)\r
+> + _notmuch_message_add_term (message, "thread", thread_id);\r
+> + }\r
+> \r
+> status = _notmuch_database_link_message_to_parents (notmuch, message,\r
+> message_file,\r
+> @@ -2079,6 +2134,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch,\r
+> notmuch_message_t *message = NULL;\r
+> notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS, ret2;\r
+> notmuch_private_status_t private_status;\r
+> + notmuch_bool_t is_ghost = false;\r
+> \r
+> const char *date, *header;\r
+> const char *from, *to, *subject;\r
+> @@ -2171,12 +2227,20 @@ notmuch_database_add_message (notmuch_database_t *notmuch,\r
+> \r
+> _notmuch_message_add_filename (message, filename);\r
+> \r
+> - /* Is this a newly created message object? */\r
+> - if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND) {\r
+> + /* Is this a newly created message object or a ghost\r
+> + * message? We have to be slightly careful: if this is a\r
+> + * blank message, it's not safe to call\r
+> + * notmuch_message_get_flag yet. */\r
+> + if (private_status == NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND ||\r
+> + (is_ghost = notmuch_message_get_flag (\r
+> + message, NOTMUCH_MESSAGE_FLAG_GHOST))) {\r
+> _notmuch_message_add_term (message, "type", "mail");\r
+> + if (is_ghost)\r
+> + /* Convert ghost message to a regular message */\r
+> + _notmuch_message_remove_term (message, "type", "ghost");\r
+> \r
+> ret = _notmuch_database_link_message (notmuch, message,\r
+> - message_file);\r
+> + message_file, is_ghost);\r
+> if (ret)\r
+> goto DONE;\r
+> \r
+> -- \r
+> 2.1.0\r
+>\r
+> _______________________________________________\r
+> notmuch mailing list\r
+> notmuch@notmuchmail.org\r
+> http://notmuchmail.org/mailman/listinfo/notmuch\r