From: Austin Clements Date: Wed, 22 Oct 2014 01:33:00 +0000 (+2000) Subject: Re: [PATCH v2 07/12] lib: Internal support for querying and creating ghost messages X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=d219c53d01fa9a6a921e1c587ef517e3269eb9f2;p=notmuch-archives.git Re: [PATCH v2 07/12] lib: Internal support for querying and creating ghost messages --- diff --git a/2d/5ca56de14210892a6aab7f850bf7217bde83bd b/2d/5ca56de14210892a6aab7f850bf7217bde83bd new file mode 100644 index 000000000..fb028ece8 --- /dev/null +++ b/2d/5ca56de14210892a6aab7f850bf7217bde83bd @@ -0,0 +1,283 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 18AF0431FB6 + for ; Tue, 21 Oct 2014 18:33:09 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: -2.3 +X-Spam-Level: +X-Spam-Status: No, score=-2.3 tagged_above=-999 required=5 + tests=[RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id N+dcz96voSKY for ; + Tue, 21 Oct 2014 18:33:01 -0700 (PDT) +Received: from outgoing.csail.mit.edu (outgoing.csail.mit.edu [128.30.2.149]) + by olra.theworths.org (Postfix) with ESMTP id 5A730431FAE + for ; Tue, 21 Oct 2014 18:33:01 -0700 (PDT) +Received: from [104.131.20.129] (helo=awakeningjr) + by outgoing.csail.mit.edu with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16) + (Exim 4.72) (envelope-from ) + id 1XgknM-0004kE-Kp; Tue, 21 Oct 2014 21:33:00 -0400 +Received: from amthrax by awakeningjr with local (Exim 4.84) + (envelope-from ) + id 1XgknM-0000Ah-Ay; Tue, 21 Oct 2014 21:33:00 -0400 +Date: Tue, 21 Oct 2014 21:33:00 -0400 +From: Austin Clements +To: Mark Walters +Subject: Re: [PATCH v2 07/12] lib: Internal support for querying and creating + ghost messages +Message-ID: <20141022013300.GD7970@csail.mit.edu> +References: <1412637438-4821-1-git-send-email-aclements@csail.mit.edu> + <1412637438-4821-8-git-send-email-aclements@csail.mit.edu> + <8738ahja72.fsf@qmul.ac.uk> +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +Content-Disposition: inline +In-Reply-To: <8738ahja72.fsf@qmul.ac.uk> +User-Agent: Mutt/1.5.23 (2014-03-12) +Cc: notmuch@notmuchmail.org +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Wed, 22 Oct 2014 01:33:09 -0000 + +Quoth Mark Walters on Oct 22 at 12:05 am: +> +> Hi +> +> I am slowly working my way through this series: only two trivial queries +> so far. +> +> On Tue, 07 Oct 2014, Austin Clements wrote: +> > From: Austin Clements +> > +> > This updates the message abstraction to support ghost messages: it +> > adds a message flag that distinguishes regular messages from ghost +> > messages, and an internal function for initializing a newly created +> > (blank) message as a ghost message. +> > --- +> > lib/message.cc | 52 +++++++++++++++++++++++++++++++++++++++++++++++++-- +> > lib/notmuch-private.h | 4 ++++ +> > lib/notmuch.h | 9 ++++++++- +> > 3 files changed, 62 insertions(+), 3 deletions(-) +> > +> > diff --git a/lib/message.cc b/lib/message.cc +> > index 55d2ff6..a7a13cc 100644 +> > --- a/lib/message.cc +> > +++ b/lib/message.cc +> > @@ -39,6 +39,9 @@ struct visible _notmuch_message { +> > notmuch_message_file_t *message_file; +> > notmuch_message_list_t *replies; +> > unsigned long flags; +> > + /* For flags that are initialized on-demand, lazy_flags indicates +> > + * if each flag has been initialized. */ +> > + unsigned long lazy_flags; +> +> I wonder if valid_flags might be better here as, as far as I can see, +> the reason for these is so we can invalidate a flag more than an +> optimisation (which is what I thought the lazy implied). + +I do think of this as an optimization. If we were to compute the +value of this flag when a message was created (and keep it +up-to-date), there would be no need for lazy_flags. But, unlike the +other flags, computing this is expensive. + +> > +> > Xapian::Document doc; +> > Xapian::termcount termpos; +> > @@ -99,6 +102,7 @@ _notmuch_message_create_for_document (const void *talloc_owner, +> > +> > message->frozen = 0; +> > message->flags = 0; +> > + message->lazy_flags = 0; +> > +> > /* Each of these will be lazily created as needed. */ +> > message->message_id = NULL; +> > @@ -192,7 +196,7 @@ _notmuch_message_create (const void *talloc_owner, +> > * +> > * There is already a document with message ID 'message_id' in the +> > * database. The returned message can be used to query/modify the +> > - * document. +> > + * document. The message may be a ghost message. +> > * +> > * NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND: +> > * +> > @@ -305,6 +309,7 @@ _notmuch_message_ensure_metadata (notmuch_message_t *message) +> > const char *thread_prefix = _find_prefix ("thread"), +> > *tag_prefix = _find_prefix ("tag"), +> > *id_prefix = _find_prefix ("id"), +> > + *type_prefix = _find_prefix ("type"), +> > *filename_prefix = _find_prefix ("file-direntry"), +> > *replyto_prefix = _find_prefix ("replyto"); +> > +> > @@ -337,10 +342,25 @@ _notmuch_message_ensure_metadata (notmuch_message_t *message) +> > message->message_id = +> > _notmuch_message_get_term (message, i, end, id_prefix); +> > +> > + /* Get document type */ +> > + assert (strcmp (id_prefix, type_prefix) < 0); +> > + if (! NOTMUCH_TEST_BIT (message->lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST)) { +> > + i.skip_to (type_prefix); +> > + /* "T" is the prefix "type" fields. See +> > + * BOOLEAN_PREFIX_INTERNAL. */ +> > + if (*i == "Tmail") +> > + NOTMUCH_CLEAR_BIT (&message->flags, NOTMUCH_MESSAGE_FLAG_GHOST); +> > + else if (*i == "Tghost") +> > + NOTMUCH_SET_BIT (&message->flags, NOTMUCH_MESSAGE_FLAG_GHOST); +> > + else +> > + INTERNAL_ERROR ("Message without type term"); +> > + NOTMUCH_SET_BIT (&message->lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST); +> > + } +> > + +> > /* Get filename list. Here we get only the terms. We lazily +> > * expand them to full file names when needed in +> > * _notmuch_message_ensure_filename_list. */ +> > - assert (strcmp (id_prefix, filename_prefix) < 0); +> > + assert (strcmp (type_prefix, filename_prefix) < 0); +> > if (!message->filename_term_list && !message->filename_list) +> > message->filename_term_list = +> > _notmuch_database_get_terms_with_prefix (message, i, end, +> > @@ -371,6 +391,11 @@ _notmuch_message_invalidate_metadata (notmuch_message_t *message, +> > message->tag_list = NULL; +> > } +> > +> > + if (strcmp ("type", prefix_name) == 0) { +> > + NOTMUCH_CLEAR_BIT (&message->flags, NOTMUCH_MESSAGE_FLAG_GHOST); +> > + NOTMUCH_CLEAR_BIT (&message->lazy_flags, NOTMUCH_MESSAGE_FLAG_GHOST); +> > + } +> > + +> > if (strcmp ("file-direntry", prefix_name) == 0) { +> > talloc_free (message->filename_term_list); +> > talloc_free (message->filename_list); +> > @@ -869,6 +894,10 @@ notmuch_bool_t +> > notmuch_message_get_flag (notmuch_message_t *message, +> > notmuch_message_flag_t flag) +> > { +> > + if (flag == NOTMUCH_MESSAGE_FLAG_GHOST && +> > + ! NOTMUCH_TEST_BIT (message->lazy_flags, flag)) +> > + _notmuch_message_ensure_metadata (message); +> > + +> > return NOTMUCH_TEST_BIT (message->flags, flag); +> > } +> > +> > @@ -880,6 +909,7 @@ notmuch_message_set_flag (notmuch_message_t *message, +> > NOTMUCH_SET_BIT (&message->flags, flag); +> > else +> > NOTMUCH_CLEAR_BIT (&message->flags, flag); +> > + NOTMUCH_SET_BIT (&message->lazy_flags, flag); +> > } +> > +> > time_t +> > @@ -989,6 +1019,24 @@ _notmuch_message_delete (notmuch_message_t *message) +> > return NOTMUCH_STATUS_SUCCESS; +> > } +> > +> > +/* Transform a blank message into a ghost message. The caller must +> > + * _notmuch_message_sync the message. */ +> > +notmuch_private_status_t +> > +_notmuch_message_initialize_ghost (notmuch_message_t *message, +> > + const char *thread_id) +> > +{ +> > + notmuch_private_status_t status; +> > + +> > + status = _notmuch_message_add_term (message, "type", "ghost"); +> > + if (status) +> > + return status; +> > + status = _notmuch_message_add_term (message, "thread", thread_id); +> > + if (status) +> > + return status; +> > + +> > + return NOTMUCH_PRIVATE_STATUS_SUCCESS; +> > +} +> > + +> > /* Ensure that 'message' is not holding any file object open. Future +> > * calls to various functions will still automatically open the +> > * message file as needed. +> > diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h +> > index 7250291..2f43c1d 100644 +> > --- a/lib/notmuch-private.h +> > +++ b/lib/notmuch-private.h +> > @@ -308,6 +308,10 @@ _notmuch_message_sync (notmuch_message_t *message); +> > notmuch_status_t +> > _notmuch_message_delete (notmuch_message_t *message); +> > +> > +notmuch_private_status_t +> > +_notmuch_message_initialize_ghost (notmuch_message_t *message, +> > + const char *thread_id); +> > + +> > void +> > _notmuch_message_close (notmuch_message_t *message); +> > +> > diff --git a/lib/notmuch.h b/lib/notmuch.h +> > index dae0416..92594b9 100644 +> > --- a/lib/notmuch.h +> > +++ b/lib/notmuch.h +> > @@ -1221,7 +1221,14 @@ notmuch_message_get_filenames (notmuch_message_t *message); +> > */ +> > typedef enum _notmuch_message_flag { +> > NOTMUCH_MESSAGE_FLAG_MATCH, +> > - NOTMUCH_MESSAGE_FLAG_EXCLUDED +> > + NOTMUCH_MESSAGE_FLAG_EXCLUDED, +> > + +> > + /* This message is a "ghost message", meaning it has no filenames +> > + * or content, but we know it exists because it was referenced by +> > + * some other message. A ghost message has only a message ID and +> > + * thread ID. +> > + */ +> +> Can I check here: we are not allowing a ghost message to have any tags? + +Correct, at least for now. + +However, I think it would make *a lot* of sense to be able to pre-seed +ghost messages with tags. nmbug could use this to avoid races with +receiving messages. Distributed tag sync could use it similarly. +Insert could use it to eliminate the nasty races between storing the +message, indexing it, and tagging it. Restore could potentially use +it. When sending messages, we could pre-seed a sent tag for when the +sent message is re-received (though insert may obviate that). I'm +sure there are other uses I haven't thought of. + +This requires some new APIs, since there's currently no way for a +library user to create a ghost message or get at it to tag it. It +also slightly complicates notmuch_database_get_all_tags since that +probably shouldn't return tags that are only on ghost messages (I +think if we just collect all the docids in the Tghost posting list and +use that to filter out tag terms that there should be almost no +performance impact of this). But these are both quite doable things. + +A more complicated question is what we want to do with deleted +messages. Currently we remove them entirely from the database, but we +*could* keep around their tags so if the message reappears (e.g., +there was a transient problem) we can bring back the tags. After +thinking about this a great deal, I concluded we should just continue +deleting them from the database (or, at most, strip the message back +down to its thread ID). If anyone's curious, I can write up my +thoughts on this, but it boils down to complicated the semantics of +initial tagging and dump/restore. + +> Best wishes +> +> Mark +> +> > + NOTMUCH_MESSAGE_FLAG_GHOST, +> > } notmuch_message_flag_t; +> > +> > /** +> > +> > _______________________________________________ +> > notmuch mailing list +> > notmuch@notmuchmail.org +> > http://notmuchmail.org/mailman/listinfo/notmuch