Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 719F0431FD0 for ; Tue, 25 Jan 2011 14:42:35 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.698 X-Spam-Level: X-Spam-Status: No, score=-0.698 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id zuGEd-dEYTpd for ; Tue, 25 Jan 2011 14:42:34 -0800 (PST) Received: from mail-qy0-f181.google.com (mail-qy0-f181.google.com [209.85.216.181]) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 33969431FB6 for ; Tue, 25 Jan 2011 14:42:34 -0800 (PST) Received: by qyk12 with SMTP id 12so350190qyk.5 for ; Tue, 25 Jan 2011 14:42:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; bh=KUs+RtmeokD9+CDz33kGAvp+xkCpVLUXIaUCaNX/O4A=; b=L763dj//C7XFVtzeRTt8G6LL20Rg1aEfZQ7cirQqyjTdctnmfaX/RCy42cOmsSi6dC k5ElRzHShkXr6ikw+ivQIKuzwqfXIl3LBzjNiBjMnViYc3LCJ+iA8+dtzXWFniRCNzJm ExxkebLXXt8P8L7bS8plf5eA4AjZ1oOxVRIAM= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=P7CILKt6GCgMRoaCVRYztoQlRPaYbFI7ouExdDHBgHaxof/T2+cRy+DE53MDucG3nF KU8tVhqGrF/o+fXIZlkqy0iAEfqxMxREnEcwkvySGmdv2rSSlOq/2F6t/Js7K4K74Jm+ HDubIyZ4sVXVl/U+H8xbNNAd7hIrPOwJwzoCA= MIME-Version: 1.0 Received: by 10.229.217.133 with SMTP id hm5mr5277348qcb.40.1295995350772; Tue, 25 Jan 2011 14:42:30 -0800 (PST) Sender: amdragon@gmail.com Received: by 10.229.97.143 with HTTP; Tue, 25 Jan 2011 14:42:30 -0800 (PST) In-Reply-To: <1295603977-14326-3-git-send-email-sojkam1@fel.cvut.cz> References: <1295603977-14326-1-git-send-email-sojkam1@fel.cvut.cz> <1295603977-14326-3-git-send-email-sojkam1@fel.cvut.cz> Date: Tue, 25 Jan 2011 17:42:30 -0500 X-Google-Sender-Auth: 10buYVdcvusHvj9TBZFPeqs9hF4 Message-ID: Subject: Re: [PATCH 1/3] new: Do not defer maildir flag synchronization during the first run From: Austin Clements To: Michal Sojka Content-Type: multipart/alternative; boundary=0016361e81a2836fbe049ab36e8f Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 25 Jan 2011 22:42:35 -0000 --0016361e81a2836fbe049ab36e8f Content-Type: text/plain; charset=ISO-8859-1 Wouldn't this be simpler and more general? --- a/notmuch-new.c +++ b/notmuch-new.c @@ -419,12 +419,11 @@ add_files_recursive (notmuch_database_t *notmuch, case NOTMUCH_STATUS_SUCCESS: state->added_messages++; for (tag=state->new_tags; *tag != NULL; tag++) notmuch_message_add_tag (message, *tag); /* Defer sync of maildir flags until after old filenames * are removed in the case of a rename. */ if (state->synchronize_flags == TRUE) - _filename_list_add (state->message_ids_to_sync, - notmuch_message_get_message_id (message)); + notmuch_message_maildir_flags_to_tags (message); break; /* Non-fatal issues (go on to next file) */ case NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID: The idea is that, if notmuch_database_add_message returns NOTMUCH_STATUS_SUCCESS, then we know this is a new message (and not a rename or anything complicated) and thus might as well perform the flag synchronization immediately. If it returns NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID, then it could be a rename (or something more complicated), and so we defer the flag synchronization like usual. This works for any new messages, regardless of whether this is the initial import or not. I believe my reasoning is correct. At least, it passes the maildir sync test cases, so if it isn't correct, then we need more maildir sync tests. On Fri, Jan 21, 2011 at 4:59 AM, Michal Sojka wrote: > When notmuch new is run for the first time, it is not necessary to defer > maildir flags synchronization to later because we already know that no > files will be removed. > > Performing the maildinr flag synchronization immediately after the > message is added to the database has the advantage that the message is > likely hot in the disk cache so the synchronization is faster. > Additionally, we also save one database query for each message, which > must be performed when the operation is deferred. > > Without this patchi, the first notmuch new of 200k messages (3 GB) took > 1h and 46m out of which 20m was maildir flags synchronization. With this > patch, the whole operation took only 1h and 36m. > --- > notmuch-new.c | 36 ++++++++++++++++++++++++++---------- > 1 files changed, 26 insertions(+), 10 deletions(-) > > diff --git a/notmuch-new.c b/notmuch-new.c > index cdf8513..a2af045 100644 > --- a/notmuch-new.c > +++ b/notmuch-new.c > @@ -420,19 +420,35 @@ add_files_recursive (notmuch_database_t *notmuch, > state->added_messages++; > for (tag=state->new_tags; *tag != NULL; tag++) > notmuch_message_add_tag (message, *tag); > - /* Defer sync of maildir flags until after old filenames > - * are removed in the case of a rename. */ > - if (state->synchronize_flags == TRUE) > - _filename_list_add (state->message_ids_to_sync, > - notmuch_message_get_message_id > (message)); > + if (state->synchronize_flags == TRUE) { > + if (!state->total_files) { > + /* Defer sync of maildir flags until after old > filenames > + * are removed in the case of a rename. */ > + _filename_list_add (state->message_ids_to_sync, > + notmuch_message_get_message_id > (message)); > + } else { > + /* During the first notmuch new we synchronize > + * flags immediately, while the message is hot in > + * disk cache. */ > + notmuch_message_maildir_flags_to_tags (message); > + } > + } > break; > /* Non-fatal issues (go on to next file) */ > case NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID: > - /* Defer sync of maildir flags until after old filenames > - * are removed in the case of a rename. */ > - if (state->synchronize_flags == TRUE) > - _filename_list_add (state->message_ids_to_sync, > - notmuch_message_get_message_id > (message)); > + if (state->synchronize_flags == TRUE) { > + if (!state->total_files) { > + /* Defer sync of maildir flags until after old > filenames > + * are removed in the case of a rename. */ > + _filename_list_add (state->message_ids_to_sync, > + notmuch_message_get_message_id > (message)); > + } else { > + /* During the first notmuch new we synchronize > + * flags immediately, while the message is hot in > + * disk cache. */ > + notmuch_message_maildir_flags_to_tags (message); > + } > + } > break; > case NOTMUCH_STATUS_FILE_NOT_EMAIL: > fprintf (stderr, "Note: Ignoring non-mail file: %s\n", > -- > 1.7.2.3 > > _______________________________________________ > notmuch mailing list > notmuch@notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch > --0016361e81a2836fbe049ab36e8f Content-Type: text/html; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Wouldn't this be simpler and more general?

--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -419,12 +41= 9,11 @@ add_files_recursive (notmuch_database_t *notmuch,
=A0=A0 = =A0 =A0 =A0case NOTMUCH_STATUS_SUCCESS:
=A0=A0 =A0 =A0 =A0 =A0 =A0state->added_messages++;
=A0=A0= =A0 =A0 =A0 =A0 =A0for (tag=3Dstate->new_tags; *tag !=3D NULL; tag++)
=A0=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0notmuch_message_add_tag (message= , *tag);
=A0=A0 =A0 =A0 =A0 =A0 =A0/* Defer sync of maildir flags= until after old filenames
=A0=A0 =A0 =A0 =A0 =A0 =A0 * are removed in the case of a rename. */
=A0=A0 =A0 =A0 =A0 =A0 =A0if (state->synchronize_flags =3D=3D T= RUE)
- =A0 =A0 =A0 =A0 =A0 =A0 =A0 _filename_list_add (state->= message_ids_to_sync,
- =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 =A0 =A0 =A0 =A0 =A0 =A0 notmuch_message_get_message_id (message));
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 notmuch_message_maildir_flags_to_tags (m= essage);
=A0=A0 =A0 =A0 =A0 =A0 =A0break;
=A0=A0 =A0 = =A0 =A0/* Non-fatal issues (go on to next file) */
=A0=A0 =A0 =A0= =A0case NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID:

The idea is that, if=A0notmuch_database_add_messa= ge returns=A0NOTMUCH_STATUS_SUCCESS, then we know this is a new message (an= d not a rename or anything complicated) and thus might as well perform the = flag synchronization immediately. =A0If it returns=A0NOTMUCH_STATUS_DUPLICA= TE_MESSAGE_ID, then it could be a rename (or something more complicated), a= nd so we defer the flag synchronization like usual. =A0This works for any n= ew messages, regardless of whether this is the initial import or not.

I believe my reasoning is correct. =A0At least, it pass= es the maildir sync test cases, so if it isn't correct, then we need mo= re maildir sync tests.

On Fri, Jan 21, 2011 at 4:59 AM, Micha= l Sojka <sojkam= 1@fel.cvut.cz> wrote:
When notmuch new is run for the first time, it is not necessary to defer maildir flags synchronization to later because we already know that no
files will be removed.

Performing the maildinr flag synchronization immediately after the
message is added to the database has the advantage that the message is
likely hot in the disk cache so the synchronization is faster.
Additionally, we also save one database query for each message, which
must be performed when the operation is deferred.

Without this patchi, the first notmuch new of 200k messages (3 GB) took
1h and 46m out of which 20m was maildir flags synchronization. With this patch, the whole operation took only 1h and 36m.
---
=A0notmuch-new.c | =A0 36 ++++++++++++++++++++++++++----------
=A01 files changed, 26 insertions(+), 10 deletions(-)

diff --git a/notmuch-new.c b/notmuch-new.c
index cdf8513..a2af045 100644
--- a/notmuch-new.c
+++ b/notmuch-new.c
@@ -420,19 +420,35 @@ add_files_recursive (notmuch_database_t *notmuch,
=A0 =A0 =A0 =A0 =A0 =A0state->added_messages++;
=A0 =A0 =A0 =A0 =A0 =A0for (tag=3Dstate->new_tags; *tag !=3D NULL; tag+= +)
=A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0notmuch_message_add_tag (message, *tag); - =A0 =A0 =A0 =A0 =A0 /* Defer sync of maildir flags until after old filena= mes
- =A0 =A0 =A0 =A0 =A0 =A0* are removed in the case of a rename. */
- =A0 =A0 =A0 =A0 =A0 if (state->synchronize_flags =3D=3D TRUE)
- =A0 =A0 =A0 =A0 =A0 =A0 =A0 _filename_list_add (state->message_ids_to_= sync,
- =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 notmu= ch_message_get_message_id (message));
+ =A0 =A0 =A0 =A0 =A0 if (state->synchronize_flags =3D=3D TRUE) {
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!state->total_files) {
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* Defer sync of maildir flags until = after old filenames
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* are removed in the case of a ren= ame. */
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 _filename_list_add (state->message= _ids_to_sync,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 notmuch_message_get_message_id (message));
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 } else {
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* During the first notmuch new we sy= nchronize
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* flags immediately, while the mes= sage is hot in
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* disk cache. */
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 notmuch_message_maildir_flags_to_tags= (message);
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
+ =A0 =A0 =A0 =A0 =A0 }
=A0 =A0 =A0 =A0 =A0 =A0break;
=A0 =A0 =A0 =A0/* Non-fatal issues (go on to next file) */
=A0 =A0 =A0 =A0case NOTMUCH_STATUS_DUPLICATE_MESSAGE_ID:
- =A0 =A0 =A0 =A0 =A0 /* Defer sync of maildir flags until after old filena= mes
- =A0 =A0 =A0 =A0 =A0 =A0* are removed in the case of a rename. */
- =A0 =A0 =A0 =A0 =A0 if (state->synchronize_flags =3D=3D TRUE)
- =A0 =A0 =A0 =A0 =A0 =A0 =A0 _filename_list_add (state->message_ids_to_= sync,
- =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 notmu= ch_message_get_message_id (message));
+ =A0 =A0 =A0 =A0 =A0 if (state->synchronize_flags =3D=3D TRUE) {
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 if (!state->total_files) {
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* Defer sync of maildir flags until = after old filenames
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* are removed in the case of a ren= ame. */
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 _filename_list_add (state->message= _ids_to_sync,
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 = =A0 notmuch_message_get_message_id (message));
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 } else {
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 /* During the first notmuch new we sy= nchronize
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* flags immediately, while the mes= sage is hot in
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0* disk cache. */
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 =A0 notmuch_message_maildir_flags_to_tags= (message);
+ =A0 =A0 =A0 =A0 =A0 =A0 =A0 }
+ =A0 =A0 =A0 =A0 =A0 }
=A0 =A0 =A0 =A0 =A0 =A0break;
=A0 =A0 =A0 =A0case NOTMUCH_STATUS_FILE_NOT_EMAIL:
=A0 =A0 =A0 =A0 =A0 =A0fprintf (stderr, "Note: Ignoring non-mail file= : %s\n",
--
1.7.2.3

_______________________________________________
notmuch mailing list
notmuch@notmuchmail.org
http://notmuchmail.org/mailman/listinfo/notmuch

--0016361e81a2836fbe049ab36e8f--