Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 14E25431FB6 for ; Sun, 27 Jul 2014 09:42:37 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -2.3 X-Spam-Level: X-Spam-Status: No, score=-2.3 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ImwWbGeucHTM for ; Sun, 27 Jul 2014 09:42:29 -0700 (PDT) Received: from dmz-mailsec-scanner-7.mit.edu (dmz-mailsec-scanner-7.mit.edu [18.7.68.36]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 2E156431FAE for ; Sun, 27 Jul 2014 09:42:29 -0700 (PDT) X-AuditID: 12074424-f79146d00000067c-d8-53d52bf43424 Received: from mailhub-auth-3.mit.edu ( [18.9.21.43]) (using TLS with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by dmz-mailsec-scanner-7.mit.edu (Symantec Messaging Gateway) with SMTP id 4F.A3.01660.4FB25D35; Sun, 27 Jul 2014 12:42:28 -0400 (EDT) Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id s6RGgRmg012752; Sun, 27 Jul 2014 12:42:28 -0400 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id s6RGgPqo019190 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Sun, 27 Jul 2014 12:42:26 -0400 Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80) (envelope-from ) id 1XBRWi-00079H-Ri; Sun, 27 Jul 2014 12:42:24 -0400 Date: Sun, 27 Jul 2014 12:42:24 -0400 From: Austin Clements To: Mark Walters Subject: Re: [PATCH 08/14] lib: Simplify upgrade code using a transaction Message-ID: <20140727164224.GG13893@mit.edu> References: <1406433173-19169-1-git-send-email-amdragon@mit.edu> <1406433173-19169-9-git-send-email-amdragon@mit.edu> <87vbqjywhy.fsf@qmul.ac.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <87vbqjywhy.fsf@qmul.ac.uk> User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmpileLIzCtJLcpLzFFi42IR4hTV1v2ifTXYoG++ssXquTwW12/OZHZg 8tg56y67x7NVt5gDmKK4bFJSczLLUov07RK4Mp69yS44olsxee9nlgbG50pdjJwcEgImElfu bmCHsMUkLtxbz9bFyMUhJDCbSeLjwS9QzkZGifkt+5kgnNNMEpN61rJAOEsYJZZeXckC0s8i oCpxd+1kNhCbTUBDYtv+5YwgtoiAjsTtQwvAdjALSEt8+90MNImDQ1jAU+LLoUKQMC9QSc+J 71AzpzJKvH7TxwyREJQ4OfMJC0SvlsSNfy/BekHmLP/HARLmBFp1fO8vsFWiAioSU05uY5vA KDQLSfcsJN2zELoXMDKvYpRNya3SzU3MzClOTdYtTk7My0st0jXXy80s0UtNKd3ECApqdheV HYzNh5QOMQpwMCrx8GawXQkWYk0sK67MPcQoycGkJMqrDYwJIb6k/JTKjMTijPii0pzU4kOM EhzMSiK8hW+BynlTEiurUovyYVLSHCxK4rxvra2ChQTSE0tSs1NTC1KLYLIyHBxKErx/tICG ChalpqdWpGXmlCCkmTg4QYbzAA2frwFUw1tckJhbnJkOkT/FqCglznsSpFkAJJFRmgfXC0s6 rxjFgV4R5uUAuZsHmLDgul8BDWYCGszifxlkcEkiQkqqgXGO3e1Aee3sjCVHL8xWjPrUqWrh /kHf2fb0U8dTplN2C1sW6rodU1Mt/JbQpcz8/2Pqj/5zEWZZigai2/a3anrO2FJcuSjRhkU+ PONP5sx7J8MjarbufxXkfHSiarDul5QqV5bSz3d19hbeczzKxraq61rC4q4tUfnZPBe3L2ec faz3TdDjeCWW4oxEQy3mouJEAEbJL1YVAwAA Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Jul 2014 16:42:37 -0000 Quoth Mark Walters on Jul 27 at 10:35 am: > > Hi > > On Sun, 27 Jul 2014, Austin Clements wrote: > > Previously, the upgrade was organized as two passes -- an upgrade > > pass, and a separate cleanup pass -- so the database was always in a > > valid state. This change substantially simplifies this code by > > performing the upgrade in a transaction and combining both passes in > > to one. This 1) eliminates a lot of duplicate code between the > > passes, 2) speeds up the upgrade process, 3) makes progress reporting > > more accurate, 4) eliminates the potential for stale data if the > > upgrade is interrupted during the cleanup pass, and 5) makes it easier > > to reason about the safety of the upgrade code. > > I like this but I wonder if it has a side effect: I think with the > current code the user can interrupt the upgrade (ctrl-c) and continue > roughly where it left off. This looks like it means the whole upgrade > needs to be done in one go. Will this be a problem on large mail stores > (eg rlb with over 1M messages)? > > I am not sure what could be done during the interrupted upgrade before > so maybe this is not a problem. I haven't tested this hypothesis, but I don't think a partially completed upgrade would actually help upon restarting the upgrade. Since the old upgrade process couldn't safely remove terms/data until the end of the upgrade, if it were interrupted, the next upgrade would start right back at the beginning and do everything over again. Also, since the old upgrade code had to update the version number before removing old terms/data, if it was interrupted during the cleanup process the database would be left with cruft that would *never* be removed. With features we actually have a better chance of making partially completed upgrades useful: we could commit after each individual feature gets upgraded. Of course, that only helps when upgrade has multiple new features to upgrade to, so it may or may not be useful in practice depending on how quickly we add new features. > Best wishes > > Mark > > > > --- > > lib/database.cc | 67 ++++++--------------------------------------------------- > > 1 file changed, 7 insertions(+), 60 deletions(-) > > > > diff --git a/lib/database.cc b/lib/database.cc > > index 03eef3e..0be7180 100644 > > --- a/lib/database.cc > > +++ b/lib/database.cc > > @@ -1238,6 +1238,9 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > > timer_is_active = TRUE; > > } > > > > + /* Perform the upgrade in a transaction. */ > > + db->begin_transaction (true); > > + > > /* Before version 1, each message document had its filename in the > > * data field. Copy that into the new format by calling > > * notmuch_message_add_filename. > > @@ -1265,6 +1268,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > > filename = _notmuch_message_talloc_copy_data (message); > > if (filename && *filename != '\0') { > > _notmuch_message_add_filename (message, filename); > > + _notmuch_message_clear_data (message); > > _notmuch_message_sync (message); > > } > > talloc_free (filename); > > @@ -1312,6 +1316,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > > NOTMUCH_FIND_CREATE, &status); > > notmuch_directory_set_mtime (directory, mtime); > > notmuch_directory_destroy (directory); > > + > > + db->delete_document (*p); > > } > > } > > } > > @@ -1353,67 +1359,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > > notmuch->features |= NOTMUCH_FEATURES_CURRENT; > > db->set_metadata ("features", _print_features (local, notmuch->features)); > > db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION)); > > - db->flush (); > > - > > - /* Now that the upgrade is complete we can remove the old data > > - * and documents that are no longer needed. */ > > - if (version < 1) { > > - notmuch_query_t *query = notmuch_query_create (notmuch, ""); > > - notmuch_messages_t *messages; > > - notmuch_message_t *message; > > - char *filename; > > - > > - for (messages = notmuch_query_search_messages (query); > > - notmuch_messages_valid (messages); > > - notmuch_messages_move_to_next (messages)) > > - { > > - if (do_progress_notify) { > > - progress_notify (closure, (double) count / total); > > - do_progress_notify = 0; > > - } > > - > > - message = notmuch_messages_get (messages); > > - > > - filename = _notmuch_message_talloc_copy_data (message); > > - if (filename && *filename != '\0') { > > - _notmuch_message_clear_data (message); > > - _notmuch_message_sync (message); > > - } > > - talloc_free (filename); > > - > > - notmuch_message_destroy (message); > > - } > > > > - notmuch_query_destroy (query); > > - } > > - > > - if (version < 1) { > > - Xapian::TermIterator t, t_end; > > - > > - t_end = notmuch->xapian_db->allterms_end ("XTIMESTAMP"); > > - > > - for (t = notmuch->xapian_db->allterms_begin ("XTIMESTAMP"); > > - t != t_end; > > - t++) > > - { > > - Xapian::PostingIterator p, p_end; > > - std::string term = *t; > > - > > - p_end = notmuch->xapian_db->postlist_end (term); > > - > > - for (p = notmuch->xapian_db->postlist_begin (term); > > - p != p_end; > > - p++) > > - { > > - if (do_progress_notify) { > > - progress_notify (closure, (double) count / total); > > - do_progress_notify = 0; > > - } > > - > > - db->delete_document (*p); > > - } > > - } > > - } > > + db->commit_transaction (); > > > > if (timer_is_active) { > > /* Now stop the timer. */ > >