From: Austin Clements Date: Sun, 27 Jul 2014 16:42:24 +0000 (+2000) Subject: Re: [PATCH 08/14] lib: Simplify upgrade code using a transaction X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=f4145350bd6c41f7da18f83ba5440d6b4dbed462;p=notmuch-archives.git Re: [PATCH 08/14] lib: Simplify upgrade code using a transaction --- diff --git a/fc/ace1ee0e3e0dad7b16af2d89225739a877b94c b/fc/ace1ee0e3e0dad7b16af2d89225739a877b94c new file mode 100644 index 000000000..fa69e8293 --- /dev/null +++ b/fc/ace1ee0e3e0dad7b16af2d89225739a877b94c @@ -0,0 +1,233 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 14E25431FB6 + for ; Sun, 27 Jul 2014 09:42:37 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: -2.3 +X-Spam-Level: +X-Spam-Status: No, score=-2.3 tagged_above=-999 required=5 + tests=[RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id ImwWbGeucHTM for ; + Sun, 27 Jul 2014 09:42:29 -0700 (PDT) +Received: from dmz-mailsec-scanner-7.mit.edu (dmz-mailsec-scanner-7.mit.edu + [18.7.68.36]) + (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id 2E156431FAE + for ; Sun, 27 Jul 2014 09:42:29 -0700 (PDT) +X-AuditID: 12074424-f79146d00000067c-d8-53d52bf43424 +Received: from mailhub-auth-3.mit.edu ( [18.9.21.43]) + (using TLS with cipher AES256-SHA (256/256 bits)) + (Client did not present a certificate) + by dmz-mailsec-scanner-7.mit.edu (Symantec Messaging Gateway) with SMTP + id 4F.A3.01660.4FB25D35; Sun, 27 Jul 2014 12:42:28 -0400 (EDT) +Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) + by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id s6RGgRmg012752; + Sun, 27 Jul 2014 12:42:28 -0400 +Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) + (authenticated bits=0) + (User authenticated as amdragon@ATHENA.MIT.EDU) + by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id s6RGgPqo019190 + (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); + Sun, 27 Jul 2014 12:42:26 -0400 +Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80) + (envelope-from ) + id 1XBRWi-00079H-Ri; Sun, 27 Jul 2014 12:42:24 -0400 +Date: Sun, 27 Jul 2014 12:42:24 -0400 +From: Austin Clements +To: Mark Walters +Subject: Re: [PATCH 08/14] lib: Simplify upgrade code using a transaction +Message-ID: <20140727164224.GG13893@mit.edu> +References: <1406433173-19169-1-git-send-email-amdragon@mit.edu> + <1406433173-19169-9-git-send-email-amdragon@mit.edu> + <87vbqjywhy.fsf@qmul.ac.uk> +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +Content-Disposition: inline +In-Reply-To: <87vbqjywhy.fsf@qmul.ac.uk> +User-Agent: Mutt/1.5.21 (2010-09-15) +X-Brightmail-Tracker: + H4sIAAAAAAAAA+NgFmpileLIzCtJLcpLzFFi42IR4hTV1v2ifTXYoG++ssXquTwW12/OZHZg + 8tg56y67x7NVt5gDmKK4bFJSczLLUov07RK4Mp69yS44olsxee9nlgbG50pdjJwcEgImElfu + bmCHsMUkLtxbz9bFyMUhJDCbSeLjwS9QzkZGifkt+5kgnNNMEpN61rJAOEsYJZZeXckC0s8i + oCpxd+1kNhCbTUBDYtv+5YwgtoiAjsTtQwvAdjALSEt8+90MNImDQ1jAU+LLoUKQMC9QSc+J + 71AzpzJKvH7TxwyREJQ4OfMJC0SvlsSNfy/BekHmLP/HARLmBFp1fO8vsFWiAioSU05uY5vA + KDQLSfcsJN2zELoXMDKvYpRNya3SzU3MzClOTdYtTk7My0st0jXXy80s0UtNKd3ECApqdheV + HYzNh5QOMQpwMCrx8GawXQkWYk0sK67MPcQoycGkJMqrDYwJIb6k/JTKjMTijPii0pzU4kOM + EhzMSiK8hW+BynlTEiurUovyYVLSHCxK4rxvra2ChQTSE0tSs1NTC1KLYLIyHBxKErx/tICG + ChalpqdWpGXmlCCkmTg4QYbzAA2frwFUw1tckJhbnJkOkT/FqCglznsSpFkAJJFRmgfXC0s6 + rxjFgV4R5uUAuZsHmLDgul8BDWYCGszifxlkcEkiQkqqgXGO3e1Aee3sjCVHL8xWjPrUqWrh + /kHf2fb0U8dTplN2C1sW6rodU1Mt/JbQpcz8/2Pqj/5zEWZZigai2/a3anrO2FJcuSjRhkU+ + PONP5sx7J8MjarbufxXkfHSiarDul5QqV5bSz3d19hbeczzKxraq61rC4q4tUfnZPBe3L2ec + faz3TdDjeCWW4oxEQy3mouJEAEbJL1YVAwAA +Cc: notmuch@notmuchmail.org +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Sun, 27 Jul 2014 16:42:37 -0000 + +Quoth Mark Walters on Jul 27 at 10:35 am: +> +> Hi +> +> On Sun, 27 Jul 2014, Austin Clements wrote: +> > Previously, the upgrade was organized as two passes -- an upgrade +> > pass, and a separate cleanup pass -- so the database was always in a +> > valid state. This change substantially simplifies this code by +> > performing the upgrade in a transaction and combining both passes in +> > to one. This 1) eliminates a lot of duplicate code between the +> > passes, 2) speeds up the upgrade process, 3) makes progress reporting +> > more accurate, 4) eliminates the potential for stale data if the +> > upgrade is interrupted during the cleanup pass, and 5) makes it easier +> > to reason about the safety of the upgrade code. +> +> I like this but I wonder if it has a side effect: I think with the +> current code the user can interrupt the upgrade (ctrl-c) and continue +> roughly where it left off. This looks like it means the whole upgrade +> needs to be done in one go. Will this be a problem on large mail stores +> (eg rlb with over 1M messages)? +> +> I am not sure what could be done during the interrupted upgrade before +> so maybe this is not a problem. + +I haven't tested this hypothesis, but I don't think a partially +completed upgrade would actually help upon restarting the upgrade. +Since the old upgrade process couldn't safely remove terms/data until +the end of the upgrade, if it were interrupted, the next upgrade would +start right back at the beginning and do everything over again. + +Also, since the old upgrade code had to update the version number +before removing old terms/data, if it was interrupted during the +cleanup process the database would be left with cruft that would +*never* be removed. + +With features we actually have a better chance of making partially +completed upgrades useful: we could commit after each individual +feature gets upgraded. Of course, that only helps when upgrade has +multiple new features to upgrade to, so it may or may not be useful in +practice depending on how quickly we add new features. + +> Best wishes +> +> Mark +> +> +> > --- +> > lib/database.cc | 67 ++++++--------------------------------------------------- +> > 1 file changed, 7 insertions(+), 60 deletions(-) +> > +> > diff --git a/lib/database.cc b/lib/database.cc +> > index 03eef3e..0be7180 100644 +> > --- a/lib/database.cc +> > +++ b/lib/database.cc +> > @@ -1238,6 +1238,9 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, +> > timer_is_active = TRUE; +> > } +> > +> > + /* Perform the upgrade in a transaction. */ +> > + db->begin_transaction (true); +> > + +> > /* Before version 1, each message document had its filename in the +> > * data field. Copy that into the new format by calling +> > * notmuch_message_add_filename. +> > @@ -1265,6 +1268,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, +> > filename = _notmuch_message_talloc_copy_data (message); +> > if (filename && *filename != '\0') { +> > _notmuch_message_add_filename (message, filename); +> > + _notmuch_message_clear_data (message); +> > _notmuch_message_sync (message); +> > } +> > talloc_free (filename); +> > @@ -1312,6 +1316,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, +> > NOTMUCH_FIND_CREATE, &status); +> > notmuch_directory_set_mtime (directory, mtime); +> > notmuch_directory_destroy (directory); +> > + +> > + db->delete_document (*p); +> > } +> > } +> > } +> > @@ -1353,67 +1359,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, +> > notmuch->features |= NOTMUCH_FEATURES_CURRENT; +> > db->set_metadata ("features", _print_features (local, notmuch->features)); +> > db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION)); +> > - db->flush (); +> > - +> > - /* Now that the upgrade is complete we can remove the old data +> > - * and documents that are no longer needed. */ +> > - if (version < 1) { +> > - notmuch_query_t *query = notmuch_query_create (notmuch, ""); +> > - notmuch_messages_t *messages; +> > - notmuch_message_t *message; +> > - char *filename; +> > - +> > - for (messages = notmuch_query_search_messages (query); +> > - notmuch_messages_valid (messages); +> > - notmuch_messages_move_to_next (messages)) +> > - { +> > - if (do_progress_notify) { +> > - progress_notify (closure, (double) count / total); +> > - do_progress_notify = 0; +> > - } +> > - +> > - message = notmuch_messages_get (messages); +> > - +> > - filename = _notmuch_message_talloc_copy_data (message); +> > - if (filename && *filename != '\0') { +> > - _notmuch_message_clear_data (message); +> > - _notmuch_message_sync (message); +> > - } +> > - talloc_free (filename); +> > - +> > - notmuch_message_destroy (message); +> > - } +> > +> > - notmuch_query_destroy (query); +> > - } +> > - +> > - if (version < 1) { +> > - Xapian::TermIterator t, t_end; +> > - +> > - t_end = notmuch->xapian_db->allterms_end ("XTIMESTAMP"); +> > - +> > - for (t = notmuch->xapian_db->allterms_begin ("XTIMESTAMP"); +> > - t != t_end; +> > - t++) +> > - { +> > - Xapian::PostingIterator p, p_end; +> > - std::string term = *t; +> > - +> > - p_end = notmuch->xapian_db->postlist_end (term); +> > - +> > - for (p = notmuch->xapian_db->postlist_begin (term); +> > - p != p_end; +> > - p++) +> > - { +> > - if (do_progress_notify) { +> > - progress_notify (closure, (double) count / total); +> > - do_progress_notify = 0; +> > - } +> > - +> > - db->delete_document (*p); +> > - } +> > - } +> > - } +> > + db->commit_transaction (); +> > +> > if (timer_is_active) { +> > /* Now stop the timer. */ +> >