Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 1D35E431FBC for ; Sun, 27 Jul 2014 02:54:16 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 0.502 X-Spam-Level: X-Spam-Status: No, score=0.502 tagged_above=-999 required=5 tests=[DKIM_ADSP_CUSTOM_MED=0.001, FREEMAIL_FROM=0.001, NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Pp9Dr19U+OZr for ; Sun, 27 Jul 2014 02:54:08 -0700 (PDT) Received: from mail2.qmul.ac.uk (mail2.qmul.ac.uk [138.37.6.6]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 97910431FB6 for ; Sun, 27 Jul 2014 02:54:07 -0700 (PDT) Received: from smtp.qmul.ac.uk ([138.37.6.40]) by mail2.qmul.ac.uk with esmtp (Exim 4.71) (envelope-from ) id 1XBL9F-0004Fl-0B; Sun, 27 Jul 2014 10:54:05 +0100 Received: from 94.196.249.126.threembb.co.uk ([94.196.249.126] helo=localhost) by smtp.qmul.ac.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.71) (envelope-from ) id 1XBL98-0006EF-Eg; Sun, 27 Jul 2014 10:53:44 +0100 From: Mark Walters To: Austin Clements , notmuch@notmuchmail.org Subject: Re: [PATCH 05/14] lib: Database version 3: Introduce fine-grained "features" In-Reply-To: <1406433173-19169-6-git-send-email-amdragon@mit.edu> References: <1406433173-19169-1-git-send-email-amdragon@mit.edu> <1406433173-19169-6-git-send-email-amdragon@mit.edu> User-Agent: Notmuch/0.15.2+615~g78e3a93 (http://notmuchmail.org) Emacs/23.4.1 (x86_64-pc-linux-gnu) Date: Sun, 27 Jul 2014 10:53:25 +0100 Message-ID: <87silnyvoq.fsf@qmul.ac.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Sender-Host-Address: 94.196.249.126 X-QM-Geographic: According to ripencc, this message was delivered by a machine in Britain (UK) (GB). X-QM-SPAM-Info: Sender has good ham record. :) X-QM-Body-MD5: 74279823180b19ebad2f59792d4c5cc2 (of first 20000 bytes) X-SpamAssassin-Score: -0.0 X-SpamAssassin-SpamBar: / X-SpamAssassin-Report: The QM spam filters have analysed this message to determine if it is spam. We require at least 5.0 points to mark a message as spam. This message scored -0.0 points. Summary of the scoring: * 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider * (markwalters1009[at]gmail.com) * -0.0 AWL AWL: From: address is in the auto white-list X-QM-Scan-Virus: ClamAV says the message is clean X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Jul 2014 09:54:16 -0000 On Sun, 27 Jul 2014, Austin Clements wrote: > Previously, our database schema was versioned by a single number. > Each database schema change had to occur "atomically" in Notmuch's > development history: before some commit, Notmuch used version N, after > that commit, it used version N+1. Hence, each new schema version > could introduce only one change, the task of developing a schema > change fell on a single person, and it all had to happen and be > perfect in a single commit series. This made introducing a new schema > version hard. We've seen only two schema changes in the history of > Notmuch. I like this series as a whole and I am happy with all the patches that I haven't sent comments on. (Note, however, that I am not very familiar with the database code.) There is one thing which confuses me in this patch: > > This commit introduces database schema version 3; hopefully the last > schema version we'll need for a while. With this version, we switch > from a single version number to "features": a set of named, > independent aspects of the database schema. > > Features should make backwards compatibility easier. For many things, > it should be easy to support databases both with and without a > feature, which will allow us to make upgrades optional and will enable > "unstable" features that can be developed and tested over time. > > Features also make forwards compatibility easier. The features > recorded in a database include "compatibility flags," which can > indicate to an older version of Notmuch when it must support a given > feature to open the database for read or for write. This lets us > replace the old vague "I don't recognize this version, so something > might go wrong, but I promise to try my best" warnings upon opening a > database with an unknown version with precise errors. If a database > is safe to open for read/write despite unknown features, an older > version will know that and issue no message at all. If the database > is not safe to open for read/write because of unknown features, an > older version will know that, too, and can tell the user exactly which > required features it lacks support for. > --- > lib/database-private.h | 56 +++++++++++++++ > lib/database.cc | 189 +++++++++++++++++++++++++++++++++++++++++-------- > 2 files changed, 214 insertions(+), 31 deletions(-) > > diff --git a/lib/database-private.h b/lib/database-private.h > index d3e65fd..323b9fe 100644 > --- a/lib/database-private.h > +++ b/lib/database-private.h > @@ -46,6 +46,11 @@ struct _notmuch_database { > int atomic_nesting; > Xapian::Database *xapian_db; > > + /* Bit mask of features used by this database. Features are > + * named, independent aspects of the database schema. This is a > + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */ > + unsigned int features; > + > unsigned int last_doc_id; > uint64_t last_thread_id; > > @@ -55,6 +60,57 @@ struct _notmuch_database { > Xapian::ValueRangeProcessor *date_range_processor; > }; > > +/* Bit masks for _notmuch_database::features. */ > +enum { > + /* If set, file names are stored in "file-direntry" terms. If > + * unset, file names are stored in document data. > + * > + * Introduced: version 1. Implementation support: both for read; > + * required for write. */ > + NOTMUCH_FEATURE_FILE_TERMS = 1 << 0, > + > + /* If set, directory timestamps are stored in documents with > + * XDIRECTORY terms and relative paths. If unset, directory > + * timestamps are stored in documents with XTIMESTAMP terms and > + * absolute paths. > + * > + * Introduced: version 1. Implementation support: required. */ > + NOTMUCH_FEATURE_DIRECTORY_DOCS = 1 << 1, > + > + /* If set, the from, subject, and message-id headers are stored in > + * message document values. If unset, message documents *may* > + * have these values, but if the value is empty, it must be > + * retrieved from the message file. > + * > + * Introduced: optional in version 1, required as of version 3. > + * Implementation support: both. > + */ > + NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES = 1 << 2, > + > + /* If set, folder terms are boolean and path terms exist. If > + * unset, folder terms are probabilistic and stemmed and path > + * terms do not exist. > + * > + * Introduced: version 2. Implementation support: required. */ > + NOTMUCH_FEATURE_BOOL_FOLDER = 1 << 3, > +}; > + > +/* Prior to database version 3, features were implied by the database > + * version number, so hard-code them for earlier versions. */ > +#define NOTMUCH_FEATURES_V0 (0) > +#define NOTMUCH_FEATURES_V1 (NOTMUCH_FEATURES_V0 | NOTMUCH_FEATURE_FILE_TERMS | \ > + NOTMUCH_FEATURE_DIRECTORY_DOCS) > +#define NOTMUCH_FEATURES_V2 (NOTMUCH_FEATURES_V1 | NOTMUCH_FEATURE_BOOL_FOLDER) > + > +/* Current database features. If any of these are missing from a > + * database, request an upgrade. > + * NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES is not included because > + * upgrade doesn't currently introduce the feature (though brand new > + * databases will have it). */ > +#define NOTMUCH_FEATURES_CURRENT \ > + (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \ > + NOTMUCH_FEATURE_BOOL_FOLDER) > + > /* Return the list of terms from the given iterator matching a prefix. > * The prefix will be stripped from the strings in the returned list. > * The list will be allocated using ctx as the talloc context. > diff --git a/lib/database.cc b/lib/database.cc > index 45c4260..03eef3e 100644 > --- a/lib/database.cc > +++ b/lib/database.cc > @@ -20,6 +20,7 @@ > > #include "database-private.h" > #include "parse-time-vrp.h" > +#include "string-util.h" > > #include > > @@ -42,7 +43,7 @@ typedef struct { > const char *prefix; > } prefix_t; > > -#define NOTMUCH_DATABASE_VERSION 2 > +#define NOTMUCH_DATABASE_VERSION 3 > > #define STRINGIFY(s) _SUB_STRINGIFY(s) > #define _SUB_STRINGIFY(s) #s > @@ -151,6 +152,17 @@ typedef struct { > * changes are made to the database (such as by > * indexing new fields). > * > + * features The set of features supported by this > + * database. This consists of a set of > + * '\n'-separated lines, where each is a feature > + * name, a '\t', and compatibility flags. If the > + * compatibility flags contain 'w', then the > + * opener must support this feature to safely > + * write this database. If the compatibility > + * flags contain 'r', then the opener must > + * support this feature to read this database. > + * Introduced in database version 3. > + * > * last_thread_id The last thread ID generated. This is stored > * as a 16-byte hexadecimal ASCII representation > * of a 64-bit unsigned integer. The first ID > @@ -226,6 +238,7 @@ static prefix_t PROBABILISTIC_PREFIX[]= { > { "subject", "XSUBJECT"}, > }; > > + > const char * > _find_prefix (const char *name) > { > @@ -251,6 +264,25 @@ _find_prefix (const char *name) > return ""; > } > > +static const struct > +{ > + /* NOTMUCH_FEATURE_* value. */ > + unsigned int value; > + /* Feature name as it appears in the database. This name should > + * be appropriate for displaying to the user if an older version > + * of notmuch doesn't support this feature. */ > + const char *name; > + /* Compatibility flags when this feature is declared. */ > + const char *flags; > +} feature_names[] = { > + {NOTMUCH_FEATURE_FILE_TERMS, "file terms", "rw"}, > + {NOTMUCH_FEATURE_DIRECTORY_DOCS, "directory documents", "rw"}, > + /* Header values are not required for reading a database because a > + * reader can just refer to the message file. */ > + {NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES, "from/subject/ID values", "w"}, > + {NOTMUCH_FEATURE_BOOL_FOLDER, "boolean folder terms", "rw"}, > +}; > + > const char * > notmuch_status_to_string (notmuch_status_t status) > { > @@ -591,6 +623,11 @@ notmuch_database_create (const char *path, notmuch_database_t **database) > ¬much); > if (status) > goto DONE; > + > + /* Upgrade doesn't add this feature to existing databases, but new > + * databases have it. */ > + notmuch->features |= NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES; > + Does this mean that if the user wants this feature he has to dump, delete the existing database, and restore? Is that just because no-one has implemented upgrade for this feature or is it "hard" in some sense? A possibly related question: is there likely to be a case where the user does not want to add/upgrade some feature? Would it be worth having an explicit upgrade command which let the user choose which features to upgrade? This is orthogonal to this series: I am just trying to get a feel for how it will, or could, be used. Best wishes Mark > status = notmuch_database_upgrade (notmuch, NULL, NULL); > if (status) { > notmuch_database_close(notmuch); > @@ -619,6 +656,80 @@ _notmuch_database_ensure_writable (notmuch_database_t *notmuch) > return NOTMUCH_STATUS_SUCCESS; > } > > +/* Parse a database features string from the given database version. > + * > + * For version < 3, this ignores the features string and returns a > + * hard-coded set of features. > + * > + * If there are unrecognized features that are required to open the > + * database in mode (which should be 'r' or 'w'), return a > + * comma-separated list of unrecognized but required features in > + * *incompat_out, which will be allocated from ctx. > + */ > +static unsigned int > +_parse_features (const void *ctx, const char *features, unsigned int version, > + char mode, char **incompat_out) > +{ > + unsigned int res = 0, namelen, i; > + size_t llen = 0; > + const char *flags; > + > + /* Prior to database version 3, features were implied by the > + * version number. */ > + if (version == 0) > + return NOTMUCH_FEATURES_V0; > + else if (version == 1) > + return NOTMUCH_FEATURES_V1; > + else if (version == 2) > + return NOTMUCH_FEATURES_V2; > + > + /* Parse the features string */ > + while ((features = strtok_len_c (features + llen, "\n", &llen)) != NULL) { > + flags = strchr (features, '\t'); > + if (! flags || flags > features + llen) > + continue; > + namelen = flags - features; > + > + for (i = 0; i < ARRAY_SIZE (feature_names); ++i) { > + if (strlen (feature_names[i].name) == namelen && > + strncmp (feature_names[i].name, features, namelen) == 0) { > + res |= feature_names[i].value; > + break; > + } > + } > + > + if (i == ARRAY_SIZE (feature_names)) { > + /* Unrecognized feature */ > + const char *have = strchr (flags, mode); > + if (have && have < features + llen) { > + /* This feature is required to access this database in > + * 'mode', but we don't understand it. */ > + if (! *incompat_out) > + *incompat_out = talloc_strdup (ctx, ""); > + *incompat_out = talloc_asprintf_append_buffer ( > + *incompat_out, "%s%.*s", **incompat_out ? ", " : "", > + namelen, features); > + } > + } > + } > + > + return res; > +} > + > +static char * > +_print_features (const void *ctx, unsigned int features) > +{ > + unsigned int i; > + char *res = talloc_strdup (ctx, ""); > + > + for (i = 0; i < ARRAY_SIZE (feature_names); ++i) > + if (features & feature_names[i].value) > + res = talloc_asprintf_append_buffer ( > + res, "%s\t%s\n", feature_names[i].name, feature_names[i].flags); > + > + return res; > +} > + > notmuch_status_t > notmuch_database_open (const char *path, > notmuch_database_mode_t mode, > @@ -627,7 +738,7 @@ notmuch_database_open (const char *path, > notmuch_status_t status = NOTMUCH_STATUS_SUCCESS; > void *local = talloc_new (NULL); > notmuch_database_t *notmuch = NULL; > - char *notmuch_path, *xapian_path; > + char *notmuch_path, *xapian_path, *incompat_features; > struct stat st; > int err; > unsigned int i, version; > @@ -686,39 +797,51 @@ notmuch_database_open (const char *path, > if (mode == NOTMUCH_DATABASE_MODE_READ_WRITE) { > notmuch->xapian_db = new Xapian::WritableDatabase (xapian_path, > Xapian::DB_CREATE_OR_OPEN); > - version = notmuch_database_get_version (notmuch); > - > - if (version > NOTMUCH_DATABASE_VERSION) { > - fprintf (stderr, > - "Error: Notmuch database at %s\n" > - " has a newer database format version (%u) than supported by this\n" > - " version of notmuch (%u). Refusing to open this database in\n" > - " read-write mode.\n", > - notmuch_path, version, NOTMUCH_DATABASE_VERSION); > - notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY; > - notmuch_database_destroy (notmuch); > - notmuch = NULL; > - status = NOTMUCH_STATUS_FILE_ERROR; > - goto DONE; > - } > - > - if (version < NOTMUCH_DATABASE_VERSION) > - notmuch->needs_upgrade = TRUE; > } else { > notmuch->xapian_db = new Xapian::Database (xapian_path); > - version = notmuch_database_get_version (notmuch); > - if (version > NOTMUCH_DATABASE_VERSION) > - { > - fprintf (stderr, > - "Warning: Notmuch database at %s\n" > - " has a newer database format version (%u) than supported by this\n" > - " version of notmuch (%u). Some operations may behave incorrectly,\n" > - " (but the database will not be harmed since it is being opened\n" > - " in read-only mode).\n", > - notmuch_path, version, NOTMUCH_DATABASE_VERSION); > - } > } > > + /* Check version. As of database version 3, we represent > + * changes in terms of features, so assume a version bump > + * means a dramatically incompatible change. */ > + version = notmuch_database_get_version (notmuch); > + if (version > NOTMUCH_DATABASE_VERSION) { > + fprintf (stderr, > + "Error: Notmuch database at %s\n" > + " has a newer database format version (%u) than supported by this\n" > + " version of notmuch (%u).\n", > + notmuch_path, version, NOTMUCH_DATABASE_VERSION); > + notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY; > + notmuch_database_destroy (notmuch); > + notmuch = NULL; > + status = NOTMUCH_STATUS_FILE_ERROR; > + goto DONE; > + } > + > + /* Check features. */ > + incompat_features = NULL; > + notmuch->features = _parse_features ( > + local, notmuch->xapian_db->get_metadata ("features").c_str (), > + version, mode == NOTMUCH_DATABASE_MODE_READ_WRITE ? 'w' : 'r', > + &incompat_features); > + if (incompat_features) { > + fprintf (stderr, > + "Error: Notmuch database at %s\n" > + " requires features (%s)\n" > + " not supported by this version of notmuch.\n", > + notmuch_path, incompat_features); > + notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY; > + notmuch_database_destroy (notmuch); > + notmuch = NULL; > + status = NOTMUCH_STATUS_FILE_ERROR; > + goto DONE; > + } > + > + /* Do we want an upgrade? */ > + if (mode == NOTMUCH_DATABASE_MODE_READ_WRITE && > + NOTMUCH_FEATURES_CURRENT & ~notmuch->features) > + notmuch->needs_upgrade = TRUE; > + > notmuch->last_doc_id = notmuch->xapian_db->get_lastdocid (); > last_thread_id = notmuch->xapian_db->get_metadata ("last_thread_id"); > if (last_thread_id.empty ()) { > @@ -1077,6 +1200,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > double progress), > void *closure) > { > + void *local = talloc_new (NULL); > Xapian::WritableDatabase *db; > struct sigaction action; > struct itimerval timerval; > @@ -1226,6 +1350,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > notmuch_query_destroy (query); > } > > + notmuch->features |= NOTMUCH_FEATURES_CURRENT; > + db->set_metadata ("features", _print_features (local, notmuch->features)); > db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION)); > db->flush (); > > @@ -1302,6 +1428,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > sigaction (SIGALRM, &action, NULL); > } > > + talloc_free (local); > return NOTMUCH_STATUS_SUCCESS; > } > > -- > 2.0.0 > > _______________________________________________ > notmuch mailing list > notmuch@notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch