From: Mark Walters Date: Sun, 27 Jul 2014 09:53:25 +0000 (+0100) Subject: Re: [PATCH 05/14] lib: Database version 3: Introduce fine-grained "features" X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=63d8b5436d1e06d1c446a598e11257142f576d41;p=notmuch-archives.git Re: [PATCH 05/14] lib: Database version 3: Introduce fine-grained "features" --- diff --git a/ce/94cd64c5cca99581a557d4abb5db398282330c b/ce/94cd64c5cca99581a557d4abb5db398282330c new file mode 100644 index 000000000..4ddc3bf93 --- /dev/null +++ b/ce/94cd64c5cca99581a557d4abb5db398282330c @@ -0,0 +1,496 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 1D35E431FBC + for ; Sun, 27 Jul 2014 02:54:16 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: 0.502 +X-Spam-Level: +X-Spam-Status: No, score=0.502 tagged_above=-999 required=5 + tests=[DKIM_ADSP_CUSTOM_MED=0.001, FREEMAIL_FROM=0.001, + NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id Pp9Dr19U+OZr for ; + Sun, 27 Jul 2014 02:54:08 -0700 (PDT) +Received: from mail2.qmul.ac.uk (mail2.qmul.ac.uk [138.37.6.6]) + (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id 97910431FB6 + for ; Sun, 27 Jul 2014 02:54:07 -0700 (PDT) +Received: from smtp.qmul.ac.uk ([138.37.6.40]) + by mail2.qmul.ac.uk with esmtp (Exim 4.71) + (envelope-from ) + id 1XBL9F-0004Fl-0B; Sun, 27 Jul 2014 10:54:05 +0100 +Received: from 94.196.249.126.threembb.co.uk ([94.196.249.126] helo=localhost) + by smtp.qmul.ac.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.71) + (envelope-from ) + id 1XBL98-0006EF-Eg; Sun, 27 Jul 2014 10:53:44 +0100 +From: Mark Walters +To: Austin Clements , notmuch@notmuchmail.org +Subject: Re: [PATCH 05/14] lib: Database version 3: Introduce fine-grained + "features" +In-Reply-To: <1406433173-19169-6-git-send-email-amdragon@mit.edu> +References: <1406433173-19169-1-git-send-email-amdragon@mit.edu> + <1406433173-19169-6-git-send-email-amdragon@mit.edu> +User-Agent: Notmuch/0.15.2+615~g78e3a93 (http://notmuchmail.org) Emacs/23.4.1 + (x86_64-pc-linux-gnu) +Date: Sun, 27 Jul 2014 10:53:25 +0100 +Message-ID: <87silnyvoq.fsf@qmul.ac.uk> +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +X-Sender-Host-Address: 94.196.249.126 +X-QM-Geographic: According to ripencc, + this message was delivered by a machine in Britain (UK) (GB). +X-QM-SPAM-Info: Sender has good ham record. :) +X-QM-Body-MD5: 74279823180b19ebad2f59792d4c5cc2 (of first 20000 bytes) +X-SpamAssassin-Score: -0.0 +X-SpamAssassin-SpamBar: / +X-SpamAssassin-Report: The QM spam filters have analysed this message to + determine if it is + spam. We require at least 5.0 points to mark a message as spam. + This message scored -0.0 points. + Summary of the scoring: + * 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail + provider * (markwalters1009[at]gmail.com) + * -0.0 AWL AWL: From: address is in the auto white-list +X-QM-Scan-Virus: ClamAV says the message is clean +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Sun, 27 Jul 2014 09:54:16 -0000 + + +On Sun, 27 Jul 2014, Austin Clements wrote: +> Previously, our database schema was versioned by a single number. +> Each database schema change had to occur "atomically" in Notmuch's +> development history: before some commit, Notmuch used version N, after +> that commit, it used version N+1. Hence, each new schema version +> could introduce only one change, the task of developing a schema +> change fell on a single person, and it all had to happen and be +> perfect in a single commit series. This made introducing a new schema +> version hard. We've seen only two schema changes in the history of +> Notmuch. + +I like this series as a whole and I am happy with all the patches that I +haven't sent comments on. (Note, however, that I am not very familiar +with the database code.) + +There is one thing which confuses me in this patch: + +> +> This commit introduces database schema version 3; hopefully the last +> schema version we'll need for a while. With this version, we switch +> from a single version number to "features": a set of named, +> independent aspects of the database schema. +> +> Features should make backwards compatibility easier. For many things, +> it should be easy to support databases both with and without a +> feature, which will allow us to make upgrades optional and will enable +> "unstable" features that can be developed and tested over time. +> +> Features also make forwards compatibility easier. The features +> recorded in a database include "compatibility flags," which can +> indicate to an older version of Notmuch when it must support a given +> feature to open the database for read or for write. This lets us +> replace the old vague "I don't recognize this version, so something +> might go wrong, but I promise to try my best" warnings upon opening a +> database with an unknown version with precise errors. If a database +> is safe to open for read/write despite unknown features, an older +> version will know that and issue no message at all. If the database +> is not safe to open for read/write because of unknown features, an +> older version will know that, too, and can tell the user exactly which +> required features it lacks support for. +> --- +> lib/database-private.h | 56 +++++++++++++++ +> lib/database.cc | 189 +++++++++++++++++++++++++++++++++++++++++-------- +> 2 files changed, 214 insertions(+), 31 deletions(-) +> +> diff --git a/lib/database-private.h b/lib/database-private.h +> index d3e65fd..323b9fe 100644 +> --- a/lib/database-private.h +> +++ b/lib/database-private.h +> @@ -46,6 +46,11 @@ struct _notmuch_database { +> int atomic_nesting; +> Xapian::Database *xapian_db; +> +> + /* Bit mask of features used by this database. Features are +> + * named, independent aspects of the database schema. This is a +> + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */ +> + unsigned int features; +> + +> unsigned int last_doc_id; +> uint64_t last_thread_id; +> +> @@ -55,6 +60,57 @@ struct _notmuch_database { +> Xapian::ValueRangeProcessor *date_range_processor; +> }; +> +> +/* Bit masks for _notmuch_database::features. */ +> +enum { +> + /* If set, file names are stored in "file-direntry" terms. If +> + * unset, file names are stored in document data. +> + * +> + * Introduced: version 1. Implementation support: both for read; +> + * required for write. */ +> + NOTMUCH_FEATURE_FILE_TERMS = 1 << 0, +> + +> + /* If set, directory timestamps are stored in documents with +> + * XDIRECTORY terms and relative paths. If unset, directory +> + * timestamps are stored in documents with XTIMESTAMP terms and +> + * absolute paths. +> + * +> + * Introduced: version 1. Implementation support: required. */ +> + NOTMUCH_FEATURE_DIRECTORY_DOCS = 1 << 1, +> + +> + /* If set, the from, subject, and message-id headers are stored in +> + * message document values. If unset, message documents *may* +> + * have these values, but if the value is empty, it must be +> + * retrieved from the message file. +> + * +> + * Introduced: optional in version 1, required as of version 3. +> + * Implementation support: both. +> + */ +> + NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES = 1 << 2, +> + +> + /* If set, folder terms are boolean and path terms exist. If +> + * unset, folder terms are probabilistic and stemmed and path +> + * terms do not exist. +> + * +> + * Introduced: version 2. Implementation support: required. */ +> + NOTMUCH_FEATURE_BOOL_FOLDER = 1 << 3, +> +}; +> + +> +/* Prior to database version 3, features were implied by the database +> + * version number, so hard-code them for earlier versions. */ +> +#define NOTMUCH_FEATURES_V0 (0) +> +#define NOTMUCH_FEATURES_V1 (NOTMUCH_FEATURES_V0 | NOTMUCH_FEATURE_FILE_TERMS | \ +> + NOTMUCH_FEATURE_DIRECTORY_DOCS) +> +#define NOTMUCH_FEATURES_V2 (NOTMUCH_FEATURES_V1 | NOTMUCH_FEATURE_BOOL_FOLDER) +> + +> +/* Current database features. If any of these are missing from a +> + * database, request an upgrade. +> + * NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES is not included because +> + * upgrade doesn't currently introduce the feature (though brand new +> + * databases will have it). */ +> +#define NOTMUCH_FEATURES_CURRENT \ +> + (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \ +> + NOTMUCH_FEATURE_BOOL_FOLDER) +> + +> /* Return the list of terms from the given iterator matching a prefix. +> * The prefix will be stripped from the strings in the returned list. +> * The list will be allocated using ctx as the talloc context. +> diff --git a/lib/database.cc b/lib/database.cc +> index 45c4260..03eef3e 100644 +> --- a/lib/database.cc +> +++ b/lib/database.cc +> @@ -20,6 +20,7 @@ +> +> #include "database-private.h" +> #include "parse-time-vrp.h" +> +#include "string-util.h" +> +> #include +> +> @@ -42,7 +43,7 @@ typedef struct { +> const char *prefix; +> } prefix_t; +> +> -#define NOTMUCH_DATABASE_VERSION 2 +> +#define NOTMUCH_DATABASE_VERSION 3 +> +> #define STRINGIFY(s) _SUB_STRINGIFY(s) +> #define _SUB_STRINGIFY(s) #s +> @@ -151,6 +152,17 @@ typedef struct { +> * changes are made to the database (such as by +> * indexing new fields). +> * +> + * features The set of features supported by this +> + * database. This consists of a set of +> + * '\n'-separated lines, where each is a feature +> + * name, a '\t', and compatibility flags. If the +> + * compatibility flags contain 'w', then the +> + * opener must support this feature to safely +> + * write this database. If the compatibility +> + * flags contain 'r', then the opener must +> + * support this feature to read this database. +> + * Introduced in database version 3. +> + * +> * last_thread_id The last thread ID generated. This is stored +> * as a 16-byte hexadecimal ASCII representation +> * of a 64-bit unsigned integer. The first ID +> @@ -226,6 +238,7 @@ static prefix_t PROBABILISTIC_PREFIX[]= { +> { "subject", "XSUBJECT"}, +> }; +> +> + +> const char * +> _find_prefix (const char *name) +> { +> @@ -251,6 +264,25 @@ _find_prefix (const char *name) +> return ""; +> } +> +> +static const struct +> +{ +> + /* NOTMUCH_FEATURE_* value. */ +> + unsigned int value; +> + /* Feature name as it appears in the database. This name should +> + * be appropriate for displaying to the user if an older version +> + * of notmuch doesn't support this feature. */ +> + const char *name; +> + /* Compatibility flags when this feature is declared. */ +> + const char *flags; +> +} feature_names[] = { +> + {NOTMUCH_FEATURE_FILE_TERMS, "file terms", "rw"}, +> + {NOTMUCH_FEATURE_DIRECTORY_DOCS, "directory documents", "rw"}, +> + /* Header values are not required for reading a database because a +> + * reader can just refer to the message file. */ +> + {NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES, "from/subject/ID values", "w"}, +> + {NOTMUCH_FEATURE_BOOL_FOLDER, "boolean folder terms", "rw"}, +> +}; +> + +> const char * +> notmuch_status_to_string (notmuch_status_t status) +> { +> @@ -591,6 +623,11 @@ notmuch_database_create (const char *path, notmuch_database_t **database) +> ¬much); +> if (status) +> goto DONE; +> + +> + /* Upgrade doesn't add this feature to existing databases, but new +> + * databases have it. */ +> + notmuch->features |= NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES; +> + + +Does this mean that if the user wants this feature he has to dump, +delete the existing database, and restore? Is that just because no-one +has implemented upgrade for this feature or is it "hard" in some sense? + +A possibly related question: is there likely to be a case where the user +does not want to add/upgrade some feature? Would it be worth having an +explicit upgrade command which let the user choose which features to +upgrade? This is orthogonal to this series: I am just trying to get a +feel for how it will, or could, be used. + +Best wishes + +Mark + +> status = notmuch_database_upgrade (notmuch, NULL, NULL); +> if (status) { +> notmuch_database_close(notmuch); +> @@ -619,6 +656,80 @@ _notmuch_database_ensure_writable (notmuch_database_t *notmuch) +> return NOTMUCH_STATUS_SUCCESS; +> } +> +> +/* Parse a database features string from the given database version. +> + * +> + * For version < 3, this ignores the features string and returns a +> + * hard-coded set of features. +> + * +> + * If there are unrecognized features that are required to open the +> + * database in mode (which should be 'r' or 'w'), return a +> + * comma-separated list of unrecognized but required features in +> + * *incompat_out, which will be allocated from ctx. +> + */ +> +static unsigned int +> +_parse_features (const void *ctx, const char *features, unsigned int version, +> + char mode, char **incompat_out) +> +{ +> + unsigned int res = 0, namelen, i; +> + size_t llen = 0; +> + const char *flags; +> + +> + /* Prior to database version 3, features were implied by the +> + * version number. */ +> + if (version == 0) +> + return NOTMUCH_FEATURES_V0; +> + else if (version == 1) +> + return NOTMUCH_FEATURES_V1; +> + else if (version == 2) +> + return NOTMUCH_FEATURES_V2; +> + +> + /* Parse the features string */ +> + while ((features = strtok_len_c (features + llen, "\n", &llen)) != NULL) { +> + flags = strchr (features, '\t'); +> + if (! flags || flags > features + llen) +> + continue; +> + namelen = flags - features; +> + +> + for (i = 0; i < ARRAY_SIZE (feature_names); ++i) { +> + if (strlen (feature_names[i].name) == namelen && +> + strncmp (feature_names[i].name, features, namelen) == 0) { +> + res |= feature_names[i].value; +> + break; +> + } +> + } +> + +> + if (i == ARRAY_SIZE (feature_names)) { +> + /* Unrecognized feature */ +> + const char *have = strchr (flags, mode); +> + if (have && have < features + llen) { +> + /* This feature is required to access this database in +> + * 'mode', but we don't understand it. */ +> + if (! *incompat_out) +> + *incompat_out = talloc_strdup (ctx, ""); +> + *incompat_out = talloc_asprintf_append_buffer ( +> + *incompat_out, "%s%.*s", **incompat_out ? ", " : "", +> + namelen, features); +> + } +> + } +> + } +> + +> + return res; +> +} +> + +> +static char * +> +_print_features (const void *ctx, unsigned int features) +> +{ +> + unsigned int i; +> + char *res = talloc_strdup (ctx, ""); +> + +> + for (i = 0; i < ARRAY_SIZE (feature_names); ++i) +> + if (features & feature_names[i].value) +> + res = talloc_asprintf_append_buffer ( +> + res, "%s\t%s\n", feature_names[i].name, feature_names[i].flags); +> + +> + return res; +> +} +> + +> notmuch_status_t +> notmuch_database_open (const char *path, +> notmuch_database_mode_t mode, +> @@ -627,7 +738,7 @@ notmuch_database_open (const char *path, +> notmuch_status_t status = NOTMUCH_STATUS_SUCCESS; +> void *local = talloc_new (NULL); +> notmuch_database_t *notmuch = NULL; +> - char *notmuch_path, *xapian_path; +> + char *notmuch_path, *xapian_path, *incompat_features; +> struct stat st; +> int err; +> unsigned int i, version; +> @@ -686,39 +797,51 @@ notmuch_database_open (const char *path, +> if (mode == NOTMUCH_DATABASE_MODE_READ_WRITE) { +> notmuch->xapian_db = new Xapian::WritableDatabase (xapian_path, +> Xapian::DB_CREATE_OR_OPEN); +> - version = notmuch_database_get_version (notmuch); +> - +> - if (version > NOTMUCH_DATABASE_VERSION) { +> - fprintf (stderr, +> - "Error: Notmuch database at %s\n" +> - " has a newer database format version (%u) than supported by this\n" +> - " version of notmuch (%u). Refusing to open this database in\n" +> - " read-write mode.\n", +> - notmuch_path, version, NOTMUCH_DATABASE_VERSION); +> - notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY; +> - notmuch_database_destroy (notmuch); +> - notmuch = NULL; +> - status = NOTMUCH_STATUS_FILE_ERROR; +> - goto DONE; +> - } +> - +> - if (version < NOTMUCH_DATABASE_VERSION) +> - notmuch->needs_upgrade = TRUE; +> } else { +> notmuch->xapian_db = new Xapian::Database (xapian_path); +> - version = notmuch_database_get_version (notmuch); +> - if (version > NOTMUCH_DATABASE_VERSION) +> - { +> - fprintf (stderr, +> - "Warning: Notmuch database at %s\n" +> - " has a newer database format version (%u) than supported by this\n" +> - " version of notmuch (%u). Some operations may behave incorrectly,\n" +> - " (but the database will not be harmed since it is being opened\n" +> - " in read-only mode).\n", +> - notmuch_path, version, NOTMUCH_DATABASE_VERSION); +> - } +> } +> +> + /* Check version. As of database version 3, we represent +> + * changes in terms of features, so assume a version bump +> + * means a dramatically incompatible change. */ +> + version = notmuch_database_get_version (notmuch); +> + if (version > NOTMUCH_DATABASE_VERSION) { +> + fprintf (stderr, +> + "Error: Notmuch database at %s\n" +> + " has a newer database format version (%u) than supported by this\n" +> + " version of notmuch (%u).\n", +> + notmuch_path, version, NOTMUCH_DATABASE_VERSION); +> + notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY; +> + notmuch_database_destroy (notmuch); +> + notmuch = NULL; +> + status = NOTMUCH_STATUS_FILE_ERROR; +> + goto DONE; +> + } +> + +> + /* Check features. */ +> + incompat_features = NULL; +> + notmuch->features = _parse_features ( +> + local, notmuch->xapian_db->get_metadata ("features").c_str (), +> + version, mode == NOTMUCH_DATABASE_MODE_READ_WRITE ? 'w' : 'r', +> + &incompat_features); +> + if (incompat_features) { +> + fprintf (stderr, +> + "Error: Notmuch database at %s\n" +> + " requires features (%s)\n" +> + " not supported by this version of notmuch.\n", +> + notmuch_path, incompat_features); +> + notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY; +> + notmuch_database_destroy (notmuch); +> + notmuch = NULL; +> + status = NOTMUCH_STATUS_FILE_ERROR; +> + goto DONE; +> + } +> + +> + /* Do we want an upgrade? */ +> + if (mode == NOTMUCH_DATABASE_MODE_READ_WRITE && +> + NOTMUCH_FEATURES_CURRENT & ~notmuch->features) +> + notmuch->needs_upgrade = TRUE; +> + +> notmuch->last_doc_id = notmuch->xapian_db->get_lastdocid (); +> last_thread_id = notmuch->xapian_db->get_metadata ("last_thread_id"); +> if (last_thread_id.empty ()) { +> @@ -1077,6 +1200,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, +> double progress), +> void *closure) +> { +> + void *local = talloc_new (NULL); +> Xapian::WritableDatabase *db; +> struct sigaction action; +> struct itimerval timerval; +> @@ -1226,6 +1350,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, +> notmuch_query_destroy (query); +> } +> +> + notmuch->features |= NOTMUCH_FEATURES_CURRENT; +> + db->set_metadata ("features", _print_features (local, notmuch->features)); +> db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION)); +> db->flush (); +> +> @@ -1302,6 +1428,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, +> sigaction (SIGALRM, &action, NULL); +> } +> +> + talloc_free (local); +> return NOTMUCH_STATUS_SUCCESS; +> } +> +> -- +> 2.0.0 +> +> _______________________________________________ +> notmuch mailing list +> notmuch@notmuchmail.org +> http://notmuchmail.org/mailman/listinfo/notmuch