From: Jani Nikula Date: Sat, 23 Aug 2014 16:02:39 +0000 (+0300) Subject: Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained "features" X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=b4bea89957bbbe2a720bd94e42916eb2df57b1ec;p=notmuch-archives.git Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained "features" --- diff --git a/97/57f3b4739ea3b6500bd57d7d9968248502e45b b/97/57f3b4739ea3b6500bd57d7d9968248502e45b new file mode 100644 index 000000000..2e26b03c1 --- /dev/null +++ b/97/57f3b4739ea3b6500bd57d7d9968248502e45b @@ -0,0 +1,520 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 3C43D431FB6 + for ; Sat, 23 Aug 2014 09:02:50 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: -0.7 +X-Spam-Level: +X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 + tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id bPYCnr-aBRuP for ; + Sat, 23 Aug 2014 09:02:43 -0700 (PDT) +Received: from mail-wi0-f171.google.com (mail-wi0-f171.google.com + [209.85.212.171]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id A3418431FAF + for ; Sat, 23 Aug 2014 09:02:42 -0700 (PDT) +Received: by mail-wi0-f171.google.com with SMTP id hi2so844068wib.16 + for ; Sat, 23 Aug 2014 09:02:41 -0700 (PDT) +X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; + d=1e100.net; s=20130820; + h=x-gm-message-state:from:to:subject:in-reply-to:references + :user-agent:date:message-id:mime-version:content-type; + bh=UE6h6AnDwL0D9gvGPUnAfS5fKTsOKdPfacIwzt7V7KQ=; + b=bzetmBlRlQOizY3PhWhAoo3cE+VP6iv+c+q2bn4CVIYwAXAYHpYueHY96rRRHC3qMQ + 3El5ND4wAkUXGBCR8I3vbIhXccGeoXiskKsx/yttIoCQkzljuZQficEf/3gOmZ1r1h17 + W0xB28/GA8DPF3LlC4E9HwuSLYHjxcj89fYbqBqei0lshMUQ5kfWWzvg2eu3OclK9RGI + JnS0tyM1VsOZKHK1otA3ujlNbHOQ+NHYIrzTXfByXa8czh1vlXkROoVp+sNavLVNqusL + HZbjilh5rssqaMux0tJB3nJXU1/LbDG2dVr32gDkpvcqJXAHct4KJj0ZWUkUCEVWVhhn + 5mYA== +X-Gm-Message-State: + ALoCoQnXOVuCL3NVkQr9Oi1AaSAVu6EDqFqIxXBXFKLbjPSirI0noI/6Tun+6U6tXGxn8ZDARVBk +X-Received: by 10.180.21.101 with SMTP id u5mr4647284wie.68.1408809761398; + Sat, 23 Aug 2014 09:02:41 -0700 (PDT) +Received: from localhost (dsl-hkibrasgw2-58c374-75.dhcp.inet.fi. + [88.195.116.75]) by mx.google.com with ESMTPSA id + y10sm10563955wie.18.2014.08.23.09.02.39 for + (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); + Sat, 23 Aug 2014 09:02:40 -0700 (PDT) +From: Jani Nikula +To: Austin Clements , notmuch@notmuchmail.org +Subject: Re: [PATCH v3 04/13] lib: Database version 3: Introduce + fine-grained "features" +In-Reply-To: <1406859003-11561-5-git-send-email-amdragon@mit.edu> +References: <1406859003-11561-1-git-send-email-amdragon@mit.edu> + <1406859003-11561-5-git-send-email-amdragon@mit.edu> +User-Agent: Notmuch/0.18.1+65~g9f0f30f (http://notmuchmail.org) Emacs/24.3.1 + (x86_64-pc-linux-gnu) +Date: Sat, 23 Aug 2014 19:02:39 +0300 +Message-ID: <87r407jisg.fsf@nikula.org> +MIME-Version: 1.0 +Content-Type: text/plain +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Sat, 23 Aug 2014 16:02:50 -0000 + +On Fri, 01 Aug 2014, Austin Clements wrote: +> Previously, our database schema was versioned by a single number. +> Each database schema change had to occur "atomically" in Notmuch's +> development history: before some commit, Notmuch used version N, after +> that commit, it used version N+1. Hence, each new schema version +> could introduce only one change, the task of developing a schema +> change fell on a single person, and it all had to happen and be +> perfect in a single commit series. This made introducing a new schema +> version hard. We've seen only two schema changes in the history of +> Notmuch. +> +> This commit introduces database schema version 3; hopefully the last +> schema version we'll need for a while. With this version, we switch +> from a single version number to "features": a set of named, +> independent aspects of the database schema. +> +> Features should make backwards compatibility easier. For many things, +> it should be easy to support databases both with and without a +> feature, which will allow us to make upgrades optional and will enable +> "unstable" features that can be developed and tested over time. +> +> Features also make forwards compatibility easier. The features +> recorded in a database include "compatibility flags," which can +> indicate to an older version of Notmuch when it must support a given +> feature to open the database for read or for write. This lets us +> replace the old vague "I don't recognize this version, so something +> might go wrong, but I promise to try my best" warnings upon opening a +> database with an unknown version with precise errors. If a database +> is safe to open for read/write despite unknown features, an older +> version will know that and issue no message at all. If the database +> is not safe to open for read/write because of unknown features, an +> older version will know that, too, and can tell the user exactly which +> required features it lacks support for. + +I agree with the change overall; it might be useful to have another set +of eyes on the patch though. + +> --- +> lib/database-private.h | 57 ++++++++++++++- +> lib/database.cc | 190 ++++++++++++++++++++++++++++++++++++++++--------- +> 2 files changed, 213 insertions(+), 34 deletions(-) +> +> diff --git a/lib/database-private.h b/lib/database-private.h +> index d3e65fd..2ffab33 100644 +> --- a/lib/database-private.h +> +++ b/lib/database-private.h +> @@ -41,11 +41,15 @@ struct _notmuch_database { +> +> char *path; +> +> - notmuch_bool_t needs_upgrade; +> notmuch_database_mode_t mode; +> int atomic_nesting; +> Xapian::Database *xapian_db; +> +> + /* Bit mask of features used by this database. Features are +> + * named, independent aspects of the database schema. This is a +> + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */ +> + unsigned int features; +> + +> unsigned int last_doc_id; +> uint64_t last_thread_id; +> +> @@ -55,6 +59,57 @@ struct _notmuch_database { +> Xapian::ValueRangeProcessor *date_range_processor; +> }; +> +> +/* Bit masks for _notmuch_database::features. */ +> +enum { +> + /* If set, file names are stored in "file-direntry" terms. If +> + * unset, file names are stored in document data. +> + * +> + * Introduced: version 1. Implementation support: both for read; +> + * required for write. */ + +Maybe I'm dense, but the implementation support comments could be +clearer. + +> + NOTMUCH_FEATURE_FILE_TERMS = 1 << 0, +> + +> + /* If set, directory timestamps are stored in documents with +> + * XDIRECTORY terms and relative paths. If unset, directory +> + * timestamps are stored in documents with XTIMESTAMP terms and +> + * absolute paths. +> + * +> + * Introduced: version 1. Implementation support: required. */ +> + NOTMUCH_FEATURE_DIRECTORY_DOCS = 1 << 1, +> + +> + /* If set, the from, subject, and message-id headers are stored in +> + * message document values. If unset, message documents *may* +> + * have these values, but if the value is empty, it must be +> + * retrieved from the message file. +> + * +> + * Introduced: optional in version 1, required as of version 3. +> + * Implementation support: both. +> + */ +> + NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES = 1 << 2, +> + +> + /* If set, folder terms are boolean and path terms exist. If +> + * unset, folder terms are probabilistic and stemmed and path +> + * terms do not exist. +> + * +> + * Introduced: version 2. Implementation support: required. */ +> + NOTMUCH_FEATURE_BOOL_FOLDER = 1 << 3, +> +}; +> + +> +/* Prior to database version 3, features were implied by the database +> + * version number, so hard-code them for earlier versions. */ +> +#define NOTMUCH_FEATURES_V0 (0) +> +#define NOTMUCH_FEATURES_V1 (NOTMUCH_FEATURES_V0 | NOTMUCH_FEATURE_FILE_TERMS | \ +> + NOTMUCH_FEATURE_DIRECTORY_DOCS) +> +#define NOTMUCH_FEATURES_V2 (NOTMUCH_FEATURES_V1 | NOTMUCH_FEATURE_BOOL_FOLDER) +> + +> +/* Current database features. If any of these are missing from a +> + * database, request an upgrade. +> + * NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES is not included because +> + * upgrade doesn't currently introduce the feature (though brand new +> + * databases will have it). */ +> +#define NOTMUCH_FEATURES_CURRENT \ +> + (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \ +> + NOTMUCH_FEATURE_BOOL_FOLDER) +> + +> /* Return the list of terms from the given iterator matching a prefix. +> * The prefix will be stripped from the strings in the returned list. +> * The list will be allocated using ctx as the talloc context. +> diff --git a/lib/database.cc b/lib/database.cc +> index 45c4260..29a56db 100644 +> --- a/lib/database.cc +> +++ b/lib/database.cc +> @@ -20,6 +20,7 @@ +> +> #include "database-private.h" +> #include "parse-time-vrp.h" +> +#include "string-util.h" +> +> #include +> +> @@ -42,7 +43,7 @@ typedef struct { +> const char *prefix; +> } prefix_t; +> +> -#define NOTMUCH_DATABASE_VERSION 2 +> +#define NOTMUCH_DATABASE_VERSION 3 +> +> #define STRINGIFY(s) _SUB_STRINGIFY(s) +> #define _SUB_STRINGIFY(s) #s +> @@ -151,6 +152,17 @@ typedef struct { +> * changes are made to the database (such as by +> * indexing new fields). +> * +> + * features The set of features supported by this +> + * database. This consists of a set of +> + * '\n'-separated lines, where each is a feature +> + * name, a '\t', and compatibility flags. If the +> + * compatibility flags contain 'w', then the +> + * opener must support this feature to safely +> + * write this database. If the compatibility +> + * flags contain 'r', then the opener must +> + * support this feature to read this database. +> + * Introduced in database version 3. +> + * +> * last_thread_id The last thread ID generated. This is stored +> * as a 16-byte hexadecimal ASCII representation +> * of a 64-bit unsigned integer. The first ID +> @@ -251,6 +263,25 @@ _find_prefix (const char *name) +> return ""; +> } +> +> +static const struct +> +{ + +Brace should be at the end of the previous line. + +> + /* NOTMUCH_FEATURE_* value. */ +> + unsigned int value; +> + /* Feature name as it appears in the database. This name should +> + * be appropriate for displaying to the user if an older version +> + * of notmuch doesn't support this feature. */ +> + const char *name; +> + /* Compatibility flags when this feature is declared. */ +> + const char *flags; +> +} feature_names[] = { +> + {NOTMUCH_FEATURE_FILE_TERMS, "multiple paths per message", "rw"}, +> + {NOTMUCH_FEATURE_DIRECTORY_DOCS, "relative directory paths", "rw"}, +> + /* Header values are not required for reading a database because a +> + * reader can just refer to the message file. */ +> + {NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES, "from/subject/message-ID in database", "w"}, +> + {NOTMUCH_FEATURE_BOOL_FOLDER, "exact folder:/path: search", "rw"}, + +Spaces missing after the opening braces and before the closing braces. + +> +}; +> + +> const char * +> notmuch_status_to_string (notmuch_status_t status) +> { +> @@ -591,6 +622,11 @@ notmuch_database_create (const char *path, notmuch_database_t **database) +> ¬much); +> if (status) +> goto DONE; +> + +> + /* Upgrade doesn't add this feature to existing databases, but new +> + * databases have it. */ +> + notmuch->features |= NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES; +> + +> status = notmuch_database_upgrade (notmuch, NULL, NULL); +> if (status) { +> notmuch_database_close(notmuch); +> @@ -619,6 +655,80 @@ _notmuch_database_ensure_writable (notmuch_database_t *notmuch) +> return NOTMUCH_STATUS_SUCCESS; +> } +> +> +/* Parse a database features string from the given database version. +> + * +> + * For version < 3, this ignores the features string and returns a +> + * hard-coded set of features. +> + * +> + * If there are unrecognized features that are required to open the +> + * database in mode (which should be 'r' or 'w'), return a +> + * comma-separated list of unrecognized but required features in +> + * *incompat_out, which will be allocated from ctx. + +Please describe the actual return value. + +> + */ +> +static unsigned int +> +_parse_features (const void *ctx, const char *features, unsigned int version, +> + char mode, char **incompat_out) +> +{ +> + unsigned int res = 0, namelen, i; +> + size_t llen = 0; +> + const char *flags; +> + +> + /* Prior to database version 3, features were implied by the +> + * version number. */ +> + if (version == 0) +> + return NOTMUCH_FEATURES_V0; +> + else if (version == 1) +> + return NOTMUCH_FEATURES_V1; +> + else if (version == 2) +> + return NOTMUCH_FEATURES_V2; +> + +> + /* Parse the features string */ +> + while ((features = strtok_len_c (features + llen, "\n", &llen)) != NULL) { +> + flags = strchr (features, '\t'); +> + if (! flags || flags > features + llen) +> + continue; +> + namelen = flags - features; +> + +> + for (i = 0; i < ARRAY_SIZE (feature_names); ++i) { +> + if (strlen (feature_names[i].name) == namelen && +> + strncmp (feature_names[i].name, features, namelen) == 0) { +> + res |= feature_names[i].value; +> + break; +> + } +> + } +> + +> + if (i == ARRAY_SIZE (feature_names)) { +> + /* Unrecognized feature */ +> + const char *have = strchr (flags, mode); +> + if (have && have < features + llen) { +> + /* This feature is required to access this database in +> + * 'mode', but we don't understand it. */ +> + if (! *incompat_out) +> + *incompat_out = talloc_strdup (ctx, ""); +> + *incompat_out = talloc_asprintf_append_buffer ( +> + *incompat_out, "%s%.*s", **incompat_out ? ", " : "", +> + namelen, features); + +Do we intend the lib user to be able to parse the features? Perhaps we +should use '\n' as separator here too? (Although looks like *currently* +this is only for printing the message from within the lib.) + +> + } +> + } +> + } +> + +> + return res; +> +} +> + +> +static char * +> +_print_features (const void *ctx, unsigned int features) +> +{ +> + unsigned int i; +> + char *res = talloc_strdup (ctx, ""); +> + +> + for (i = 0; i < ARRAY_SIZE (feature_names); ++i) +> + if (features & feature_names[i].value) +> + res = talloc_asprintf_append_buffer ( +> + res, "%s\t%s\n", feature_names[i].name, feature_names[i].flags); +> + +> + return res; +> +} +> + +> notmuch_status_t +> notmuch_database_open (const char *path, +> notmuch_database_mode_t mode, +> @@ -627,7 +737,7 @@ notmuch_database_open (const char *path, +> notmuch_status_t status = NOTMUCH_STATUS_SUCCESS; +> void *local = talloc_new (NULL); +> notmuch_database_t *notmuch = NULL; +> - char *notmuch_path, *xapian_path; +> + char *notmuch_path, *xapian_path, *incompat_features; +> struct stat st; +> int err; +> unsigned int i, version; +> @@ -677,7 +787,6 @@ notmuch_database_open (const char *path, +> if (notmuch->path[strlen (notmuch->path) - 1] == '/') +> notmuch->path[strlen (notmuch->path) - 1] = '\0'; +> +> - notmuch->needs_upgrade = FALSE; +> notmuch->mode = mode; +> notmuch->atomic_nesting = 0; +> try { +> @@ -686,37 +795,44 @@ notmuch_database_open (const char *path, +> if (mode == NOTMUCH_DATABASE_MODE_READ_WRITE) { +> notmuch->xapian_db = new Xapian::WritableDatabase (xapian_path, +> Xapian::DB_CREATE_OR_OPEN); +> - version = notmuch_database_get_version (notmuch); +> - +> - if (version > NOTMUCH_DATABASE_VERSION) { +> - fprintf (stderr, +> - "Error: Notmuch database at %s\n" +> - " has a newer database format version (%u) than supported by this\n" +> - " version of notmuch (%u). Refusing to open this database in\n" +> - " read-write mode.\n", +> - notmuch_path, version, NOTMUCH_DATABASE_VERSION); +> - notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY; +> - notmuch_database_destroy (notmuch); +> - notmuch = NULL; +> - status = NOTMUCH_STATUS_FILE_ERROR; +> - goto DONE; +> - } +> - +> - if (version < NOTMUCH_DATABASE_VERSION) +> - notmuch->needs_upgrade = TRUE; +> } else { +> notmuch->xapian_db = new Xapian::Database (xapian_path); +> - version = notmuch_database_get_version (notmuch); +> - if (version > NOTMUCH_DATABASE_VERSION) +> - { +> - fprintf (stderr, +> - "Warning: Notmuch database at %s\n" +> - " has a newer database format version (%u) than supported by this\n" +> - " version of notmuch (%u). Some operations may behave incorrectly,\n" +> - " (but the database will not be harmed since it is being opened\n" +> - " in read-only mode).\n", +> - notmuch_path, version, NOTMUCH_DATABASE_VERSION); +> - } +> + } +> + +> + /* Check version. As of database version 3, we represent +> + * changes in terms of features, so assume a version bump +> + * means a dramatically incompatible change. */ +> + version = notmuch_database_get_version (notmuch); +> + if (version > NOTMUCH_DATABASE_VERSION) { +> + fprintf (stderr, +> + "Error: Notmuch database at %s\n" +> + " has a newer database format version (%u) than supported by this\n" +> + " version of notmuch (%u).\n", +> + notmuch_path, version, NOTMUCH_DATABASE_VERSION); +> + notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY; +> + notmuch_database_destroy (notmuch); +> + notmuch = NULL; +> + status = NOTMUCH_STATUS_FILE_ERROR; +> + goto DONE; +> + } +> + +> + /* Check features. */ +> + incompat_features = NULL; +> + notmuch->features = _parse_features ( +> + local, notmuch->xapian_db->get_metadata ("features").c_str (), +> + version, mode == NOTMUCH_DATABASE_MODE_READ_WRITE ? 'w' : 'r', + +Makes me think _parse_features could use notmuch_database_mode_t mode +instead of char mode. *shrug*. + +> + &incompat_features); +> + if (incompat_features) { +> + fprintf (stderr, +> + "Error: Notmuch database at %s\n" +> + " requires features (%s)\n" +> + " not supported by this version of notmuch.\n", +> + notmuch_path, incompat_features); +> + notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY; +> + notmuch_database_destroy (notmuch); +> + notmuch = NULL; +> + status = NOTMUCH_STATUS_FILE_ERROR; +> + goto DONE; +> } +> +> notmuch->last_doc_id = notmuch->xapian_db->get_lastdocid (); +> @@ -1048,7 +1164,8 @@ notmuch_database_get_version (notmuch_database_t *notmuch) +> notmuch_bool_t +> notmuch_database_needs_upgrade (notmuch_database_t *notmuch) +> { +> - return notmuch->needs_upgrade; +> + return notmuch->mode == NOTMUCH_DATABASE_MODE_READ_WRITE && +> + (NOTMUCH_FEATURES_CURRENT & ~notmuch->features); +> } + +Am I correct that this does not lead to a database upgrade from v2 to v3 +until we actually change the features? + +Aside, why don't we return a suitable status code from +notmuch_database_open when an upgrade is required? + +> static volatile sig_atomic_t do_progress_notify = 0; +> @@ -1077,6 +1194,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, +> double progress), +> void *closure) +> { +> + void *local = talloc_new (NULL); +> Xapian::WritableDatabase *db; +> struct sigaction action; +> struct itimerval timerval; +> @@ -1114,6 +1232,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, +> timer_is_active = TRUE; +> } +> +> + /* Set the target features so we write out changes in the desired +> + * format. */ +> + notmuch->features |= NOTMUCH_FEATURES_CURRENT; +> + +> /* Before version 1, each message document had its filename in the +> * data field. Copy that into the new format by calling +> * notmuch_message_add_filename. +> @@ -1226,6 +1348,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, +> notmuch_query_destroy (query); +> } +> +> + db->set_metadata ("features", _print_features (local, notmuch->features)); +> db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION)); +> db->flush (); +> +> @@ -1302,6 +1425,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, +> sigaction (SIGALRM, &action, NULL); +> } +> +> + talloc_free (local); +> return NOTMUCH_STATUS_SUCCESS; +> } +> +> -- +> 2.0.0 +> +> _______________________________________________ +> notmuch mailing list +> notmuch@notmuchmail.org +> http://notmuchmail.org/mailman/listinfo/notmuch