Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 3C43D431FB6 for ; Sat, 23 Aug 2014 09:02:50 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bPYCnr-aBRuP for ; Sat, 23 Aug 2014 09:02:43 -0700 (PDT) Received: from mail-wi0-f171.google.com (mail-wi0-f171.google.com [209.85.212.171]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id A3418431FAF for ; Sat, 23 Aug 2014 09:02:42 -0700 (PDT) Received: by mail-wi0-f171.google.com with SMTP id hi2so844068wib.16 for ; Sat, 23 Aug 2014 09:02:41 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:subject:in-reply-to:references :user-agent:date:message-id:mime-version:content-type; bh=UE6h6AnDwL0D9gvGPUnAfS5fKTsOKdPfacIwzt7V7KQ=; b=bzetmBlRlQOizY3PhWhAoo3cE+VP6iv+c+q2bn4CVIYwAXAYHpYueHY96rRRHC3qMQ 3El5ND4wAkUXGBCR8I3vbIhXccGeoXiskKsx/yttIoCQkzljuZQficEf/3gOmZ1r1h17 W0xB28/GA8DPF3LlC4E9HwuSLYHjxcj89fYbqBqei0lshMUQ5kfWWzvg2eu3OclK9RGI JnS0tyM1VsOZKHK1otA3ujlNbHOQ+NHYIrzTXfByXa8czh1vlXkROoVp+sNavLVNqusL HZbjilh5rssqaMux0tJB3nJXU1/LbDG2dVr32gDkpvcqJXAHct4KJj0ZWUkUCEVWVhhn 5mYA== X-Gm-Message-State: ALoCoQnXOVuCL3NVkQr9Oi1AaSAVu6EDqFqIxXBXFKLbjPSirI0noI/6Tun+6U6tXGxn8ZDARVBk X-Received: by 10.180.21.101 with SMTP id u5mr4647284wie.68.1408809761398; Sat, 23 Aug 2014 09:02:41 -0700 (PDT) Received: from localhost (dsl-hkibrasgw2-58c374-75.dhcp.inet.fi. [88.195.116.75]) by mx.google.com with ESMTPSA id y10sm10563955wie.18.2014.08.23.09.02.39 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 23 Aug 2014 09:02:40 -0700 (PDT) From: Jani Nikula To: Austin Clements , notmuch@notmuchmail.org Subject: Re: [PATCH v3 04/13] lib: Database version 3: Introduce fine-grained "features" In-Reply-To: <1406859003-11561-5-git-send-email-amdragon@mit.edu> References: <1406859003-11561-1-git-send-email-amdragon@mit.edu> <1406859003-11561-5-git-send-email-amdragon@mit.edu> User-Agent: Notmuch/0.18.1+65~g9f0f30f (http://notmuchmail.org) Emacs/24.3.1 (x86_64-pc-linux-gnu) Date: Sat, 23 Aug 2014 19:02:39 +0300 Message-ID: <87r407jisg.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 23 Aug 2014 16:02:50 -0000 On Fri, 01 Aug 2014, Austin Clements wrote: > Previously, our database schema was versioned by a single number. > Each database schema change had to occur "atomically" in Notmuch's > development history: before some commit, Notmuch used version N, after > that commit, it used version N+1. Hence, each new schema version > could introduce only one change, the task of developing a schema > change fell on a single person, and it all had to happen and be > perfect in a single commit series. This made introducing a new schema > version hard. We've seen only two schema changes in the history of > Notmuch. > > This commit introduces database schema version 3; hopefully the last > schema version we'll need for a while. With this version, we switch > from a single version number to "features": a set of named, > independent aspects of the database schema. > > Features should make backwards compatibility easier. For many things, > it should be easy to support databases both with and without a > feature, which will allow us to make upgrades optional and will enable > "unstable" features that can be developed and tested over time. > > Features also make forwards compatibility easier. The features > recorded in a database include "compatibility flags," which can > indicate to an older version of Notmuch when it must support a given > feature to open the database for read or for write. This lets us > replace the old vague "I don't recognize this version, so something > might go wrong, but I promise to try my best" warnings upon opening a > database with an unknown version with precise errors. If a database > is safe to open for read/write despite unknown features, an older > version will know that and issue no message at all. If the database > is not safe to open for read/write because of unknown features, an > older version will know that, too, and can tell the user exactly which > required features it lacks support for. I agree with the change overall; it might be useful to have another set of eyes on the patch though. > --- > lib/database-private.h | 57 ++++++++++++++- > lib/database.cc | 190 ++++++++++++++++++++++++++++++++++++++++--------- > 2 files changed, 213 insertions(+), 34 deletions(-) > > diff --git a/lib/database-private.h b/lib/database-private.h > index d3e65fd..2ffab33 100644 > --- a/lib/database-private.h > +++ b/lib/database-private.h > @@ -41,11 +41,15 @@ struct _notmuch_database { > > char *path; > > - notmuch_bool_t needs_upgrade; > notmuch_database_mode_t mode; > int atomic_nesting; > Xapian::Database *xapian_db; > > + /* Bit mask of features used by this database. Features are > + * named, independent aspects of the database schema. This is a > + * bitwise-OR of NOTMUCH_FEATURE_* values (below). */ > + unsigned int features; > + > unsigned int last_doc_id; > uint64_t last_thread_id; > > @@ -55,6 +59,57 @@ struct _notmuch_database { > Xapian::ValueRangeProcessor *date_range_processor; > }; > > +/* Bit masks for _notmuch_database::features. */ > +enum { > + /* If set, file names are stored in "file-direntry" terms. If > + * unset, file names are stored in document data. > + * > + * Introduced: version 1. Implementation support: both for read; > + * required for write. */ Maybe I'm dense, but the implementation support comments could be clearer. > + NOTMUCH_FEATURE_FILE_TERMS = 1 << 0, > + > + /* If set, directory timestamps are stored in documents with > + * XDIRECTORY terms and relative paths. If unset, directory > + * timestamps are stored in documents with XTIMESTAMP terms and > + * absolute paths. > + * > + * Introduced: version 1. Implementation support: required. */ > + NOTMUCH_FEATURE_DIRECTORY_DOCS = 1 << 1, > + > + /* If set, the from, subject, and message-id headers are stored in > + * message document values. If unset, message documents *may* > + * have these values, but if the value is empty, it must be > + * retrieved from the message file. > + * > + * Introduced: optional in version 1, required as of version 3. > + * Implementation support: both. > + */ > + NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES = 1 << 2, > + > + /* If set, folder terms are boolean and path terms exist. If > + * unset, folder terms are probabilistic and stemmed and path > + * terms do not exist. > + * > + * Introduced: version 2. Implementation support: required. */ > + NOTMUCH_FEATURE_BOOL_FOLDER = 1 << 3, > +}; > + > +/* Prior to database version 3, features were implied by the database > + * version number, so hard-code them for earlier versions. */ > +#define NOTMUCH_FEATURES_V0 (0) > +#define NOTMUCH_FEATURES_V1 (NOTMUCH_FEATURES_V0 | NOTMUCH_FEATURE_FILE_TERMS | \ > + NOTMUCH_FEATURE_DIRECTORY_DOCS) > +#define NOTMUCH_FEATURES_V2 (NOTMUCH_FEATURES_V1 | NOTMUCH_FEATURE_BOOL_FOLDER) > + > +/* Current database features. If any of these are missing from a > + * database, request an upgrade. > + * NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES is not included because > + * upgrade doesn't currently introduce the feature (though brand new > + * databases will have it). */ > +#define NOTMUCH_FEATURES_CURRENT \ > + (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \ > + NOTMUCH_FEATURE_BOOL_FOLDER) > + > /* Return the list of terms from the given iterator matching a prefix. > * The prefix will be stripped from the strings in the returned list. > * The list will be allocated using ctx as the talloc context. > diff --git a/lib/database.cc b/lib/database.cc > index 45c4260..29a56db 100644 > --- a/lib/database.cc > +++ b/lib/database.cc > @@ -20,6 +20,7 @@ > > #include "database-private.h" > #include "parse-time-vrp.h" > +#include "string-util.h" > > #include > > @@ -42,7 +43,7 @@ typedef struct { > const char *prefix; > } prefix_t; > > -#define NOTMUCH_DATABASE_VERSION 2 > +#define NOTMUCH_DATABASE_VERSION 3 > > #define STRINGIFY(s) _SUB_STRINGIFY(s) > #define _SUB_STRINGIFY(s) #s > @@ -151,6 +152,17 @@ typedef struct { > * changes are made to the database (such as by > * indexing new fields). > * > + * features The set of features supported by this > + * database. This consists of a set of > + * '\n'-separated lines, where each is a feature > + * name, a '\t', and compatibility flags. If the > + * compatibility flags contain 'w', then the > + * opener must support this feature to safely > + * write this database. If the compatibility > + * flags contain 'r', then the opener must > + * support this feature to read this database. > + * Introduced in database version 3. > + * > * last_thread_id The last thread ID generated. This is stored > * as a 16-byte hexadecimal ASCII representation > * of a 64-bit unsigned integer. The first ID > @@ -251,6 +263,25 @@ _find_prefix (const char *name) > return ""; > } > > +static const struct > +{ Brace should be at the end of the previous line. > + /* NOTMUCH_FEATURE_* value. */ > + unsigned int value; > + /* Feature name as it appears in the database. This name should > + * be appropriate for displaying to the user if an older version > + * of notmuch doesn't support this feature. */ > + const char *name; > + /* Compatibility flags when this feature is declared. */ > + const char *flags; > +} feature_names[] = { > + {NOTMUCH_FEATURE_FILE_TERMS, "multiple paths per message", "rw"}, > + {NOTMUCH_FEATURE_DIRECTORY_DOCS, "relative directory paths", "rw"}, > + /* Header values are not required for reading a database because a > + * reader can just refer to the message file. */ > + {NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES, "from/subject/message-ID in database", "w"}, > + {NOTMUCH_FEATURE_BOOL_FOLDER, "exact folder:/path: search", "rw"}, Spaces missing after the opening braces and before the closing braces. > +}; > + > const char * > notmuch_status_to_string (notmuch_status_t status) > { > @@ -591,6 +622,11 @@ notmuch_database_create (const char *path, notmuch_database_t **database) > ¬much); > if (status) > goto DONE; > + > + /* Upgrade doesn't add this feature to existing databases, but new > + * databases have it. */ > + notmuch->features |= NOTMUCH_FEATURE_FROM_SUBJECT_ID_VALUES; > + > status = notmuch_database_upgrade (notmuch, NULL, NULL); > if (status) { > notmuch_database_close(notmuch); > @@ -619,6 +655,80 @@ _notmuch_database_ensure_writable (notmuch_database_t *notmuch) > return NOTMUCH_STATUS_SUCCESS; > } > > +/* Parse a database features string from the given database version. > + * > + * For version < 3, this ignores the features string and returns a > + * hard-coded set of features. > + * > + * If there are unrecognized features that are required to open the > + * database in mode (which should be 'r' or 'w'), return a > + * comma-separated list of unrecognized but required features in > + * *incompat_out, which will be allocated from ctx. Please describe the actual return value. > + */ > +static unsigned int > +_parse_features (const void *ctx, const char *features, unsigned int version, > + char mode, char **incompat_out) > +{ > + unsigned int res = 0, namelen, i; > + size_t llen = 0; > + const char *flags; > + > + /* Prior to database version 3, features were implied by the > + * version number. */ > + if (version == 0) > + return NOTMUCH_FEATURES_V0; > + else if (version == 1) > + return NOTMUCH_FEATURES_V1; > + else if (version == 2) > + return NOTMUCH_FEATURES_V2; > + > + /* Parse the features string */ > + while ((features = strtok_len_c (features + llen, "\n", &llen)) != NULL) { > + flags = strchr (features, '\t'); > + if (! flags || flags > features + llen) > + continue; > + namelen = flags - features; > + > + for (i = 0; i < ARRAY_SIZE (feature_names); ++i) { > + if (strlen (feature_names[i].name) == namelen && > + strncmp (feature_names[i].name, features, namelen) == 0) { > + res |= feature_names[i].value; > + break; > + } > + } > + > + if (i == ARRAY_SIZE (feature_names)) { > + /* Unrecognized feature */ > + const char *have = strchr (flags, mode); > + if (have && have < features + llen) { > + /* This feature is required to access this database in > + * 'mode', but we don't understand it. */ > + if (! *incompat_out) > + *incompat_out = talloc_strdup (ctx, ""); > + *incompat_out = talloc_asprintf_append_buffer ( > + *incompat_out, "%s%.*s", **incompat_out ? ", " : "", > + namelen, features); Do we intend the lib user to be able to parse the features? Perhaps we should use '\n' as separator here too? (Although looks like *currently* this is only for printing the message from within the lib.) > + } > + } > + } > + > + return res; > +} > + > +static char * > +_print_features (const void *ctx, unsigned int features) > +{ > + unsigned int i; > + char *res = talloc_strdup (ctx, ""); > + > + for (i = 0; i < ARRAY_SIZE (feature_names); ++i) > + if (features & feature_names[i].value) > + res = talloc_asprintf_append_buffer ( > + res, "%s\t%s\n", feature_names[i].name, feature_names[i].flags); > + > + return res; > +} > + > notmuch_status_t > notmuch_database_open (const char *path, > notmuch_database_mode_t mode, > @@ -627,7 +737,7 @@ notmuch_database_open (const char *path, > notmuch_status_t status = NOTMUCH_STATUS_SUCCESS; > void *local = talloc_new (NULL); > notmuch_database_t *notmuch = NULL; > - char *notmuch_path, *xapian_path; > + char *notmuch_path, *xapian_path, *incompat_features; > struct stat st; > int err; > unsigned int i, version; > @@ -677,7 +787,6 @@ notmuch_database_open (const char *path, > if (notmuch->path[strlen (notmuch->path) - 1] == '/') > notmuch->path[strlen (notmuch->path) - 1] = '\0'; > > - notmuch->needs_upgrade = FALSE; > notmuch->mode = mode; > notmuch->atomic_nesting = 0; > try { > @@ -686,37 +795,44 @@ notmuch_database_open (const char *path, > if (mode == NOTMUCH_DATABASE_MODE_READ_WRITE) { > notmuch->xapian_db = new Xapian::WritableDatabase (xapian_path, > Xapian::DB_CREATE_OR_OPEN); > - version = notmuch_database_get_version (notmuch); > - > - if (version > NOTMUCH_DATABASE_VERSION) { > - fprintf (stderr, > - "Error: Notmuch database at %s\n" > - " has a newer database format version (%u) than supported by this\n" > - " version of notmuch (%u). Refusing to open this database in\n" > - " read-write mode.\n", > - notmuch_path, version, NOTMUCH_DATABASE_VERSION); > - notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY; > - notmuch_database_destroy (notmuch); > - notmuch = NULL; > - status = NOTMUCH_STATUS_FILE_ERROR; > - goto DONE; > - } > - > - if (version < NOTMUCH_DATABASE_VERSION) > - notmuch->needs_upgrade = TRUE; > } else { > notmuch->xapian_db = new Xapian::Database (xapian_path); > - version = notmuch_database_get_version (notmuch); > - if (version > NOTMUCH_DATABASE_VERSION) > - { > - fprintf (stderr, > - "Warning: Notmuch database at %s\n" > - " has a newer database format version (%u) than supported by this\n" > - " version of notmuch (%u). Some operations may behave incorrectly,\n" > - " (but the database will not be harmed since it is being opened\n" > - " in read-only mode).\n", > - notmuch_path, version, NOTMUCH_DATABASE_VERSION); > - } > + } > + > + /* Check version. As of database version 3, we represent > + * changes in terms of features, so assume a version bump > + * means a dramatically incompatible change. */ > + version = notmuch_database_get_version (notmuch); > + if (version > NOTMUCH_DATABASE_VERSION) { > + fprintf (stderr, > + "Error: Notmuch database at %s\n" > + " has a newer database format version (%u) than supported by this\n" > + " version of notmuch (%u).\n", > + notmuch_path, version, NOTMUCH_DATABASE_VERSION); > + notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY; > + notmuch_database_destroy (notmuch); > + notmuch = NULL; > + status = NOTMUCH_STATUS_FILE_ERROR; > + goto DONE; > + } > + > + /* Check features. */ > + incompat_features = NULL; > + notmuch->features = _parse_features ( > + local, notmuch->xapian_db->get_metadata ("features").c_str (), > + version, mode == NOTMUCH_DATABASE_MODE_READ_WRITE ? 'w' : 'r', Makes me think _parse_features could use notmuch_database_mode_t mode instead of char mode. *shrug*. > + &incompat_features); > + if (incompat_features) { > + fprintf (stderr, > + "Error: Notmuch database at %s\n" > + " requires features (%s)\n" > + " not supported by this version of notmuch.\n", > + notmuch_path, incompat_features); > + notmuch->mode = NOTMUCH_DATABASE_MODE_READ_ONLY; > + notmuch_database_destroy (notmuch); > + notmuch = NULL; > + status = NOTMUCH_STATUS_FILE_ERROR; > + goto DONE; > } > > notmuch->last_doc_id = notmuch->xapian_db->get_lastdocid (); > @@ -1048,7 +1164,8 @@ notmuch_database_get_version (notmuch_database_t *notmuch) > notmuch_bool_t > notmuch_database_needs_upgrade (notmuch_database_t *notmuch) > { > - return notmuch->needs_upgrade; > + return notmuch->mode == NOTMUCH_DATABASE_MODE_READ_WRITE && > + (NOTMUCH_FEATURES_CURRENT & ~notmuch->features); > } Am I correct that this does not lead to a database upgrade from v2 to v3 until we actually change the features? Aside, why don't we return a suitable status code from notmuch_database_open when an upgrade is required? > static volatile sig_atomic_t do_progress_notify = 0; > @@ -1077,6 +1194,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > double progress), > void *closure) > { > + void *local = talloc_new (NULL); > Xapian::WritableDatabase *db; > struct sigaction action; > struct itimerval timerval; > @@ -1114,6 +1232,10 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > timer_is_active = TRUE; > } > > + /* Set the target features so we write out changes in the desired > + * format. */ > + notmuch->features |= NOTMUCH_FEATURES_CURRENT; > + > /* Before version 1, each message document had its filename in the > * data field. Copy that into the new format by calling > * notmuch_message_add_filename. > @@ -1226,6 +1348,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > notmuch_query_destroy (query); > } > > + db->set_metadata ("features", _print_features (local, notmuch->features)); > db->set_metadata ("version", STRINGIFY (NOTMUCH_DATABASE_VERSION)); > db->flush (); > > @@ -1302,6 +1425,7 @@ notmuch_database_upgrade (notmuch_database_t *notmuch, > sigaction (SIGALRM, &action, NULL); > } > > + talloc_free (local); > return NOTMUCH_STATUS_SUCCESS; > } > > -- > 2.0.0 > > _______________________________________________ > notmuch mailing list > notmuch@notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch