1 Return-Path: <aclements@csail.mit.edu>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 9402D431FDA
\r
6 for <notmuch@notmuchmail.org>; Sun, 12 Oct 2014 23:20:39 -0700 (PDT)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-2.3 tagged_above=-999 required=5
\r
12 tests=[RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id vJiu2E2zbxAs for <notmuch@notmuchmail.org>;
\r
16 Sun, 12 Oct 2014 23:20:33 -0700 (PDT)
\r
17 Received: from dmz-mailsec-scanner-1.mit.edu (dmz-mailsec-scanner-1.mit.edu
\r
19 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
\r
20 (No client certificate requested)
\r
21 by olra.theworths.org (Postfix) with ESMTPS id 1ADED431FBF
\r
22 for <notmuch@notmuchmail.org>; Sun, 12 Oct 2014 23:20:20 -0700 (PDT)
\r
23 X-AuditID: 1209190c-f795e6d000006c66-f2-543b6f237023
\r
24 Received: from mailhub-auth-4.mit.edu ( [18.7.62.39])
\r
25 (using TLS with cipher AES256-SHA (256/256 bits))
\r
26 (Client did not present a certificate)
\r
27 by dmz-mailsec-scanner-1.mit.edu (Symantec Messaging Gateway) with SMTP
\r
28 id 56.75.27750.32F6B345; Mon, 13 Oct 2014 02:20:19 -0400 (EDT)
\r
29 Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11])
\r
30 by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id s9D6K8FI019182;
\r
31 Mon, 13 Oct 2014 02:20:08 -0400
\r
32 Received: from drake.dyndns.org ([73.162.189.21]) (authenticated bits=0)
\r
33 (User authenticated as amdragon@ATHENA.MIT.EDU)
\r
34 by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id s9D6K5iL031977
\r
35 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT);
\r
36 Mon, 13 Oct 2014 02:20:07 -0400
\r
37 Received: from amthrax by drake.dyndns.org with local (Exim 4.84)
\r
38 (envelope-from <aclements@csail.mit.edu>)
\r
39 id 1XdYzF-0000UC-9U; Mon, 13 Oct 2014 02:20:05 -0400
\r
40 From: Austin Clements <aclements@csail.mit.edu>
\r
41 To: notmuch@notmuchmail.org
\r
42 Subject: [WIP PATCH 2/4] lib: Add per-message last modification tracking
\r
43 Date: Mon, 13 Oct 2014 02:20:01 -0400
\r
44 Message-Id: <1413181203-1676-3-git-send-email-aclements@csail.mit.edu>
\r
45 X-Mailer: git-send-email 2.1.0
\r
46 In-Reply-To: <1413181203-1676-1-git-send-email-aclements@csail.mit.edu>
\r
47 References: <1413181203-1676-1-git-send-email-aclements@csail.mit.edu>
\r
48 X-Brightmail-Tracker:
\r
49 H4sIAAAAAAAAA+NgFtrIIsWRmVeSWpSXmKPExsUixG6nrqucbx1iMHOFuMX1mzOZHRg9nq26
\r
50 xRzAGMVlk5Kak1mWWqRvl8CVcW9aRMHv0Ipf3/YxNzC+d+li5OSQEDCR6Js0mRXCFpO4cG89
\r
51 WxcjF4eQwGwmiTvNX6CcjYwS3WeeMEM4y5kkPi2/wwzSIiSwhFFiwnpPEJtNQF9ixdpJYKNE
\r
52 BKQldt6dDWRzcDALqEn86VIBCQsLeEi8+7wVrJVFQFVi7/8WdhCbV8BN4te5X+wQV8hJbNj9
\r
53 nxHE5hRwl7g48SzUKjeJY23XWCYw8i9gZFjFKJuSW6Wbm5iZU5yarFucnJiXl1qka6iXm1mi
\r
54 l5pSuokRFDKckjw7GN8cVDrEKMDBqMTDa/HHKkSINbGsuDL3EKMkB5OSKG9AmnWIEF9Sfkpl
\r
55 RmJxRnxRaU5q8SFGCQ5mJRHetzZAOd6UxMqq1KJ8mJQ0B4uSOO+mH3whQgLpiSWp2ampBalF
\r
56 MFkZDg4lCd53uUCNgkWp6akVaZk5JQhpJg5OkOE8QMNbckCGFxck5hZnpkPkTzEqSonzfgdp
\r
57 FgBJZJTmwfXCYvoVozjQK8K8hiBVPMB0ANf9CmgwE9Dgo13mIINLEhFSUg2MfF9C2Wt5M1rd
\r
58 T+6fNCV1mpKiInvAjHcdPLmbnndOcZZ0SvmTOcfS6lJZ/XWuuPvhLS1TJiox/PkZLidx7O0P
\r
59 X+Y/9o0uz6ZlHX1b4PJ4fvsdvaA5m747izaqfXnRyC9ilJl+YsH63dIPtIINVgZfs659+MtR
\r
60 6NBW8clMv0IaHn59aP9qR1i7EktxRqKhFnNRcSIA1lCFZcQCAAA=
\r
61 X-BeenThere: notmuch@notmuchmail.org
\r
62 X-Mailman-Version: 2.1.13
\r
64 List-Id: "Use and development of the notmuch mail system."
\r
65 <notmuch.notmuchmail.org>
\r
66 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
67 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
68 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
69 List-Post: <mailto:notmuch@notmuchmail.org>
\r
70 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
71 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
72 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
73 X-List-Received-Date: Mon, 13 Oct 2014 06:20:40 -0000
\r
75 From: Austin Clements <amdragon@mit.edu>
\r
77 This adds a new document value that stores the revision of the last
\r
78 modification to message metadata, where the revision number increases
\r
79 monotonically with each database commit.
\r
81 An alternative would be to store the wall-clock time of the last
\r
82 modification of each message. In principle this is simpler and has
\r
83 the advantage that any process can determine the current timestamp
\r
84 without support from libnotmuch. However, even assuming a computer's
\r
85 clock never goes backward and ignoring clock skew in networked
\r
86 environments, this has a fatal flaw. Xapian uses (optimistic)
\r
87 snapshot isolation, which means reads can be concurrent with writes.
\r
88 Given this, consider the following time line with a write and two read
\r
91 write |-X-A--------------|
\r
95 The write transaction modifies message X and records the wall-clock
\r
96 time of the modification at A. The writer hangs around for a while
\r
97 and later commits its change. Read 1 is concurrent with the write, so
\r
98 it doesn't see the change to X. It does some query and records the
\r
99 wall-clock time of its results at B. Transaction read 2 later starts
\r
100 after the write commits and queries for changes since wall-clock time
\r
101 B (say the reads are performing an incremental backup). Even though
\r
102 read 1 could not see the change to X, read 2 is told (correctly) that
\r
103 X has not changed since B, the time of the last read. In fact, X
\r
104 changed before wall-clock time A, but the change was not visible until
\r
105 *after* wall-clock time B, so read 2 misses the change to X.
\r
107 This is tricky to solve in full-blown snapshot isolation, but because
\r
108 Xapian serializes writes, we can use a simple, monotonically
\r
109 increasing database revision number. Furthermore, maintaining this
\r
110 revision number requires no more IO than a wall-clock time solution
\r
111 because Xapian already maintains statistics on the upper (and lower)
\r
112 bound of each value stream.
\r
114 lib/database-private.h | 15 ++++++++++++++-
\r
115 lib/database.cc | 49 +++++++++++++++++++++++++++++++++++++++++++++++--
\r
116 lib/message.cc | 22 ++++++++++++++++++++++
\r
117 lib/notmuch-private.h | 10 +++++++++-
\r
118 4 files changed, 92 insertions(+), 4 deletions(-)
\r
120 diff --git a/lib/database-private.h b/lib/database-private.h
\r
121 index 15e03cc..465065d 100644
\r
122 --- a/lib/database-private.h
\r
123 +++ b/lib/database-private.h
\r
124 @@ -92,6 +92,12 @@ enum _notmuch_features {
\r
126 * Introduced: version 3. */
\r
127 NOTMUCH_FEATURE_GHOSTS = 1 << 4,
\r
129 + /* If set, messages store the revision number of the last
\r
130 + * modification in NOTMUCH_VALUE_LAST_MOD.
\r
132 + * Introduced: version 3. */
\r
133 + NOTMUCH_FEATURE_LAST_MOD = 1 << 5,
\r
136 /* In C++, a named enum is its own type, so define bitwise operators
\r
137 @@ -137,6 +143,8 @@ struct _notmuch_database {
\r
139 notmuch_database_mode_t mode;
\r
140 int atomic_nesting;
\r
141 + /* TRUE if changes have been made in this atomic section */
\r
142 + notmuch_bool_t atomic_dirty;
\r
143 Xapian::Database *xapian_db;
\r
145 /* Bit mask of features used by this database. This is a
\r
146 @@ -145,6 +153,10 @@ struct _notmuch_database {
\r
148 unsigned int last_doc_id;
\r
149 uint64_t last_thread_id;
\r
150 + /* Highest committed revision number. Modifications are recorded
\r
151 + * under a higher revision number, which can be generated with
\r
152 + * notmuch_database_new_revision. */
\r
153 + unsigned long revision;
\r
155 Xapian::QueryParser *query_parser;
\r
156 Xapian::TermGenerator *term_gen;
\r
157 @@ -166,7 +178,8 @@ struct _notmuch_database {
\r
158 * databases will have it). */
\r
159 #define NOTMUCH_FEATURES_CURRENT \
\r
160 (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_DIRECTORY_DOCS | \
\r
161 - NOTMUCH_FEATURE_BOOL_FOLDER | NOTMUCH_FEATURE_GHOSTS)
\r
162 + NOTMUCH_FEATURE_BOOL_FOLDER | NOTMUCH_FEATURE_GHOSTS | \
\r
163 + NOTMUCH_FEATURE_LAST_MOD)
\r
165 /* Return the list of terms from the given iterator matching a prefix.
\r
166 * The prefix will be stripped from the strings in the returned list.
\r
167 diff --git a/lib/database.cc b/lib/database.cc
\r
168 index 6e51a72..45d32ab 100644
\r
169 --- a/lib/database.cc
\r
170 +++ b/lib/database.cc
\r
171 @@ -101,6 +101,9 @@ typedef struct {
\r
173 * SUBJECT: The value of the "Subject" header
\r
175 + * LAST_MOD: The revision number as of the last tag or
\r
176 + * filename change.
\r
178 * In addition, terms from the content of the message are added with
\r
179 * "from", "to", "attachment", and "subject" prefixes for use by the
\r
180 * user in searching. Similarly, terms from the path of the mail
\r
181 @@ -304,6 +307,8 @@ static const struct {
\r
182 "exact folder:/path: search", "rw" },
\r
183 { NOTMUCH_FEATURE_GHOSTS,
\r
184 "mail documents for missing messages", "w"},
\r
185 + { NOTMUCH_FEATURE_LAST_MOD,
\r
186 + "modification tracking", "w"},
\r
190 @@ -678,6 +683,23 @@ _notmuch_database_ensure_writable (notmuch_database_t *notmuch)
\r
191 return NOTMUCH_STATUS_SUCCESS;
\r
194 +/* Allocate a revision number for the next change. */
\r
196 +_notmuch_database_new_revision (notmuch_database_t *notmuch)
\r
198 + unsigned long new_revision = notmuch->revision + 1;
\r
200 + /* If we're in an atomic section, hold off on updating the
\r
201 + * committed revision number until we commit the atomic section.
\r
203 + if (notmuch->atomic_nesting)
\r
204 + notmuch->atomic_dirty = TRUE;
\r
206 + notmuch->revision = new_revision;
\r
208 + return new_revision;
\r
211 /* Parse a database features string from the given database version.
\r
212 * Returns the feature bit set.
\r
214 @@ -817,6 +839,7 @@ notmuch_database_open (const char *path,
\r
215 notmuch->atomic_nesting = 0;
\r
217 string last_thread_id;
\r
220 if (mode == NOTMUCH_DATABASE_MODE_READ_WRITE) {
\r
221 notmuch->xapian_db = new Xapian::WritableDatabase (xapian_path,
\r
222 @@ -875,6 +898,14 @@ notmuch_database_open (const char *path,
\r
223 INTERNAL_ERROR ("Malformed database last_thread_id: %s", str);
\r
226 + /* Get current highest revision number. */
\r
227 + last_mod = notmuch->xapian_db->get_value_upper_bound (
\r
228 + NOTMUCH_VALUE_LAST_MOD);
\r
229 + if (last_mod.empty ())
\r
230 + notmuch->revision = 0;
\r
232 + notmuch->revision = Xapian::sortable_unserialise (last_mod);
\r
234 notmuch->query_parser = new Xapian::QueryParser;
\r
235 notmuch->term_gen = new Xapian::TermGenerator;
\r
236 notmuch->term_gen->set_stemmer (Xapian::Stem ("english"));
\r
237 @@ -1266,7 +1297,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
\r
239 /* Figure out how much total work we need to do. */
\r
241 - (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER)) {
\r
242 + (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER |
\r
243 + NOTMUCH_FEATURE_LAST_MOD)) {
\r
244 notmuch_query_t *query = notmuch_query_create (notmuch, "");
\r
245 total += notmuch_query_count_messages (query);
\r
246 notmuch_query_destroy (query);
\r
247 @@ -1293,7 +1325,8 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
\r
249 /* Perform per-message upgrades. */
\r
251 - (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER)) {
\r
252 + (NOTMUCH_FEATURE_FILE_TERMS | NOTMUCH_FEATURE_BOOL_FOLDER |
\r
253 + NOTMUCH_FEATURE_LAST_MOD)) {
\r
254 notmuch_query_t *query = notmuch_query_create (notmuch, "");
\r
255 notmuch_messages_t *messages;
\r
256 notmuch_message_t *message;
\r
257 @@ -1330,6 +1363,13 @@ notmuch_database_upgrade (notmuch_database_t *notmuch,
\r
258 if (new_features & NOTMUCH_FEATURE_BOOL_FOLDER)
\r
259 _notmuch_message_upgrade_folder (message);
\r
261 + /* Prior to NOTMUCH_FEATURE_LAST_MOD, messages did not
\r
262 + * track modification revisions. Give all messages a
\r
265 + if (new_features & NOTMUCH_FEATURE_LAST_MOD)
\r
266 + _notmuch_message_upgrade_last_mod (message);
\r
268 _notmuch_message_sync (message);
\r
270 notmuch_message_destroy (message);
\r
271 @@ -1512,6 +1552,11 @@ notmuch_database_end_atomic (notmuch_database_t *notmuch)
\r
272 return NOTMUCH_STATUS_XAPIAN_EXCEPTION;
\r
275 + if (notmuch->atomic_dirty) {
\r
276 + ++notmuch->revision;
\r
277 + notmuch->atomic_dirty = FALSE;
\r
281 notmuch->atomic_nesting--;
\r
282 return NOTMUCH_STATUS_SUCCESS;
\r
283 diff --git a/lib/message.cc b/lib/message.cc
\r
284 index cf2fd7c..767f0ab 100644
\r
285 --- a/lib/message.cc
\r
286 +++ b/lib/message.cc
\r
287 @@ -996,6 +996,16 @@ _notmuch_message_set_header_values (notmuch_message_t *message,
\r
288 message->modified = TRUE;
\r
291 +/* Upgrade a message to support NOTMUCH_FEATURE_LAST_MOD. The caller
\r
292 + * must call _notmuch_message_sync. */
\r
294 +_notmuch_message_upgrade_last_mod (notmuch_message_t *message)
\r
296 + /* _notmuch_message_sync will update the last modification
\r
297 + * revision; we just have to ask it to. */
\r
298 + message->modified = TRUE;
\r
301 /* Synchronize changes made to message->doc out into the database. */
\r
303 _notmuch_message_sync (notmuch_message_t *message)
\r
304 @@ -1008,6 +1018,18 @@ _notmuch_message_sync (notmuch_message_t *message)
\r
305 if (! message->modified)
\r
308 + /* Update the last modification of this message. */
\r
309 + if (message->notmuch->features & NOTMUCH_FEATURE_LAST_MOD)
\r
310 + /* sortable_serialise gives a reasonably compact encoding,
\r
311 + * which directly translates to reduced IO when scanning the
\r
312 + * value stream. Since it's built for doubles, we only get 53
\r
313 + * effective bits, but that's still enough for the database to
\r
314 + * last a few centuries at 1 million revisions per second. */
\r
315 + message->doc.add_value (NOTMUCH_VALUE_LAST_MOD,
\r
316 + Xapian::sortable_serialise (
\r
317 + _notmuch_database_new_revision (
\r
318 + message->notmuch)));
\r
320 db = static_cast <Xapian::WritableDatabase *> (message->notmuch->xapian_db);
\r
321 db->replace_document (message->doc_id, message->doc);
\r
322 message->modified = FALSE;
\r
323 diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
\r
324 index 2f43c1d..cb85738 100644
\r
325 --- a/lib/notmuch-private.h
\r
326 +++ b/lib/notmuch-private.h
\r
327 @@ -108,7 +108,8 @@ typedef enum {
\r
328 NOTMUCH_VALUE_TIMESTAMP = 0,
\r
329 NOTMUCH_VALUE_MESSAGE_ID,
\r
330 NOTMUCH_VALUE_FROM,
\r
331 - NOTMUCH_VALUE_SUBJECT
\r
332 + NOTMUCH_VALUE_SUBJECT,
\r
333 + NOTMUCH_VALUE_LAST_MOD,
\r
336 /* Xapian (with flint backend) complains if we provide a term longer
\r
337 @@ -191,6 +192,9 @@ _notmuch_message_id_compressed (void *ctx, const char *message_id);
\r
339 _notmuch_database_ensure_writable (notmuch_database_t *notmuch);
\r
342 +_notmuch_database_new_revision (notmuch_database_t *notmuch);
\r
345 _notmuch_database_relative_path (notmuch_database_t *notmuch,
\r
347 @@ -302,6 +306,10 @@ _notmuch_message_set_header_values (notmuch_message_t *message,
\r
350 const char *subject);
\r
353 +_notmuch_message_upgrade_last_mod (notmuch_message_t *message);
\r
356 _notmuch_message_sync (notmuch_message_t *message);
\r