From: Gaute Hope Date: Thu, 10 Apr 2014 21:10:20 +0000 (+0200) Subject: Re: [PATCH] Add configurable changed tag to messages that have been changed on disk X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=5cc9dc5838e8781d165aaf0f1ed0ab91b6a70616;p=notmuch-archives.git Re: [PATCH] Add configurable changed tag to messages that have been changed on disk --- diff --git a/4d/e2d319ee84948cffa7a7074c82b8a9d826554e b/4d/e2d319ee84948cffa7a7074c82b8a9d826554e new file mode 100644 index 000000000..bad39e8f6 --- /dev/null +++ b/4d/e2d319ee84948cffa7a7074c82b8a9d826554e @@ -0,0 +1,181 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 13A14431FBC + for ; Thu, 10 Apr 2014 14:11:39 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: -0.7 +X-Spam-Level: +X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 + tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id 2cZf5c3w9GcH for ; + Thu, 10 Apr 2014 14:11:35 -0700 (PDT) +Received: from mail-lb0-f171.google.com (mail-lb0-f171.google.com + [209.85.217.171]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id 1FA18431FAE + for ; Thu, 10 Apr 2014 14:11:34 -0700 (PDT) +Received: by mail-lb0-f171.google.com with SMTP id w7so2777282lbi.2 + for ; Thu, 10 Apr 2014 14:11:32 -0700 (PDT) +X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; + d=1e100.net; s=20130820; + h=x-gm-message-state:content-type:from:to:cc:subject:in-reply-to + :references:date:message-id:user-agent:content-transfer-encoding; + bh=3SgpQsUytmES8TLpkjCznp8F6qKnc2NslVrnPnrHIfY=; + b=VdiM+DrKehrOxZSCMHXI2hnAuYeNJPTPH6Fj3VxPlKlkur3VqVPaZFW9HqFF6UUJjN + d+r2nOsEmVLXyfdqfLe6EtVvrR3VWL/sT8euE1np3rx9xMbJ+swkIdzskZDg8+6nGm/h + 1U5zzc+l3SQENbylgAAERctm1hVCPmnOlYhLqC2qz+BZi8pnMdL9JopULnVRtTeM8A3H + U+gnbUjZo5HDzIkAtp+WP/Z5ZnPxsVvPlwjZ5/x5PO3KOfMWSTyUZA7wn+SCpzTpdaDp + yANn7UX+JbjiM5XCaujHIXbOZEslseVY9KY4fYBdZby94vZkIooh9EyjsW9k41tq7fWU + KT3w== +X-Gm-Message-State: + ALoCoQmvyobhsQ+Ed0VSyHO5124h52ANmYny6lFbAz4EJibHE2a0j/yz1OFsaV4NLZfj5qc8gxIV +X-Received: by 10.152.42.164 with SMTP id p4mr13705398lal.5.1397164292100; + Thu, 10 Apr 2014 14:11:32 -0700 (PDT) +Received: from localhost (cD572BF51.dhcp.as2116.net. [81.191.114.213]) + by mx.google.com with ESMTPSA id g8sm5197206laf.0.2014.04.10.14.11.29 + for + (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); + Thu, 10 Apr 2014 14:11:30 -0700 (PDT) +Content-Type: text/plain; charset=UTF-8 +From: Gaute Hope +To: David Mazieres expires 2014-07-09 PDT + +Subject: Re: [PATCH] Add configurable changed tag to messages that have been + changed on disk +In-reply-to: <87wqexnqvb.fsf@ta.scs.stanford.edu> +References: <1396800683-9164-1-git-send-email-eg@gaute.vetsj.com> + <87wqf2gqig.fsf@ta.scs.stanford.edu> <1397140962-sup-6514@qwerzila> + <87wqexnqvb.fsf@ta.scs.stanford.edu> +Date: Thu, 10 Apr 2014 23:10:20 +0200 +Message-Id: <1397163239-sup-5101@qwerzila> +User-Agent: Sup/git +Content-Transfer-Encoding: 8bit +Cc: notmuch +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Thu, 10 Apr 2014 21:11:39 -0000 + +Excerpts from dm-list-email-notmuch's message of 2014-04-10 17:31:04 +0200: +> Gaute Hope writes: +> +> >> A better approach would be to add a new "modtime" xapian value that is +> >> updated whenever the tags or any other terms (such as XFDIRENTRY) are +> >> added to or deleted from a docid. If it's a Xapian value, rather than a +> >> term, then modtime will be queriable just like date, allowing multiple +> >> applications to query all docids modified since the last time they ran. +> >> +> >> [... snip] +> > +> > This could also solve it, and probably have more uses. I don't quite see +> > how the opposite problem (for my use case) can be solved by this without +> > using a 'localchange' tag. This is to sync tag to maildir sync, when a +> > new tag has been added (by e.g. a user interaction in a client) it needs +> > to be copied to the maildir, if it is not done in the same go a +> > different application won't know whether the change was local or remote. +> > How did you solve this? +> +> Why don't you just set maildir.synchronize_flags=true? When I +> synchronize mail across machines, I start by concurrently running +> "notmuch new" on both the local and remote machines, which picks up all +> the changed maildir flags. Then I synchronize the mail and the tags +> between the two maildirs. If maildir.synchronize=true, then atomically +> with setting the new tags I call notmuch_message_tags_to_maildir_flags() +> to sync the new tags to the maildir. + +I am talking about syncing tags to a maildir _folder_, not flags. It +could be implemented as maildir.synchronize is now, but it would be a +larger feature which could work in a lot of different ways. + +> The maildir flags question seems kind of independent of what we are +> talking about, which is just having an incremental way of examining the +> database. Right now, I have to scan everything to find tags that have +> changed since the last synchronization event. If I had modtime (or +> really it should be called "ctime", like inode change time), then I +> could look at only the few messages that changed, and it would probably +> shave 250msec off polling new mail for a 100,000-message maildir. +> +> Note you can't use the file system ctime/mtime because the file system +> may have changed since the last time you ran notmuch new. + +If you have a unreliable clock or use a badly configured system you +could risk detecting changes in the case where application time stamp is +set in the future, a mod time now. Then the app won't know there has +been a change. The same could happen if the clock is in the past, and +the modtime is set, the clock is updated and the app won't know there +has been a change. + +The only way to know is to do a full scan of the entire db. This could +be very expansive, and comparable to initial indexing, for some actions. + +You would not necessarily, or reliably, be able to detect this. + +With an internal tick this wouldn't be an issue. + +> > I would suggest using a Xapian- or Index-time which gets a tick +> > everytime a modification is made to the index. +> +> Exactly. It could be a tick, or just the current time of day if your +> clock does not go backwards. (I'd be willing to do a full scan if the +> clock ever goes backwards.) The advantage of time is that you don't +> have to synchronously update some counter. +> +> > Atomic operations could operate on the same time in case this +> > distinction turns out to be useful. Perhaps something like this +> > already exists in Xapian? +> +> I don't think it's important for atomic operations to have the same +> timestamp. All that's important is that you be able to diff the +> database between the last time you scanned it. + +Yeah, it is not necessary for anything I am planning on doing, but it +would be a way for other apps to know that a set of changes were done at +the same time. + +> > This way clock skew, clock resolution (lots of operations happening in +> > the same second, msec or nanosec) problems won't be an issue. The crux +> > will be to make sure all write-operations trigger a tick on the +> > indextime. +> +> Clock skew is not really an issue. It takes years to amass hundreds of +> thousands of email messages. So adding 5 minutes of slop is not a big +> deal--you'll just scan a few messages needlessly. + +Yes, but you risk missing changes without knowing. That is an issue for +my use case. + + +> Making sure the write-operations update the time should be easy. Most +> or all of the changes are probably funneled through +> _notmuch_message_sync. Worst case, there are only 9 places in the +> source code that make use of a Xapian:WritableDatabase, so I'm pretty +> confident total changes wouldn't be much more than 50 lines of code. + +Yes :) + +> I would do it myself if there were any kind of indication that such a +> change could be upstreamed. I brought this up in January, 2011, and +> didn't get a huge amount of interest in the ctime idea. But I was also +> a lot less focused on what I needed. Now that I have a working +> distributed setup and am actually using notmuch for my mail, I have a +> much better understanding of what is needed. + +Would be great if it could be included.. I guess a comment from +one/some of the notmuch-gurus could clarify? + + +- gaute