Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 224D0431FD0 for ; Mon, 19 Dec 2011 11:47:13 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id A6HwCmjIIPXK for ; Mon, 19 Dec 2011 11:47:12 -0800 (PST) Received: from dmz-mailsec-scanner-2.mit.edu (DMZ-MAILSEC-SCANNER-2.MIT.EDU [18.9.25.13]) by olra.theworths.org (Postfix) with ESMTP id 544F0431FB6 for ; Mon, 19 Dec 2011 11:47:12 -0800 (PST) X-AuditID: 1209190d-b7f576d0000008c4-a6-4eef94bfcc1c Received: from mailhub-auth-4.mit.edu ( [18.7.62.39]) by dmz-mailsec-scanner-2.mit.edu (Symantec Messaging Gateway) with SMTP id BD.1B.02244.FB49FEE4; Mon, 19 Dec 2011 14:47:11 -0500 (EST) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id pBJJlAF1017294; Mon, 19 Dec 2011 14:47:11 -0500 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id pBJJl8hV010252 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Mon, 19 Dec 2011 14:47:08 -0500 (EST) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77) (envelope-from ) id 1RcjC9-0002t5-QF; Mon, 19 Dec 2011 14:48:21 -0500 Date: Mon, 19 Dec 2011 14:48:21 -0500 From: Austin Clements To: David Edmondson Subject: Re: [PATCH 0/5] Store message modification times in the DB Message-ID: <20111219194821.GA10376@mit.edu> References: <1323796305-28789-1-git-send-email-schnouki@schnouki.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprCKsWRmVeSWpSXmKPExsUixG6nrrt/yns/g58z9Cz23dnCZHH95kxm i339/g7MHrue/2XyeLbqFrPHlFlz2QOYo7hsUlJzMstSi/TtErgyGhdeYCr4J11x67hUA+Nc sS5GTg4JAROJBxtPs0DYYhIX7q1n62Lk4hAS2McocW1WA1hCSGADo8T+KWUQiZNMEpv3nGOC SCxhlLhzxBTEZhFQlfi4YiIriM0moCGxbf9yRhBbREBR4v+3FewgNrOAg8TKk71gtrCAs8Tm nzuB6jk4eAV0JF70SECMLJfY/akDrIRXQFDi5MwnLBCtWhI3/r1kAilnFpCWWP6PA8TkFLCR WLXdE6RCVEBFYsrJbWwTGIVmIWmehaR5FkLzAkbmVYyyKblVurmJmTnFqcm6xcmJeXmpRbpG ermZJXqpKaWbGEEBzinJu4Px3UGlQ4wCHIxKPLwrm977CbEmlhVX5h5ilORgUhLltZoMFOJL yk+pzEgszogvKs1JLT7EKMHBrCTC++PFWz8h3pTEyqrUonyYlDQHi5I4r6rWOz8hgfTEktTs 1NSC1CKYrAwHh5IEry4wkoUEi1LTUyvSMnNKENJMHJwgw3mAhmuC1PAWFyTmFmemQ+RPMepy 7P38/QyjEEtefl6qlDivKUiRAEhRRmke3BxYYnrFKA70ljDEOh5gUoOb9ApoCRPQkr7ANyBL ShIRUsDkk/ntnM+KLRMeB/14seG+UOpUhVU/tif9vqpl7WU3PaE2u2KpdYnH3Q+Jj3Qeal3X SJNTSlAzb7D97hOd9vzeGn35U3tzn4es32BT/LHyoKLk5zqbdSrFKj9iNQNe3LSeLtAWUPUx oeKbt4jp36kVP6wUjt/1yAm/pqdpc2hFiWT8QrU18kHnlFiKMxINtZiLihMBGxHPVicDAAA= Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 19 Dec 2011 19:47:13 -0000 Quoth David Edmondson on Dec 19 at 4:34 pm: > On Tue, 13 Dec 2011 18:11:40 +0100, Thomas Jost wrote: > > This is a patch series I've been working on for some time in order to be > > able to sync my tags on several computers. I'm posting it now, but > > please consider it as a RFC rather than something that is ready to be > > pushed. > > > > The basic idea is to the last time each message was modified, i.e. "the > > message was added to the DB", "a tag was added" or "a tag was removed". > > Thomas, this is interesting. Do you have a (back of the envelope?) > design for how you will use this information to implement tag sync? > > My gut feeling is that we need a log of when a change occurred rather > than the last modification time, but I haven't really thought that all > through properly. Here are sketches for two sync algorithms with different properties. I haven't proven these to be correct, but I believe they are. In both, R is the remote host and L is the local host. They're both one-way (they only update tags on L), but should be symmetrically stable. == Two-way "merge" from host R to host L == Per-host state: - last_mtime: Map from remote hosts to last sync mtime new_mtime = last_mtime[R] For msgid, mtime, tags in messages on host R with mtime >= last_mtime[R]: If mtime > local mtime of msgid: Set local tags of msgid to tags new_mtime = max(new_mtime, mtime) last_mtime[R] = new_mtime This has the advantage of keeping very little state, but the synchronization is also quite primitive. If two hosts change a message's tags in different ways between synchronizations, the more recent of the two will override the full set of tags on that message. This does not strictly require tombstones, though if you make a tag change and then delete the message before a sync, the tag change will be lost without some record of that state. Also, this obviously depends heavily on synchronized clocks. == Three-way merge from host R to host L == Per-host state: - last_mtime: Map from remote hosts to last sync mtime - last_sync: Map from remote hosts to the tag database as of the last sync new_mtime = last_mtime[R] for msgid, mtime, r_tags in messages on host R with mtime >= last_mtime[R]: my_tags = local tags of msgid last_tags = last_sync[R][msgid] for each tag that differs between my_tags and r_tags: if tag is in last_tags: remove tag locally else: add tag locally last_sync[R][msgid] = tags new_mtime = max(new_mtime, mtime) Delete stale messages from last_sync[R] (using tombstones or something) last_mtime[R] = new_mtime This protocol requires significantly more state, but can also reconstruct per-tag changes. Conflict resolution is equivalent to what git would do and is based solely on the current local and remote state and the common ancestor state. This can lead to unintuitive results if a tag on a message has gone through multiple changes on both hosts since the last sync (though, I argue, there are no intuitive results in such situations). Tombstones are only required to garbage collect sync state (and other techniques could be used for that). This also does not depend on time synchronization (though, like any mtime solution, it does depend on mtime monotonicity). The algorithm would work equally well with sequence numbers. I tried coming up with a third algorithm that used mtimes to resolve tagging conflicts, but without per-tag mtimes it degenerated into the first algorithm.