From 69b3a6b62b9e53695e000c1b3cab33c0cb85bcf6 Mon Sep 17 00:00:00 2001 From: Daniel Kahn Gillmor Date: Thu, 11 Feb 2016 12:21:24 +1900 Subject: [PATCH] Re: [PATCH v3 15/16] added notmuch_message_reindex --- b7/0dc879377b862c138acb14c4e7cade79a7566f | 158 ++++++++++++++++++++++ 1 file changed, 158 insertions(+) create mode 100644 b7/0dc879377b862c138acb14c4e7cade79a7566f diff --git a/b7/0dc879377b862c138acb14c4e7cade79a7566f b/b7/0dc879377b862c138acb14c4e7cade79a7566f new file mode 100644 index 000000000..6a453a23d --- /dev/null +++ b/b7/0dc879377b862c138acb14c4e7cade79a7566f @@ -0,0 +1,158 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by arlo.cworth.org (Postfix) with ESMTP id 7DA916DE179D + for ; Wed, 10 Feb 2016 09:21:34 -0800 (PST) +X-Virus-Scanned: Debian amavisd-new at cworth.org +X-Spam-Flag: NO +X-Spam-Score: -0.017 +X-Spam-Level: +X-Spam-Status: No, score=-0.017 tagged_above=-999 required=5 + tests=[AWL=-0.017] autolearn=disabled +Received: from arlo.cworth.org ([127.0.0.1]) + by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id Bl9KNKXl6Bxs for ; + Wed, 10 Feb 2016 09:21:32 -0800 (PST) +Received: from che.mayfirst.org (che.mayfirst.org [209.234.253.108]) + by arlo.cworth.org (Postfix) with ESMTP id 4569E6DE13DA + for ; Wed, 10 Feb 2016 09:21:32 -0800 (PST) +Received: from fifthhorseman.net (unknown [38.109.115.130]) + by che.mayfirst.org (Postfix) with ESMTPSA id 769BCF997; + Wed, 10 Feb 2016 12:21:28 -0500 (EST) +Received: by fifthhorseman.net (Postfix, from userid 1000) + id E61661FF75; Wed, 10 Feb 2016 12:21:27 -0500 (EST) +From: Daniel Kahn Gillmor +To: Jameson Graef Rollins , + Notmuch Mail +Subject: Re: [PATCH v3 15/16] added notmuch_message_reindex +In-Reply-To: <87oabpnzt4.fsf@alice.fifthhorseman.net> +References: <1454272801-23623-1-git-send-email-dkg@fifthhorseman.net> + <1454272801-23623-16-git-send-email-dkg@fifthhorseman.net> + <87mvr9s8gy.fsf@servo.finestructure.net> + <87oabpnzt4.fsf@alice.fifthhorseman.net> +User-Agent: Notmuch/0.21+72~gd8c4f1c (http://notmuchmail.org) Emacs/24.5.1 + (x86_64-pc-linux-gnu) +Date: Wed, 10 Feb 2016 12:21:24 -0500 +Message-ID: <871t8ko50r.fsf@alice.fifthhorseman.net> +MIME-Version: 1.0 +Content-Type: multipart/signed; boundary="==-=-="; + micalg=pgp-sha512; protocol="application/pgp-signature" +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.20 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Wed, 10 Feb 2016 17:21:34 -0000 + +--==-=-= +Content-Type: multipart/mixed; boundary="=-=-=" + +--=-=-= +Content-Type: text/plain +Content-Transfer-Encoding: quoted-printable + +On Tue 2016-02-09 20:01:43 -0500, Daniel Kahn Gillmor wrote: +>> I just wanted to mention that I think there's a problem with the reindex +>> functionality introduced in this patch (or in 16/16). It looks like +>> this function irrevocably busts apart threads. dkg and I are +>> investigating. +> +> it doesn't appear to be irrevocable to me, but it is definitely doing +> something weird with threading. + +OK, this is definitely tickling some problems with threading, but those +are problems that are present already in existing versions of notmuch, +unrelated to this series. + +When removing a message from the database, its earlier presence doesn't +become a ghost message, and as a result anything that points to it +doesn't get assembled into the prior thread properly. + +The attached tarball has a python test showing this behavior with a +simple thread of two messages: + +0 dkg@frigg:~/src/notmuch/threading-test$ ./run-test=20 +Found 2 total files (that's not much mail). +Processed 2 total files in almost no time. +Added 2 new messages to the database. +Threads: 1 +removing and re-adding a@example.com +Threads: 2 +removing and re-adding b@example.com +Threads: 1 +0 dkg@frigg:~/src/notmuch/threading-test$=20 + +the relevant python function is: + + +def remove_and_readd(db, mid): + print('removing and re-adding', mid) + m =3D db.find_message(mid) + f =3D m.get_filename() + db.remove_message(f) + db.add_message(f) + + + +I think when a message is removed from the database, we need to know +whether anything else (in its same thread?) refers to it. If so, we +should keep it around as a ghost message instead of fully removing it. + +does this sound like the right approach? + + --dkg + + +--=-=-= +Content-Type: application/x-gtar-compressed +Content-Disposition: attachment; filename=threading-test.tgz +Content-Transfer-Encoding: base64 + +H4sIAAAAAAACA+1YW2/TMBTus3+FGUhtEUlz71YxBNoF+jAmjSEeIyc+aYOaOHOcbX3ht3OaZhO9 +DARsGRd/L0l8Lo5z/H12XLDrcDITEZuFU2AcZOf+YSGCIKiviPUrGr2O7QQ++gydwMZ22/NdtzPp +tICqVExil1II9T2/H9nXB/eXwHdoLLIMcrVvc8+y+S5YfGjx3Thme47LbAhc5rnRrmvbzl7sscAi +HY1/BmoqkfZpPjEUlGrwIH0s+DAc+nfzH+/X+W85dsfX/G+7/uYkVekkFxLuXf+9u+vvbeh/gDOm +Y+n6PzhyobIqnhqxyJN0Qp5/IRmUJZtAOTAbm9b7/0f/b4v/yPrvB66l9f8R6x9XcvCI9Q88vf4/ +ev1ZS+u/42yu/37g6PW/BZwsC26M+Yiy13DNsmIGJv4Skg9V9Blihc2MkUOmYEQ/AX9BbYseQ0Qd +LBLej+zhyN6jhuVbFjmWIkP/WRrDSqpzMaKRiFbayDuQQNOSqinQJJWlos3k0/uNP4T/UUv8d+1g +C/99zf+W+R9t538URT/F/3WmL9i/qQlnkKAA5DGUI/pyRXlekXFunEExmxuL0HXjinCUgD8uXCvH +ffFfgrHaYhbzB9//DYMN/g+HruZ/G3j6ZFCVKPVpPoD8khZzNRW5S0iaFUIqenME0DyKkhAOCZWQ +iUsIWc7DxWThPR69oFnK+yNCEYVMc9Xr1l44kSj6YYiBjvjUXXrWjhndpzwykxQTNRzu3RoTNGbm +BFSYpDPIWQa9pQEjmhe4iUluDdjHt62ER5ilGYWJKsYiVmIfgsP+eqt5cnp4ZJ4dvTkMP52Nz4/6 +5GL5ejGOUUF4UYGc97q1lHUxdTPK85ov5QjHdYEKVeUqXFKo7PX7ZOuX6q5IGub6jVTRr6S6m/9V +Xt+0ev7roW39/993A83/dvi/4D7O/ykhcF2z/P3p+cnHg3fhwen74/Hb/We94or3B2snhURm1JAJ +3WnMG8eGO2hazbRDiBJo2WJoYugyOS7rivIbWhZMTTe62RpSlSDNhVDQxRyuH+90Q55kTM5DyFg6 +q/1viSQkDu8mLocrQsytC6PebmhoaGhoaGhoaPxt+AoTIjhIACgAAA== +--=-=-=-- + +--==-=-= +Content-Type: application/pgp-signature; name="signature.asc" + +-----BEGIN PGP SIGNATURE----- +Version: GnuPG v2 + +iQJ8BAEBCgBmBQJWu3GUXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w +ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRFREIyRTc0RjU2RkNGMkI2NzI5N0I3MzUy +NEVDRkY1QUZGNjgzNzBBAAoJECTs/1r/aDcKJm8QAIdV/RVM0hjmNJHIGJ+zkg+t +u4ZXt+OBZBkIVWDZ5Ksu9RrqxwG/jrfipVRp2U+GWGQh2wvsaxY6h1+rt942SEIj +dYOxfEGsOC4Zr5YwmBZVFYzT1Ndp1gt9urKLfKzwrbKq9yW060/AOOoc02lobOIX +rKazE9wl+scJfHDaSfpEzd+Ts5awWlXgkWd1hQTJ2z/8qndFoA+HfdA/DnwW1iI+ +advd9w8c+ZCXx/dRAGh9H3aQBD9tPShh9ceF7Szii9SzJ8SAtzO8pVF8ndMEqZ6c +pxR/hilFhTB1FuLqf8feKflUBAGSBa0pI1ceBDqr+7mbaxS+88ZJcpyIJ3b2Hexc +/yTPaxJPeWgQTSFyHp0WsuEU4FeZTh+tJOKL2yRLLqVKfhZ8oDcPSjzLBvbDpst9 +ytzHOTM/GpwxP+bEFr14zi4wAJANmEmJdfmFxUYJbjI4UEn7R7d6qSvnIjKpJbLa +8tt6NobX8UWtycL6PxdYUVFwL6pAe4tmHp6b4b252Us8jR/OkMP8tYroyha0PICQ +gfbfCvEQ9so1URDcTf6zzZ1Wkg0DG0sL10n7Ujwo7omTmLaMvHhCFxtagEOPmTgq +mcEB/6c5ylVHDicHXTYWx/0XMvgea/NAWDud3DIXyu5dg+tUCk74vFOupusBR0ik +L34eAydJ09C2ZQkele84 +=/P4A +-----END PGP SIGNATURE----- +--==-=-=-- -- 2.26.2