Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by arlo.cworth.org (Postfix) with ESMTP id 7DA916DE179D for ; Wed, 10 Feb 2016 09:21:34 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at cworth.org X-Spam-Flag: NO X-Spam-Score: -0.017 X-Spam-Level: X-Spam-Status: No, score=-0.017 tagged_above=-999 required=5 tests=[AWL=-0.017] autolearn=disabled Received: from arlo.cworth.org ([127.0.0.1]) by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Bl9KNKXl6Bxs for ; Wed, 10 Feb 2016 09:21:32 -0800 (PST) Received: from che.mayfirst.org (che.mayfirst.org [209.234.253.108]) by arlo.cworth.org (Postfix) with ESMTP id 4569E6DE13DA for ; Wed, 10 Feb 2016 09:21:32 -0800 (PST) Received: from fifthhorseman.net (unknown [38.109.115.130]) by che.mayfirst.org (Postfix) with ESMTPSA id 769BCF997; Wed, 10 Feb 2016 12:21:28 -0500 (EST) Received: by fifthhorseman.net (Postfix, from userid 1000) id E61661FF75; Wed, 10 Feb 2016 12:21:27 -0500 (EST) From: Daniel Kahn Gillmor To: Jameson Graef Rollins , Notmuch Mail Subject: Re: [PATCH v3 15/16] added notmuch_message_reindex In-Reply-To: <87oabpnzt4.fsf@alice.fifthhorseman.net> References: <1454272801-23623-1-git-send-email-dkg@fifthhorseman.net> <1454272801-23623-16-git-send-email-dkg@fifthhorseman.net> <87mvr9s8gy.fsf@servo.finestructure.net> <87oabpnzt4.fsf@alice.fifthhorseman.net> User-Agent: Notmuch/0.21+72~gd8c4f1c (http://notmuchmail.org) Emacs/24.5.1 (x86_64-pc-linux-gnu) Date: Wed, 10 Feb 2016 12:21:24 -0500 Message-ID: <871t8ko50r.fsf@alice.fifthhorseman.net> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="==-=-="; micalg=pgp-sha512; protocol="application/pgp-signature" X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 Feb 2016 17:21:34 -0000 --==-=-= Content-Type: multipart/mixed; boundary="=-=-=" --=-=-= Content-Type: text/plain Content-Transfer-Encoding: quoted-printable On Tue 2016-02-09 20:01:43 -0500, Daniel Kahn Gillmor wrote: >> I just wanted to mention that I think there's a problem with the reindex >> functionality introduced in this patch (or in 16/16). It looks like >> this function irrevocably busts apart threads. dkg and I are >> investigating. > > it doesn't appear to be irrevocable to me, but it is definitely doing > something weird with threading. OK, this is definitely tickling some problems with threading, but those are problems that are present already in existing versions of notmuch, unrelated to this series. When removing a message from the database, its earlier presence doesn't become a ghost message, and as a result anything that points to it doesn't get assembled into the prior thread properly. The attached tarball has a python test showing this behavior with a simple thread of two messages: 0 dkg@frigg:~/src/notmuch/threading-test$ ./run-test=20 Found 2 total files (that's not much mail). Processed 2 total files in almost no time. Added 2 new messages to the database. Threads: 1 removing and re-adding a@example.com Threads: 2 removing and re-adding b@example.com Threads: 1 0 dkg@frigg:~/src/notmuch/threading-test$=20 the relevant python function is: def remove_and_readd(db, mid): print('removing and re-adding', mid) m =3D db.find_message(mid) f =3D m.get_filename() db.remove_message(f) db.add_message(f) I think when a message is removed from the database, we need to know whether anything else (in its same thread?) refers to it. If so, we should keep it around as a ghost message instead of fully removing it. does this sound like the right approach? --dkg --=-=-= Content-Type: application/x-gtar-compressed Content-Disposition: attachment; filename=threading-test.tgz Content-Transfer-Encoding: base64 H4sIAAAAAAACA+1YW2/TMBTus3+FGUhtEUlz71YxBNoF+jAmjSEeIyc+aYOaOHOcbX3ht3OaZhO9 DARsGRd/L0l8Lo5z/H12XLDrcDITEZuFU2AcZOf+YSGCIKiviPUrGr2O7QQ++gydwMZ22/NdtzPp tICqVExil1II9T2/H9nXB/eXwHdoLLIMcrVvc8+y+S5YfGjx3Thme47LbAhc5rnRrmvbzl7sscAi HY1/BmoqkfZpPjEUlGrwIH0s+DAc+nfzH+/X+W85dsfX/G+7/uYkVekkFxLuXf+9u+vvbeh/gDOm Y+n6PzhyobIqnhqxyJN0Qp5/IRmUJZtAOTAbm9b7/0f/b4v/yPrvB66l9f8R6x9XcvCI9Q88vf4/ ev1ZS+u/42yu/37g6PW/BZwsC26M+Yiy13DNsmIGJv4Skg9V9Blihc2MkUOmYEQ/AX9BbYseQ0Qd LBLej+zhyN6jhuVbFjmWIkP/WRrDSqpzMaKRiFbayDuQQNOSqinQJJWlos3k0/uNP4T/UUv8d+1g C/99zf+W+R9t538URT/F/3WmL9i/qQlnkKAA5DGUI/pyRXlekXFunEExmxuL0HXjinCUgD8uXCvH ffFfgrHaYhbzB9//DYMN/g+HruZ/G3j6ZFCVKPVpPoD8khZzNRW5S0iaFUIqenME0DyKkhAOCZWQ iUsIWc7DxWThPR69oFnK+yNCEYVMc9Xr1l44kSj6YYiBjvjUXXrWjhndpzwykxQTNRzu3RoTNGbm BFSYpDPIWQa9pQEjmhe4iUluDdjHt62ER5ilGYWJKsYiVmIfgsP+eqt5cnp4ZJ4dvTkMP52Nz4/6 5GL5ejGOUUF4UYGc97q1lHUxdTPK85ov5QjHdYEKVeUqXFKo7PX7ZOuX6q5IGub6jVTRr6S6m/9V Xt+0ev7roW39/993A83/dvi/4D7O/ykhcF2z/P3p+cnHg3fhwen74/Hb/We94or3B2snhURm1JAJ 3WnMG8eGO2hazbRDiBJo2WJoYugyOS7rivIbWhZMTTe62RpSlSDNhVDQxRyuH+90Q55kTM5DyFg6 q/1viSQkDu8mLocrQsytC6PebmhoaGhoaGhoaPxt+AoTIjhIACgAAA== --=-=-=-- --==-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQJ8BAEBCgBmBQJWu3GUXxSAAAAAAC4AKGlzc3Vlci1mcHJAbm90YXRpb25zLm9w ZW5wZ3AuZmlmdGhob3JzZW1hbi5uZXRFREIyRTc0RjU2RkNGMkI2NzI5N0I3MzUy NEVDRkY1QUZGNjgzNzBBAAoJECTs/1r/aDcKJm8QAIdV/RVM0hjmNJHIGJ+zkg+t u4ZXt+OBZBkIVWDZ5Ksu9RrqxwG/jrfipVRp2U+GWGQh2wvsaxY6h1+rt942SEIj dYOxfEGsOC4Zr5YwmBZVFYzT1Ndp1gt9urKLfKzwrbKq9yW060/AOOoc02lobOIX rKazE9wl+scJfHDaSfpEzd+Ts5awWlXgkWd1hQTJ2z/8qndFoA+HfdA/DnwW1iI+ advd9w8c+ZCXx/dRAGh9H3aQBD9tPShh9ceF7Szii9SzJ8SAtzO8pVF8ndMEqZ6c pxR/hilFhTB1FuLqf8feKflUBAGSBa0pI1ceBDqr+7mbaxS+88ZJcpyIJ3b2Hexc /yTPaxJPeWgQTSFyHp0WsuEU4FeZTh+tJOKL2yRLLqVKfhZ8oDcPSjzLBvbDpst9 ytzHOTM/GpwxP+bEFr14zi4wAJANmEmJdfmFxUYJbjI4UEn7R7d6qSvnIjKpJbLa 8tt6NobX8UWtycL6PxdYUVFwL6pAe4tmHp6b4b252Us8jR/OkMP8tYroyha0PICQ gfbfCvEQ9so1URDcTf6zzZ1Wkg0DG0sL10n7Ujwo7omTmLaMvHhCFxtagEOPmTgq mcEB/6c5ylVHDicHXTYWx/0XMvgea/NAWDud3DIXyu5dg+tUCk74vFOupusBR0ik L34eAydJ09C2ZQkele84 =/P4A -----END PGP SIGNATURE----- --==-=-=--