From: Austin Clements Date: Sun, 20 Apr 2014 17:46:01 +0000 (+2000) Subject: Re: excessive thread fusing X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=e39a4824e0e6507d04bd43bf84041468d169e02c;p=notmuch-archives.git Re: excessive thread fusing --- diff --git a/e4/b6d4f2d5f55220c8ae4b70d5433562007e0967 b/e4/b6d4f2d5f55220c8ae4b70d5433562007e0967 new file mode 100644 index 000000000..b7ec492a7 --- /dev/null +++ b/e4/b6d4f2d5f55220c8ae4b70d5433562007e0967 @@ -0,0 +1,140 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 20DD3431FBD + for ; Sun, 20 Apr 2014 10:46:14 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: -0.7 +X-Spam-Level: +X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 + tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id 1wJVgg0Txobt for ; + Sun, 20 Apr 2014 10:46:06 -0700 (PDT) +Received: from dmz-mailsec-scanner-6.mit.edu (dmz-mailsec-scanner-6.mit.edu + [18.7.68.35]) + (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id A1C39431FBC + for ; Sun, 20 Apr 2014 10:46:06 -0700 (PDT) +X-AuditID: 12074423-f79476d000000c51-ca-535407dc81d5 +Received: from mailhub-auth-1.mit.edu ( [18.9.21.35]) + (using TLS with cipher AES256-SHA (256/256 bits)) + (Client did not present a certificate) + by dmz-mailsec-scanner-6.mit.edu (Symantec Messaging Gateway) with SMTP + id 06.66.03153.CD704535; Sun, 20 Apr 2014 13:46:04 -0400 (EDT) +Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) + by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id s3KHk3xc005104 + for ; Sun, 20 Apr 2014 13:46:04 -0400 +Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) + (authenticated bits=0) + (User authenticated as amdragon@ATHENA.MIT.EDU) + by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id s3KHk1jk021826 + (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT) + for ; Sun, 20 Apr 2014 13:46:03 -0400 +Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80) + (envelope-from ) id 1WbvoX-0005zb-D0 + for notmuch@notmuchmail.org; Sun, 20 Apr 2014 13:46:01 -0400 +Date: Sun, 20 Apr 2014 13:46:01 -0400 +From: Austin Clements +To: notmuch@notmuchmail.org +Subject: Re: excessive thread fusing +Message-ID: <20140420174601.GC25817@mit.edu> +References: <87ioq5mrbz.fsf@maritornes.cs.unb.ca> + + <20140419210439.GC1797@sid.nuvreauspam> + <20140420164812.GB25817@mit.edu> +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +Content-Disposition: inline +In-Reply-To: <20140420164812.GB25817@mit.edu> +User-Agent: Mutt/1.5.21 (2010-09-15) +X-Brightmail-Tracker: + H4sIAAAAAAAAA+NgFnrPIsWRmVeSWpSXmKPExsUixCmqrHuHPSTY4Px+RYvrN2cyOzB6PFt1 + izmAMYrLJiU1J7MstUjfLoEr4/aynywFc0Uqnm38xtLA+JS/i5GTQ0LAROLGpsPMELaYxIV7 + 69m6GLk4hARmM0lcezKFHcI5zyhx8fFqKOclk8SeZVegnEOMElf/HGUD6WcRUJWYtHUGK4jN + JqAhsW3/ckYQW0RAWmLn3dlgcWEBFYmNZ2Yzgdi8AjoS+x8sgNq3hFGi5dhkRoiEoMTJmU9Y + QGxmAS2JG/9eAjVwANnSEsv/cYCEOQV0JZrmPQG7WxRo5pST29gmMArOQtI9C0n3LITuBYzM + qxhlU3KrdHMTM3OKU5N1i5MT8/JSi3TN9HIzS/RSU0o3MYLCld1FeQfjn4NKhxgFOBiVeHgn + fAsKFmJNLCuuzD3EKMnBpCTK+4UpJFiILyk/pTIjsTgjvqg0J7UYGCAczEoivCdfBgcL8aYk + VlalFuXDpKQ5WJTEed9aWwULCaQnlqRmp6YWpBbBZGU4OJQkeNuAcSkkWJSanlqRlplTgpBm + 4uAEGc4DNNwZpIa3uCAxtzgzHSJ/ilFRSpy3kg0oIQCSyCjNg+uFpZNXjOJArwjz9oG08wBT + EVz3K6DBTECD/54JABlckoiQkmpgNNGofVZ4ql+0a4NirNoZs7oeO+Wuz8mut4TKlf5bFwnV + Tgya9zeMX0rPqXav38wTLH6qO9dvXnfwjPvmx85LN+u4Bne8cJVtzeVieScX4KZ0LCKwYLu9 + yZfWU9O6dbnkjDvuXfSs/Bau2j5ZR99tU++FzJt6lsYJymtMyt48+DpnUs7xUCklluKMREMt + 5qLiRAAwfaQ3AgMAAA== +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Sun, 20 Apr 2014 17:46:14 -0000 + +Quoth myself on Apr 20 at 12:48 pm: +> Quoth Andrei POPESCU on Apr 20 at 12:04 am: +> > On Sb, 19 apr 14, 18:52:02, Eric wrote: +> > > +> > > This may not actually be any help, but both hypermail and mhonarc agree +> > > that two messages form a separate thread from the rest. I believe that +> > > the latter, at least, is the JWZ algorithm. +> > +> > mutt concurs. +> +> Can anyone explain why JWZ *doesn't* have the same problem? I don't +> see how this heuristic doesn't doom it to the same fate: +> +> The References field is populated from the ``References'' and/or +> ``In-Reply-To'' headers. If both headers exist, take the first thing +> in the In-Reply-To header that looks like a Message-ID, and append +> it to the References header. +> +> Given this, even considering only messages 18 and 52 (which "should" +> be in different threads), JWZ should find the common "parent" +> e.fraga@ucl.ac.uk and link them in to the same thread: +> +> Add 18 (step 1) +> - The combined "references" list is +> - Creates and links containers 17 <- e.fraga@ucl.ac.uk <- 18 where the +> first two are empty +> +> Add 52 (step 1) +> - The combined "references" list is +> +> - Creates and links containers 31 <- 32 <- 39 +> - Also considers container e.fraga@ucl.ac.uk, but this is already +> linked, so it doesn't change it +> - Creates container 52 and links e.fraga@ucl.ac.uk <- 52 (step 1C) +> +> 18 and 52 will later get promoted over their empty parent (step 4), +> but will remain in the same thread. +> +> What am I missing? Or are these other MUAs not using pure JWZ? + +I dug in to mutt's mutt_sort_threads a bit. It's not using JWZ, +though it's something similar. The most salient thing may be how it +handles in-reply-to and references: + +1. If a message has both in-reply-to and references, the parent chain + is the *last* in-reply-to ID and then the references from right to + left (skipping the last reference ID if it's the same as the last + in-reply-to ID). (See also mutt_parse_references.) +2. If a message has only in-reply-to, the parent chain is *all* of the + IDs in in-reply-to *from right to left* (e.g., the right-most one + is the immediate parent). +3. If a message has only references, the parent chain is that, from + right to left. + +Like JWZ, mutt creates and links together "empty containers" as it +scans the parent chain towards the root, though unlike JWZ it stops +when it finds a non-empty container or a container that already has a +parent.