Re: [PATCH 0/3] Speed up notmuch new for unchanged directories
authorAustin Clements <amdragon@MIT.EDU>
Mon, 25 Jun 2012 17:59:15 +0000 (13:59 +2000)
committerW. Trevor King <wking@tremily.us>
Fri, 7 Nov 2014 17:47:48 +0000 (09:47 -0800)
f5/cc0cb326e34683965a34a0e4810dbb5d4ff796 [new file with mode: 0644]

diff --git a/f5/cc0cb326e34683965a34a0e4810dbb5d4ff796 b/f5/cc0cb326e34683965a34a0e4810dbb5d4ff796
new file mode 100644 (file)
index 0000000..0bd180c
--- /dev/null
@@ -0,0 +1,103 @@
+Return-Path: <amdragon@mit.edu>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+       by olra.theworths.org (Postfix) with ESMTP id 15A6C431FB6\r
+       for <notmuch@notmuchmail.org>; Mon, 25 Jun 2012 10:59:22 -0700 (PDT)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: -0.7\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5\r
+       tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+       by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+       with ESMTP id cKxtNjxO4jYF for <notmuch@notmuchmail.org>;\r
+       Mon, 25 Jun 2012 10:59:20 -0700 (PDT)\r
+Received: from dmz-mailsec-scanner-2.mit.edu (DMZ-MAILSEC-SCANNER-2.MIT.EDU\r
+       [18.9.25.13])\r
+       by olra.theworths.org (Postfix) with ESMTP id E24BA431FAF\r
+       for <notmuch@notmuchmail.org>; Mon, 25 Jun 2012 10:59:19 -0700 (PDT)\r
+X-AuditID: 1209190d-b7fd56d000000933-cd-4fe8a6f776df\r
+Received: from mailhub-auth-1.mit.edu ( [18.9.21.35])\r
+       by dmz-mailsec-scanner-2.mit.edu (Symantec Messaging Gateway) with SMTP\r
+       id 3C.53.02355.7F6A8EF4; Mon, 25 Jun 2012 13:59:19 -0400 (EDT)\r
+Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103])\r
+       by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id q5PHxIw9018999; \r
+       Mon, 25 Jun 2012 13:59:18 -0400\r
+Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91])\r
+       (authenticated bits=0)\r
+       (User authenticated as amdragon@ATHENA.MIT.EDU)\r
+       by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id q5PHxG4x010040\r
+       (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT);\r
+       Mon, 25 Jun 2012 13:59:17 -0400 (EDT)\r
+Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77)\r
+       (envelope-from <amdragon@mit.edu>)\r
+       id 1SjDZE-0004TW-00; Mon, 25 Jun 2012 13:59:16 -0400\r
+From: Austin Clements <amdragon@MIT.EDU>\r
+To: Sascha Silbe <sascha-ml-reply-to-2012-3@silbe.org>,\r
+       notmuch <notmuch@notmuchmail.org>\r
+Subject: Re: [PATCH 0/3] Speed up notmuch new for unchanged directories\r
+In-Reply-To: <1340555366-25891-1-git-send-email-sascha-pgp@silbe.org>\r
+References: <1340555366-25891-1-git-send-email-sascha-pgp@silbe.org>\r
+User-Agent: Notmuch/0.12+132~gf2f390b (http://notmuchmail.org) Emacs/23.3.1\r
+       (i486-pc-linux-gnu)\r
+Date: Mon, 25 Jun 2012 13:59:15 -0400\r
+Message-ID: <87pq8n1de4.fsf@awakening.csail.mit.edu>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain; charset=us-ascii\r
+X-Brightmail-Tracker:\r
+ H4sIAAAAAAAAA+NgFnrEIsWRmVeSWpSXmKPExsUixCmqrPt92Qt/g58HeSyu35zJbPH22Q1G\r
+       ByaPZ6tuMXts/PuDJYApissmJTUnsyy1SN8ugStjVvNr9oKVPBWHz81gbWD8ytnFyMkhIWAi\r
+       sX1TDzuELSZx4d56ti5GLg4hgX2MEpfunmCCcDYwSvTsfc0C4Zxkkljxci0zhLOEUaKpYxUj\r
+       SD+bgIbEtv3LwWwRgSSJR0da2UBsYQF3iYVvLjCB2JwCrhIzHk4EiwsJuEjsWncGrF5UIF7i\r
+       T+9msDiLgKrEg89XWEFsXqD7bl04zg5hC0qcnPmEBcRmFtCSuPHvJdMERoFZSFKzkKQWMDKt\r
+       YpRNya3SzU3MzClOTdYtTk7My0st0jXSy80s0UtNKd3ECA5KSd4djO8OKh1iFOBgVOLh9ah/\r
+       4S/EmlhWXJl7iFGSg0lJlHfREqAQX1J+SmVGYnFGfFFpTmrxIUYJDmYlEd4T84FyvCmJlVWp\r
+       RfkwKWkOFiVx3ispN/2FBNITS1KzU1MLUotgsjIcHEoSvFzA6BMSLEpNT61Iy8wpQUgzcXCC\r
+       DOcBGi67FGR4cUFibnFmOkT+FKOilDivNEizAEgiozQPrheWNF4xigO9IszLDVLFA0w4cN2v\r
+       gAYzAQ1uPfAMZHBJIkJKqoGxaYGhq+uZF+uues8stvSe901md2n9dsZt+9yDGr9f+z4528DN\r
+       846zSltLesXZtywJn5rWzLPpTytlrXtUZDVNy+XhjpPWSWoHTXbOj57dGTBPOuvenE2inaJ5\r
+       6t2SO4TM1bfofzrhkWger2sTfvddQ2Tpc/Xg5y8aXY4HFxhK/8pve+5XNVGJpTgj0VCLuag4\r
+       EQA28Z169QIAAA==\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+       <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Mon, 25 Jun 2012 17:59:22 -0000\r
+\r
+On Sun, 24 Jun 2012, Sascha Silbe <sascha-pgp@silbe.org> wrote:\r
+> All the time I thought what makes "notmuch new" so abysmally slow is the\r
+> stat() for each maildir. But as it continued to be slow even after I\r
+> moved most mails out of 'new' (into 'new-20120624'), I strace'd notmuch\r
+> and noticed it listed even unchanged directories, thereby listing and\r
+> iterating over each and every single of the 900k mails in my mail store.\r
+>\r
+> There's still quite some room for further improvements as it continues\r
+> to take several minutes to scan < 100 new mails in changed directories\r
+> containing < 1000 mails in total. Even the rsync run that fetches the\r
+> new mails is faster.\r
+\r
+I haven't looked over your patches yet, but this result surprises me.\r
+Could you explain your setup a little more?  How much mail do you have\r
+and across how many directories?  What file system are you using?\r
+\r
+I'm also surprised that your new approach helps.  This directory listing\r
+has to be read off disk one way or the other, but listing directories is\r
+the bread-and-butter of file systems, whereas I would think that Xapian\r
+would require more IO to accomplish the same effect.  Does your patch\r
+win because you can specifically list subdirectories out of Xapian,\r
+making the IO proportional to the number of subdirectories instead of\r
+the number of subdirectories and files (even though the constant factors\r
+probably favor reading from the file system)?\r
+\r
+I like the idea of these patches, I just want to make sure I have a firm\r
+grip on what's being optimized and why it wins.\r