1 Return-Path: <amdragon@mit.edu>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 9F89D431FBC
\r
6 for <notmuch@notmuchmail.org>; Thu, 24 Oct 2013 14:08:49 -0700 (PDT)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5
\r
12 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id HacakvWkBfXv for <notmuch@notmuchmail.org>;
\r
16 Thu, 24 Oct 2013 14:08:43 -0700 (PDT)
\r
17 Received: from dmz-mailsec-scanner-8.mit.edu (dmz-mailsec-scanner-8.mit.edu
\r
19 by olra.theworths.org (Postfix) with ESMTP id 51C7B431FB6
\r
20 for <notmuch@notmuchmail.org>; Thu, 24 Oct 2013 14:08:43 -0700 (PDT)
\r
21 X-AuditID: 12074425-b7f1c8e0000009c7-2f-52698c5a4118
\r
22 Received: from mailhub-auth-2.mit.edu ( [18.7.62.36])
\r
23 by dmz-mailsec-scanner-8.mit.edu (Symantec Messaging Gateway) with SMTP
\r
24 id 86.BB.02503.A5C89625; Thu, 24 Oct 2013 17:08:42 -0400 (EDT)
\r
25 Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11])
\r
26 by mailhub-auth-2.mit.edu (8.13.8/8.9.2) with ESMTP id r9OL8f3q027338;
\r
27 Thu, 24 Oct 2013 17:08:41 -0400
\r
28 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91])
\r
29 (authenticated bits=0)
\r
30 (User authenticated as amdragon@ATHENA.MIT.EDU)
\r
31 by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id r9OL8dFW030896
\r
32 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT);
\r
33 Thu, 24 Oct 2013 17:08:41 -0400
\r
34 Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80)
\r
35 (envelope-from <amdragon@mit.edu>)
\r
36 id 1VZS90-0005w3-Vb; Thu, 24 Oct 2013 17:08:39 -0400
\r
37 Date: Thu, 24 Oct 2013 17:08:37 -0400
\r
38 From: Austin Clements <amdragon@MIT.EDU>
\r
39 To: notmuch@notmuchmail.org
\r
40 Subject: Re: [PATCH] new: Don't scan unchanged directories with no
\r
42 Message-ID: <20131024210837.GH20337@mit.edu>
\r
43 References: <1382646822-24556-1-git-send-email-amdragon@mit.edu>
\r
45 Content-Type: text/plain; charset=us-ascii
\r
46 Content-Disposition: inline
\r
47 In-Reply-To: <1382646822-24556-1-git-send-email-amdragon@mit.edu>
\r
48 User-Agent: Mutt/1.5.21 (2010-09-15)
\r
49 X-Brightmail-Tracker:
\r
50 H4sIAAAAAAAAA+NgFmpileLIzCtJLcpLzFFi42IRYrdT0Y3qyQwyaN1jZHH95kxmi47bu9kc
\r
51 mDyerbrF7PHx6S2WAKYoLpuU1JzMstQifbsEroxHXSuYCx4KV0zb9JqtgbGXv4uRk0NCwETi
\r
52 8vzr7BC2mMSFe+vZuhi5OIQE9jFKLF3wAMrZyCixa+MvKOc0k8Sm43+YQVqEBJYwSrzqkQax
\r
53 WQRUJZonfwIbxSagIbFt/3JGEFtEQFpi593ZrF2MHBzMArISr38pgISFBUIk9m86CVbCK6Aj
\r
54 cWDaaiaIkQ4S/X+vsEPEBSVOznzCAmIzC2hJ3Pj3kglijLTE8n8cIGFOAUeJJ0fXg5WICqhI
\r
55 TDm5jW0Co9AsJN2zkHTPQuhewMi8ilE2JbdKNzcxM6c4NVm3ODkxLy+1SNdCLzezRC81pXQT
\r
56 IzioXVR3ME44pHSIUYCDUYmHt+FTepAQa2JZcWXuIUZJDiYlUd6E9swgIb6k/JTKjMTijPii
\r
57 0pzU4kOMEhzMSiK80/SAcrwpiZVVqUX5MClpDhYlcd5bHPZBQgLpiSWp2ampBalFMFkZDg4l
\r
58 CV6hbqBGwaLU9NSKtMycEoQ0EwcnyHAeoOGPukCGFxck5hZnpkPkTzEqSonz6oI0C4AkMkrz
\r
59 4HphSecVozjQK8K8hSBVPMCEBdf9CmgwE9DgKUvSQAaXJCKkpBoYrRsX7Mgvnvc2V8fXcf+8
\r
60 U5LiSiYLPn4UrQ9OurpZ6LLLqojij7IXEzf4ef46mLLyoQ/zi3v/WtfIrK/REqvzlfPQjPVc
\r
61 6DfVSsm2tvxLIbthndGZ4xIblq168TWR5bnfKdYpbxv6DBWUrMrFPng0mW7ady7o4W+LA+aJ
\r
62 n55/1Zjitr5pbqmwEktxRqKhFnNRcSIAHdCTPxUDAAA=
\r
63 X-BeenThere: notmuch@notmuchmail.org
\r
64 X-Mailman-Version: 2.1.13
\r
66 List-Id: "Use and development of the notmuch mail system."
\r
67 <notmuch.notmuchmail.org>
\r
68 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
69 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
70 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
71 List-Post: <mailto:notmuch@notmuchmail.org>
\r
72 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
73 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
74 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
75 X-List-Received-Date: Thu, 24 Oct 2013 21:08:49 -0000
\r
77 There might be a problem with this patch. Directory entries that are
\r
78 *symlinks* to other directories do not increase the containing
\r
79 directory's link count, but we do count them as directories in
\r
80 add_files pass 1 and traverse in to them. Hence, if you had a
\r
81 directory that contained no sub-directories, but did contain symlinks
\r
82 to other directories, we would fail to notice changes in the symlinked
\r
85 We could check if the database thinks there are sub-directories and
\r
86 only bail early if the directory is unchanged and *both* the file
\r
87 system and the database think there are no sub-directories.
\r
89 Quoth myself on Oct 24 at 4:33 pm:
\r
90 > This can substantially reduce the cost of notmuch new in some
\r
91 > situations, such as when the file system cache is cold or when the
\r
92 > Maildir is on NFS.
\r
94 > notmuch-new.c | 20 ++++++++++++++++++++
\r
95 > 1 file changed, 20 insertions(+)
\r
97 > diff --git a/notmuch-new.c b/notmuch-new.c
\r
98 > index faa33f1..364c73a 100644
\r
99 > --- a/notmuch-new.c
\r
100 > +++ b/notmuch-new.c
\r
101 > @@ -323,6 +323,26 @@ add_files (notmuch_database_t *notmuch,
\r
103 > db_mtime = directory ? notmuch_directory_get_mtime (directory) : 0;
\r
105 > + /* If the directory is unchanged from our last scan and has no
\r
106 > + * sub-directories, then return without scanning it at all. In
\r
107 > + * some situations, skipping the scan can substantially reduce the
\r
108 > + * cost of notmuch new, especially since the huge numbers of files
\r
109 > + * in Maildirs make scans expensive, but all files live in leaf
\r
112 > + * To check for sub-directories, we borrow a trick from find,
\r
113 > + * kpathsea, and many other UNIX tools: since a directory's link
\r
114 > + * count is the number of sub-directories (specifically, their
\r
115 > + * '..' entries) plus 2 (the link from the parent and the link for
\r
116 > + * '.'). This check is safe even on weird file systems, since
\r
117 > + * file systems that can't compute this will return 0 or 1. This
\r
118 > + * is safe even on *really* weird file systems like HFS+ that
\r
119 > + * mistakenly return the total number of directory entries, since
\r
120 > + * that only inflates the count beyond 2.
\r
122 > + if (directory && fs_mtime == db_mtime && st.st_nlink == 2)
\r
125 > /* If the database knows about this directory, then we sort based
\r
126 > * on strcmp to match the database sorting. Otherwise, we can do
\r
127 > * inode-based sorting for faster filesystem operation. */
\r