1 Return-Path: <fatkasuvayu+linux@gmail.com>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id DDD26431FBC
\r
6 for <notmuch@notmuchmail.org>; Tue, 16 Oct 2012 07:55:12 -0700 (PDT)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-0.799 tagged_above=-999 required=5
\r
12 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
\r
13 FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled
\r
14 Received: from olra.theworths.org ([127.0.0.1])
\r
15 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
16 with ESMTP id 38H4adXAWhIE for <notmuch@notmuchmail.org>;
\r
17 Tue, 16 Oct 2012 07:55:10 -0700 (PDT)
\r
18 Received: from mail-wi0-f179.google.com (mail-wi0-f179.google.com
\r
19 [209.85.212.179]) (using TLSv1 with cipher RC4-SHA (128/128 bits))
\r
20 (No client certificate requested)
\r
21 by olra.theworths.org (Postfix) with ESMTPS id 7751B431FB6
\r
22 for <notmuch@notmuchmail.org>; Tue, 16 Oct 2012 07:55:10 -0700 (PDT)
\r
23 Received: by mail-wi0-f179.google.com with SMTP id hq7so2923337wib.2
\r
24 for <notmuch@notmuchmail.org>; Tue, 16 Oct 2012 07:55:08 -0700 (PDT)
\r
25 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
\r
26 h=sender:date:from:to:subject:message-id:mail-followup-to:references
\r
27 :mime-version:content-type:content-disposition
\r
28 :content-transfer-encoding:in-reply-to:user-agent;
\r
29 bh=BJhNgeEvF4Du1+GeH7pci0NDb7w7CuRZCoGthXiZZ1M=;
\r
30 b=Q1kTKKVckQE47xaKElMw3AopEf6QMJ4hHlSgOKm9+iDxHPqP7U6D7bjfha3NjDVsyW
\r
31 XAlWLXFugnQDG6CPyxnNoOVA/T7SEiLjWE+myuH0T3D//YHDnET83KwcKlzzc1xihclM
\r
32 jcPYWxt6sPhZIdVB7Z/GhaE6bFym5giRe8TFoJlHEKCgFnUmia+1sTkza5X0EPE4aobc
\r
33 3M0s9SAYzV6VOjxFQPZXnO0sIFhLom0AY006GfHuq1PipXa9GnJcg4NryoYS0Iv+eDf6
\r
34 kg3JvpK+sZK7yT2QXiDtjG+yobHQgMK2q3oEabDIiInl6DayUY8p9cG9vWVpYTQqP7Vv
\r
36 Received: by 10.180.80.104 with SMTP id q8mr32528965wix.6.1350399307879;
\r
37 Tue, 16 Oct 2012 07:55:07 -0700 (PDT)
\r
38 Received: from kuru.dyndns-at-home.com (pb-d-128-141-52-183.cern.ch.
\r
40 by mx.google.com with ESMTPS id f1sm19474364wiy.2.2012.10.16.07.55.05
\r
41 (version=TLSv1/SSLv3 cipher=OTHER);
\r
42 Tue, 16 Oct 2012 07:55:06 -0700 (PDT)
\r
43 Sender: suvayu ali <fatkasuvayu@gmail.com>
\r
44 Date: Tue, 16 Oct 2012 16:55:03 +0200
\r
45 From: Suvayu Ali <fatkasuvayu+linux@gmail.com>
\r
46 To: notmuch@notmuchmail.org
\r
47 Subject: Re: nbook: a notmuch based address book written in python
\r
48 Message-ID: <20121016145503.GC11488@kuru.dyndns-at-home.com>
\r
49 Mail-Followup-To: notmuch@notmuchmail.org
\r
50 References: <20120924082646.GA10577@kuru.dyndns-at-home.com>
\r
51 <20120925104457.12264.30350@megatron>
\r
52 <20121008093429.GC4534@kuru.dyndns-at-home.com>
\r
53 <20121013165851.29671.29869@brick.lan>
\r
54 <20121015105830.12412.43278@thinkbox.jade-hamburg.de>
\r
56 Content-Type: text/plain; charset=utf-8
\r
57 Content-Disposition: inline
\r
58 Content-Transfer-Encoding: 8bit
\r
59 In-Reply-To: <20121015105830.12412.43278@thinkbox.jade-hamburg.de>
\r
60 User-Agent: Mutt/1.5.21 (2011-07-01)
\r
61 X-BeenThere: notmuch@notmuchmail.org
\r
62 X-Mailman-Version: 2.1.13
\r
64 List-Id: "Use and development of the notmuch mail system."
\r
65 <notmuch.notmuchmail.org>
\r
66 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
67 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
68 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
69 List-Post: <mailto:notmuch@notmuchmail.org>
\r
70 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
71 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
72 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
73 X-List-Received-Date: Tue, 16 Oct 2012 14:55:13 -0000
\r
77 I finally had time to go through your response carefully.
\r
79 On Mon, Oct 15, 2012 at 12:58:30PM +0200, Justus Winter wrote:
\r
81 > > > > -------------------------------
\r
82 > > > > [~] time nbook Patrick
\r
84 > > > > Error opening /home/pazz/mail/gmail/[Google Mail].All Mail/cur/1330682270_0.12958.megatron,U=8766,FMD5=66ff6a8bc18a8a3ac4b311daa93d358a:2,S: Too many open files
\r
85 > > > > Traceback (most recent call last):
\r
86 > > > > File "/home/pazz/bin/nbook", line 167, in <module>
\r
87 > > > > File "/home/pazz/bin/nbook", line 71, in __init__
\r
88 > > > > File "/home/pazz/.local/lib/python2.7/site-packages/notmuch/message.py", line 233, in get_header
\r
89 > > > > notmuch.errors.NullPointerError
\r
93 > > As mentioned before, I think you invalidate the Database object concurrently
\r
94 > > while your long-running algorithm goes through all messages.
\r
95 > > Xapian doesn't handle concurrent access to the index like a normalâ„¢ database would.
\r
96 > > This means you are notified by this error that some changes were detected.
\r
97 > > Maybe the error message should be more telling here though. Teythoon?
\r
99 > The reason for this error is exactly what the error message says, you
\r
100 > are opening to many files. Check out this limit using ulimit -n:
\r
105 > This problem is subtle. Here is a minimal test case:
\r
110 > with notmuch.Database() as db:
\r
111 > query = notmuch.Query(db, 'a').search_messages()
\r
112 > for msg in query:
\r
113 > msg.get_header('from')
\r
115 > with notmuch.Database() as db:
\r
116 > query = notmuch.Query(db, 'a').search_messages()
\r
117 > for msg in list(query):
\r
118 > msg.get_header('from')
\r
122 > Error opening /home/teythoon/Maildir/.lists.notmuch/cur/1323251462.M53044P18514.thinkbox,S=7306,W=7466:2,: Too many open files
\r
123 > Traceback (most recent call last):
\r
124 > File "test.py", line 11, in <module>
\r
125 > msg.get_header('from')
\r
126 > File "/home/teythoon/.local/lib/python2.7/site-packages/notmuch/message.py", line 237, in get_header
\r
127 > raise NullPointerError()
\r
128 > notmuch.errors.NullPointerError
\r
130 > Observe that it blows up in line 11, the first version works. The only
\r
131 > difference is that the second version creates a list from the notmuch
\r
132 > query. This prevents the garbage collector from collecting the message
\r
133 > objects and thus closing the file handles. So here's your fix:
\r
136 > diff --git a/nbook b/nbook
\r
137 > index 387c71d..b3d4fd6 100755
\r
140 > @@ -173,7 +173,7 @@ class AddressHeaders(object):
\r
143 > query = Query(db, 'from:"{0}" or to:"{0}"'.format(querystr))
\r
144 > -msgs = list(query.search_messages())
\r
145 > +msgs = query.search_messages()
\r
147 > addresses = AddressHeaders(msgs, querystr)
\r
152 This explanation helped me a lot, thanks!
\r
154 > A few more comments:
\r
156 > > from notmuch import *
\r
158 > Please avoid * imports, they prevent tools like pyflakes from checking
\r
159 > whether you accidentally misspelled any identifiers.
\r
162 Point taken. I'll be more careful in the future. :)
\r
164 > > pyversion = float('%d.%d' % (sys.version_info.major, sys.version_info.minor))
\r
165 > > if pyversion < 2.7:
\r
167 > Converting this to float feels wrong. Consider doing sth like
\r
169 > if sys.version_info.major > 2 or (sys.version_info.major == 2 and sys.version_info.minor >= 7):
\r
172 I incorporated these suggestions too.
\r
174 > > print '`nbook\' needs Python 2.7 or higher for argparse'
\r
176 > Note that in py3k print is a function and not a statement, so you need
\r
177 > to use braces. Consider dropping this at the beginning of all your
\r
178 > python files to make py2.7 use the new features:
\r
180 > from __future__ import print_function, absolute_import, unicode_literals
\r
184 > exit is not a builtin function. You have to use sys.exit. Tools like
\r
185 > pyflakes can spot this kind of mistakes. Also, sys.exit also accepts a
\r
186 > string as argument which it prints to stderr before exiting with an
\r
190 I will read-up some more about the above suggestions and update
\r
193 > > self.__fromhdr__ += ',' + msg.get_header('from')
\r
195 > Hm, this is somewhat unpythonic. It used to be the case that building
\r
196 > strings this way was a lot slower than building a list and then
\r
197 > joining it on a delimiter of your choice
\r
198 > (i.e. ','.join(from_headers)). This is (was?) because strings are
\r
199 > immutable in python and constantly creating strings just to throw them
\r
200 > away in the next iteration puts a lot of pressure on the memory
\r
201 > management system. Somewhat recent discussion here:
\r
203 > http://stackoverflow.com/questions/1316887/what-is-the-most-efficient-string-concatenation-method-in-python
\r
206 I had a commit with ','.join(..) in a private branch, but thanks for
\r
207 pointing out the reasons and the links to the discussion. This was very
\r
210 > > def print_addrs(self, fmtstr='', query=''):
\r
211 > > if '' == fmtstr: fmtstr = '%s %s\n'
\r
213 > Ok, several things here:
\r
215 > * The comparison looks weird, you are using the string constant as the
\r
216 > first operand. While this is technically not wrong, it is somewhat
\r
217 > unpythonic b/c if you read it out loud (''if the empty string is
\r
218 > equal to fmtstr'') it somewhat bends the 1:1 mapping of the semantic
\r
219 > of your program and the English sentence. It looks like this c hack
\r
220 > that is actually unnecessary in python b/c you cannot use the
\r
221 > assignment operator as a value (except for a=b=c=0 style
\r
225 Yes you are correct, I'm more used to C/C++ and the reason you mention
\r
226 is why I tend to write comparisons like that. I'll retrain my fingers
\r
227 for python from now on.
\r
229 > * Please don't put multiple statements in one line.
\r
232 I will keep that in mind for the future.
\r
234 > * This can be written shorter and more idiomatic (yay keyword
\r
237 > def print_addrs(self, fmtstr='%s %s\n', query=''):
\r
241 That was silly of me not to do that in the first place! :-p
\r
246 Thank you soo much for this incredibly informative response. I learned
\r
254 Open source is the future. It sets us free.
\r