1 Return-Path: <jrollins@finestructure.net>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 15171431FAF
\r
6 for <notmuch@notmuchmail.org>; Wed, 5 Jun 2013 08:22:23 -0700 (PDT)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-2.3 tagged_above=-999 required=5
\r
12 tests=[RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id l69V0CyDdLoY for <notmuch@notmuchmail.org>;
\r
16 Wed, 5 Jun 2013 08:22:15 -0700 (PDT)
\r
17 Received: from outgoing-mail.its.caltech.edu (outgoing-mail.its.caltech.edu
\r
19 by olra.theworths.org (Postfix) with ESMTP id B4DC2431FAE
\r
20 for <notmuch@notmuchmail.org>; Wed, 5 Jun 2013 08:22:15 -0700 (PDT)
\r
21 Received: from fire-doxen.imss.caltech.edu (localhost [127.0.0.1])
\r
22 by fire-doxen-postvirus (Postfix) with ESMTP id 2C958328004;
\r
23 Wed, 5 Jun 2013 08:22:13 -0700 (PDT)
\r
24 X-Spam-Scanned: at Caltech-IMSS on fire-doxen by amavisd-new
\r
25 Received: from finestructure.net (cpe-76-173-75-86.socal.res.rr.com
\r
26 [76.173.75.86]) (Authenticated sender: jrollins)
\r
27 by fire-doxen-submit (Postfix) with ESMTP id 831E4328025;
\r
28 Wed, 5 Jun 2013 08:22:03 -0700 (PDT)
\r
29 Received: by finestructure.net (Postfix, from userid 1000)
\r
30 id CB1906171A; Wed, 5 Jun 2013 08:22:02 -0700 (PDT)
\r
31 From: Jameson Graef Rollins <jrollins@finestructure.net>
\r
32 To: Austin Clements <amdragon@MIT.EDU>, notmuch@notmuchmail.org
\r
33 Subject: Re: [PATCH v3 0/6] Make Emacs search use sexp format
\r
34 In-Reply-To: <87bo7mtp79.fsf@awakening.csail.mit.edu>
\r
35 References: <1370047208-12785-1-git-send-email-amdragon@mit.edu>
\r
36 <87sj12yqyu.fsf@maritornes.cs.unb.ca>
\r
37 <87r4gk8qa5.fsf@servo.finestructure.net>
\r
38 <87bo7mtp79.fsf@awakening.csail.mit.edu>
\r
39 User-Agent: Notmuch/0.15.2+155~g7fa0560 (http://notmuchmail.org) Emacs/24.3.1
\r
40 (x86_64-pc-linux-gnu)
\r
41 Date: Wed, 05 Jun 2013 08:21:59 -0700
\r
42 Message-ID: <87a9n460rs.fsf@servo.finestructure.net>
\r
44 Content-Type: multipart/signed; boundary="=-=-=";
\r
45 micalg=pgp-sha256; protocol="application/pgp-signature"
\r
46 Cc: tomi.ollila@iki.fi
\r
47 X-BeenThere: notmuch@notmuchmail.org
\r
48 X-Mailman-Version: 2.1.13
\r
50 List-Id: "Use and development of the notmuch mail system."
\r
51 <notmuch.notmuchmail.org>
\r
52 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
53 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
54 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
55 List-Post: <mailto:notmuch@notmuchmail.org>
\r
56 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
57 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
58 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
59 X-List-Received-Date: Wed, 05 Jun 2013 15:22:23 -0000
\r
62 Content-Type: text/plain
\r
64 On Mon, Jun 03 2013, Austin Clements <amdragon@MIT.EDU> wrote:
\r
65 >> * Killing a search buffer that is still in the process of being filled
\r
66 >> causes errors to be thrown. I'm seeing both of the following
\r
69 >> [Sun Jun 2 08:26:40 2013]
\r
70 >> notmuch exited with status killed
\r
71 >> command: notmuch search --format\=sexp --format-version\=1 --sort\=newest-first to\:jrollins
\r
72 >> exit signal: killed
\r
74 >> [Sun Jun 2 08:32:26 2013]
\r
75 >> notmuch exited with status hangup
\r
76 >> command: notmuch search --format\=sexp --format-version\=1 --sort\=newest-first to\:jrollins
\r
77 >> exit signal: hangup
\r
79 >> This is somewhat understandable, as the notmuch binary exits with an
\r
80 >> error if it hasn't finished dumping the output, but given how common
\r
81 >> this particular scenario is I think we should try to avoid throwing
\r
82 >> errors in this circumstance. I wonder if we shouldn't just modify the
\r
83 >> binary to not return non-zero if it was manually killed while
\r
84 >> processing the output, or at least special-case the particular error
\r
85 >> caused by manually killing the search.
\r
87 > Your assessment is correct, of course. The right place to fix this is
\r
88 > in Emacs, not the CLI (the CLI *can't* do anything about this, since it
\r
89 > gets killed by a signal). Probably we should do something different in
\r
90 > the sentinel if the search process's buffer is no longer live. Clearly
\r
91 > we should suppress the status error for the signal, but I think we still
\r
92 > should report anything that appeared in err-file because it may be
\r
93 > relevant to why the user killed the buffer (e.g., maybe a notmuch
\r
94 > wrapper was blocked on something).
\r
96 That seems like a reasonable approach to me, to suppress the error but
\r
97 continue to report in *Notmuch errors* buffer.
\r
99 >> * The next thing I'm seeing is this:
\r
101 >> Opening input file: no such file or directory, /home/jrollins/tmp/nmerr5390CAY
\r
103 >> I'm not exactly sure what causes this error, but it looks to me like
\r
104 >> the temporary error file was removed before we were finished with it.
\r
106 > This one's pretty awesome (and I think is a bug in Emacs). At a high
\r
107 > level, the sentinel is getting run twice and since the first call
\r
108 > deletes the error file, the second call fails. At a low level, what
\r
109 > causes this is fascinating.
\r
111 > 1) You kill the search buffer. This invokes kill_buffer_processes,
\r
112 > which sends a SIGHUP to notmuch, but doesn't do anything else.
\r
113 > Meanwhile, the notmuch search process has printed some more output,
\r
114 > but Emacs hasn't consumed it yet (this is critical).
\r
116 > 2) Emacs gets a SIGCHLD from the dying notmuch process, which invokes
\r
117 > handle_child_signal, which sets the new process status, but can't do
\r
118 > anything else because it's a signal handler.
\r
120 > 3) Emacs returns to its idle loop, which calls status_notify, which sees
\r
121 > that the notmuch process has a new status. This is where things get
\r
124 > 3.1) Emacs guarantees that it will run process filters on any unconsumed
\r
125 > output before running the process sentinel, so status_notify calls
\r
126 > read_process_output, which consumes the final output and calls
\r
127 > notmuch-search-process-filter.
\r
129 > 3.1.1) notmuch-search-process-filter contains code to check if the
\r
130 > search buffer is still alive and, since it's not, it calls
\r
133 > 3.1.1.1) delete-process correctly sees that the process is already dead
\r
134 > and doesn't try to send another signal, *but* it still modifies
\r
135 > the status to "killed". To deal with the new status, it calls
\r
136 > status_notify. Dun dun dun. We've seen this function before.
\r
138 > 3.1.1.1.1) The *recursive* status_notify invocation sees that the
\r
139 > process has a new status and doesn't have any more output to
\r
140 > consume, so it invokes our sentinel and returns.
\r
142 > 3.2) The outer status_notify call (which we're still in) is now done
\r
143 > flushing pending process output, so it *also* invokes our sentinel.
\r
145 > It might be that the answer is to just remove the delete-process call
\r
146 > from the filter. It seems completely redundant (and racy) with Emacs'
\r
147 > automatic SIGHUP'ing.
\r
149 Wow, awesome detective work. As mentioned on IRC, this suggestion of
\r
150 Austin's does seem to fix the problem:
\r
152 diff --git a/emacs/notmuch.el b/emacs/notmuch.el
\r
153 index 5a8c957..975ef2b 100644
\r
154 --- a/emacs/notmuch.el
\r
155 +++ b/emacs/notmuch.el
\r
156 @@ -817,7 +817,7 @@ non-authors is found, assume that all of the authors match."
\r
157 (inhibit-read-only t)
\r
159 (if (not (buffer-live-p results-buf))
\r
160 - (delete-process proc)
\r
162 (with-current-buffer parse-buf
\r
167 I'm not sure if this is the ultimate solution, but it does cause the
\r
168 missing tmp file errors to go away.
\r
170 >> * Finally, something happened that caused *12,000* of the following lines
\r
171 >> to be sent to the *Notmuch errors* buffer:
\r
173 >> A Xapian exception occurred performing query: The revision being read has been discarded - you should call Xapian::Database::reopen() and retry the operation
\r
175 >> Again, this was related to killing a search buffer that was still
\r
176 >> being filled. I'm pretty sure the database was not modified during
\r
179 > I have no insight on this one. My best guess is that this has nothing
\r
180 > to do with this change except that this change makes these warnings
\r
181 > visible rather than burying them somewhere down in the search results
\r
184 Yeah, I suspected as much as well.
\r
189 Content-Type: application/pgp-signature
\r
191 -----BEGIN PGP SIGNATURE-----
\r
192 Version: GnuPG v1.4.12 (GNU/Linux)
\r
194 iQIcBAEBCAAGBQJRr1eYAAoJEO00zqvie6q8d4IP/RWe/4CnDqQSm9QQLSoUfNwm
\r
195 tIuDzsvo4reBrNlPwQrB5+jKtVTcd7byDv/OJcTbyr44M3qy/2LUzQOexm1WBGj1
\r
196 wyg/qaiAKac0KjY/3zaxGwkXe/CMgfKJ3/zcBOnUFk3TadRs+KiKdZM6aS2KgQj/
\r
197 OjSTZoVd9/6MTDf3je5fwcl+J74I0rOYmntK37PuRcMrNmSOsFxMH8sAvp8KgkFF
\r
198 jY/IJ3mq/sS7/Juc02HN5IWKlByU6t7W3IqoYIbLfwT/TlyFZgMxXBHiae3nFYyx
\r
199 WnJaYzU0zz7bFem0eka1uwhEQuvECeYBZmo5FrHF20uzPNpeMf2SH7P3hA39TQhn
\r
200 E5HiKQzBNj5N67+7t8xrnaAbYPv+kTRO1iy7HjxjYoG0XcW9iJkz3lyqi9cqsLHu
\r
201 gbA+VaiUBoEws3afDj3AJ+scDfW6pRgSWVi+nfMDaqziRKn3rKSquNjlXuWnW6AA
\r
202 jjkL9P4OF3LpNJzilt1j+ocfqtmI86dj/Z8xK92Meh4zCBSMmq3DmXtkIm7uLQ8k
\r
203 rOE4BCCNa1uuMBVP8x1iz3ogHg4Nygsjh8N8tgww+a50rWrXTv5FLAUDpwCTI7Fm
\r
204 075Qgm7SiU06R4bF2YcOze/qJBg/sZyirm7ZfaYS7SiYZO1PP9vT4VIcxNviIExc
\r
205 1WVjDwXoTkarE3EU6ika
\r
207 -----END PGP SIGNATURE-----
\r