1 Return-Path: <amdragon@mit.edu>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 1B257431FBC
\r
6 for <notmuch@notmuchmail.org>; Mon, 3 Feb 2014 13:40:16 -0800 (PST)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5
\r
12 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id KlM0d8QK2WRE for <notmuch@notmuchmail.org>;
\r
16 Mon, 3 Feb 2014 13:40:08 -0800 (PST)
\r
17 Received: from dmz-mailsec-scanner-6.mit.edu (dmz-mailsec-scanner-6.mit.edu
\r
19 (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))
\r
20 (No client certificate requested)
\r
21 by olra.theworths.org (Postfix) with ESMTPS id 30F17431FAF
\r
22 for <notmuch@notmuchmail.org>; Mon, 3 Feb 2014 13:40:08 -0800 (PST)
\r
23 X-AuditID: 12074423-f79726d000000cc9-1c-52f00cb7a523
\r
24 Received: from mailhub-auth-3.mit.edu ( [18.9.21.43])
\r
25 (using TLS with cipher AES256-SHA (256/256 bits))
\r
26 (Client did not present a certificate)
\r
27 by dmz-mailsec-scanner-6.mit.edu (Symantec Messaging Gateway) with SMTP
\r
28 id 62.FD.03273.7BC00F25; Mon, 3 Feb 2014 16:40:07 -0500 (EST)
\r
29 Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11])
\r
30 by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id s13Le62l031338;
\r
31 Mon, 3 Feb 2014 16:40:07 -0500
\r
32 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91])
\r
33 (authenticated bits=0)
\r
34 (User authenticated as amdragon@ATHENA.MIT.EDU)
\r
35 by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id s13Le4qh015564
\r
36 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT);
\r
37 Mon, 3 Feb 2014 16:40:06 -0500
\r
38 Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80)
\r
39 (envelope-from <amdragon@mit.edu>)
\r
40 id 1WARFM-0002hk-66; Mon, 03 Feb 2014 16:40:04 -0500
\r
41 Date: Mon, 3 Feb 2014 16:40:03 -0500
\r
42 From: Austin Clements <amdragon@MIT.EDU>
\r
43 To: Jani Nikula <jani@nikula.org>
\r
44 Subject: Re: [PATCH v3 6/6] lib: parse messages only once
\r
45 Message-ID: <20140203214003.GN4375@mit.edu>
\r
46 References: <cover.1391456555.git.jani@nikula.org>
\r
47 <31d785c4a3e4b90862a0fdc545d4e900a4c898e2.1391456555.git.jani@nikula.org>
\r
49 Content-Type: text/plain; charset=us-ascii
\r
50 Content-Disposition: inline
\r
52 <31d785c4a3e4b90862a0fdc545d4e900a4c898e2.1391456555.git.jani@nikula.org>
\r
53 User-Agent: Mutt/1.5.21 (2010-09-15)
\r
54 X-Brightmail-Tracker:
\r
55 H4sIAAAAAAAAA+NgFmpkleLIzCtJLcpLzFFi42IR4hTV1t3O8yHIYOZCcYum6c4W12/OZHZg
\r
56 8rh1/zW7x7NVt5gDmKK4bFJSczLLUov07RK4MnY93MlccMOoonnTa7YGxifqXYycHBICJhJ7
\r
57 FnxhhrDFJC7cW8/WxcjFISQwm0li85rlYAkhgQ2MEn8PZkEkTjFJHGq+xA7hLGGUWDTxFytI
\r
58 FYuAisSW3xvAOtgENCS27V/OCGKLCChKbD65H8xmFpCW+Pa7mamLkYNDWMBS4kCTPUiYV0Bb
\r
59 4sqmJiaIZXUSR+bMYoOIC0qcnPmEBaJVS+LGv5dgrSBjlv/jAAlzCoRJzJwOsVUU6IIpJ7ex
\r
60 TWAUmoWkexaS7lkI3QsYmVcxyqbkVunmJmbmFKcm6xYnJ+blpRbpmunlZpbopaaUbmIEhTS7
\r
61 i/IOxj8HlQ4xCnAwKvHwdux9FyTEmlhWXJl7iFGSg0lJlFef4UOQEF9SfkplRmJxRnxRaU5q
\r
62 8SFGCQ5mJRFev0/vg4R4UxIrq1KL8mFS0hwsSuK8iTPeBAkJpCeWpGanphakFsFkZTg4lCR4
\r
63 OYGxKyRYlJqeWpGWmVOCkGbi4AQZzgM0XAqkhre4IDG3ODMdIn+KUVFKnPcnN1BCACSRUZoH
\r
64 1wtLOa8YxYFeEeZlBWnnAaYruO5XQIOZgAavcwW5urgkESEl1cC4x/DzAf9ji2fHJX9XTu2R
\r
65 DX5+/+Ehu5VajAz+OkVipxv1Fr9KMLQ26VTKtokrmVl8NqzYZI/swXtrvhfGrp/iY/ZEU0Pm
\r
66 4PrH8+KfXck1kLodO093BodEXmTJhA6HmMWV294Xr5ZepPBx5vWFabd7il0/6BVNvR0r1hrM
\r
67 tVrlcnifYPP08wJKLMUZiYZazEXFiQA8LkdPFAMAAA==
\r
68 Cc: notmuch@notmuchmail.org
\r
69 X-BeenThere: notmuch@notmuchmail.org
\r
70 X-Mailman-Version: 2.1.13
\r
72 List-Id: "Use and development of the notmuch mail system."
\r
73 <notmuch.notmuchmail.org>
\r
74 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
75 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
76 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
77 List-Post: <mailto:notmuch@notmuchmail.org>
\r
78 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
79 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
80 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
81 X-List-Received-Date: Mon, 03 Feb 2014 21:40:16 -0000
\r
83 Quoth Jani Nikula on Feb 03 at 9:51 pm:
\r
84 > Use the previously parsed gmime message for indexing instead of
\r
85 > running an extra parsing pass.
\r
87 > After this change, we'll only do unnecessary parsing of the message
\r
88 > body for duplicates and non-messages. For regular non-duplicate
\r
89 > messages, we have now shaved off an extra header parsing round during
\r
92 > lib/database.cc | 2 +-
\r
93 > lib/index.cc | 59 ++++++---------------------------------------------
\r
94 > lib/message-file.c | 9 ++++++++
\r
95 > lib/notmuch-private.h | 16 ++++++++++++--
\r
96 > 4 files changed, 30 insertions(+), 56 deletions(-)
\r
98 > diff --git a/lib/database.cc b/lib/database.cc
\r
99 > index d1bea88..3a29fe7 100644
\r
100 > --- a/lib/database.cc
\r
101 > +++ b/lib/database.cc
\r
102 > @@ -2029,7 +2029,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch,
\r
103 > date = notmuch_message_file_get_header (message_file, "date");
\r
104 > _notmuch_message_set_header_values (message, date, from, subject);
\r
106 > - ret = _notmuch_message_index_file (message, filename);
\r
107 > + ret = _notmuch_message_index_file (message, message_file);
\r
111 > diff --git a/lib/index.cc b/lib/index.cc
\r
112 > index 976e49f..71397da 100644
\r
113 > --- a/lib/index.cc
\r
114 > +++ b/lib/index.cc
\r
115 > @@ -425,52 +425,15 @@ _index_mime_part (notmuch_message_t *message,
\r
118 > _notmuch_message_index_file (notmuch_message_t *message,
\r
119 > - const char *filename)
\r
120 > + notmuch_message_file_t *message_file)
\r
122 > - GMimeStream *stream = NULL;
\r
123 > - GMimeParser *parser = NULL;
\r
124 > - GMimeMessage *mime_message = NULL;
\r
125 > + GMimeMessage *mime_message;
\r
126 > InternetAddressList *addresses;
\r
127 > - FILE *file = NULL;
\r
128 > const char *from, *subject;
\r
129 > - notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS;
\r
130 > - static int initialized = 0;
\r
131 > - char from_buf[5];
\r
132 > - bool is_mbox = false;
\r
134 > - if (! initialized) {
\r
135 > - g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);
\r
136 > - initialized = 1;
\r
139 > - file = fopen (filename, "r");
\r
141 > - fprintf (stderr, "Error opening %s: %s\n", filename, strerror (errno));
\r
142 > - ret = NOTMUCH_STATUS_FILE_ERROR;
\r
146 > - /* Is this mbox? */
\r
147 > - if (fread (from_buf, sizeof (from_buf), 1, file) == 1 &&
\r
148 > - strncmp (from_buf, "From ", 5) == 0)
\r
149 > - is_mbox = true;
\r
152 > - /* Evil GMime steals my FILE* here so I won't fclose it. */
\r
153 > - stream = g_mime_stream_file_new (file);
\r
155 > - parser = g_mime_parser_new_with_stream (stream);
\r
156 > - g_mime_parser_set_scan_from (parser, is_mbox);
\r
158 > - mime_message = g_mime_parser_construct_message (parser);
\r
161 > - if (!g_mime_parser_eos (parser)) {
\r
162 > - /* This is a multi-message mbox. */
\r
163 > - ret = NOTMUCH_STATUS_FILE_NOT_EMAIL;
\r
167 > + mime_message = notmuch_message_file_get_mime_message (message_file);
\r
168 > + if (! mime_message)
\r
169 > + return NOTMUCH_STATUS_FILE_NOT_EMAIL; /* more like internal error */
\r
171 Are there situations other than forgetting to call
\r
172 notmuch_message_file_parse that could cause this? (Speaking of which,
\r
173 where is notmuch_message_file_parse called?)
\r
176 > from = g_mime_message_get_sender (mime_message);
\r
178 > @@ -491,15 +454,5 @@ _notmuch_message_index_file (notmuch_message_t *message,
\r
180 > _index_mime_part (message, g_mime_message_get_mime_part (mime_message));
\r
183 > - if (mime_message)
\r
184 > - g_object_unref (mime_message);
\r
187 > - g_object_unref (parser);
\r
190 > - g_object_unref (stream);
\r
193 > + return NOTMUCH_STATUS_SUCCESS;
\r
195 > diff --git a/lib/message-file.c b/lib/message-file.c
\r
196 > index 33f6468..99e1dc8 100644
\r
197 > --- a/lib/message-file.c
\r
198 > +++ b/lib/message-file.c
\r
199 > @@ -250,6 +250,15 @@ mboxes is deprecated and may be removed in the future.\n", message->filename);
\r
200 > return NOTMUCH_STATUS_SUCCESS;
\r
204 > +notmuch_message_file_get_mime_message (notmuch_message_file_t *message)
\r
206 > + if (! message->parsed)
\r
209 This seems like another good opportunity to call the parser lazily and
\r
210 hide notmuch_message_file_parse from the caller, rather than requiring
\r
211 the caller to implement a particular call sequence (which I wasn't
\r
212 even able to find above). This might also clean up the error handling
\r
213 in the call to notmuch_message_file_get_mime_message above.
\r
216 > + return message->message;
\r
219 > /* return NULL on errors, empty string for non-existing headers */
\r
221 > notmuch_message_file_get_header (notmuch_message_file_t *message,
\r
222 > diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h
\r
223 > index 7277df1..7559521 100644
\r
224 > --- a/lib/notmuch-private.h
\r
225 > +++ b/lib/notmuch-private.h
\r
226 > @@ -46,6 +46,8 @@ NOTMUCH_BEGIN_DECLS
\r
228 > #include <talloc.h>
\r
230 > +#include <gmime/gmime.h>
\r
232 > #include "xutil.h"
\r
233 > #include "error_util.h"
\r
235 > @@ -320,9 +322,11 @@ notmuch_message_get_author (notmuch_message_t *message);
\r
239 > +typedef struct _notmuch_message_file notmuch_message_file_t;
\r
242 > _notmuch_message_index_file (notmuch_message_t *message,
\r
243 > - const char *filename);
\r
244 > + notmuch_message_file_t *message_file);
\r
246 > /* message-file.c */
\r
248 > @@ -330,7 +334,6 @@ _notmuch_message_index_file (notmuch_message_t *message,
\r
249 > * into the public interface in notmuch.h
\r
252 > -typedef struct _notmuch_message_file notmuch_message_file_t;
\r
254 > /* Open a file containing a single email message.
\r
256 > @@ -377,6 +380,15 @@ void
\r
257 > notmuch_message_file_restrict_headersv (notmuch_message_file_t *message,
\r
258 > va_list va_headers);
\r
260 > +/* Get the gmime message of a parsed message file.
\r
262 > + * Returns NULL if the message file has not been parsed.
\r
264 > + * XXX: Would be nice to not have to expose GMimeMessage here.
\r
266 Maybe just forward-declare struct GMimeMessage? Then you also
\r
267 wouldn't need to add the gmime #include.
\r
271 > +notmuch_message_file_get_mime_message (notmuch_message_file_t *message);
\r
273 > /* Get the value of the specified header from the message as a UTF-8 string.
\r
275 > * The header name is case insensitive.
\r