Re: [PATCH v3] nmbug: Translate to Python
[notmuch-archives.git] / 1b / a8755bcb92fbba185c5888815b91b7956bdec8
1 Return-Path: <amdragon@mit.edu>\r
2 X-Original-To: notmuch@notmuchmail.org\r
3 Delivered-To: notmuch@notmuchmail.org\r
4 Received: from localhost (localhost [127.0.0.1])\r
5         by olra.theworths.org (Postfix) with ESMTP id 1B257431FBC\r
6         for <notmuch@notmuchmail.org>; Mon,  3 Feb 2014 13:40:16 -0800 (PST)\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
8 X-Spam-Flag: NO\r
9 X-Spam-Score: -0.7\r
10 X-Spam-Level: \r
11 X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5\r
12         tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled\r
13 Received: from olra.theworths.org ([127.0.0.1])\r
14         by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
15         with ESMTP id KlM0d8QK2WRE for <notmuch@notmuchmail.org>;\r
16         Mon,  3 Feb 2014 13:40:08 -0800 (PST)\r
17 Received: from dmz-mailsec-scanner-6.mit.edu (dmz-mailsec-scanner-6.mit.edu\r
18         [18.7.68.35])\r
19         (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))\r
20         (No client certificate requested)\r
21         by olra.theworths.org (Postfix) with ESMTPS id 30F17431FAF\r
22         for <notmuch@notmuchmail.org>; Mon,  3 Feb 2014 13:40:08 -0800 (PST)\r
23 X-AuditID: 12074423-f79726d000000cc9-1c-52f00cb7a523\r
24 Received: from mailhub-auth-3.mit.edu ( [18.9.21.43])\r
25         (using TLS with cipher AES256-SHA (256/256 bits))\r
26         (Client did not present a certificate)\r
27         by dmz-mailsec-scanner-6.mit.edu (Symantec Messaging Gateway) with SMTP\r
28         id 62.FD.03273.7BC00F25; Mon,  3 Feb 2014 16:40:07 -0500 (EST)\r
29 Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11])\r
30         by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id s13Le62l031338; \r
31         Mon, 3 Feb 2014 16:40:07 -0500\r
32 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91])\r
33         (authenticated bits=0)\r
34         (User authenticated as amdragon@ATHENA.MIT.EDU)\r
35         by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id s13Le4qh015564\r
36         (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT);\r
37         Mon, 3 Feb 2014 16:40:06 -0500\r
38 Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80)\r
39         (envelope-from <amdragon@mit.edu>)\r
40         id 1WARFM-0002hk-66; Mon, 03 Feb 2014 16:40:04 -0500\r
41 Date: Mon, 3 Feb 2014 16:40:03 -0500\r
42 From: Austin Clements <amdragon@MIT.EDU>\r
43 To: Jani Nikula <jani@nikula.org>\r
44 Subject: Re: [PATCH v3 6/6] lib: parse messages only once\r
45 Message-ID: <20140203214003.GN4375@mit.edu>\r
46 References: <cover.1391456555.git.jani@nikula.org>\r
47         <31d785c4a3e4b90862a0fdc545d4e900a4c898e2.1391456555.git.jani@nikula.org>\r
48 MIME-Version: 1.0\r
49 Content-Type: text/plain; charset=us-ascii\r
50 Content-Disposition: inline\r
51 In-Reply-To:\r
52  <31d785c4a3e4b90862a0fdc545d4e900a4c898e2.1391456555.git.jani@nikula.org>\r
53 User-Agent: Mutt/1.5.21 (2010-09-15)\r
54 X-Brightmail-Tracker:\r
55  H4sIAAAAAAAAA+NgFmpkleLIzCtJLcpLzFFi42IR4hTV1t3O8yHIYOZCcYum6c4W12/OZHZg\r
56         8rh1/zW7x7NVt5gDmKK4bFJSczLLUov07RK4MnY93MlccMOoonnTa7YGxifqXYycHBICJhJ7\r
57         FnxhhrDFJC7cW8/WxcjFISQwm0li85rlYAkhgQ2MEn8PZkEkTjFJHGq+xA7hLGGUWDTxFytI\r
58         FYuAisSW3xvAOtgENCS27V/OCGKLCChKbD65H8xmFpCW+Pa7mamLkYNDWMBS4kCTPUiYV0Bb\r
59         4sqmJiaIZXUSR+bMYoOIC0qcnPmEBaJVS+LGv5dgrSBjlv/jAAlzCoRJzJwOsVUU6IIpJ7ex\r
60         TWAUmoWkexaS7lkI3QsYmVcxyqbkVunmJmbmFKcm6xYnJ+blpRbpmunlZpbopaaUbmIEhTS7\r
61         i/IOxj8HlQ4xCnAwKvHwdux9FyTEmlhWXJl7iFGSg0lJlFef4UOQEF9SfkplRmJxRnxRaU5q\r
62         8SFGCQ5mJRFev0/vg4R4UxIrq1KL8mFS0hwsSuK8iTPeBAkJpCeWpGanphakFsFkZTg4lCR4\r
63         OYGxKyRYlJqeWpGWmVOCkGbi4AQZzgM0XAqkhre4IDG3ODMdIn+KUVFKnPcnN1BCACSRUZoH\r
64         1wtLOa8YxYFeEeZlBWnnAaYruO5XQIOZgAavcwW5urgkESEl1cC4x/DzAf9ji2fHJX9XTu2R\r
65         DX5+/+Ehu5VajAz+OkVipxv1Fr9KMLQ26VTKtokrmVl8NqzYZI/swXtrvhfGrp/iY/ZEU0Pm\r
66         4PrH8+KfXck1kLodO093BodEXmTJhA6HmMWV294Xr5ZepPBx5vWFabd7il0/6BVNvR0r1hrM\r
67         tVrlcnifYPP08wJKLMUZiYZazEXFiQA8LkdPFAMAAA==\r
68 Cc: notmuch@notmuchmail.org\r
69 X-BeenThere: notmuch@notmuchmail.org\r
70 X-Mailman-Version: 2.1.13\r
71 Precedence: list\r
72 List-Id: "Use and development of the notmuch mail system."\r
73         <notmuch.notmuchmail.org>\r
74 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
75         <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
76 List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
77 List-Post: <mailto:notmuch@notmuchmail.org>\r
78 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
79 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
80         <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
81 X-List-Received-Date: Mon, 03 Feb 2014 21:40:16 -0000\r
82 \r
83 Quoth Jani Nikula on Feb 03 at  9:51 pm:\r
84 > Use the previously parsed gmime message for indexing instead of\r
85 > running an extra parsing pass.\r
86\r
87 > After this change, we'll only do unnecessary parsing of the message\r
88 > body for duplicates and non-messages. For regular non-duplicate\r
89 > messages, we have now shaved off an extra header parsing round during\r
90 > indexing.\r
91 > ---\r
92 >  lib/database.cc       |  2 +-\r
93 >  lib/index.cc          | 59 ++++++---------------------------------------------\r
94 >  lib/message-file.c    |  9 ++++++++\r
95 >  lib/notmuch-private.h | 16 ++++++++++++--\r
96 >  4 files changed, 30 insertions(+), 56 deletions(-)\r
97\r
98 > diff --git a/lib/database.cc b/lib/database.cc\r
99 > index d1bea88..3a29fe7 100644\r
100 > --- a/lib/database.cc\r
101 > +++ b/lib/database.cc\r
102 > @@ -2029,7 +2029,7 @@ notmuch_database_add_message (notmuch_database_t *notmuch,\r
103 >           date = notmuch_message_file_get_header (message_file, "date");\r
104 >           _notmuch_message_set_header_values (message, date, from, subject);\r
105 >  \r
106 > -         ret = _notmuch_message_index_file (message, filename);\r
107 > +         ret = _notmuch_message_index_file (message, message_file);\r
108 >           if (ret)\r
109 >               goto DONE;\r
110 >       } else {\r
111 > diff --git a/lib/index.cc b/lib/index.cc\r
112 > index 976e49f..71397da 100644\r
113 > --- a/lib/index.cc\r
114 > +++ b/lib/index.cc\r
115 > @@ -425,52 +425,15 @@ _index_mime_part (notmuch_message_t *message,\r
116 >  \r
117 >  notmuch_status_t\r
118 >  _notmuch_message_index_file (notmuch_message_t *message,\r
119 > -                          const char *filename)\r
120 > +                          notmuch_message_file_t *message_file)\r
121 >  {\r
122 > -    GMimeStream *stream = NULL;\r
123 > -    GMimeParser *parser = NULL;\r
124 > -    GMimeMessage *mime_message = NULL;\r
125 > +    GMimeMessage *mime_message;\r
126 >      InternetAddressList *addresses;\r
127 > -    FILE *file = NULL;\r
128 >      const char *from, *subject;\r
129 > -    notmuch_status_t ret = NOTMUCH_STATUS_SUCCESS;\r
130 > -    static int initialized = 0;\r
131 > -    char from_buf[5];\r
132 > -    bool is_mbox = false;\r
133 > -\r
134 > -    if (! initialized) {\r
135 > -     g_mime_init (GMIME_ENABLE_RFC2047_WORKAROUNDS);\r
136 > -     initialized = 1;\r
137 > -    }\r
138 > -\r
139 > -    file = fopen (filename, "r");\r
140 > -    if (! file) {\r
141 > -     fprintf (stderr, "Error opening %s: %s\n", filename, strerror (errno));\r
142 > -     ret = NOTMUCH_STATUS_FILE_ERROR;\r
143 > -     goto DONE;\r
144 > -    }\r
145 > -\r
146 > -    /* Is this mbox? */\r
147 > -    if (fread (from_buf, sizeof (from_buf), 1, file) == 1 &&\r
148 > -     strncmp (from_buf, "From ", 5) == 0)\r
149 > -     is_mbox = true;\r
150 > -    rewind (file);\r
151 > -\r
152 > -    /* Evil GMime steals my FILE* here so I won't fclose it. */\r
153 > -    stream = g_mime_stream_file_new (file);\r
154 > -\r
155 > -    parser = g_mime_parser_new_with_stream (stream);\r
156 > -    g_mime_parser_set_scan_from (parser, is_mbox);\r
157 >  \r
158 > -    mime_message = g_mime_parser_construct_message (parser);\r
159 > -\r
160 > -    if (is_mbox) {\r
161 > -     if (!g_mime_parser_eos (parser)) {\r
162 > -         /* This is a multi-message mbox. */\r
163 > -         ret = NOTMUCH_STATUS_FILE_NOT_EMAIL;\r
164 > -         goto DONE;\r
165 > -     }\r
166 > -    }\r
167 > +    mime_message = notmuch_message_file_get_mime_message (message_file);\r
168 > +    if (! mime_message)\r
169 > +     return NOTMUCH_STATUS_FILE_NOT_EMAIL; /* more like internal error */\r
170 \r
171 Are there situations other than forgetting to call\r
172 notmuch_message_file_parse that could cause this?  (Speaking of which,\r
173 where is notmuch_message_file_parse called?)\r
174 \r
175 >  \r
176 >      from = g_mime_message_get_sender (mime_message);\r
177 >  \r
178 > @@ -491,15 +454,5 @@ _notmuch_message_index_file (notmuch_message_t *message,\r
179 >  \r
180 >      _index_mime_part (message, g_mime_message_get_mime_part (mime_message));\r
181 >  \r
182 > -  DONE:\r
183 > -    if (mime_message)\r
184 > -     g_object_unref (mime_message);\r
185 > -\r
186 > -    if (parser)\r
187 > -     g_object_unref (parser);\r
188 > -\r
189 > -    if (stream)\r
190 > -     g_object_unref (stream);\r
191 > -\r
192 > -    return ret;\r
193 > +    return NOTMUCH_STATUS_SUCCESS;\r
194 >  }\r
195 > diff --git a/lib/message-file.c b/lib/message-file.c\r
196 > index 33f6468..99e1dc8 100644\r
197 > --- a/lib/message-file.c\r
198 > +++ b/lib/message-file.c\r
199 > @@ -250,6 +250,15 @@ mboxes is deprecated and may be removed in the future.\n", message->filename);\r
200 >      return NOTMUCH_STATUS_SUCCESS;\r
201 >  }\r
202 >  \r
203 > +GMimeMessage *\r
204 > +notmuch_message_file_get_mime_message (notmuch_message_file_t *message)\r
205 > +{\r
206 > +    if (! message->parsed)\r
207 > +     return NULL;\r
208 \r
209 This seems like another good opportunity to call the parser lazily and\r
210 hide notmuch_message_file_parse from the caller, rather than requiring\r
211 the caller to implement a particular call sequence (which I wasn't\r
212 even able to find above).  This might also clean up the error handling\r
213 in the call to notmuch_message_file_get_mime_message above.\r
214 \r
215 > +\r
216 > +    return message->message;\r
217 > +}\r
218 > +\r
219 >  /* return NULL on errors, empty string for non-existing headers */\r
220 >  const char *\r
221 >  notmuch_message_file_get_header (notmuch_message_file_t *message,\r
222 > diff --git a/lib/notmuch-private.h b/lib/notmuch-private.h\r
223 > index 7277df1..7559521 100644\r
224 > --- a/lib/notmuch-private.h\r
225 > +++ b/lib/notmuch-private.h\r
226 > @@ -46,6 +46,8 @@ NOTMUCH_BEGIN_DECLS\r
227 >  \r
228 >  #include <talloc.h>\r
229 >  \r
230 > +#include <gmime/gmime.h>\r
231 > +\r
232 >  #include "xutil.h"\r
233 >  #include "error_util.h"\r
234 >  \r
235 > @@ -320,9 +322,11 @@ notmuch_message_get_author (notmuch_message_t *message);\r
236 >  \r
237 >  /* index.cc */\r
238 >  \r
239 > +typedef struct _notmuch_message_file notmuch_message_file_t;\r
240 > +\r
241 >  notmuch_status_t\r
242 >  _notmuch_message_index_file (notmuch_message_t *message,\r
243 > -                          const char *filename);\r
244 > +                          notmuch_message_file_t *message_file);\r
245 >  \r
246 >  /* message-file.c */\r
247 >  \r
248 > @@ -330,7 +334,6 @@ _notmuch_message_index_file (notmuch_message_t *message,\r
249 >   * into the public interface in notmuch.h\r
250 >   */\r
251 >  \r
252 > -typedef struct _notmuch_message_file notmuch_message_file_t;\r
253 >  \r
254 >  /* Open a file containing a single email message.\r
255 >   *\r
256 > @@ -377,6 +380,15 @@ void\r
257 >  notmuch_message_file_restrict_headersv (notmuch_message_file_t *message,\r
258 >                                       va_list va_headers);\r
259 >  \r
260 > +/* Get the gmime message of a parsed message file.\r
261 > + *\r
262 > + * Returns NULL if the message file has not been parsed.\r
263 > + *\r
264 > + * XXX: Would be nice to not have to expose GMimeMessage here.\r
265 \r
266 Maybe just forward-declare struct GMimeMessage?  Then you also\r
267 wouldn't need to add the gmime #include.\r
268 \r
269 > + */\r
270 > +GMimeMessage *\r
271 > +notmuch_message_file_get_mime_message (notmuch_message_file_t *message);\r
272 > +\r
273 >  /* Get the value of the specified header from the message as a UTF-8 string.\r
274 >   *\r
275 >   * The header name is case insensitive.\r