1 Return-Path: <tomi.ollila@iki.fi>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 1246C431FBD
\r
6 for <notmuch@notmuchmail.org>; Fri, 4 Apr 2014 03:51:20 -0700 (PDT)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none]
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id prq7hYyF183R for <notmuch@notmuchmail.org>;
\r
16 Fri, 4 Apr 2014 03:51:10 -0700 (PDT)
\r
17 Received: from guru.guru-group.fi (guru.guru-group.fi [46.183.73.34])
\r
18 by olra.theworths.org (Postfix) with ESMTP id A8BF9431FAF
\r
19 for <notmuch@notmuchmail.org>; Fri, 4 Apr 2014 03:51:09 -0700 (PDT)
\r
20 Received: from guru.guru-group.fi (localhost [IPv6:::1])
\r
21 by guru.guru-group.fi (Postfix) with ESMTP id D023E100051;
\r
22 Fri, 4 Apr 2014 13:51:01 +0300 (EEST)
\r
23 From: Tomi Ollila <tomi.ollila@iki.fi>
\r
24 To: David Bremner <david@tethera.net>, notmuch@notmuchmail.org
\r
25 Subject: Re: [Patch v6 1/6] dump: support gzipped and atomic output
\r
26 References: <1396554083-3892-1-git-send-email-david@tethera.net>
\r
27 <1396554083-3892-2-git-send-email-david@tethera.net>
\r
28 User-Agent: Notmuch/0.17+174~gef82849 (http://notmuchmail.org) Emacs/24.3.1
\r
29 (x86_64-unknown-linux-gnu)
\r
30 X-Face: HhBM'cA~<r"^Xv\KRN0P{vn'Y"Kd;zg_y3S[4)KSN~s?O\"QPoL
\r
31 $[Xv_BD:i/F$WiEWax}R(MPS`^UaptOGD`*/=@\1lKoVa9tnrg0TW?"r7aRtgk[F
\r
32 !)g;OY^,BjTbr)Np:%c_o'jj,Z
\r
33 lIn-Reply-To: <1396554083-3892-2-git-send-email-david@tethera.net>
\r
34 Date: Fri, 04 Apr 2014 13:51:01 +0300
\r
35 Message-ID: <m2zjk1cqqy.fsf@guru.guru-group.fi>
\r
37 Content-Type: text/plain
\r
38 X-BeenThere: notmuch@notmuchmail.org
\r
39 X-Mailman-Version: 2.1.13
\r
41 List-Id: "Use and development of the notmuch mail system."
\r
42 <notmuch.notmuchmail.org>
\r
43 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
44 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
45 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
46 List-Post: <mailto:notmuch@notmuchmail.org>
\r
47 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
48 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
49 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
50 X-List-Received-Date: Fri, 04 Apr 2014 10:51:20 -0000
\r
52 On Thu, Apr 03 2014, David Bremner <david@tethera.net> wrote:
\r
54 > The main goal is to support gzipped output for future internal
\r
55 > calls (e.g. from notmuch-new) to notmuch_database_dump.
\r
57 > The additional dependency is not very heavy since xapian already pulls
\r
60 > We want the dump to be "atomic", in the sense that after running the
\r
61 > dump file is either present and complete, or not present. This avoids
\r
62 > certain classes of mishaps involving overwriting a good backup with a
\r
63 > bad or partial one.
\r
65 2 things in this patch (comments inline), otherwise LGTM.
\r
67 Except now I remembered. In addition to those 2 the the error message in patch 6/6:
\r
69 + fprintf (stderr, "Error duping stdin\n");
\r
72 + fprintf (stderr, "Error duping stdin: %s\n", strerror (errno));
\r
78 > INSTALL | 20 ++++++++--
\r
79 > Makefile.local | 2 +-
\r
80 > configure | 28 ++++++++++++--
\r
81 > doc/man1/notmuch-dump.rst | 3 ++
\r
82 > notmuch-client.h | 4 +-
\r
83 > notmuch-dump.c | 95 +++++++++++++++++++++++++++++++++++++----------
\r
84 > test/T240-dump-restore.sh | 12 ++++++
\r
85 > 7 files changed, 136 insertions(+), 28 deletions(-)
\r
87 > diff --git a/INSTALL b/INSTALL
\r
88 > index 690b0ef..b543c50 100644
\r
91 > @@ -20,8 +20,8 @@ configure stage.
\r
95 > -Notmuch depends on three libraries: Xapian, GMime 2.4 or 2.6, and
\r
96 > -Talloc which are each described below:
\r
97 > +Notmuch depends on four libraries: Xapian, GMime 2.4 or 2.6,
\r
98 > +Talloc, and zlib which are each described below:
\r
102 > @@ -60,6 +60,18 @@ Talloc which are each described below:
\r
104 > Talloc is available from http://talloc.samba.org/
\r
109 > + zlib is an extremely popular compression library. It is used
\r
110 > + by Xapian, so if you installed that you will already have
\r
111 > + zlib. You may need to install the zlib headers separately.
\r
113 > + Notmuch needs the transparent write feature of zlib introduced
\r
114 > + in version 1.2.5.2 (Dec. 2011).
\r
116 > + zlib is available from http://zlib.net
\r
118 > Building Documentation
\r
119 > ----------------------
\r
121 > @@ -79,11 +91,11 @@ dependencies with a simple simple command line. For example:
\r
123 > For Debian and similar:
\r
125 > - sudo apt-get install libxapian-dev libgmime-2.6-dev libtalloc-dev python-sphinx
\r
126 > + sudo apt-get install libxapian-dev libgmime-2.6-dev libtalloc-dev zlib1g-dev python-sphinx
\r
128 > For Fedora and similar:
\r
130 > - sudo yum install xapian-core-devel gmime-devel libtalloc-devel python-sphinx
\r
131 > + sudo yum install xapian-core-devel gmime-devel libtalloc-devel zlib-devel python-sphinx
\r
133 > On other systems, a similar command can be used, but the details of
\r
134 > the package names may be different.
\r
135 > diff --git a/Makefile.local b/Makefile.local
\r
136 > index cb7b106..e5a20a7 100644
\r
137 > --- a/Makefile.local
\r
138 > +++ b/Makefile.local
\r
139 > @@ -41,7 +41,7 @@ PV_FILE=bindings/python/notmuch/version.py
\r
140 > # Smash together user's values with our extra values
\r
141 > FINAL_CFLAGS = -DNOTMUCH_VERSION=$(VERSION) $(CPPFLAGS) $(CFLAGS) $(WARN_CFLAGS) $(extra_cflags) $(CONFIGURE_CFLAGS)
\r
142 > FINAL_CXXFLAGS = $(CPPFLAGS) $(CXXFLAGS) $(WARN_CXXFLAGS) $(extra_cflags) $(extra_cxxflags) $(CONFIGURE_CXXFLAGS)
\r
143 > -FINAL_NOTMUCH_LDFLAGS = $(LDFLAGS) -Lutil -lutil -Llib -lnotmuch $(AS_NEEDED_LDFLAGS) $(GMIME_LDFLAGS) $(TALLOC_LDFLAGS)
\r
144 > +FINAL_NOTMUCH_LDFLAGS = $(LDFLAGS) -Lutil -lutil -Llib -lnotmuch $(AS_NEEDED_LDFLAGS) $(GMIME_LDFLAGS) $(TALLOC_LDFLAGS) $(ZLIB_LDFLAGS)
\r
145 > FINAL_NOTMUCH_LINKER = CC
\r
146 > ifneq ($(LINKER_RESOLVES_LIBRARY_DEPENDENCIES),1)
\r
147 > FINAL_NOTMUCH_LDFLAGS += $(CONFIGURE_LDFLAGS)
\r
148 > diff --git a/configure b/configure
\r
149 > index 1d430b9..1d624f7 100755
\r
152 > @@ -340,6 +340,18 @@ else
\r
153 > errors=$((errors + 1))
\r
156 > +printf "Checking for zlib (>= 1.2.5.2)... "
\r
158 > +if pkg-config --atleast-version=1.2.5.2 zlib; then
\r
159 > + printf "Yes.\n"
\r
161 > + zlib_cflags=$(pkg-config --cflags zlib)
\r
162 > + zlib_ldflags=$(pkg-config --libs zlib)
\r
165 > + errors=$((errors + 1))
\r
168 > printf "Checking for talloc development files... "
\r
169 > if pkg-config --exists talloc; then
\r
171 > @@ -496,6 +508,11 @@ EOF
\r
172 > echo " Xapian library (including development files such as headers)"
\r
173 > echo " http://xapian.org/"
\r
175 > + if [ $have_zlib -eq 0 ]; then
\r
176 > + echo " zlib library (including development files such as headers)"
\r
177 > + echo " http://zlib.net/"
\r
180 This message above should inform the required zlib version; this is usually
\r
181 the only string user see at the end of failed configure and she may wonder
\r
182 that 'Hey, I do have zlib. it is ubiquitous!'.
\r
185 > if [ $have_gmime -eq 0 ]; then
\r
186 > echo " Either GMime 2.4 library" $GMIME_24_VERSION_CTR "or GMime 2.6 library" $GMIME_26_VERSION_CTR
\r
187 > echo " (including development files such as headers)"
\r
188 > @@ -519,11 +536,11 @@ case a simple command will install everything you need. For example:
\r
190 > On Debian and similar systems:
\r
192 > - sudo apt-get install libxapian-dev libgmime-2.6-dev libtalloc-dev
\r
193 > + sudo apt-get install libxapian-dev libgmime-2.6-dev libtalloc-dev zlib1g-dev
\r
195 > Or on Fedora and similar systems:
\r
197 > - sudo yum install xapian-core-devel gmime-devel libtalloc-devel
\r
198 > + sudo yum install xapian-core-devel gmime-devel libtalloc-devel zlib-devel
\r
200 > On other systems, similar commands can be used, but the details of the
\r
201 > package names may be different.
\r
202 > @@ -844,6 +861,10 @@ XAPIAN_LDFLAGS = ${xapian_ldflags}
\r
203 > GMIME_CFLAGS = ${gmime_cflags}
\r
204 > GMIME_LDFLAGS = ${gmime_ldflags}
\r
206 > +# Flags needed to compile and link against zlib
\r
207 > +ZLIB_CFLAGS = ${zlib_cflags}
\r
208 > +ZLIB_LDFLAGS = ${zlib_ldflags}
\r
210 > # Flags needed to compile and link against talloc
\r
211 > TALLOC_CFLAGS = ${talloc_cflags}
\r
212 > TALLOC_LDFLAGS = ${talloc_ldflags}
\r
213 > @@ -882,6 +903,7 @@ CONFIGURE_CFLAGS = -DHAVE_GETLINE=\$(HAVE_GETLINE) \$(GMIME_CFLAGS) \\
\r
214 > -DUTIL_BYTE_ORDER=\$(UTIL_BYTE_ORDER)
\r
216 > CONFIGURE_CXXFLAGS = -DHAVE_GETLINE=\$(HAVE_GETLINE) \$(GMIME_CFLAGS) \\
\r
217 > + \$(ZLIB_CFLAGS) \\
\r
218 > \$(TALLOC_CFLAGS) -DHAVE_VALGRIND=\$(HAVE_VALGRIND) \\
\r
219 > \$(VALGRIND_CFLAGS) \$(XAPIAN_CXXFLAGS) \\
\r
220 > -DHAVE_STRCASESTR=\$(HAVE_STRCASESTR) \\
\r
221 > @@ -892,5 +914,5 @@ CONFIGURE_CXXFLAGS = -DHAVE_GETLINE=\$(HAVE_GETLINE) \$(GMIME_CFLAGS) \\
\r
222 > -DHAVE_XAPIAN_COMPACT=\$(HAVE_XAPIAN_COMPACT) \\
\r
223 > -DUTIL_BYTE_ORDER=\$(UTIL_BYTE_ORDER)
\r
225 > -CONFIGURE_LDFLAGS = \$(GMIME_LDFLAGS) \$(TALLOC_LDFLAGS) \$(XAPIAN_LDFLAGS)
\r
226 > +CONFIGURE_LDFLAGS = \$(GMIME_LDFLAGS) \$(TALLOC_LDFLAGS) \$(ZLIB_LDFLAGS) \$(XAPIAN_LDFLAGS)
\r
228 > diff --git a/doc/man1/notmuch-dump.rst b/doc/man1/notmuch-dump.rst
\r
229 > index 17d1da5..d94cb4f 100644
\r
230 > --- a/doc/man1/notmuch-dump.rst
\r
231 > +++ b/doc/man1/notmuch-dump.rst
\r
232 > @@ -19,6 +19,9 @@ recreated from the messages themselves. The output of notmuch dump is
\r
233 > therefore the only critical thing to backup (and much more friendly to
\r
234 > incremental backup than the native database files.)
\r
237 > + Compress the output in a format compatible with **gzip(1)**.
\r
239 > ``--format=(sup|batch-tag)``
\r
240 > Notmuch restore supports two plain text dump formats, both with one
\r
241 > message-id per line, followed by a list of tags.
\r
242 > diff --git a/notmuch-client.h b/notmuch-client.h
\r
243 > index d110648..e1efbe0 100644
\r
244 > --- a/notmuch-client.h
\r
245 > +++ b/notmuch-client.h
\r
246 > @@ -450,7 +450,9 @@ typedef enum dump_formats {
\r
248 > notmuch_database_dump (notmuch_database_t *notmuch,
\r
249 > const char *output_file_name,
\r
250 > - const char *query_str, dump_format_t output_format);
\r
251 > + const char *query_str,
\r
252 > + dump_format_t output_format,
\r
253 > + notmuch_bool_t gzip_output);
\r
255 > #include "command-line-arguments.h"
\r
257 > diff --git a/notmuch-dump.c b/notmuch-dump.c
\r
258 > index 21702d7..2a7252a 100644
\r
259 > --- a/notmuch-dump.c
\r
260 > +++ b/notmuch-dump.c
\r
261 > @@ -21,9 +21,11 @@
\r
262 > #include "notmuch-client.h"
\r
263 > #include "hex-escape.h"
\r
264 > #include "string-util.h"
\r
265 > +#include <zlib.h>
\r
269 > -database_dump_file (notmuch_database_t *notmuch, FILE *output,
\r
270 > +database_dump_file (notmuch_database_t *notmuch, gzFile output,
\r
271 > const char *query_str, int output_format)
\r
273 > notmuch_query_t *query;
\r
274 > @@ -69,7 +71,7 @@ database_dump_file (notmuch_database_t *notmuch, FILE *output,
\r
277 > if (output_format == DUMP_FORMAT_SUP) {
\r
278 > - fprintf (output, "%s (", message_id);
\r
279 > + gzprintf (output, "%s (", message_id);
\r
282 > for (tags = notmuch_message_get_tags (message);
\r
283 > @@ -78,12 +80,12 @@ database_dump_file (notmuch_database_t *notmuch, FILE *output,
\r
284 > const char *tag_str = notmuch_tags_get (tags);
\r
287 > - fputs (" ", output);
\r
288 > + gzputs (output, " ");
\r
292 > if (output_format == DUMP_FORMAT_SUP) {
\r
293 > - fputs (tag_str, output);
\r
294 > + gzputs (output, tag_str);
\r
296 > if (hex_encode (notmuch, tag_str,
\r
297 > &buffer, &buffer_size) != HEX_SUCCESS) {
\r
298 > @@ -91,12 +93,12 @@ database_dump_file (notmuch_database_t *notmuch, FILE *output,
\r
300 > return EXIT_FAILURE;
\r
302 > - fprintf (output, "+%s", buffer);
\r
303 > + gzprintf (output, "+%s", buffer);
\r
307 > if (output_format == DUMP_FORMAT_SUP) {
\r
308 > - fputs (")\n", output);
\r
309 > + gzputs (output, ")\n");
\r
311 > if (make_boolean_term (notmuch, "id", message_id,
\r
312 > &buffer, &buffer_size)) {
\r
313 > @@ -104,7 +106,7 @@ database_dump_file (notmuch_database_t *notmuch, FILE *output,
\r
314 > message_id, strerror (errno));
\r
315 > return EXIT_FAILURE;
\r
317 > - fprintf (output, " -- %s\n", buffer);
\r
318 > + gzprintf (output, " -- %s\n", buffer);
\r
321 > notmuch_message_destroy (message);
\r
322 > @@ -121,24 +123,77 @@ database_dump_file (notmuch_database_t *notmuch, FILE *output,
\r
324 > notmuch_database_dump (notmuch_database_t *notmuch,
\r
325 > const char *output_file_name,
\r
326 > - const char *query_str, dump_format_t output_format)
\r
327 > + const char *query_str,
\r
328 > + dump_format_t output_format,
\r
329 > + notmuch_bool_t gzip_output)
\r
331 > - FILE *output = stdout;
\r
334 > + const char *mode = gzip_output ? "w9" : "wT";
\r
335 > + const char *name_for_error = output_file_name ? output_file_name : "stdout";
\r
337 > + char *tempname = NULL;
\r
338 > + int outfd = -1;
\r
342 > if (output_file_name) {
\r
343 > - output = fopen (output_file_name, "w");
\r
344 > - if (output == NULL) {
\r
345 > - fprintf (stderr, "Error opening %s for writing: %s\n",
\r
346 > - output_file_name, strerror (errno));
\r
347 > - return EXIT_FAILURE;
\r
349 > + tempname = talloc_asprintf (notmuch, "%s.XXXXXX", output_file_name);
\r
350 > + outfd = mkstemp (tempname);
\r
352 > + outfd = dup (STDOUT_FILENO);
\r
355 > + if (outfd < 0) {
\r
356 > + fprintf (stderr, "Bad output file %s\n", name_for_error);
\r
360 > + output = gzdopen (outfd, mode);
\r
362 > + if (output == NULL) {
\r
363 > + fprintf (stderr, "Error opening %s for (gzip) writing: %s\n",
\r
364 > + name_for_error, strerror (errno));
\r
365 > + if (close (outfd))
\r
366 > + fprintf (stderr, "Error closing %s during shutdown: %s\n",
\r
367 > + name_for_error, strerror (errno));
\r
371 > ret = database_dump_file (notmuch, output, query_str, output_format);
\r
372 > + if (ret) goto DONE;
\r
374 > - if (output != stdout)
\r
375 > - fclose (output);
\r
376 > + ret = gzflush (output, Z_FINISH);
\r
378 > + fprintf (stderr, "Error flushing output: %s\n", gzerror (output, NULL));
\r
382 > + if (output_file_name) {
\r
383 > + ret = fdatasync (outfd);
\r
385 > + fprintf (stderr, "Error syncing %s to disk: %s\n",
\r
386 > + name_for_error, strerror (errno));
\r
391 > + if (gzclose_w (output) != Z_OK) {
\r
392 > + ret = EXIT_FAILURE;
\r
394 error message here, like fprintf (stderr, "Error closing output: %s\n", gzerror (output, NULL));
\r
399 > + if (output_file_name) {
\r
400 > + ret = rename (tempname, output_file_name);
\r
402 > + fprintf (stderr, "Error renaming %s to %s: %s\n",
\r
403 > + tempname, output_file_name, strerror (errno));
\r
409 > + if (ret != EXIT_SUCCESS && output_file_name)
\r
410 > + (void) unlink (tempname);
\r
414 > @@ -158,6 +213,7 @@ notmuch_dump_command (notmuch_config_t *config, int argc, char *argv[])
\r
417 > int output_format = DUMP_FORMAT_BATCH_TAG;
\r
418 > + notmuch_bool_t gzip_output = 0;
\r
420 > notmuch_opt_desc_t options[] = {
\r
421 > { NOTMUCH_OPT_KEYWORD, &output_format, "format", 'f',
\r
422 > @@ -165,6 +221,7 @@ notmuch_dump_command (notmuch_config_t *config, int argc, char *argv[])
\r
423 > { "batch-tag", DUMP_FORMAT_BATCH_TAG },
\r
425 > { NOTMUCH_OPT_STRING, &output_file_name, "output", 'o', 0 },
\r
426 > + { NOTMUCH_OPT_BOOLEAN, &gzip_output, "gzip", 'z', 0 },
\r
427 > { 0, 0, 0, 0, 0 }
\r
430 > @@ -181,7 +238,7 @@ notmuch_dump_command (notmuch_config_t *config, int argc, char *argv[])
\r
433 > ret = notmuch_database_dump (notmuch, output_file_name, query_str,
\r
434 > - output_format);
\r
435 > + output_format, gzip_output);
\r
437 > notmuch_database_destroy (notmuch);
\r
439 > diff --git a/test/T240-dump-restore.sh b/test/T240-dump-restore.sh
\r
440 > index 0004438..d79aca8 100755
\r
441 > --- a/test/T240-dump-restore.sh
\r
442 > +++ b/test/T240-dump-restore.sh
\r
443 > @@ -68,6 +68,18 @@ test_begin_subtest "dump --output=outfile --"
\r
444 > notmuch dump --output=dump-1-arg-dash.actual --
\r
445 > test_expect_equal_file dump.expected dump-1-arg-dash.actual
\r
447 > +# gzipped output
\r
449 > +test_begin_subtest "dump --gzip"
\r
450 > +notmuch dump --gzip > dump-gzip.gz
\r
451 > +gunzip dump-gzip.gz
\r
452 > +test_expect_equal_file dump.expected dump-gzip
\r
454 > +test_begin_subtest "dump --gzip --output=outfile"
\r
455 > +notmuch dump --gzip --output=dump-gzip-outfile.gz
\r
456 > +gunzip dump-gzip-outfile.gz
\r
457 > +test_expect_equal_file dump.expected dump-gzip-outfile
\r
459 > # Note, we assume all messages from cworth have a message-id
\r
460 > # containing cworth.org
\r