1 Return-Path: <tomi.ollila@iki.fi>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 9E965431FAF
\r
6 for <notmuch@notmuchmail.org>; Sat, 8 Sep 2012 06:38:29 -0700 (PDT)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none]
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id Kj6MYN+0UJ1F for <notmuch@notmuchmail.org>;
\r
16 Sat, 8 Sep 2012 06:38:28 -0700 (PDT)
\r
17 Received: from guru.guru-group.fi (guru.guru-group.fi [46.183.73.34])
\r
18 by olra.theworths.org (Postfix) with ESMTP id 4D08B431FAE
\r
19 for <notmuch@notmuchmail.org>; Sat, 8 Sep 2012 06:38:28 -0700 (PDT)
\r
20 Received: from guru.guru-group.fi (localhost [IPv6:::1])
\r
21 by guru.guru-group.fi (Postfix) with ESMTP id 656061000E5;
\r
22 Sat, 8 Sep 2012 16:38:35 +0300 (EEST)
\r
23 From: Tomi Ollila <tomi.ollila@iki.fi>
\r
24 To: david@tethera.net, notmuch@notmuchmail.org
\r
25 Subject: Re: [Patch v3 5/6] test: add generator for random "stub" messages
\r
26 In-Reply-To: <1345382314-5330-6-git-send-email-david@tethera.net>
\r
27 References: <1345382314-5330-1-git-send-email-david@tethera.net>
\r
28 <1345382314-5330-6-git-send-email-david@tethera.net>
\r
29 User-Agent: Notmuch/0.14+11~gd9bf007 (http://notmuchmail.org) Emacs/24.2.1
\r
30 (x86_64-unknown-linux-gnu)
\r
31 X-Face: HhBM'cA~<r"^Xv\KRN0P{vn'Y"Kd;zg_y3S[4)KSN~s?O\"QPoL
\r
32 $[Xv_BD:i/F$WiEWax}R(MPS`^UaptOGD`*/=@\1lKoVa9tnrg0TW?"r7aRtgk[F
\r
33 !)g;OY^,BjTbr)Np:%c_o'jj,Z
\r
34 Date: Sat, 08 Sep 2012 16:38:35 +0300
\r
35 Message-ID: <m2wr04ocro.fsf@guru.guru-group.fi>
\r
37 Content-Type: text/plain
\r
38 Cc: David Bremner <bremner@debian.org>
\r
39 X-BeenThere: notmuch@notmuchmail.org
\r
40 X-Mailman-Version: 2.1.13
\r
42 List-Id: "Use and development of the notmuch mail system."
\r
43 <notmuch.notmuchmail.org>
\r
44 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
45 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
46 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
47 List-Post: <mailto:notmuch@notmuchmail.org>
\r
48 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
49 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
50 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
51 X-List-Received-Date: Sat, 08 Sep 2012 13:38:29 -0000
\r
53 On Sun, Aug 19 2012, david@tethera.net wrote:
\r
55 > From: David Bremner <bremner@debian.org>
\r
57 > Initial use case is testing dump and restore, so we only have
\r
58 > message-ids and tags.
\r
60 > The message ID's are nothing like RFC compliant, but it doesn't seem
\r
61 > any harder to roundtrip random UTF-8 strings than RFC-compliant ones.
\r
63 > Tags are UTF-8, even though notmuch is in principle more generous than
\r
67 Mostly LGTM (the whole series). Few comments inline...
\r
69 Finally, 6/6 adds known broken test -- when will we see this code
\r
70 taken into use and the broken test fixed :)
\r
72 > test/.gitignore | 1 +
\r
73 > test/Makefile.local | 9 +++
\r
75 > test/random-corpus.c | 202 ++++++++++++++++++++++++++++++++++++++++++++++++++
\r
76 > 4 files changed, 213 insertions(+), 1 deletion(-)
\r
77 > create mode 100644 test/random-corpus.c
\r
82 > diff --git a/test/random-corpus.c b/test/random-corpus.c
\r
83 > new file mode 100644
\r
84 > index 0000000..8c5b559
\r
86 > +++ b/test/random-corpus.c
\r
91 > +/* Current largest UTF-32 value defined. Note that most of these will
\r
92 > + * be printed as boxes in most fonts.
\r
95 Should we be talking about UTF-8 valies. UTF-8 (currently has the same
\r
99 > +#define GLYPH_MAX 0x10FFFE
\r
102 > +random_unichar ()
\r
104 > + int start = 1, stop = GLYPH_MAX;
\r
105 > + int class = random() % 2;
\r
108 > + * Choose about half ascii as test characters, as ascii
\r
109 > + * punctation and whitespace is the main cause of problems for
\r
110 > + * the (old) restore parser
\r
112 > + switch (class) {
\r
119 > + /* the rest of unicode */
\r
121 > + stop = GLYPH_MAX;
\r
124 > + if (start == stop)
\r
127 > + return start + (random() % (stop - start + 1));
\r
131 > +random_utf8_string (void *ctx, size_t char_count)
\r
134 > + gchar *buf = NULL;
\r
135 > + size_t buf_size = 0;
\r
137 > + size_t offset = 0;
\r
141 > + buf = talloc_realloc (ctx, NULL, gchar, char_count);
\r
142 > + buf_size = char_count;
\r
144 > + for (i = 0; i < char_count; i++) {
\r
145 > + gunichar randomchar;
\r
146 > + size_t written;
\r
148 > + /* 6 for one glyph, one for null */
\r
149 > + if (buf_size - offset < 8) {
\r
150 > + buf_size += 16;
\r
151 > + buf = talloc_realloc (ctx, buf, gchar, buf_size);
\r
153 This reallocation will hit many times, as originally there was just
\r
154 char_count bytes allocated -- this limit will probably get hit before
\r
155 halfway the creation of random string (half uses 1 byte, other half
\r
156 2, 3 or 4 bytes, mostly 4 (even only half of the 4-byte range is used...)
\r
158 Maybe originally allocating char_count * 2 + 8 and if realloc required
\r
159 (char_count - i) * 2 + 8... or maybe better, just doing the latter
\r
160 realloc and replacing first with buf = NULL; buf_size = 0;
\r
162 Alternatively you could play with random states; calculate size,
\r
163 reset random state, alloc size + 1 and write chars.
\r
167 > + randomchar = random_unichar();
\r
169 > + written = g_unichar_to_utf8 (randomchar, buf + offset);
\r
171 > + if (written <= 0) {
\r
172 > + fprintf (stderr, "error converting to utf8\n");
\r
176 > + offset += written;
\r
180 Above there is extra newline. There are a few others in other
\r
181 files (at least after opening and before closing brace).
\r
182 Maybe uncrustify your source :)
\r
184 > + buf[offset] = 0;
\r