1 Return-Path: <ciprian.craciun@gmail.com>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 3E88E431FC2
\r
6 for <notmuch@notmuchmail.org>; Tue, 14 Aug 2012 10:05:13 -0700 (PDT)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-0.799 tagged_above=-999 required=5
\r
12 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,
\r
13 FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled
\r
14 Received: from olra.theworths.org ([127.0.0.1])
\r
15 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
16 with ESMTP id uhSEJWkQGR-0 for <notmuch@notmuchmail.org>;
\r
17 Tue, 14 Aug 2012 10:05:12 -0700 (PDT)
\r
18 Received: from mail-wi0-f173.google.com (mail-wi0-f173.google.com
\r
19 [209.85.212.173]) (using TLSv1 with cipher RC4-SHA (128/128 bits))
\r
20 (No client certificate requested)
\r
21 by olra.theworths.org (Postfix) with ESMTPS id 6B921431FAE
\r
22 for <notmuch@notmuchmail.org>; Tue, 14 Aug 2012 10:05:12 -0700 (PDT)
\r
23 Received: by wibhm6 with SMTP id hm6so3900887wib.2
\r
24 for <notmuch@notmuchmail.org>; Tue, 14 Aug 2012 10:05:11 -0700 (PDT)
\r
25 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113;
\r
26 h=mime-version:in-reply-to:references:date:message-id:subject:from:to
\r
27 :content-type; bh=mD/ocqYSMyihREvcSf2OeRlzDTXe6KGu6E71JFKcjOU=;
\r
28 b=bEHA28V/kNaVHIEBjP5F7AZRnTpzOnOdYcRkvpXsylOOpVznfBlc2wGFfMACFX7wD5
\r
29 zp95auKVotrL6rcm1+D7LBk7mh2FrJwLYb9DK4P4Rw4aULP8vgS8fZAbyg8DZ16UKl5G
\r
30 U7b7PvruXWSUf4vHXCx4bMZXaYRMezNEqFQxY9P2rIBlGkPMteVnT55etnBI3UsWyi5s
\r
31 Fy31ZoEWAclADghSwS00xyzBZCD2twwRe0rtzDLQi96aQ2PINKhZkGOqCxWY/qIB1g4e
\r
32 3dg66huhl8V5xeR8c83sjW+T5FzxA532a0RdfZYc05Df8uzjf6h9MYgX07pu/yH74JQL
\r
35 Received: by 10.180.74.33 with SMTP id q1mr29483925wiv.4.1344963911172; Tue,
\r
36 14 Aug 2012 10:05:11 -0700 (PDT)
\r
37 Received: by 10.180.104.196 with HTTP; Tue, 14 Aug 2012 10:05:11 -0700 (PDT)
\r
38 In-Reply-To: <20120814165044.GP28321@pub.cz.oracle.com>
\r
40 <CA+Tk8fwq2thNeKHgfG-EX0hgR7uyqrSce0ZMOhEJBsz1RVtRqg@mail.gmail.com>
\r
41 <20120811094635.GY28321@pub.cz.oracle.com> <874no613ms.fsf@flamingspork.com>
\r
42 <20120814160442.GO28321@pub.cz.oracle.com>
\r
43 <CA+Tk8fwVwWewTS-AVaaapQpLNU6a698acp-_ZmnktJ5ynRrx1A@mail.gmail.com>
\r
44 <20120814165044.GP28321@pub.cz.oracle.com>
\r
45 Date: Tue, 14 Aug 2012 20:05:11 +0300
\r
47 <CA+Tk8fwT4Hb3upMoucWUBeP8RMo6hTMi5zkH1HcPC6dhkS60wg@mail.gmail.com>
\r
48 Subject: Re: Alternative (raw) message store (i.e. instead of maildir)
\r
49 From: Ciprian Dorin Craciun <ciprian.craciun@gmail.com>
\r
50 To: Vladimir.Marek@oracle.com, Stewart Smith <stewart@flamingspork.com>,
\r
51 notmuch@notmuchmail.org
\r
52 Content-Type: text/plain; charset=UTF-8
\r
53 X-BeenThere: notmuch@notmuchmail.org
\r
54 X-Mailman-Version: 2.1.13
\r
56 List-Id: "Use and development of the notmuch mail system."
\r
57 <notmuch.notmuchmail.org>
\r
58 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
59 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
60 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
61 List-Post: <mailto:notmuch@notmuchmail.org>
\r
62 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
63 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
64 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
65 X-List-Received-Date: Tue, 14 Aug 2012 17:05:13 -0000
\r
67 On Tue, Aug 14, 2012 at 7:50 PM, Vladimir Marek
\r
68 <Vladimir.Marek@oracle.com> wrote:
\r
69 >> On the other hand I strongly sustain having a more optimized
\r
70 >> backend for emails, especially for such cases. For example a
\r
71 >> BerkeleyDB would perfectly fit such a use case, especially if we store
\r
72 >> the body and the headers in separate databases.
\r
74 >> Just a small experiment, below are the R `summary(emails)` of the
\r
75 >> sizes of my 700k emails:
\r
77 >> Min. 1st Qu. Median Mean 3rd Qu. Max.
\r
78 >> 8 4364 5374 11510 7042 31090000
\r
81 >> As seen 75% of the emails are below 7k, and this without any compression...
\r
83 >> Moreover we could organize the keys so that in a B-Tree structure
\r
84 >> the emails in the same thread are closer together...
\r
86 > Now I'm not sure if you talk about some berkeley-db fuse filesystem or
\r
87 > direct support in notmuch.
\r
91 I proposed -- better said queried if possible or at least wanted
\r
92 -- to have an internal interface (SPI) that any mail store would have
\r
93 to implement in order to be indexed and used by notmuch. I guess the
\r
94 interface would be quite lightweight, and would need just the
\r
97 * create a cursor iterating through all the emails, yielding only the keys;
\r
98 * read the envelope (as a byte blob) of a particular key; (used
\r
99 only for displaying thread lists, etc.;)
\r
100 * read the body (as a byte blob) of a particular key;
\r
101 * maybe create a cursor iterating over all those emails that have
\r
102 changed since a particular timestamp;
\r
105 > I don't have enough cycles to modify notmuch,
\r
106 > so I started to look at simpler (codewise) solution ...
\r
108 > To summarize, what I personally want from the mail storage
\r
110 We need to make a distinction between current storage (like
\r
111 maildir) and archival storage (like the Zip or my proposal).
\r
114 > - ability to read and write mails
\r
116 It could be done through a small CLI over the proposed API.
\r
118 > - should work with mutt (or mutt-kz)
\r
120 This would eliminate any proposal not involving a FUSE wrapper...
\r
122 > - simple backup to windows drive (files can't contain double colon ':')
\r
124 This could be done via a dump like facility. (BerkeleyDB supports
\r
125 this natively through a tool.)
\r