From bb17f872f52f8d8606feb5f883fa8788257950ca Mon Sep 17 00:00:00 2001 From: Ciprian Dorin Craciun Date: Tue, 14 Aug 2012 20:05:11 +0300 Subject: [PATCH] Re: Alternative (raw) message store (i.e. instead of maildir) --- 6e/318bbf5e4992696b352c0af7faea5aeae7f65f | 125 ++++++++++++++++++++++ 1 file changed, 125 insertions(+) create mode 100644 6e/318bbf5e4992696b352c0af7faea5aeae7f65f diff --git a/6e/318bbf5e4992696b352c0af7faea5aeae7f65f b/6e/318bbf5e4992696b352c0af7faea5aeae7f65f new file mode 100644 index 000000000..74f82e3f9 --- /dev/null +++ b/6e/318bbf5e4992696b352c0af7faea5aeae7f65f @@ -0,0 +1,125 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 3E88E431FC2 + for ; Tue, 14 Aug 2012 10:05:13 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: -0.799 +X-Spam-Level: +X-Spam-Status: No, score=-0.799 tagged_above=-999 required=5 + tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, + FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id uhSEJWkQGR-0 for ; + Tue, 14 Aug 2012 10:05:12 -0700 (PDT) +Received: from mail-wi0-f173.google.com (mail-wi0-f173.google.com + [209.85.212.173]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id 6B921431FAE + for ; Tue, 14 Aug 2012 10:05:12 -0700 (PDT) +Received: by wibhm6 with SMTP id hm6so3900887wib.2 + for ; Tue, 14 Aug 2012 10:05:11 -0700 (PDT) +DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; + h=mime-version:in-reply-to:references:date:message-id:subject:from:to + :content-type; bh=mD/ocqYSMyihREvcSf2OeRlzDTXe6KGu6E71JFKcjOU=; + b=bEHA28V/kNaVHIEBjP5F7AZRnTpzOnOdYcRkvpXsylOOpVznfBlc2wGFfMACFX7wD5 + zp95auKVotrL6rcm1+D7LBk7mh2FrJwLYb9DK4P4Rw4aULP8vgS8fZAbyg8DZ16UKl5G + U7b7PvruXWSUf4vHXCx4bMZXaYRMezNEqFQxY9P2rIBlGkPMteVnT55etnBI3UsWyi5s + Fy31ZoEWAclADghSwS00xyzBZCD2twwRe0rtzDLQi96aQ2PINKhZkGOqCxWY/qIB1g4e + 3dg66huhl8V5xeR8c83sjW+T5FzxA532a0RdfZYc05Df8uzjf6h9MYgX07pu/yH74JQL + eAjg== +MIME-Version: 1.0 +Received: by 10.180.74.33 with SMTP id q1mr29483925wiv.4.1344963911172; Tue, + 14 Aug 2012 10:05:11 -0700 (PDT) +Received: by 10.180.104.196 with HTTP; Tue, 14 Aug 2012 10:05:11 -0700 (PDT) +In-Reply-To: <20120814165044.GP28321@pub.cz.oracle.com> +References: + + <20120811094635.GY28321@pub.cz.oracle.com> <874no613ms.fsf@flamingspork.com> + <20120814160442.GO28321@pub.cz.oracle.com> + + <20120814165044.GP28321@pub.cz.oracle.com> +Date: Tue, 14 Aug 2012 20:05:11 +0300 +Message-ID: + +Subject: Re: Alternative (raw) message store (i.e. instead of maildir) +From: Ciprian Dorin Craciun +To: Vladimir.Marek@oracle.com, Stewart Smith , + notmuch@notmuchmail.org +Content-Type: text/plain; charset=UTF-8 +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Tue, 14 Aug 2012 17:05:13 -0000 + +On Tue, Aug 14, 2012 at 7:50 PM, Vladimir Marek + wrote: +>> On the other hand I strongly sustain having a more optimized +>> backend for emails, especially for such cases. For example a +>> BerkeleyDB would perfectly fit such a use case, especially if we store +>> the body and the headers in separate databases. +>> +>> Just a small experiment, below are the R `summary(emails)` of the +>> sizes of my 700k emails: +>> ~~~~ +>> Min. 1st Qu. Median Mean 3rd Qu. Max. +>> 8 4364 5374 11510 7042 31090000 +>> ~~~~ +>> +>> As seen 75% of the emails are below 7k, and this without any compression... +>> +>> Moreover we could organize the keys so that in a B-Tree structure +>> the emails in the same thread are closer together... +> +> Now I'm not sure if you talk about some berkeley-db fuse filesystem or +> direct support in notmuch. + + No tricks. :) + + I proposed -- better said queried if possible or at least wanted +-- to have an internal interface (SPI) that any mail store would have +to implement in order to be indexed and used by notmuch. I guess the +interface would be quite lightweight, and would need just the +following: + * open store; + * create a cursor iterating through all the emails, yielding only the keys; + * read the envelope (as a byte blob) of a particular key; (used +only for displaying thread lists, etc.;) + * read the body (as a byte blob) of a particular key; + * maybe create a cursor iterating over all those emails that have +changed since a particular timestamp; + + +> I don't have enough cycles to modify notmuch, +> so I started to look at simpler (codewise) solution ... +> +> To summarize, what I personally want from the mail storage + + We need to make a distinction between current storage (like +maildir) and archival storage (like the Zip or my proposal). + + +> - ability to read and write mails + + It could be done through a small CLI over the proposed API. + +> - should work with mutt (or mutt-kz) + + This would eliminate any proposal not involving a FUSE wrapper... + +> - simple backup to windows drive (files can't contain double colon ':') + + This could be done via a dump like facility. (BerkeleyDB supports +this natively through a tool.) -- 2.26.2