Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 983B4431FBC for ; Thu, 29 Nov 2012 23:31:38 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 0 X-Spam-Level: X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LJ4X4bczkd4O for ; Thu, 29 Nov 2012 23:31:34 -0800 (PST) Received: from guru.guru-group.fi (guru.guru-group.fi [46.183.73.34]) by olra.theworths.org (Postfix) with ESMTP id D99F1431FAF for ; Thu, 29 Nov 2012 23:31:33 -0800 (PST) Received: from guru.guru-group.fi (localhost [IPv6:::1]) by guru.guru-group.fi (Postfix) with ESMTP id 6E1D91000E5; Fri, 30 Nov 2012 09:31:31 +0200 (EET) From: Tomi Ollila To: Eirik Byrkjeflot Anonsen , notmuch mailing list Subject: Re: On disk tag storage format In-Reply-To: <874nk8td7p.fsf@star.eba> References: <874nk8v9zw.fsf@zancas.localnet> <874nk8td7p.fsf@star.eba> User-Agent: Notmuch/0.14+116~g29fcdb5 (http://notmuchmail.org) Emacs/24.2.1 (x86_64-unknown-linux-gnu) X-Face: HhBM'cA~ MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 30 Nov 2012 07:31:38 -0000 On Thu, Nov 29 2012, Eirik Byrkjeflot Anonsen wrote: > David Bremner writes: > >> Austin outlined on IRC a way of representing tags on disk as hardlinks >> to messages. In order to make the discussion more concrete, I wrote a >> prototype in python to dump the notmuch database to this format. On my >> 250k messages, this creates 40k new hardlinks, and uses about 5M of >> diskspace. The dump process takes about 20s on >> my core i7 machine. With symbolic links, the same database takes about >> 150M of disk space; this isn't great but it isn't unbearable either. > > And eating 40k inodes, I suppose. Which may matter to some systems. > (Hardlinks do not use extra inodes, as they are just directory entries > pointing to already existing inodes). > > Of course, the space usage also depends on the file system, as e.g. ext2 > would use 1 complete block (typically 4kiB) to store the file name > pointed to per symlink. ReiserFS would probably use 5M for the > directory entries and another 5M for the symlink data (wild guess). IIRC in mid 1990's (some) frisbee fs stored many symbolic links to one inode and, at the same time, stored multiple link names to same fs block ... note that IIRC :D > eirik Tomi > _______________________________________________ > notmuch mailing list > notmuch@notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch