1 Return-Path: <jani@nikula.org>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 1EAE1431FBC
\r
6 for <notmuch@notmuchmail.org>; Mon, 2 Jun 2014 10:06:24 -0700 (PDT)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5
\r
12 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id CCYUPGMmLCkc for <notmuch@notmuchmail.org>;
\r
16 Mon, 2 Jun 2014 10:06:16 -0700 (PDT)
\r
17 Received: from mail-wi0-f171.google.com (mail-wi0-f171.google.com
\r
18 [209.85.212.171]) (using TLSv1 with cipher RC4-SHA (128/128 bits))
\r
19 (No client certificate requested)
\r
20 by olra.theworths.org (Postfix) with ESMTPS id 4D7B2431FAE
\r
21 for <notmuch@notmuchmail.org>; Mon, 2 Jun 2014 10:06:16 -0700 (PDT)
\r
22 Received: by mail-wi0-f171.google.com with SMTP id cc10so5008597wib.4
\r
23 for <notmuch@notmuchmail.org>; Mon, 02 Jun 2014 10:06:13 -0700 (PDT)
\r
24 X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
\r
25 d=1e100.net; s=20130820;
\r
26 h=x-gm-message-state:from:to:subject:in-reply-to:references
\r
27 :user-agent:date:message-id:mime-version:content-type;
\r
28 bh=O6RcBBKkVtQR85qqmLXWlxZHYEO+QkP6X9rWxRrWqw0=;
\r
29 b=YxaW4/sNIOlBMj8C4BK4Nm5bWnh94sbXaZs+aPXBRopsxZ+uf42RFarFLkunl3NWD8
\r
30 I07Y2PjtntByjEMPhGsbzrG38Ypn4PQANnij881RL6OJk9yhSKp50PGLmtZ0mS+0Mh7n
\r
31 efZxO1Xrnd6XHST8Xyk7LUY5y+efUhmEA/nz/T2q0LWMStKhx9jM+wQMFdfgZgE9Yl9I
\r
32 LqA80brb+oBO82cK5BBOQXjbV5+aKFrJYwjlbCxKTENz65pkhxjcm/9WspQOJ0atUPdN
\r
33 oOqwg83Ix9T/upCPCLorxT6g6SbYrZ0QXSJYTAZxBcPAUBXTevdTGFWPGaIqLbnwFtAS
\r
36 ALoCoQkP4DDQBSF2cdVDqtxF4eswnUbYpGTcy9XPk6sDtL2keTojdTMIIDBk5eYC4zKVip4BBUPu
\r
37 X-Received: by 10.180.90.51 with SMTP id bt19mr24467825wib.22.1401728773691;
\r
38 Mon, 02 Jun 2014 10:06:13 -0700 (PDT)
\r
39 Received: from localhost (dsl-hkibrasgw2-58c36f-91.dhcp.inet.fi.
\r
41 by mx.google.com with ESMTPSA id m2sm36855357wjw.3.2014.06.02.10.06.12
\r
42 for <multiple recipients>
\r
43 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
\r
44 Mon, 02 Jun 2014 10:06:13 -0700 (PDT)
\r
45 From: Jani Nikula <jani@nikula.org>
\r
46 To: Mark Walters <markwalters1009@gmail.com>,
\r
47 Tomi Ollila <tomi.ollila@iki.fi>, Vladimir Marek <Vladimir.Marek@oracle.com>,
\r
48 notmuch@notmuchmail.org
\r
49 Subject: Re: Deduplication ?
\r
50 In-Reply-To: <87ppirqtfa.fsf@qmul.ac.uk>
\r
51 References: <20140602123212.GA12639@virt.cz.oracle.com>
\r
52 <87d2ers9mi.fsf@qmul.ac.uk> <m2ppirs8ea.fsf@guru.guru-group.fi>
\r
53 <87ppirqtfa.fsf@qmul.ac.uk>
\r
54 User-Agent: Notmuch/0.18+24~gfe8cd90 (http://notmuchmail.org) Emacs/24.3.1
\r
55 (x86_64-pc-linux-gnu)
\r
56 Date: Mon, 02 Jun 2014 20:06:09 +0300
\r
57 Message-ID: <87y4xfz1fi.fsf@nikula.org>
\r
59 Content-Type: text/plain
\r
60 X-BeenThere: notmuch@notmuchmail.org
\r
61 X-Mailman-Version: 2.1.13
\r
63 List-Id: "Use and development of the notmuch mail system."
\r
64 <notmuch.notmuchmail.org>
\r
65 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
66 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
67 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
68 List-Post: <mailto:notmuch@notmuchmail.org>
\r
69 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
70 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
71 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
72 X-List-Received-Date: Mon, 02 Jun 2014 17:06:24 -0000
\r
74 On Mon, 02 Jun 2014, Mark Walters <markwalters1009@gmail.com> wrote:
\r
75 > Tomi Ollila <tomi.ollila@iki.fi> writes:
\r
77 >> On Mon, Jun 02 2014, Mark Walters <markwalters1009@gmail.com> wrote:
\r
79 >>> Vladimir Marek <Vladimir.Marek@oracle.com> writes:
\r
80 >>> If you want to save disk space then you could delete the duplicates
\r
81 >>> after with something like
\r
83 >>> notmuch search --output=files --format=text0 --duplicate=2 '*' piped to
\r
86 >> What if there are 3 duplicates (or 4... ;)
\r
88 > I was assuming that it was merging 2 duplicate-free bunches of messages,
\r
89 > but I guess the new 100000 might not be. In that case running the above
\r
90 > repeatedly (ie until it is a no-op) would be fine.
\r
92 With 'notmuch new' in between the runs, obviously.
\r
94 Alternatively, find the biggest --duplicate=N which still outputs
\r
95 something, and run the command for each N...2.
\r
98 >> One should also have some message content heuristics to determine that the
\r
99 >> content is indeed duplicate and not something totally different (not that
\r
100 >> we can see the different content anyway... but...)
\r
102 > That would be nice.
\r