From: Tomi Ollila Date: Sat, 7 Jun 2014 13:37:41 +0000 (+0300) Subject: Re: Deduplication ? X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=ded752ab54192d9fb734cbb21fb3c3c8881ba38d;p=notmuch-archives.git Re: Deduplication ? --- diff --git a/3c/489e66737ebc35278a96cc39a5c30df03872ec b/3c/489e66737ebc35278a96cc39a5c30df03872ec new file mode 100644 index 000000000..80892ca98 --- /dev/null +++ b/3c/489e66737ebc35278a96cc39a5c30df03872ec @@ -0,0 +1,103 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 6195C40D1CF + for ; Sat, 7 Jun 2014 06:38:01 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: 0 +X-Spam-Level: +X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none] + autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id YYNB-c-X9UwR for ; + Sat, 7 Jun 2014 06:37:50 -0700 (PDT) +Received: from guru.guru-group.fi (guru.guru-group.fi [46.183.73.34]) + by olra.theworths.org (Postfix) with ESMTP id 9A9D040A924 + for ; Sat, 7 Jun 2014 06:37:50 -0700 (PDT) +Received: from guru.guru-group.fi (localhost [IPv6:::1]) + by guru.guru-group.fi (Postfix) with ESMTP id 4F6681000B3; + Sat, 7 Jun 2014 16:37:41 +0300 (EEST) +From: Tomi Ollila +To: Vladimir Marek +Subject: Re: Deduplication ? +In-Reply-To: <20140606104018.GJ2154@virt.cz.oracle.com> +References: <20140602123212.GA12639@virt.cz.oracle.com> + <87d2ers9mi.fsf@qmul.ac.uk> + <87ppirqtfa.fsf@qmul.ac.uk> <87y4xfz1fi.fsf@nikula.org> + + <20140606104018.GJ2154@virt.cz.oracle.com> +User-Agent: Notmuch/0.18+28~gcecaba1 (http://notmuchmail.org) Emacs/24.3.1 + (x86_64-unknown-linux-gnu) +X-Face: HhBM'cA~ +MIME-Version: 1.0 +Content-Type: text/plain +Cc: notmuch@notmuchmail.org +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Sat, 07 Jun 2014 13:38:01 -0000 + +On Fri, Jun 06 2014, Vladimir Marek wrote: + +> Hi, +> + + // stuff deleted // + +> +> I'm attaching my perl script if anyone is interested. It's in no way +> complete solution. It is supposed to be used as +> +> notmuch search --output=files --duplicate=2 '*' > dups +> ./dedup # It opens the file 'dups' +> +> The attached version does not remove anyting (the 'unlink' command is +> commented out). +> +> +> Interestingly this does not work (it seems to return all messages): +> notmuch search --output=messages --duplicate=2 '*' +> +> Also I have found that if I run 'notmuch search' and 'notmuch new' at +> the same time, the notmuch search crashes sometimes. That's why I don't +> use +> +> notmuch search ... | ./dedup +> +> Use with care :) + +To me, any perl code that lacks use strict; use warning; looks like a BIG +footgun ;/ + +> +> Thank you for your help +> -- +> Vlad + + +Tomi + +> #!/usr/bin/perl +> +> use Data::Dumper; +> use List::Util; +> +> +> @TO_IGNORE= ( +>