Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 7AD99429E50 for ; Wed, 4 Sep 2013 19:14:08 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.799 X-Spam-Level: X-Spam-Status: No, score=-0.799 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5MAEuF5xbkvK for ; Wed, 4 Sep 2013 19:14:03 -0700 (PDT) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id C0DAF429E4C for ; Wed, 4 Sep 2013 19:14:02 -0700 (PDT) Received: from compute1.internal (compute1.nyi.mail.srv.osa [10.202.2.41]) by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 89F812113C; Wed, 4 Sep 2013 22:06:34 -0400 (EDT) Received: from frontend2 ([10.202.2.161]) by compute1.internal (MEProxy); Wed, 04 Sep 2013 22:06:34 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=8t8.us; h=from:to :cc:subject:date:message-id:in-reply-to:references; s=mesmtp; bh=NwXtEP9eSENBMPi85MxvEbbVmQg=; b=YVisTZ6Ahnge0sY2tXn3DZ4OK0uN Hu9OA6kC3JO9Cmf3XXpj/k385rGOCV/H/9dMZhW3stkNSHOmy4NnEJvtcbMjF/bQ SVoDuoJfj54J3Pbj2IoK2GVNPxblriNmTSDnWYBfU8F2TFCyyYlCBLJ8nAB21rEj 4RKvkoogqHeDqXE= DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=from:to:cc:subject:date:message-id :in-reply-to:references; s=smtpout; bh=NwXtEP9eSENBMPi85MxvEbbVm Qg=; b=llgO6YCC45vnB/CwRbtiEheD7weA/LdNtUc93121FGu5BY6CpxxMct70N XXUvw0E1XmHgVkmXQQeQFvH6GQk80e0/gt3soh+tfNlyKENQ1AGcSsdorMVsVJsG O1cKAGNK9oDvIHgFCi9gQM5jeF6dsbgt76S+/Z8BssaUWjdYA4= X-Sasl-enc: WE7t0EeN3frTUXWxsuElr80nUavLFJk0cjUDWF1+0LJl 1378346793 Received: from localhost (unknown [97.125.94.9]) by mail.messagingengine.com (Postfix) with ESMTPA id 9A7806804FA; Wed, 4 Sep 2013 22:06:33 -0400 (EDT) From: Kevin McCarthy To: notmuch@notmuchmail.org Subject: [PATCH 1/2] notmuch-mutt: use notmuch --duplicate flag Date: Wed, 4 Sep 2013 19:05:50 -0700 Message-Id: <1378346751-25548-2-git-send-email-kevin@8t8.us> X-Mailer: git-send-email 1.8.4.rc3 In-Reply-To: <1378346751-25548-1-git-send-email-kevin@8t8.us> References: <1378346751-25548-1-git-send-email-kevin@8t8.us> Cc: zack@upsilon.cc X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 05 Sep 2013 02:14:08 -0000 Change notmuch-mutt to use the new --duplicate=1 flag for duplicate removal. This will remove duplicates based on message-id at the notmuch level. Previously we were using fdupes or generating sha sums after the search. This version will be faster, but will enable the possibility of hiding search results due to accidental/malicious duplicate message-ids. --- contrib/notmuch-mutt/README | 5 --- contrib/notmuch-mutt/notmuch-mutt | 64 +++++++-------------------------------- 2 files changed, 11 insertions(+), 58 deletions(-) diff --git a/contrib/notmuch-mutt/README b/contrib/notmuch-mutt/README index e00035c..382ac91 100644 --- a/contrib/notmuch-mutt/README +++ b/contrib/notmuch-mutt/README @@ -41,11 +41,6 @@ To *run* notmuch-mutt you will need Perl with the following libraries: (Debian package: libstring-shellquote-perl) - Term::ReadLine (Debian package: libterm-readline-gnu-perl) -- File::Which - (Debian package: libfile-which-perl) - -The --remove-dups option will use fdupes -if it is installed. Version fdupes-1.50-PR2 or higher is required. To *build* notmuch-mutt documentation you will need: diff --git a/contrib/notmuch-mutt/notmuch-mutt b/contrib/notmuch-mutt/notmuch-mutt index 00c5ef8..c69b35c 100755 --- a/contrib/notmuch-mutt/notmuch-mutt +++ b/contrib/notmuch-mutt/notmuch-mutt @@ -18,8 +18,6 @@ use Mail::Box::Maildir; use Pod::Usage; use String::ShellQuote; use Term::ReadLine; -use Digest::SHA; -use File::Which; my $xdg_cache_dir = "$ENV{HOME}/.cache"; @@ -36,65 +34,22 @@ sub empty_maildir($) { $folder->close(); } -# Match files by size and SHA-256; then delete duplicates -sub builtin_remove_dups($) { - my ($maildir) = @_; - my (%size_to_files, %sha_to_files); - - # Group files by matching sizes - foreach my $file (glob("$maildir/cur/*")) { - my $size = -s $file; - push(@{$size_to_files{$size}}, $file) if $size; - } - - foreach my $same_size_files (values %size_to_files) { - # Don't run sha unless there is another file of the same size - next if scalar(@$same_size_files) < 2; - %sha_to_files = (); - - # Group files with matching sizes by SHA-256 - foreach my $file (@$same_size_files) { - open(my $fh, '<', $file) or next; - binmode($fh); - my $sha256hash = Digest::SHA->new(256)->addfile($fh)->hexdigest; - close($fh); - - push(@{$sha_to_files{$sha256hash}}, $file); - } - - # Remove duplicates - foreach my $same_sha_files (values %sha_to_files) { - next if scalar(@$same_sha_files) < 2; - unlink(@{$same_sha_files}[1..$#$same_sha_files]); - } - } -} - -# Use either fdupes or the built-in scanner to detect and remove duplicate -# search results in the maildir -sub remove_duplicates($) { - my ($maildir) = @_; - - my $fdupes = which("fdupes"); - if ($fdupes) { - system("$fdupes --hardlinks --symlinks --delete --noprompt" - . " --quiet $maildir/cur/ > /dev/null"); - } else { - builtin_remove_dups($maildir); - } -} - # search($maildir, $remove_dups, $query) # search mails according to $query with notmuch; store results in $maildir sub search($$$) { my ($maildir, $remove_dups, $query) = @_; + my $dup_option = ""; + $query = shell_quote($query); + if ($remove_dups) { + $dup_option = "--duplicate=1"; + } + empty_maildir($maildir); - system("notmuch search --output=files $query" + system("notmuch search --output=files $dup_option $query" . " | sed -e 's: :\\\\ :g'" . " | xargs --no-run-if-empty ln -s -t $maildir/cur/"); - remove_duplicates($maildir) if ($remove_dups); } sub prompt($$) { @@ -252,7 +207,10 @@ Instead of using command line search terms, prompt the user for them (only for =item --remove-dups -Remove duplicates from search results. +Remove emails with duplicate message-ids from search results. (Passes +--duplicate=1 to notmuch search command.) Note this can hide search +results if an email accidentally or maliciously uses the same message-id +as a different email. =item -h -- 1.8.4.rc3