Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id E2311431FAF for ; Sat, 3 Nov 2012 13:53:24 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 0 X-Spam-Level: X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id b7o8JPcwyy2j for ; Sat, 3 Nov 2012 13:53:24 -0700 (PDT) Received: from tesseract.cs.unb.ca (tesseract.cs.unb.ca [131.202.240.238]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 61FE7431FAE for ; Sat, 3 Nov 2012 13:53:24 -0700 (PDT) Received: from remotemail by tesseract.cs.unb.ca with local (Exim 4.72) (envelope-from ) id 1TUkiX-0002nJ-26; Sat, 03 Nov 2012 17:53:21 -0300 Received: (nullmailer pid 8528 invoked by uid 1000); Sat, 03 Nov 2012 20:53:19 -0000 From: David Bremner To: Eirik Byrkjeflot Anonsen , notmuch@notmuchmail.org Subject: Re: Automatic suppression of non-duplicate messages In-Reply-To: <87mwyz3s9d.fsf@star.eba> References: <87mwyz3s9d.fsf@star.eba> User-Agent: Notmuch/0.14+76~g84a0c52 (http://notmuchmail.org) Emacs/24.1.1 (x86_64-pc-linux-gnu) Date: Sat, 03 Nov 2012 16:53:19 -0400 Message-ID: <87390qxvb4.fsf@maritornes.cs.unb.ca> MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Nov 2012 20:53:25 -0000 Eirik Byrkjeflot Anonsen writes: > That's not what I see. If I search for a term that only appears in > one of the "copies", none of the copies are included in the search > result. The offending code is at line 1813 of lib/database.cc; the message is only indexed if the message-id is new. It might be sensible to move _notmuch_message_index_file into the other branch of the if, but even if that works fine, something more sophisticated is needed for the call to __notmuch_message_set_header_values; the invariant that each message has a single subject seems reasonable. Offhand I'm not sure of a good method of automatically deciding what is the same message (with e.g. headers and footer text added by a mailing list).