Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id EFDED431FBC for ; Thu, 23 Feb 2012 16:31:38 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -2.3 X-Spam-Level: X-Spam-Status: No, score=-2.3 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id sq6GnxX4aFYZ for ; Thu, 23 Feb 2012 16:31:38 -0800 (PST) Received: from max.feld.cvut.cz (max.feld.cvut.cz [147.32.192.36]) by olra.theworths.org (Postfix) with ESMTP id 274E1431FAE for ; Thu, 23 Feb 2012 16:31:38 -0800 (PST) Received: from localhost (unknown [192.168.200.4]) by max.feld.cvut.cz (Postfix) with ESMTP id 11AAD19F330A; Fri, 24 Feb 2012 01:31:37 +0100 (CET) X-Virus-Scanned: IMAP AMAVIS Received: from max.feld.cvut.cz ([192.168.200.1]) by localhost (styx.feld.cvut.cz [192.168.200.4]) (amavisd-new, port 10044) with ESMTP id OyDXVwAxqspK; Fri, 24 Feb 2012 01:31:36 +0100 (CET) Received: from imap.feld.cvut.cz (imap.feld.cvut.cz [147.32.192.34]) by max.feld.cvut.cz (Postfix) with ESMTP id 2C8E219F32F4; Fri, 24 Feb 2012 01:31:36 +0100 (CET) Received: from steelpick.2x.cz (cable-86-56-3-85.cust.telecolumbus.net [86.56.3.85]) (Authenticated sender: sojkam1) by imap.feld.cvut.cz (Postfix) with ESMTPSA id 1CF9F660968; Fri, 24 Feb 2012 01:31:36 +0100 (CET) Received: from wsh by steelpick.2x.cz with local (Exim 4.77) (envelope-from ) id 1S0j4R-0005dL-EW; Fri, 24 Feb 2012 01:31:35 +0100 From: Michal Sojka To: Serge Z , notmuch@notmuchmail.org Subject: Re: Searching through different charsets In-Reply-To: <20120222171041.11455.92079@localhost> References: <20120222171041.11455.92079@localhost> User-Agent: Notmuch/0.11.1+210~g5c2fc0a (http://notmuchmail.org) Emacs/23.3.1 (x86_64-pc-linux-gnu) Date: Fri, 24 Feb 2012 01:31:35 +0100 Message-ID: <877gzd5axk.fsf@steelpick.2x.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Feb 2012 00:31:39 -0000 On Wed, 22 Feb 2012, Serge Z wrote: > > Hello! > > I've got the following problem: fetched emails can be in different encodings. > And searching a term typed in one encoding (system default) does not match the > same term in another encoding. > > The solution, as I see, can be in preprocessing each incoming email to > "normalize" it and its encoding so that indexer will handle emails in system > encoding only. Could you please suggest something? I can confirm this issue and sending a patch with test case (marked as broken) for this. I expect the fix to be quite simple because all encoding/docoding stuff is already implemented in gmime which is used by notmuch when indexing. > > Another issue (not so much wanted but wanted too) is searching through html > messages without matching html tags. I don't know whether somebody works on this or nor. > This problem looks to be solvable by properly configured run-mailcap. Is there > such solution anywhere? I don't think that run-mailcap has anything to do with notmuch. -Michal