Re: [PATCH] test: Add test for searching of uncommonly encoded messages
authorSerge Z <triumhiz@yandex.ru>
Fri, 24 Feb 2012 04:29:25 +0000 (08:29 +0400)
committerW. Trevor King <wking@tremily.us>
Fri, 7 Nov 2014 17:44:53 +0000 (09:44 -0800)
8d/e05734cffb8204abeb245181f97236c05485da [new file with mode: 0644]

diff --git a/8d/e05734cffb8204abeb245181f97236c05485da b/8d/e05734cffb8204abeb245181f97236c05485da
new file mode 100644 (file)
index 0000000..5112f3b
--- /dev/null
@@ -0,0 +1,94 @@
+Return-Path: <triumhiz@yandex.ru>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+       by olra.theworths.org (Postfix) with ESMTP id 3F069431FBC\r
+       for <notmuch@notmuchmail.org>; Thu, 23 Feb 2012 20:34:09 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: 1.146\r
+X-Spam-Level: *\r
+X-Spam-Status: No, score=1.146 tagged_above=-999 required=5\r
+       tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,\r
+       RCVD_IN_BL_SPAMCOP_NET=1.246, RCVD_IN_DNSWL_NONE=-0.0001]\r
+       autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+       by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+       with ESMTP id VmHE3kXSc61e for <notmuch@notmuchmail.org>;\r
+       Thu, 23 Feb 2012 20:34:08 -0800 (PST)\r
+X-Greylist: delayed 440 seconds by postgrey-1.32 at olra;\r
+       Thu, 23 Feb 2012 20:34:08 PST\r
+Received: from forward12.mail.yandex.net (forward12.mail.yandex.net\r
+       [95.108.130.94])\r
+       by olra.theworths.org (Postfix) with ESMTP id 7946B431FAE\r
+       for <notmuch@notmuchmail.org>; Thu, 23 Feb 2012 20:34:08 -0800 (PST)\r
+Received: from smtp14.mail.yandex.net (smtp14.mail.yandex.net\r
+ [95.108.131.192])     by forward12.mail.yandex.net (Yandex) with ESMTP id\r
+ 38CAAC223F4   for <notmuch@notmuchmail.org>; Fri, 24 Feb 2012 08:26:42 +0400\r
+ (MSK)\r
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail;\r
+       t=1330057602; bh=iOGfT2kHNwEjlBccPL3bXGnX/59XyH6lboabtYqnOxA=;\r
+       h=Content-Type:MIME-Version:Content-Transfer-Encoding:From:To:\r
+       References:In-Reply-To:Message-ID:Subject:Date;\r
+       b=ri4ZfO+orp+XSGBBN18MTlTAwCWECLkyrjwCv4/FMT47Ft2+TiE+gF0pJE24DChK4\r
+       lgVmeLVYouJ4xoATzPoNGmXaDDQYBcIZ0pOWTG+Z1eISCt9QcZMCkNZ/Qb4aYXKB3Z\r
+       dzPMg6wYnPcnE9FASNQuLAt+j90+E81gC7yLv0Ik=\r
+Received: from smtp14.mail.yandex.net (localhost [127.0.0.1])\r
+       by smtp14.mail.yandex.net (Yandex) with ESMTP id 1BA6C1B60018\r
+       for <notmuch@notmuchmail.org>; Fri, 24 Feb 2012 08:26:42 +0400 (MSK)\r
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail;\r
+       t=1330057602; bh=iOGfT2kHNwEjlBccPL3bXGnX/59XyH6lboabtYqnOxA=;\r
+       h=Content-Type:MIME-Version:Content-Transfer-Encoding:From:To:\r
+       References:In-Reply-To:Message-ID:Subject:Date;\r
+       b=ri4ZfO+orp+XSGBBN18MTlTAwCWECLkyrjwCv4/FMT47Ft2+TiE+gF0pJE24DChK4\r
+       lgVmeLVYouJ4xoATzPoNGmXaDDQYBcIZ0pOWTG+Z1eISCt9QcZMCkNZ/Qb4aYXKB3Z\r
+       dzPMg6wYnPcnE9FASNQuLAt+j90+E81gC7yLv0Ik=\r
+Received: from host-158-152-66-217.spbmts.ru (host-158-152-66-217.spbmts.ru\r
+       [217.66.152.158])\r
+       by smtp14.mail.yandex.net (nwsmtp/Yandex) with ESMTP id\r
+       QcLWeVa9-QeLigpvl; Fri, 24 Feb 2012 08:26:40 +0400\r
+X-Yandex-Spam: 1\r
+Content-Type: text/plain; charset="utf-8"\r
+MIME-Version: 1.0\r
+Content-Transfer-Encoding: quoted-printable\r
+From: Serge Z <triumhiz@yandex.ru>\r
+User-Agent: alot/0.21+\r
+To: notmuch@notmuchmail.org\r
+References: <877gzd5axk.fsf@steelpick.2x.cz>\r
+       <1330043595-22054-1-git-send-email-sojkam1@fel.cvut.cz>\r
+In-Reply-To: <1330043595-22054-1-git-send-email-sojkam1@fel.cvut.cz>\r
+Message-ID: <20120224042925.2870.87924@localhost>\r
+Subject: Re: [PATCH] test: Add test for searching of uncommonly encoded\r
+       messages\r
+Date: Fri, 24 Feb 2012 08:29:25 +0400\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+       <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Fri, 24 Feb 2012 04:34:09 -0000\r
+\r
+\r
+Quoting Michal Sojka (2012-02-24 04:33:15)\r
+>Emails that are encoded differently than as ASCII or UTF-8 are not\r
+>indexed properly by notmuch. It is not possible to search for non-ASCII\r
+>words within those messages.\r
+\r
+Ok. But we can preprocess each incoming message right after 'getmail' to\r
+convert it from html to text and to utf8 encoding. One solution is to creat=\r
+e a\r
+seperate script for this and make gmail pipe all messages to this script, a=\r
+nd\r
+then to notmuch. But It would be better if maildir contains original messag=\r
+es\r
+only, so the question is: can we make nomuch indexing engine to index\r
+preprocessed message while maildir will contain original message - as it was\r
+obtained?\r
+\r