1 Return-Path: <sojkam1@fel.cvut.cz>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 169C54196F2
\r
6 for <notmuch@notmuchmail.org>; Mon, 17 May 2010 00:56:42 -0700 (PDT)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-0.001 tagged_above=-999 required=5
\r
12 tests=[BAYES_20=-0.001] autolearn=ham
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id q6GKhVqnjwsv for <notmuch@notmuchmail.org>;
\r
16 Mon, 17 May 2010 00:56:31 -0700 (PDT)
\r
17 Received: from max.feld.cvut.cz (max.feld.cvut.cz [147.32.192.36])
\r
18 by olra.theworths.org (Postfix) with ESMTP id 334124196F0
\r
19 for <notmuch@notmuchmail.org>; Mon, 17 May 2010 00:56:31 -0700 (PDT)
\r
20 Received: from localhost (unknown [192.168.200.4])
\r
21 by max.feld.cvut.cz (Postfix) with ESMTP id B52EC19F33E1;
\r
22 Mon, 17 May 2010 09:56:29 +0200 (CEST)
\r
23 X-Virus-Scanned: IMAP AMAVIS
\r
24 Received: from max.feld.cvut.cz ([192.168.200.1])
\r
25 by localhost (styx.feld.cvut.cz [192.168.200.4]) (amavisd-new,
\r
27 with ESMTP id 9HHT-Kdl6kxV; Mon, 17 May 2010 09:56:28 +0200 (CEST)
\r
28 Received: from imap.feld.cvut.cz (imap.feld.cvut.cz [147.32.192.34])
\r
29 by max.feld.cvut.cz (Postfix) with ESMTP id 26F0119F334D;
\r
30 Mon, 17 May 2010 09:56:28 +0200 (CEST)
\r
31 Received: from steelpick.2x.cz (k335-30.felk.cvut.cz [147.32.86.30])
\r
32 (Authenticated sender: sojkam1)
\r
33 by imap.feld.cvut.cz (Postfix) with ESMTPSA id F30C515C062;
\r
34 Mon, 17 May 2010 09:56:27 +0200 (CEST)
\r
35 Received: from wsh by steelpick.2x.cz with local (Exim 4.71)
\r
36 (envelope-from <sojkam1@fel.cvut.cz>)
\r
37 id 1ODvBb-0006xe-Kw; Mon, 17 May 2010 09:56:27 +0200
\r
38 From: Michal Sojka <sojkam1@fel.cvut.cz>
\r
39 To: Igor Shenderovich <shender.i@gmail.com>, notmuch@notmuchmail.org
\r
40 Subject: Re: utf-8 in author field
\r
41 In-Reply-To: <AANLkTilbzkZSQ0fJd_xtA6gq1QbitGgKA8sV3Mt_uoYv@mail.gmail.com>
\r
42 References: <AANLkTilbzkZSQ0fJd_xtA6gq1QbitGgKA8sV3Mt_uoYv@mail.gmail.com>
\r
43 User-Agent: Notmuch/0.3.1-33-g594021b (http://notmuchmail.org) Emacs/23.1.1
\r
44 (x86_64-pc-linux-gnu)
\r
45 Date: Mon, 17 May 2010 09:56:27 +0200
\r
46 Message-ID: <87vdanrmfo.fsf@steelpick.2x.cz>
\r
48 Content-Type: text/plain; charset=us-ascii
\r
49 X-BeenThere: notmuch@notmuchmail.org
\r
50 X-Mailman-Version: 2.1.13
\r
52 List-Id: "Use and development of the notmuch mail system."
\r
53 <notmuch.notmuchmail.org>
\r
54 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
55 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
56 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
57 List-Post: <mailto:notmuch@notmuchmail.org>
\r
58 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
59 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
60 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
61 X-List-Received-Date: Mon, 17 May 2010 07:56:42 -0000
\r
63 On Fri, 14 May 2010, Igor Shenderovich wrote:
\r
66 > I'm using the latest version of notmuch (cloned from git on May 13), but I
\r
67 > can't handle with utf-8 symbols in the authors field. For example, I have a
\r
68 > letter with the field
\r
71 > "=?UTF-8?B?Z3JpZmZvbiAtINCa0L7QvNC80LXQvdGC0LDRgNC40Lkg0LIg0JbQlg==?=",
\r
73 > (got it from usual emacs interface).
\r
75 > However, the body of this letter is pretty readable (it also contains some
\r
76 > utf-8 characters).
\r
78 > What should one do to see the true list of authors?
\r
82 I encounter the same when headers are not encoded properly according to
\r
83 RFC 2047. I commonly see the violation of section 5, paragraph (3),
\r
84 sentence "An 'encoded-word' MUST NOT appear within a 'quoted-string'".
\r
85 That is when the encoded word is enclosed in double quotes. I guess, the
\r
86 "problem" is not only notmuch related, but all users of gmime library
\r
89 I use the following patch for notmuch to sanitize headers from a popular
\r
90 mailing list server in Czech republic:
\r
97 From: Michal Sojka <sojkam1@fel.cvut.cz>
\r
98 Subject: Fix broken headers from pandora.cz
\r
102 lib/message-file.c | 34 ++++++++++++++++++++++++++++++++++
\r
103 1 files changed, 34 insertions(+), 0 deletions(-)
\r
105 diff --git a/lib/message-file.c b/lib/message-file.c
\r
106 index 7722832..abfedc1 100644
\r
107 --- a/lib/message-file.c
\r
108 +++ b/lib/message-file.c
\r
109 @@ -42,6 +42,7 @@ struct _notmuch_message_file {
\r
110 int broken_headers;
\r
112 size_t header_size; /* Length of full message header in bytes. */
\r
113 + notmuch_bool_t pandora_cz_quirk;
\r
115 /* Parsing state */
\r
117 @@ -324,7 +325,40 @@ notmuch_message_file_get_header (notmuch_message_file_t *message,
\r
119 match = (strcasecmp (header, header_desired) == 0);
\r
121 + if (strstr(message->value.str, "=40pandora=2Ecz=29") ||
\r
122 + strstr(message->value.str, "@pandora.cz") ||
\r
123 + message->pandora_cz_quirk)
\r
125 + char *quote = message->value.str;
\r
126 + message->pandora_cz_quirk = TRUE;
\r
127 + if (*quote == '"') {
\r
128 + int len = strlen(quote);
\r
129 + bcopy(quote+1, quote, len);
\r
130 + quote = strchr(quote, '"');
\r
132 + len = strlen(quote);
\r
133 + bcopy(quote+1, quote, len);
\r
138 decoded_value = g_mime_utils_header_decode_text (message->value.str);
\r
140 + if (message->pandora_cz_quirk &&
\r
141 + strcasecmp (header, "From") == 0)
\r
143 + /* remove "(<conf>@pandora.cz)" */
\r
144 + char *langle = strchr(decoded_value, '<');
\r
146 + char *comment = langle - 2;
\r
147 + if (comment > decoded_value && *comment == ')')
\r
148 + while (comment > decoded_value && *comment != '(')
\r
150 + if (comment > decoded_value)
\r
151 + bcopy(langle, comment, strlen(langle)+1);
\r
155 header_sofar = (char *)g_hash_table_lookup (message->headers, header);
\r
156 /* we treat the Received: header special - we want to concat ALL of
\r
157 * the Received: headers we encounter.
\r
159 tg: (417274d..) t/Fix-broken-headers-from-pandora.cz (depends on: master)
\r