1 Return-Path: <sojkam1@fel.cvut.cz>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id B6A6E431FC0
\r
6 for <notmuch@notmuchmail.org>; Wed, 23 May 2012 03:15:31 -0700 (PDT)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=-2.3 tagged_above=-999 required=5
\r
12 tests=[RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled
\r
13 Received: from olra.theworths.org ([127.0.0.1])
\r
14 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
15 with ESMTP id jxkonYeCNwoI for <notmuch@notmuchmail.org>;
\r
16 Wed, 23 May 2012 03:15:30 -0700 (PDT)
\r
17 Received: from max.feld.cvut.cz (max.feld.cvut.cz [147.32.192.36])
\r
18 by olra.theworths.org (Postfix) with ESMTP id 5D85C431FBD
\r
19 for <notmuch@notmuchmail.org>; Wed, 23 May 2012 03:15:30 -0700 (PDT)
\r
20 Received: from localhost (unknown [192.168.200.4])
\r
21 by max.feld.cvut.cz (Postfix) with ESMTP id 69D9619F3375;
\r
22 Wed, 23 May 2012 12:15:29 +0200 (CEST)
\r
23 X-Virus-Scanned: IMAP AMAVIS
\r
24 Received: from max.feld.cvut.cz ([192.168.200.1])
\r
25 by localhost (styx.feld.cvut.cz [192.168.200.4]) (amavisd-new,
\r
27 with ESMTP id UCiCyRxsfvJD; Wed, 23 May 2012 12:15:20 +0200 (CEST)
\r
28 Received: from imap.feld.cvut.cz (imap.feld.cvut.cz [147.32.192.34])
\r
29 by max.feld.cvut.cz (Postfix) with ESMTP id B350C19F3353;
\r
30 Wed, 23 May 2012 12:15:19 +0200 (CEST)
\r
31 Received: from steelpick.2x.cz (note-sojka.felk.cvut.cz [147.32.86.30])
\r
32 (Authenticated sender: sojkam1)
\r
33 by imap.feld.cvut.cz (Postfix) with ESMTPSA id 9329D660968;
\r
34 Wed, 23 May 2012 12:15:18 +0200 (CEST)
\r
35 Received: from wsh by steelpick.2x.cz with local (Exim 4.77)
\r
36 (envelope-from <sojkam1@fel.cvut.cz>)
\r
37 id 1SX8b8-0004sh-EC; Wed, 23 May 2012 12:15:18 +0200
\r
38 From: Michal Sojka <sojkam1@fel.cvut.cz>
\r
39 To: Tomi Ollila <tomi.ollila@iki.fi>, Adam Wolfe Gordon <awg+notmuch@xvx.ca>
\r
40 Subject: Re: emacs complains about encoding?
\r
41 In-Reply-To: <m27gw4nyfu.fsf@guru.guru-group.fi>
\r
42 References: <20120515194455.B7AD5100646@guru.guru-group.fi>
\r
43 <878vgsbprq.fsf@nikula.org> <m23970bhre.fsf@guru.guru-group.fi>
\r
44 <CAMoJFUungAFPWy0d1Lh+rqmpK--P7MMEwNaewWHR=rbYo+BKsA@mail.gmail.com>
\r
45 <871umc1int.fsf@steelpick.2x.cz>
\r
46 <m27gw4nyfu.fsf@guru.guru-group.fi>
\r
47 User-Agent: Notmuch/0.13+14~g2d2a5a4 (http://notmuchmail.org) Emacs/23.4.1
\r
48 (x86_64-pc-linux-gnu)
\r
49 Date: Wed, 23 May 2012 12:15:18 +0200
\r
50 Message-ID: <87r4uburt5.fsf@steelpick.2x.cz>
\r
52 Content-Type: text/plain; charset=us-ascii
\r
53 Cc: notmuch@notmuchmail.org
\r
54 X-BeenThere: notmuch@notmuchmail.org
\r
55 X-Mailman-Version: 2.1.13
\r
57 List-Id: "Use and development of the notmuch mail system."
\r
58 <notmuch.notmuchmail.org>
\r
59 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
60 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
61 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
62 List-Post: <mailto:notmuch@notmuchmail.org>
\r
63 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
64 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
65 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
66 X-List-Received-Date: Wed, 23 May 2012 10:15:31 -0000
\r
68 Tomi Ollila <tomi.ollila@iki.fi> writes:
\r
69 > Michal Sojka <sojkam1@fel.cvut.cz> writes:
\r
73 >> Adam Wolfe Gordon <awg+notmuch@xvx.ca> writes:
\r
74 >>> It turns out it's actually not the emacs side, but an interaction
\r
75 >>> between our JSON reply format and emacs.
\r
77 >>> The JSON reply (and show) code includes part content for all text/*
\r
78 >>> parts except text/html. Because all JSON is required to be UTF-8, it
\r
79 >>> handles the encoding itself, puts UTF-8 text in, and omits a
\r
80 >>> content-charset field from the output. Emacs passes on the
\r
81 >>> content-charset field to mm-display-part-inline if it's available, but
\r
82 >>> for text/plain parts it's not, leaving mm-display-part-inline to its
\r
83 >>> own devices for figuring out what the charset is. It seems
\r
84 >>> mm-display-part-inline correctly figures out that it's UTF-8, and puts
\r
85 >>> in the series of ugly \nnn characters because that's what emacs does
\r
86 >>> with UTF-8 sometimes.
\r
88 >>> In the original reply stuff (pre-JSON reply format) emacs used the
\r
89 >>> output of notmuch reply verbatim, so all the charset stuff was handled
\r
90 >>> in notmuch. Before f6c170fabca8f39e74705e3813504137811bf162, emacs was
\r
91 >>> using the JSON reply format, but was inserting the text itself instead
\r
92 >>> of using mm-display-part-inline, so emacs still wasn't trying to do
\r
93 >>> any charset manipulation. Using mm-display-part-inline is desirable
\r
94 >>> because it lets us handle non-text/plain (e.g. text/html) parts
\r
95 >>> correctly in reply, and makes the display more consistent (since we
\r
96 >>> use it for show). But, it leads to this problem.
\r
98 >>> So, there are a couple of solutions I can see:
\r
100 >>> 1) Have the JSON formats include the original content-charset even
\r
101 >>> though they're actually outputting UTF-8. Of the solutions I tried,
\r
102 >>> this is the best, even though it doesn't sound like a good thing to
\r
105 >>> 2) Have the JSON formats include content only if it's actually UTF-8.
\r
106 >>> This means that for non-UTF-8 parts (including ASCII parts), the emacs
\r
107 >>> interface has to do more work to display the part content, since it
\r
108 >>> must fetch it from outside first. When I tried this, it worked but
\r
109 >>> caused the \nnn to show up when viewing messages in emacs. I suspect
\r
110 >>> this is because it sets a charset for the whole buffer, and can't
\r
111 >>> accommodate messages with different charsets in the same buffer
\r
112 >>> properly. Reply works correctly, though.
\r
114 >>> 3) Have the JSON formats include the charset for all parts, but make
\r
115 >>> it UTF-8 for all parts they include content for (since we're actually
\r
116 >>> outputting UTF-8). This doesn't seem to fix the problem, even though
\r
117 >>> it seems like it should.
\r
119 >>> If no one has a better idea or a strong reason not to, I'll send a
\r
120 >>> patch for solution (1).
\r
122 >> Thank you very much for your analysis. It encouraged me to dig into the
\r
123 >> problem and I've found another solution, which might be better than
\r
124 >> those you suggested.
\r
126 >> I traced what Emacs does with the text inside
\r
127 >> notmuch-mm-display-part-inline and the wrong charset conversion happens
\r
128 >> deeply in elisp code in mm-with-part called by mm-get-part, which is in
\r
129 >> turn called by mm-inline-text. There is a way to make mm-inline-text not
\r
130 >> to call mm-get-part, which is to set the charset to 'gnus-decoded. This
\r
131 >> sounds like something that applies to our situation, where the part is
\r
132 >> already decoded.
\r
134 > You've digged deeper than I did... :)
\r
137 >> The following patch (apply it with git am -c) solves the problem for me.
\r
138 >> However, I'm not sure it is a universal solution. It sets the charset
\r
139 >> only if it is not defined in notmuch json output and I'm not sure that
\r
140 >> this is correct. text/html parts seem to have charset defined, but as
\r
141 >> you wrote that json is always utf-8, so it might be that we need
\r
142 >> 'gnus-decoded always, independently of the json output. What do you
\r
145 > No -- when non-inlined content is fetched by executing command
\r
146 > notmuch show --format=raw --part=n --decrypt id:"<message-id>" the content
\r
147 > is received with original charset -- and then mm-* components needs to have
\r
148 > correct charset set (well, I think, I have not tested ;).
\r
150 > Also, we cannot rely that the json output doesn't contain content-charset
\r
151 > information in the future...
\r
153 > I'm currently applying this to my build tree whenever I rebuild notmuch for
\r
154 > my own use: id:"1337533094-5467-1-git-send-email-tomi.ollila@iki.fi"
\r
156 Great, this is more or less the same solution :-)
\r
158 > I think the current plan is to use the same decoding lookup table that
\r
159 > notmuch-show is using in reply too.
\r
161 Which table do you refer to? notmuch-show-handlers-for?
\r
163 > That is good plan for consistency point of view. That just requires
\r
164 > some code to be moved from notmuch-show.el to some other file (maybe a
\r