Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 7F509431FB6 for ; Sun, 20 May 2012 08:34:56 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qYJUBtrE5ElG for ; Sun, 20 May 2012 08:34:54 -0700 (PDT) Received: from mail-lpp01m010-f53.google.com (mail-lpp01m010-f53.google.com [209.85.215.53]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 978E8431FAE for ; Sun, 20 May 2012 08:34:54 -0700 (PDT) Received: by lagu2 with SMTP id u2so3478750lag.26 for ; Sun, 20 May 2012 08:34:51 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:sender:x-originating-ip:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding:x-gm-message-state; bh=tEWgpW06KE7GKjpnHkKqwR6tdvV5SURNuhRR+Ip755g=; b=mWdfREXlujHWx4OqhubSvZ9qyPDwn8N3AFiGx0otsmKvO/gnTqNK53tRppjyf+QcbJ 2UvciQtw15+LKACSrk8PA55RcC29LStdySLI5VCDu46jBb7ZwaFiZ3wYJmKP6aB2KDES ZANG+nqZvkvqSVMmOpa/5O2qMxIhgeW2hQipfIYiSXWu/qVZLBAnn13ax2+hVL7/Oj3U WCiIjAj2DySaird7ecNywVZpycOtylGFjBCuv39Ei/n9NHA8pne6enSyxh93CwUoxvTM Iazjcxe+oO0Pdskz+DnjJUHNY1TxldTvmhgA28NJIsfGYWlBXNnkP8MgmolpeAt/1egS G2dQ== MIME-Version: 1.0 Received: by 10.152.145.41 with SMTP id sr9mr10681666lab.25.1337528091518; Sun, 20 May 2012 08:34:51 -0700 (PDT) Sender: awg@xvx.ca Received: by 10.112.82.163 with HTTP; Sun, 20 May 2012 08:34:51 -0700 (PDT) X-Originating-IP: [96.52.216.56] In-Reply-To: References: <20120515194455.B7AD5100646@guru.guru-group.fi> <878vgsbprq.fsf@nikula.org> Date: Sun, 20 May 2012 09:34:51 -0600 X-Google-Sender-Auth: 92fN3P7EpdBUqCWr4e1SsCOyQ-M Message-ID: Subject: Re: emacs complains about encoding? From: Adam Wolfe Gordon To: Tomi Ollila Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQnZSxpxAMqJqUbF4AsRCO4a+i0bJBP47PyxcuUvQUIOciwUWyMlj/Lv7JR4HFwQkzhoFtyv Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 20 May 2012 15:34:56 -0000 On Wed, May 16, 2012 at 3:24 AM, Tomi Ollila wrote: > Haa, It doesn't matter which is the original encoding of the message; > > notmuch reply id:20120515194455.B7AD5100646@guru.guru-group.fi > > where =A0notmuch show --format=3Draw ^^^ =A0outputs (among other lines): > > =A0Content-Type: text/plain; charset=3D"iso-8859-1" > =A0Content-Transfer-Encoding: quoted-printable > > and > > notmuch reply id:"878vgsbprq.fsf@nikula.org" > > where =A0notmuch show --format=3Draw ^^^ =A0outputs (among other lines): > > =A0Content-Type: text/plain; charset=3D"utf-8" > =A0Content-Transfer-Encoding: base64 > > produce correct reply content, both in utf-8. > > So it is the emacs side which breaks replies. It turns out it's actually not the emacs side, but an interaction between our JSON reply format and emacs. The JSON reply (and show) code includes part content for all text/* parts except text/html. Because all JSON is required to be UTF-8, it handles the encoding itself, puts UTF-8 text in, and omits a content-charset field from the output. Emacs passes on the content-charset field to mm-display-part-inline if it's available, but for text/plain parts it's not, leaving mm-display-part-inline to its own devices for figuring out what the charset is. It seems mm-display-part-inline correctly figures out that it's UTF-8, and puts in the series of ugly \nnn characters because that's what emacs does with UTF-8 sometimes. In the original reply stuff (pre-JSON reply format) emacs used the output of notmuch reply verbatim, so all the charset stuff was handled in notmuch. Before f6c170fabca8f39e74705e3813504137811bf162, emacs was using the JSON reply format, but was inserting the text itself instead of using mm-display-part-inline, so emacs still wasn't trying to do any charset manipulation. Using mm-display-part-inline is desirable because it lets us handle non-text/plain (e.g. text/html) parts correctly in reply, and makes the display more consistent (since we use it for show). But, it leads to this problem. So, there are a couple of solutions I can see: 1) Have the JSON formats include the original content-charset even though they're actually outputting UTF-8. Of the solutions I tried, this is the best, even though it doesn't sound like a good thing to do. 2) Have the JSON formats include content only if it's actually UTF-8. This means that for non-UTF-8 parts (including ASCII parts), the emacs interface has to do more work to display the part content, since it must fetch it from outside first. When I tried this, it worked but caused the \nnn to show up when viewing messages in emacs. I suspect this is because it sets a charset for the whole buffer, and can't accommodate messages with different charsets in the same buffer properly. Reply works correctly, though. 3) Have the JSON formats include the charset for all parts, but make it UTF-8 for all parts they include content for (since we're actually outputting UTF-8). This doesn't seem to fix the problem, even though it seems like it should. If no one has a better idea or a strong reason not to, I'll send a patch for solution (1). -- Adam