Re: [PATCH v2 14/20] nmbug-status: Encode output using the user's locale
authorW. Trevor King <wking@tremily.us>
Tue, 11 Feb 2014 20:11:35 +0000 (12:11 +1600)
committerW. Trevor King <wking@tremily.us>
Fri, 7 Nov 2014 17:59:54 +0000 (09:59 -0800)
0c/773b0bf86cb0cbd7870c083ed46bb7985f221d [new file with mode: 0644]

diff --git a/0c/773b0bf86cb0cbd7870c083ed46bb7985f221d b/0c/773b0bf86cb0cbd7870c083ed46bb7985f221d
new file mode 100644 (file)
index 0000000..0631c13
--- /dev/null
@@ -0,0 +1,162 @@
+Return-Path: <wking@tremily.us>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+       by olra.theworths.org (Postfix) with ESMTP id 18745431FBF\r
+       for <notmuch@notmuchmail.org>; Tue, 11 Feb 2014 12:11:48 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: -0.1\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=-0.1 tagged_above=-999 required=5\r
+       tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1,\r
+       RCVD_IN_DNSWL_NONE=-0.0001] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+       by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+       with ESMTP id Z+rQvOqzTqoa for <notmuch@notmuchmail.org>;\r
+       Tue, 11 Feb 2014 12:11:40 -0800 (PST)\r
+Received: from qmta09.westchester.pa.mail.comcast.net\r
+       (qmta09.westchester.pa.mail.comcast.net [76.96.62.96])\r
+       by olra.theworths.org (Postfix) with ESMTP id E0E70431FBD\r
+       for <notmuch@notmuchmail.org>; Tue, 11 Feb 2014 12:11:39 -0800 (PST)\r
+Received: from omta11.westchester.pa.mail.comcast.net ([76.96.62.36])\r
+       by qmta09.westchester.pa.mail.comcast.net with comcast\r
+       id R3cu1n0020mv7h0598Beaj; Tue, 11 Feb 2014 20:11:38 +0000\r
+Received: from odin.tremily.us ([24.18.63.50])\r
+       by omta11.westchester.pa.mail.comcast.net with comcast\r
+       id R8Bc1n00k152l3L3X8Bdmh; Tue, 11 Feb 2014 20:11:38 +0000\r
+Received: by odin.tremily.us (Postfix, from userid 1000)\r
+       id A59BB1020F20; Tue, 11 Feb 2014 12:11:35 -0800 (PST)\r
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tremily.us; s=odin;\r
+       t=1392149495; bh=xBZ6+21TpuRhhbLqQcsAptAZO1YbnTJnUMbygvjEq34=;\r
+       h=Date:From:To:Cc:Subject:References:In-Reply-To;\r
+       b=Y5JsVfbcudqU/zN63j5HI4OEWRhYMo7FuSaD+aw2CaSyxYJL4MBl9LLRZQckiefH9\r
+       24qh8pKe8cU6oHTq8BG/o8cPOEiKzAO5R1DaPpD3uEGdhpJyHWsC8EewCnXm1rX2Zx\r
+       Zx9GKQfrheneBijLSVecws43yVJD/tniRMahFIvo=\r
+Date: Tue, 11 Feb 2014 12:11:35 -0800\r
+From: "W. Trevor King" <wking@tremily.us>\r
+To: Tomi Ollila <tomi.ollila@iki.fi>\r
+Subject: Re: [PATCH v2 14/20] nmbug-status: Encode output using the user's\r
+       locale\r
+Message-ID: <20140211201135.GJ14197@odin.tremily.us>\r
+References: <cover.1392056624.git.wking@tremily.us>\r
+       <deff072f78f4a7c5b0774e67a8f0517cc704725d.1392056624.git.wking@tremily.us>\r
+       <87eh396e6e.fsf@zancas.localnet>\r
+       <m2sirpu46i.fsf@guru.guru-group.fi>\r
+MIME-Version: 1.0\r
+Content-Type: multipart/signed; micalg=pgp-sha1;\r
+       protocol="application/pgp-signature"; boundary="MR4jz7xdnY3JMfbc"\r
+Content-Disposition: inline\r
+In-Reply-To: <m2sirpu46i.fsf@guru.guru-group.fi>\r
+OpenPGP: id=39A2F3FA2AB17E5D8764F388FC29BDCDF15F5BE8;\r
+       url=http://tremily.us/pubkey.txt\r
+User-Agent: Mutt/1.5.22 (2013-10-16)\r
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net;\r
+       s=q20121106; t=1392149498;\r
+       bh=01X+Pt8TM2BiDeu/MUSHZEpOVOy+c896RrQRkcgwF1A=;\r
+       h=Received:Received:Received:Date:From:To:Subject:Message-ID:\r
+       MIME-Version:Content-Type;\r
+       b=r1/xfPiy1uIp5rL3GMCvvxCjsuxiyk2SSmSQ/mBvpz/SGXhA8YvAtXjAfP/t0KreP\r
+       xWz+07dUuOq2IyslTKAbxKA72VWFZbijoIpoozekM7SW9tlgJ8z9vpOyapsYNj5GAk\r
+       TtZRxO1vLZwTwMVN7maFSK/jK6NGvYjSL2zLq/7ZsBfmre1SRRZCkIDc/s95rK92FQ\r
+       2eLxaJ1ZWrKdp2vY8xrbt6j9gfakegBZyOAw4kXrN6sQwPQbAEqkYLWYJcFkK7uRSi\r
+       iIoSlXTRznNXwhsAivJkpXB5MjkeCe8UzUTLV+6iYqW/9U9BIIje5425Z0fYHfardv\r
+       QjCC446Er60Jg==\r
+Cc: notmuch@notmuchmail.org\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+       <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Tue, 11 Feb 2014 20:11:48 -0000\r
+\r
+\r
+--MR4jz7xdnY3JMfbc\r
+Content-Type: text/plain; charset=utf-8\r
+Content-Disposition: inline\r
+Content-Transfer-Encoding: quoted-printable\r
+\r
+On Tue, Feb 11, 2014 at 04:14:45PM +0200, Tomi Ollila wrote:\r
+> On Tue, Feb 11 2014, David Bremner wrote:\r
+> > W. Trevor King writes:\r
+> >> Instead of always writing UTF-8, allow the user to configure the\r
+> >> output encoding using their locale.  This is useful for\r
+> >> previewing output in the terminal, for poor souls that don't use\r
+> >> UTF-8 locales ;).\r
+> >\r
+> > =E2=80=A6\r
+> > remote: UnicodeEncodeError: 'ascii' codec can't encode character\r
+> >   u'\u017b' in position 219: ordinal not in range(128)\r
+> >\r
+> > possibly because of\r
+> >\r
+> > LANG=3DC\r
+> > =E2=80=A6\r
+> >\r
+> > I think it's fine to _allow_ the user to configure the output\r
+> > encoding. I'm less sure about _requiring_ it.\r
+\r
+If a user has set LANG=3DC, I expect that's what we should use for\r
+output (in which case dying with an encoding error is the right thing\r
+to do).  If you want UTF-8 output, using a UTF-8 locale seems like a\r
+reasonable requirement.  For the HTML case, we could fall back on\r
+numerical character references (e.g. &#x017b;) if the requested locale\r
+didn't support the required character directly, but I don't see an\r
+easy solution for the text-mode output.\r
+\r
+> That reminded me that yesterday (after review, of course) I thought\r
+> that we probably want configuration file to be parsed as utf-8\r
+> instead of any encoding user may have in their system...\r
+\r
+The POSIX spec for LANG doesn't restrict the scoping to the terminal\r
+intput / output [1], so I feel like we should also be using LANG to\r
+read the config file as well.  I expect folks with UTF-8 LANGs will\r
+want UTF-8 file contents.  In both cases (terminal output and\r
+config-file input), it is easy for users to pick their preferred\r
+encoding:\r
+\r
+  $ LANG=3Den_US.UTF-8 nmbug-status =E2=80=A6\r
+\r
+I think we should trust what they've chosen, rather than guessing that\r
+they actually want UTF-8 ;).\r
+\r
+Cheers,\r
+Trevor\r
+\r
+[1]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.htm=\r
+l#tag_08_02\r
+\r
+--=20\r
+This email may be signed or encrypted with GnuPG (http://www.gnupg.org).\r
+For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy\r
+\r
+--MR4jz7xdnY3JMfbc\r
+Content-Type: application/pgp-signature; name="signature.asc"\r
+Content-Description: OpenPGP digital signature\r
+\r
+-----BEGIN PGP SIGNATURE-----\r
+Version: GnuPG v2.0.22 (GNU/Linux)\r
+\r
+iQIcBAEBAgAGBQJS+oP0AAoJEKKfehoaNkbt6ZAP/0H+pnE5bI6Fr1nukjMFq93W\r
+OzbHgIHma+iXHjW4++gid1C1upuDVqZ7n4OkMzRf4sLoh9YRnNIvTaoYyiNUlceN\r
+tiNOzTSNI5r/AJpQKP1pOQIySRMNPbBAvxbEHoL7FhQyOOvPlDR2wuLJw0Scg577\r
+6bksYfSiOoxYYPfNb/Nt0PUYBAL3BWgNyH/47DPdqANgZK7OhY2KuzEwBiBChE4l\r
+cW81OlnqiczYPmSfxLjxLHDyJIgDZh7AN++UgXndLtoUy1xF31pIoJFF5KMdF5EU\r
+SaI3zWqrqA4CzZEWUIzTo6VwqCJj3AMzaGN02R/jUrcunfgmSsjoqiPyLGaoU9xr\r
+eFhFUuPTWr9jPqevaPmutUaj2mp2KfSPzv4ImJCsQZIdO/L91ZCVlzt7nVRxOpp4\r
+l50gfHFP0BGiGwFqP+obZIEvTisJII1bIIIvQvVXKdOHIi0aOcrNzrlvS6VCy3E+\r
+P2zxj/PMPApvXisuMybDhpjYrogxwyYREdCpgd3601VXHXVIFicVEgiy5g0AfhIv\r
+1U0l4xUpgcahbf7gNFTV+NigxHBXvJXBGSnAelu5mTACY6TeK6Sw0uDfsb0QzwXS\r
+8XlcR4tol6Sv8+tsgbTaSXiX3LtWDT6BSsq7+cTRVwD/9oebNsgPwK/GrNuCkAEs\r
+fGIcff1XpR4iPmygM/xn\r
+=A9fT\r
+-----END PGP SIGNATURE-----\r
+\r
+--MR4jz7xdnY3JMfbc--\r