Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 18745431FBF for ; Tue, 11 Feb 2014 12:11:48 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.1 X-Spam-Level: X-Spam-Status: No, score=-0.1 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Z+rQvOqzTqoa for ; Tue, 11 Feb 2014 12:11:40 -0800 (PST) Received: from qmta09.westchester.pa.mail.comcast.net (qmta09.westchester.pa.mail.comcast.net [76.96.62.96]) by olra.theworths.org (Postfix) with ESMTP id E0E70431FBD for ; Tue, 11 Feb 2014 12:11:39 -0800 (PST) Received: from omta11.westchester.pa.mail.comcast.net ([76.96.62.36]) by qmta09.westchester.pa.mail.comcast.net with comcast id R3cu1n0020mv7h0598Beaj; Tue, 11 Feb 2014 20:11:38 +0000 Received: from odin.tremily.us ([24.18.63.50]) by omta11.westchester.pa.mail.comcast.net with comcast id R8Bc1n00k152l3L3X8Bdmh; Tue, 11 Feb 2014 20:11:38 +0000 Received: by odin.tremily.us (Postfix, from userid 1000) id A59BB1020F20; Tue, 11 Feb 2014 12:11:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tremily.us; s=odin; t=1392149495; bh=xBZ6+21TpuRhhbLqQcsAptAZO1YbnTJnUMbygvjEq34=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=Y5JsVfbcudqU/zN63j5HI4OEWRhYMo7FuSaD+aw2CaSyxYJL4MBl9LLRZQckiefH9 24qh8pKe8cU6oHTq8BG/o8cPOEiKzAO5R1DaPpD3uEGdhpJyHWsC8EewCnXm1rX2Zx Zx9GKQfrheneBijLSVecws43yVJD/tniRMahFIvo= Date: Tue, 11 Feb 2014 12:11:35 -0800 From: "W. Trevor King" To: Tomi Ollila Subject: Re: [PATCH v2 14/20] nmbug-status: Encode output using the user's locale Message-ID: <20140211201135.GJ14197@odin.tremily.us> References: <87eh396e6e.fsf@zancas.localnet> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="MR4jz7xdnY3JMfbc" Content-Disposition: inline In-Reply-To: OpenPGP: id=39A2F3FA2AB17E5D8764F388FC29BDCDF15F5BE8; url=http://tremily.us/pubkey.txt User-Agent: Mutt/1.5.22 (2013-10-16) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20121106; t=1392149498; bh=01X+Pt8TM2BiDeu/MUSHZEpOVOy+c896RrQRkcgwF1A=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=r1/xfPiy1uIp5rL3GMCvvxCjsuxiyk2SSmSQ/mBvpz/SGXhA8YvAtXjAfP/t0KreP xWz+07dUuOq2IyslTKAbxKA72VWFZbijoIpoozekM7SW9tlgJ8z9vpOyapsYNj5GAk TtZRxO1vLZwTwMVN7maFSK/jK6NGvYjSL2zLq/7ZsBfmre1SRRZCkIDc/s95rK92FQ 2eLxaJ1ZWrKdp2vY8xrbt6j9gfakegBZyOAw4kXrN6sQwPQbAEqkYLWYJcFkK7uRSi iIoSlXTRznNXwhsAivJkpXB5MjkeCe8UzUTLV+6iYqW/9U9BIIje5425Z0fYHfardv QjCC446Er60Jg== Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Feb 2014 20:11:48 -0000 --MR4jz7xdnY3JMfbc Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Tue, Feb 11, 2014 at 04:14:45PM +0200, Tomi Ollila wrote: > On Tue, Feb 11 2014, David Bremner wrote: > > W. Trevor King writes: > >> Instead of always writing UTF-8, allow the user to configure the > >> output encoding using their locale. This is useful for > >> previewing output in the terminal, for poor souls that don't use > >> UTF-8 locales ;). > > > > =E2=80=A6 > > remote: UnicodeEncodeError: 'ascii' codec can't encode character > > u'\u017b' in position 219: ordinal not in range(128) > > > > possibly because of > > > > LANG=3DC > > =E2=80=A6 > > > > I think it's fine to _allow_ the user to configure the output > > encoding. I'm less sure about _requiring_ it. If a user has set LANG=3DC, I expect that's what we should use for output (in which case dying with an encoding error is the right thing to do). If you want UTF-8 output, using a UTF-8 locale seems like a reasonable requirement. For the HTML case, we could fall back on numerical character references (e.g. Ż) if the requested locale didn't support the required character directly, but I don't see an easy solution for the text-mode output. > That reminded me that yesterday (after review, of course) I thought > that we probably want configuration file to be parsed as utf-8 > instead of any encoding user may have in their system... The POSIX spec for LANG doesn't restrict the scoping to the terminal intput / output [1], so I feel like we should also be using LANG to read the config file as well. I expect folks with UTF-8 LANGs will want UTF-8 file contents. In both cases (terminal output and config-file input), it is easy for users to pick their preferred encoding: $ LANG=3Den_US.UTF-8 nmbug-status =E2=80=A6 I think we should trust what they've chosen, rather than guessing that they actually want UTF-8 ;). Cheers, Trevor [1]: http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.htm= l#tag_08_02 --=20 This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy --MR4jz7xdnY3JMfbc Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAEBAgAGBQJS+oP0AAoJEKKfehoaNkbt6ZAP/0H+pnE5bI6Fr1nukjMFq93W OzbHgIHma+iXHjW4++gid1C1upuDVqZ7n4OkMzRf4sLoh9YRnNIvTaoYyiNUlceN tiNOzTSNI5r/AJpQKP1pOQIySRMNPbBAvxbEHoL7FhQyOOvPlDR2wuLJw0Scg577 6bksYfSiOoxYYPfNb/Nt0PUYBAL3BWgNyH/47DPdqANgZK7OhY2KuzEwBiBChE4l cW81OlnqiczYPmSfxLjxLHDyJIgDZh7AN++UgXndLtoUy1xF31pIoJFF5KMdF5EU SaI3zWqrqA4CzZEWUIzTo6VwqCJj3AMzaGN02R/jUrcunfgmSsjoqiPyLGaoU9xr eFhFUuPTWr9jPqevaPmutUaj2mp2KfSPzv4ImJCsQZIdO/L91ZCVlzt7nVRxOpp4 l50gfHFP0BGiGwFqP+obZIEvTisJII1bIIIvQvVXKdOHIi0aOcrNzrlvS6VCy3E+ P2zxj/PMPApvXisuMybDhpjYrogxwyYREdCpgd3601VXHXVIFicVEgiy5g0AfhIv 1U0l4xUpgcahbf7gNFTV+NigxHBXvJXBGSnAelu5mTACY6TeK6Sw0uDfsb0QzwXS 8XlcR4tol6Sv8+tsgbTaSXiX3LtWDT6BSsq7+cTRVwD/9oebNsgPwK/GrNuCkAEs fGIcff1XpR4iPmygM/xn =A9fT -----END PGP SIGNATURE----- --MR4jz7xdnY3JMfbc--