Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id A87E3431FBD for ; Tue, 4 Feb 2014 10:40:28 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 0 X-Spam-Level: X-Spam-Status: No, score=0 tagged_above=-999 required=5 tests=[none] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tTb7boxQsUGJ for ; Tue, 4 Feb 2014 10:40:23 -0800 (PST) Received: from guru.guru-group.fi (guru.guru-group.fi [46.183.73.34]) by olra.theworths.org (Postfix) with ESMTP id A8F0F431FBC for ; Tue, 4 Feb 2014 10:40:23 -0800 (PST) Received: from guru.guru-group.fi (localhost [IPv6:::1]) by guru.guru-group.fi (Postfix) with ESMTP id 187C6100033; Tue, 4 Feb 2014 20:40:19 +0200 (EET) From: Tomi Ollila To: "W. Trevor King" Subject: Re: [PATCH 00/17] nmbug-status: Python-3-compabitility and general refactoring In-Reply-To: <20140204161142.GS14197@odin.tremily.us> References: <20140204005331.GQ14197@odin.tremily.us> <20140204161142.GS14197@odin.tremily.us> User-Agent: Notmuch/0.17+55~g4397960 (http://notmuchmail.org) Emacs/24.3.1 (x86_64-unknown-linux-gnu) X-Face: HhBM'cA~ MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 04 Feb 2014 18:40:28 -0000 On Tue, Feb 04 2014, "W. Trevor King" wrote: > > >>> from __future__ import unicode_literals > >>> import codecs > >>> import locale > >>> import sys > >>> print(locale.getpreferredencoding()) # same as yours > UTF-8 > >>> print(sys.getdefaultencoding()) # same as yours > ascii > >>> _ENCODING =3D locale.getpreferredencoding() or sys.getdefaultencodi= ng() > >>> print(_ENCODING) # double-check default encodings > UTF-8 > >>> byte_stream =3D sys.stdout # copied from Page.write > >>> stream =3D codecs.getwriter(encoding=3D_ENCODING)(stream=3Dbyte_str= eam) > >>> data =3D {'from': '\u017b'} # fake the troublesome data > >>> print(type(data['from'])) # double-check unicode_literals > > >>> string =3D ' {from}\n'.format(**data) > >>> stream.write(string) > =C5=BB > > It looks like you'll have the same _ENCODING as I do (UTF-8). That > means your stream should be wrapped in a UTF-8 StreamWriter, so I > don't understand why it's converting to ASCII. Can you run through > the above on your troublesome machine and confirm that stream.write() > is still raising the exception? If it doesn't work, can you just > paste that whole run in your next email? I don't know what to paste, so i paste this: $ python Python 2.6.6 (r266:84292, Nov 21 2013, 12:39:37)=20 [GCC 4.4.7 20120313 (Red Hat 4.4.7-3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> data =3D {'from': '\u017b'} >>> print(type(data['from']))=20 >>> string =3D ' {from}\n'.format(**data) >>> print string \u017b and then: >>> data =3D {'from': u'\u017b'} >>> print(type(data['from']))=20 >>> string =3D ' {from}\n'.format(**data) Traceback (most recent call last): File "", line 1, in UnicodeEncodeError: 'ascii' codec can't encode character u'\u017b' in >>> position 0: ordinal not in range(128) ... and ... >>> import os >>> print os.environ['LANG'] en_US.UTF-8 > Thanks, > Trevor Tomi