Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 85EE2431FC2 for ; Sun, 29 Mar 2015 09:39:05 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: 2.338 X-Spam-Level: ** X-Spam-Status: No, score=2.338 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DNS_FROM_AHBL_RHSBL=2.438, RCVD_IN_DNSWL_NONE=-0.0001] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dOZ1PH1xt+k3 for ; Sun, 29 Mar 2015 09:39:02 -0700 (PDT) Received: from resqmta-ch2-09v.sys.comcast.net (resqmta-ch2-09v.sys.comcast.net [69.252.207.41]) (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 4B048431FAE for ; Sun, 29 Mar 2015 09:39:02 -0700 (PDT) Received: from resomta-ch2-01v.sys.comcast.net ([69.252.207.97]) by resqmta-ch2-09v.sys.comcast.net with comcast id 9Udb1q00826dK1R01Uf1hm; Sun, 29 Mar 2015 16:39:01 +0000 Received: from odin.tremily.us ([67.168.81.176]) by resomta-ch2-01v.sys.comcast.net with comcast id 9Ucz1q0063oF5yT01UczZJ; Sun, 29 Mar 2015 16:37:01 +0000 Received: by odin.tremily.us (Postfix, from userid 1000) id 344AD16F724D; Sun, 29 Mar 2015 09:36:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tremily.us; s=odin; t=1427647018; bh=cMWZiFbLowP+ASopR1kvaZLYYxWy+p8Vk5vEAWwpIRc=; h=Date:From:To:Cc:Subject:References:In-Reply-To; b=RlzYHFSTHzsPOHOD1GKBqPQ+kKia7Zi2779THSjwpv1Y1kf57hlVuysus7S9uUfvb IKlnA5GR4KBHLc67qPbC7zGMSa2Ofh18i7XsW9+p9ztvov+HCaRCx0A2SO7euPTzhu WzKPAi/RLTSSPb4YqvJBny1BcyPXbnP7XLABrGK8= Date: Sun, 29 Mar 2015 09:36:58 -0700 From: "W. Trevor King" To: Sebastian Fischmeister Subject: Re: UnicodeDecodeError with python API Message-ID: <20150329163658.GK22036@odin.tremily.us> References: <874mp4q7e7.fsf@uwaterloo.ca> MIME-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="P8VDRTGkfyc6BSVj" Content-Disposition: inline In-Reply-To: <874mp4q7e7.fsf@uwaterloo.ca> OpenPGP: id=39A2F3FA2AB17E5D8764F388FC29BDCDF15F5BE8; url=http://tremily.us/pubkey.txt User-Agent: Mutt/1.5.23 (2014-03-12) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; s=q20140121; t=1427647141; bh=vHyaNQMLHyT3aM26hQwl5+2RVsLu3r3Rx4JFDdKoy5U=; h=Received:Received:Received:Date:From:To:Subject:Message-ID: MIME-Version:Content-Type; b=C8FN9hmtgEMnOooSI1UIHlwaniIkckokQVwXmdkYuozutqgzT7foiCGvEI4su9J7S 0snUEsjIBBu1I2Hpm0ZgzMogFn87jYIwQLKPj6uqoSBxj8gJDmTgCXaqWNcP1kI1WA 4GjdjDs7NRT3cu5/O19GF22RY1KsVMzpwfgS7jIGk2AtQBS4dwEt3oxjNIYaou6oOH XSEl16m3z58fwWbaT1QE2mDmV1RKIDX9YKdWz0lAwqroQqqtOpb+PYE98V+WWcS8+o eqjf5yfEcM3CmjKtTYvARGG9zfetG9AEdiTW96O8+Hy1aKc6ACgRG3jUtoqqN4DA83 VWPOJmpb9Zd0w== Cc: notmuch X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 29 Mar 2015 16:39:05 -0000 --P8VDRTGkfyc6BSVj Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Sun, Mar 29, 2015 at 09:08:28AM -0400, Sebastian Fischmeister wrote: > Traceback (most recent call last): > File "./test.py", line 66, in > print(type(y.get_part(1))) > =E2=80=A6 > File "/usr/lib/python3.4/email/parser.py", line 54, in parse > data =3D fp.read(8192) > File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode > return codecs.ascii_decode(input, self.errors)[0] > UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3447= : ordinal not in range(128) My first guess is that the file's encoding doesn't match your locale. Do you have a non-ASCII locale set? You can check with: $ locale LANG=3Den_US.UTF-8 LC_CTYPE=3D"en_US.UTF-8" LC_NUMERIC=3D"en_US.UTF-8" LC_TIME=3D"en_US.UTF-8" LC_COLLATE=3DC LC_MONETARY=3D"en_US.UTF-8" LC_MESSAGES=3D"en_US.UTF-8" LC_PAPER=3D"en_US.UTF-8" LC_NAME=3D"en_US.UTF-8" LC_ADDRESS=3D"en_US.UTF-8" LC_TELEPHONE=3D"en_US.UTF-8" LC_MEASUREMENT=3D"en_US.UTF-8" LC_IDENTIFICATION=3D"en_US.UTF-8" LC_ALL=3D Although you can obviously have a different preferred language or Unicode-capable encoding. What you don't want is: # locale LANG=3D LC_CTYPE=3D"POSIX" LC_NUMERIC=3D"POSIX" LC_TIME=3D"POSIX" LC_COLLATE=3D"POSIX" LC_MONETARY=3D"POSIX" LC_MESSAGES=3D"POSIX" LC_PAPER=3D"POSIX" LC_NAME=3D"POSIX" LC_ADDRESS=3D"POSIX" LC_TELEPHONE=3D"POSIX" LC_MEASUREMENT=3D"POSIX" LC_IDENTIFICATION=3D"POSIX" LC_ALL=3D If that's not the problem, could you attach the troublesome message? Or a minimal example derived from the troublesome message? Cheers, Trevor --=20 This email may be signed or encrypted with GnuPG (http://www.gnupg.org). For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy --P8VDRTGkfyc6BSVj Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBAgAGBQJVGConAAoJEG8/JgBt8ol86+kP/RLld0FbhJa9Y6LTVsUpw/5c 4Q/fEgbnORSJG1n6OQOy+bAG/N0DUZ06t1tR82ICpYdF8pO2cWGjc8mlwkdqRpA/ hXoHCCPRyOGtgeD+qaTYlyWuXWxw95hEb0IyTlRtAYWqKFlyXT1ZwXrx92kXGNJw SzIH78telKf7IyO0hJBJrlsRxe+f2vqhzTPDC1Av7xhVflm7Jb/3EWtnqOSsA6aW yKZdb8nJGjdm0/wwJIBFns6kqsTy5TFpr45nmNfmwf8Uxr5cGHWL3tMVnUXofmsX 7Gje2LBbHJX3sU2lKbNonh5O/t+y4cMVrJI4BCSqFqy3JkNA9GuBgoqyjHh6kEXD VN2bTS/2JlZfraAmgdF7oVZR8J905HTsaGc9PQK53jWubRXESnWSjIiPW/zqG84v RCV4eGIAN7rZvQU64AvPkrf6+7/qRueo9+r2ct0M6xQ9CzQHkerweBWE39dCS9LJ 57zWjlwDqvSH/LjZEKJLkm7aCDaphXFOg0exPuxDbBeoRu6Gd+GuNABMVn6lnVpN Iz/zZ2qcMzsOk+y7NOe5v4AxytCH5gEvZMFncIzoLb6NSQ0DrFpBz7hEUZA/M6YG TIVhrbaofD1BalRZ5qlUK+kE9NnSqJW8q76rNU2sdqHIAs4+Znl8GQSiPNSFq4gN i5FMDCfejhVOKVDnyX+l =ga/y -----END PGP SIGNATURE----- --P8VDRTGkfyc6BSVj--