From: W. Trevor King Date: Sun, 29 Mar 2015 16:36:58 +0000 (+1700) Subject: Re: UnicodeDecodeError with python API X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=1ad8c7df0cc19600a590038f9d3b77948b6acd05;p=notmuch-archives.git Re: UnicodeDecodeError with python API --- diff --git a/c3/d5dd578dea862ec0d76f3bb5a4b333db262ab6 b/c3/d5dd578dea862ec0d76f3bb5a4b333db262ab6 new file mode 100644 index 000000000..d8b328cb9 --- /dev/null +++ b/c3/d5dd578dea862ec0d76f3bb5a4b333db262ab6 @@ -0,0 +1,167 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 85EE2431FC2 + for ; Sun, 29 Mar 2015 09:39:05 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: 2.338 +X-Spam-Level: ** +X-Spam-Status: No, score=2.338 tagged_above=-999 required=5 + tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, + DNS_FROM_AHBL_RHSBL=2.438, RCVD_IN_DNSWL_NONE=-0.0001] + autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id dOZ1PH1xt+k3 for ; + Sun, 29 Mar 2015 09:39:02 -0700 (PDT) +Received: from resqmta-ch2-09v.sys.comcast.net + (resqmta-ch2-09v.sys.comcast.net [69.252.207.41]) + (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id 4B048431FAE + for ; Sun, 29 Mar 2015 09:39:02 -0700 (PDT) +Received: from resomta-ch2-01v.sys.comcast.net ([69.252.207.97]) + by resqmta-ch2-09v.sys.comcast.net with comcast + id 9Udb1q00826dK1R01Uf1hm; Sun, 29 Mar 2015 16:39:01 +0000 +Received: from odin.tremily.us ([67.168.81.176]) + by resomta-ch2-01v.sys.comcast.net with comcast + id 9Ucz1q0063oF5yT01UczZJ; Sun, 29 Mar 2015 16:37:01 +0000 +Received: by odin.tremily.us (Postfix, from userid 1000) + id 344AD16F724D; Sun, 29 Mar 2015 09:36:58 -0700 (PDT) +DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=tremily.us; s=odin; + t=1427647018; bh=cMWZiFbLowP+ASopR1kvaZLYYxWy+p8Vk5vEAWwpIRc=; + h=Date:From:To:Cc:Subject:References:In-Reply-To; + b=RlzYHFSTHzsPOHOD1GKBqPQ+kKia7Zi2779THSjwpv1Y1kf57hlVuysus7S9uUfvb + IKlnA5GR4KBHLc67qPbC7zGMSa2Ofh18i7XsW9+p9ztvov+HCaRCx0A2SO7euPTzhu + WzKPAi/RLTSSPb4YqvJBny1BcyPXbnP7XLABrGK8= +Date: Sun, 29 Mar 2015 09:36:58 -0700 +From: "W. Trevor King" +To: Sebastian Fischmeister +Subject: Re: UnicodeDecodeError with python API +Message-ID: <20150329163658.GK22036@odin.tremily.us> +References: <874mp4q7e7.fsf@uwaterloo.ca> +MIME-Version: 1.0 +Content-Type: multipart/signed; micalg=pgp-sha1; + protocol="application/pgp-signature"; boundary="P8VDRTGkfyc6BSVj" +Content-Disposition: inline +In-Reply-To: <874mp4q7e7.fsf@uwaterloo.ca> +OpenPGP: id=39A2F3FA2AB17E5D8764F388FC29BDCDF15F5BE8; + url=http://tremily.us/pubkey.txt +User-Agent: Mutt/1.5.23 (2014-03-12) +DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; + s=q20140121; t=1427647141; + bh=vHyaNQMLHyT3aM26hQwl5+2RVsLu3r3Rx4JFDdKoy5U=; + h=Received:Received:Received:Date:From:To:Subject:Message-ID: + MIME-Version:Content-Type; + b=C8FN9hmtgEMnOooSI1UIHlwaniIkckokQVwXmdkYuozutqgzT7foiCGvEI4su9J7S + 0snUEsjIBBu1I2Hpm0ZgzMogFn87jYIwQLKPj6uqoSBxj8gJDmTgCXaqWNcP1kI1WA + 4GjdjDs7NRT3cu5/O19GF22RY1KsVMzpwfgS7jIGk2AtQBS4dwEt3oxjNIYaou6oOH + XSEl16m3z58fwWbaT1QE2mDmV1RKIDX9YKdWz0lAwqroQqqtOpb+PYE98V+WWcS8+o + eqjf5yfEcM3CmjKtTYvARGG9zfetG9AEdiTW96O8+Hy1aKc6ACgRG3jUtoqqN4DA83 + VWPOJmpb9Zd0w== +Cc: notmuch +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Sun, 29 Mar 2015 16:39:05 -0000 + + +--P8VDRTGkfyc6BSVj +Content-Type: text/plain; charset=utf-8 +Content-Disposition: inline +Content-Transfer-Encoding: quoted-printable + +On Sun, Mar 29, 2015 at 09:08:28AM -0400, Sebastian Fischmeister wrote: +> Traceback (most recent call last): +> File "./test.py", line 66, in +> print(type(y.get_part(1))) +> =E2=80=A6 +> File "/usr/lib/python3.4/email/parser.py", line 54, in parse +> data =3D fp.read(8192) +> File "/usr/lib/python3.4/encodings/ascii.py", line 26, in decode +> return codecs.ascii_decode(input, self.errors)[0] +> UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 3447= +: ordinal not in range(128) + +My first guess is that the file's encoding doesn't match your locale. +Do you have a non-ASCII locale set? You can check with: + + $ locale + LANG=3Den_US.UTF-8 + LC_CTYPE=3D"en_US.UTF-8" + LC_NUMERIC=3D"en_US.UTF-8" + LC_TIME=3D"en_US.UTF-8" + LC_COLLATE=3DC + LC_MONETARY=3D"en_US.UTF-8" + LC_MESSAGES=3D"en_US.UTF-8" + LC_PAPER=3D"en_US.UTF-8" + LC_NAME=3D"en_US.UTF-8" + LC_ADDRESS=3D"en_US.UTF-8" + LC_TELEPHONE=3D"en_US.UTF-8" + LC_MEASUREMENT=3D"en_US.UTF-8" + LC_IDENTIFICATION=3D"en_US.UTF-8" + LC_ALL=3D + +Although you can obviously have a different preferred language or +Unicode-capable encoding. What you don't want is: + + # locale + LANG=3D + LC_CTYPE=3D"POSIX" + LC_NUMERIC=3D"POSIX" + LC_TIME=3D"POSIX" + LC_COLLATE=3D"POSIX" + LC_MONETARY=3D"POSIX" + LC_MESSAGES=3D"POSIX" + LC_PAPER=3D"POSIX" + LC_NAME=3D"POSIX" + LC_ADDRESS=3D"POSIX" + LC_TELEPHONE=3D"POSIX" + LC_MEASUREMENT=3D"POSIX" + LC_IDENTIFICATION=3D"POSIX" + LC_ALL=3D + +If that's not the problem, could you attach the troublesome message? +Or a minimal example derived from the troublesome message? + +Cheers, +Trevor + +--=20 +This email may be signed or encrypted with GnuPG (http://www.gnupg.org). +For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy + +--P8VDRTGkfyc6BSVj +Content-Type: application/pgp-signature; name="signature.asc" +Content-Description: OpenPGP digital signature + +-----BEGIN PGP SIGNATURE----- +Version: GnuPG v2 + +iQIcBAEBAgAGBQJVGConAAoJEG8/JgBt8ol86+kP/RLld0FbhJa9Y6LTVsUpw/5c +4Q/fEgbnORSJG1n6OQOy+bAG/N0DUZ06t1tR82ICpYdF8pO2cWGjc8mlwkdqRpA/ +hXoHCCPRyOGtgeD+qaTYlyWuXWxw95hEb0IyTlRtAYWqKFlyXT1ZwXrx92kXGNJw +SzIH78telKf7IyO0hJBJrlsRxe+f2vqhzTPDC1Av7xhVflm7Jb/3EWtnqOSsA6aW +yKZdb8nJGjdm0/wwJIBFns6kqsTy5TFpr45nmNfmwf8Uxr5cGHWL3tMVnUXofmsX +7Gje2LBbHJX3sU2lKbNonh5O/t+y4cMVrJI4BCSqFqy3JkNA9GuBgoqyjHh6kEXD +VN2bTS/2JlZfraAmgdF7oVZR8J905HTsaGc9PQK53jWubRXESnWSjIiPW/zqG84v +RCV4eGIAN7rZvQU64AvPkrf6+7/qRueo9+r2ct0M6xQ9CzQHkerweBWE39dCS9LJ +57zWjlwDqvSH/LjZEKJLkm7aCDaphXFOg0exPuxDbBeoRu6Gd+GuNABMVn6lnVpN +Iz/zZ2qcMzsOk+y7NOe5v4AxytCH5gEvZMFncIzoLb6NSQ0DrFpBz7hEUZA/M6YG +TIVhrbaofD1BalRZ5qlUK+kE9NnSqJW8q76rNU2sdqHIAs4+Znl8GQSiPNSFq4gN +i5FMDCfejhVOKVDnyX+l +=ga/y +-----END PGP SIGNATURE----- + +--P8VDRTGkfyc6BSVj--