From: Amadeusz Żołnowski Date: Sun, 7 Jul 2013 07:00:16 +0000 (+0200) Subject: [notmuch] Unicode in Python bindings X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=c5d930e266d7e9c3cee3fcb8e5fd2fff1ffe0088;p=notmuch-archives.git [notmuch] Unicode in Python bindings --- diff --git a/16/f851c20302a3e40812bcd86a5eca2d88652798 b/16/f851c20302a3e40812bcd86a5eca2d88652798 new file mode 100644 index 000000000..11ae55d03 --- /dev/null +++ b/16/f851c20302a3e40812bcd86a5eca2d88652798 @@ -0,0 +1,171 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 12681431FAF + for ; Tue, 9 Jul 2013 02:49:47 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: 0.002 +X-Spam-Level: +X-Spam-Status: No, score=0.002 tagged_above=-999 required=5 + tests=[TVD_RCVD_SPACE_BRACKET=0.001, UNPARSEABLE_RELAY=0.001] + autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id IISlbJzESpJX for ; + Tue, 9 Jul 2013 02:49:39 -0700 (PDT) +Received: from mail.cryptobitch.de (cryptobitch.de [88.198.7.68]) + (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id B2B4C431FAE + for ; Tue, 9 Jul 2013 02:49:38 -0700 (PDT) +Received: from mail.jade-hamburg.de (mail.jade-hamburg.de [85.183.11.228]) + (using TLSv1 with cipher ADH-AES256-SHA (256/256 bits)) + (No client certificate requested) + by mail.cryptobitch.de (Postfix) with ESMTPSA id 6D5456A4A0C + for ; Tue, 9 Jul 2013 11:49:32 +0200 (CEST) +Received: by mail.jade-hamburg.de (Postfix, from userid 401) + id B8578DF2A2; Tue, 9 Jul 2013 11:49:31 +0200 (CEST) +Received: from thinkbox.jade-hamburg.de (cryptobitch.de [88.198.7.68]) + (using TLSv1 with cipher DHE-RSA-AES128-SHA (128/128 bits)) + (No client certificate requested) (Authenticated sender: teythoon) + by mail.jade-hamburg.de (Postfix) with ESMTPSA id 6C537DF28B + for ; Tue, 9 Jul 2013 11:49:27 +0200 (CEST) +Received: from teythoon by thinkbox.jade-hamburg.de with local (Exim 4.80) + (envelope-from ) id 1UwUY2-0001Qr-71 + for notmuch@notmuchmail.org; Tue, 09 Jul 2013 11:49:26 +0200 +Resent-Date: Tue, 09 Jul 2013 11:49:26 +0200 +Resent-Message-Id: +Received: from mailhost.informatik.uni-hamburg.de [134.100.9.70] + by jadE.jadE-Hamburg.de with IMAP (fetchmail-6.3.9-rc2) + for (single-drop); + Sun, 07 Jul 2013 09:01:19 +0200 (CEST) +Received: from mailhost.informatik.uni-hamburg.de ([unix socket]) + by mailhost (Cyrus v2.3.16) with LMTPA; + Sun, 07 Jul 2013 09:00:24 +0200 +X-Sieve: CMU Sieve 2.3 +Received: from localhost (localhost [127.0.0.1]) + by mailhost.informatik.uni-hamburg.de (Postfix) with ESMTP id 085F02BC + for <4winter@informatik.uni-hamburg.de>; + Sun, 7 Jul 2013 09:00:24 +0200 (CEST) +X-Virus-Scanned: amavisd-new at informatik.uni-hamburg.de +Received: from mailhost.informatik.uni-hamburg.de ([127.0.0.1]) + by localhost (mailhost.informatik.uni-hamburg.de [127.0.0.1]) + (amavisd-new, port 10024) + with LMTP id 4QEaQhRmctP4 for <4winter@informatik.uni-hamburg.de>; + Sun, 7 Jul 2013 09:00:20 +0200 (CEST) +X-policyd-weight: NOT_IN_SBL_XBL_SPAMHAUS=-1.5 NOT_IN_SPAMCOP=-1.5 + BL_NJABL=SKIP(-1.5) CL_IP_EQ_HELO_IP=-2 (check from: .aidecoe. - helo: + .mail-bk0-f41.google. - helo-domain: .google.) + FROM/MX_MATCHES_HELO(DOMAIN)=-2; rate: -8.5 +Received: from mail-bk0-f41.google.com (mail-bk0-f41.google.com + [209.85.214.41]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) + (Client CN "smtp.gmail.com", + Issuer "Google Internet Authority" (verified OK)) + by mailhost.informatik.uni-hamburg.de (Postfix) with ESMTPS id 54E692BB + for <4winter@informatik.uni-hamburg.de>; + Sun, 7 Jul 2013 09:00:19 +0200 (CEST) +Received: by mail-bk0-f41.google.com with SMTP id jc3so1465854bkc.14 + for <4winter@informatik.uni-hamburg.de>; + Sun, 07 Jul 2013 00:00:19 -0700 (PDT) +X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; + d=google.com; s=20120113; + h=from:to:subject:user-agent:date:message-id:mime-version + :content-type:x-gm-message-state; + bh=iV83mH3O1R0iAxhhbmEE6gBIGmdWraTzccthWZKWoy8=; + b=YNTPVhysrfqll3pNjX14dQpeAsy6UcUCgd8FvSz09/S9XEaYyYg5yeHz2wrvDZlBlX + 1bBwV+GY3ibYNYfUyo6JqrhtJ+sERLnW1XYudk3Fk6u96XzFzuttZpKCtWKbvltjfVph + TM41v814YdxlN7A+sk3DQ7jUf7+On0EzDty5sulB+b0Xt1U/zPePrYCJoQYsDhwrIXlQ + yn4tgvG88bU+PWXI5pWmeQE3SalkUZn/y9rkULyiFPMJDMTKxpZFdkwB+DzgXofkp21j + lWaP6S27oXo2vmXDph7WDtSb7X2yKfFkwtNvSCRtj/BnPr4zYZuVGvtd4TLW/2CWs8+e + 1Wig== +X-Received: by 10.205.130.67 with SMTP id hl3mr2583382bkc.61.1373180419276; + Sun, 07 Jul 2013 00:00:19 -0700 (PDT) +From: Amadeusz =?utf-8?B?xbtvxYJub3dza2k=?= +To: Justus Winter <4winter@informatik.uni-hamburg.de> +Subject: [notmuch] Unicode in Python bindings +User-Agent: Notmuch/0.15.2 (http://notmuchmail.org) Emacs/24.3.1 + (x86_64-pc-linux-gnu) +Date: Sun, 07 Jul 2013 09:00:16 +0200 +Message-ID: <87txk6zwfz.fsf@raeviah.aidecoe.name> +MIME-Version: 1.0 +Content-Type: multipart/signed; boundary="=-=-="; + micalg=pgp-sha1; protocol="application/pgp-signature" +X-Gm-Message-State: + ALoCoQl9lHlNXHB3cdksAJhxQ8jpPMCVt2kXej1sNqQ7ciTTAWy6VBuT8bk7E3ijw0KPgILxZGk/ +X-Alot-OpenPGP-Signature-Valid: True +X-Alot-OpenPGP-Signature-Message: Valid: F0134531E1DBFAB5 +Resent-From: Justus Winter <4winter@informatik.uni-hamburg.de> +Resent-To: notmuch mailing list +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Tue, 09 Jul 2013 09:49:47 -0000 + +--=-=-= +Content-Type: text/plain; charset=utf-8 +Content-Transfer-Encoding: quoted-printable + +Hello, + +I have come across a problem with Unicode [1] in afew mail filter which +uses Notmuch Python bindings and it has eventually bringed us to +confusion about Unicode handling in Python bindings. + +Shouldn't __unicode__() methods return value of type unicode? Let's +take an example of __unicode__() method from Message class: + + def __unicode__(self): + format =3D "%s (%s) (%s)" + return format % (self.get_header('from'), + self.get_tags(),=20=20=20=20=20=20=20=20=20=20=20= +=20=20=20=20=20 + date.fromtimestamp(self.get_date()), + ) + +format is of type str, not unicode and method is eventually going to +return str, while the user of the API is expecting unicode type. + +I haven't programmed in Python 3, yet - only in Python 2, so maybe I am +missing something. When I was writing a big project in Python 2, I have +eventually decided to use u'' literals everywhere and decode any str to +unicode ASAP - and this solved all issues wrt encodings. I guess that +mixing Python 2 and 3 gets even more problematic. + +Could you review (and fix if it is needed) Python bindings in context of +unicode handling, please? + + +[1] https://github.com/teythoon/afew/issues/36 + + +Regards, + +=2D-=20 +Amadeusz =C5=BBo=C5=82nowski + +--=-=-= +Content-Type: application/pgp-signature + +-----BEGIN PGP SIGNATURE----- +Version: GnuPG v2.0.20 (GNU/Linux) + +iQEcBAEBAgAGBQJR2RIAAAoJEPATRTHh2/q1jeEH/2+RY69kyuKNz7gdBY5IvKrV +f1WYHMeGkHFcWMC6Rm6dbaXrzfJe6IP7XW+3MaWNErbfsBQzRiUdI+DPPUQZOpI9 +KVFCx1wa4jtBrf++kUowV3GlKGyyoDr8W9Gii8wnAw7rRRX9Qv4CP0sNhxXLj5xR +WcmjFLGvuEUXUVZZCAqKfpuXa+BA/ix1gPSfHEK3Gr8TkKbsFzR2GGZxpyq+znsq +NrYBLcD2zAP9UUQ+WVKpo8+x9y++WnEkduWqDX2exvUhHl2u3Rl6co0Mg03/HtSQ +1Bb7MbbATEDC1KD7GbxQ42XOEEzXjiIXeAjGQFmEVtTIOalSbZ5TFVJF1OGWlKU= +=peoP +-----END PGP SIGNATURE----- +--=-=-=--