From 09fe6dffcdfdeb6e07c8ce73e510157031bc21ff Mon Sep 17 00:00:00 2001 From: "W. Trevor King" Date: Sat, 15 Feb 2014 08:48:55 +1600 Subject: [PATCH] [PATCH v4 4/4] nmbug-status: Hardcode UTF-8 instead of using the user's locale --- a2/7ef7ab83aa1686e4fe22d249676bb4ad80a5e3 | 127 ++++++++++++++++++++++ 1 file changed, 127 insertions(+) create mode 100644 a2/7ef7ab83aa1686e4fe22d249676bb4ad80a5e3 diff --git a/a2/7ef7ab83aa1686e4fe22d249676bb4ad80a5e3 b/a2/7ef7ab83aa1686e4fe22d249676bb4ad80a5e3 new file mode 100644 index 000000000..ad5c3db19 --- /dev/null +++ b/a2/7ef7ab83aa1686e4fe22d249676bb4ad80a5e3 @@ -0,0 +1,127 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id C08A3431FBD + for ; Fri, 14 Feb 2014 08:50:21 -0800 (PST) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Amavis-Alert: BAD HEADER SECTION, Duplicate header field: "References" +X-Spam-Flag: NO +X-Spam-Score: 0 +X-Spam-Level: +X-Spam-Status: No, score=0 tagged_above=-999 required=5 + tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001] + autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id XZtU8LgAnfeu for ; + Fri, 14 Feb 2014 08:50:14 -0800 (PST) +Received: from qmta04.westchester.pa.mail.comcast.net + (qmta04.westchester.pa.mail.comcast.net [76.96.62.40]) + by olra.theworths.org (Postfix) with ESMTP id 79F2C431FBC + for ; Fri, 14 Feb 2014 08:50:14 -0800 (PST) +Received: from omta17.westchester.pa.mail.comcast.net ([76.96.62.89]) + by qmta04.westchester.pa.mail.comcast.net with comcast + id SETK1n00A1vXlb854GqEZW; Fri, 14 Feb 2014 16:50:14 +0000 +Received: from odin.tremily.us ([24.18.63.50]) + by omta17.westchester.pa.mail.comcast.net with comcast + id SGqD1n005152l3L3dGqDh0; Fri, 14 Feb 2014 16:50:14 +0000 +Received: from mjolnir.tremily.us (unknown [192.168.0.140]) + by odin.tremily.us (Postfix) with ESMTPS id C18EC103A9A6; + Fri, 14 Feb 2014 08:50:12 -0800 (PST) +Received: (nullmailer pid 18390 invoked by uid 1000); + Fri, 14 Feb 2014 16:48:57 -0000 +From: "W. Trevor King" +To: notmuch@notmuchmail.org +Subject: [PATCH v4 4/4] nmbug-status: Hardcode UTF-8 instead of using the + user's locale +Date: Fri, 14 Feb 2014 08:48:55 -0800 +Message-Id: + <2ccf8081e5195473199923f76e71cc33552b63df.1392395932.git.wking@tremily.us> +X-Mailer: git-send-email 1.8.5.2.8.g0f6c0d1 +In-Reply-To: +References: +In-Reply-To: +References: +DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=comcast.net; + s=q20121106; t=1392396614; + bh=hZxSG2hNFjKlPLJssd6MNC3pRSCBUYBw9bx42O44jD0=; + h=Received:Received:Received:Received:From:To:Subject:Date: + Message-Id; + b=P4e8PcZIjcDuRxQMvIcp9AZ2PvYtWJ1vsA+AQ/gfXNkWbh1FIb9N8fVDLQvQuawQG + 03tzjQ7ysYejbqjnW7071OofLz/IUa4ptMTffAna9v9dLDTrBBxeuo1V7EL57yEDZi + dGcl1b7N+QEokrZBYjMaKowlsdwAhDv3bwkOPNEmhv6w1UH/Yv2T9RV/rPVQMmrokf + N9sEnZzOg2h7pzxT09VjAmV6VxkBtDudjFs5NZovTikgEgRdSyj/RA5ln0w3443Rwq + uGVMIcQZddidm95dVm42bDC3A08mUAMwOb8dGZUJMro0MmgYdbfufEWsz48yFvxJRk + xVWAwvkKtBIgw== +Cc: Tomi Ollila +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Fri, 14 Feb 2014 16:50:21 -0000 + +David [1] and Tomi [2] both feel that the user's choice of LANG is not +explicit enough to have such a strong effect on nmbug-status. For +example, cron jobs usually default to LANG=C, and that is going to +give you ASCII output: + + $ LANG=C python -c 'import locale; print(locale.getpreferredencoding())' + ANSI_X3.4-1968 + +Trying to print Unicode author names (and other strings) in that +encoding would crash nmbug-status with a UnicodeEncodeError. To avoid +that, this patch hardcodes UTF-8, which can handle generic Unicode, +and is the preferred encoding (regardless of LANG settings) for +everyone who has chimed in on the list so far. I'd prefer trusting +LANG, but in the absence of any users that prefer non-UTF-8 encodings +I'm fine with this approach. + +While we could achieve the same effect on the output content by +dropping the previous patch (nmbug-status: Encode output using the +user's locale), Tomi also wanted UTF-8 hardcoded as the config-file +encoding [2]. Keeping the output encoding patch and then adding this +to hardcode both the config-file and output encodings at once seems +the easiest route, now that fd29d3f (nmbug-status: Decode Popen output +using the user's locale, 2014-02-10) has landed in master. + +[1]: id="877g8z4v4x.fsf@zancas.localnet" + http://article.gmane.org/gmane.mail.notmuch.general/17202 +[2]: id="m2vbwj79lu.fsf@guru.guru-group.fi" + http://article.gmane.org/gmane.mail.notmuch.general/17209 +--- + devel/nmbug/nmbug-status | 3 +-- + 1 file changed, 1 insertion(+), 2 deletions(-) + +diff --git a/devel/nmbug/nmbug-status b/devel/nmbug/nmbug-status +index c4532f1..ef7169a 100755 +--- a/devel/nmbug/nmbug-status ++++ b/devel/nmbug/nmbug-status +@@ -13,7 +13,6 @@ import codecs + import collections + import datetime + import email.utils +-import locale + try: # Python 3 + from urllib.parse import quote + except ImportError: # Python 2 +@@ -27,7 +26,7 @@ import subprocess + import xml.sax.saxutils + + +-_ENCODING = locale.getpreferredencoding() or sys.getdefaultencoding() ++_ENCODING = 'UTF-8' + _PAGES = {} + + +-- +1.8.5.2.8.g0f6c0d1 + -- 2.26.2