[PATCH v1 1/2] emacs: Observe the charset of MIME parts when reading them.
authorDavid Edmondson <dme@dme.org>
Sat, 30 Apr 2016 06:51:47 +0000 (07:51 +0100)
committerW. Trevor King <wking@tremily.us>
Sat, 20 Aug 2016 23:21:42 +0000 (16:21 -0700)
3c/d77137e8331a2fb4435388e1a619a6a545d3eb [new file with mode: 0644]

diff --git a/3c/d77137e8331a2fb4435388e1a619a6a545d3eb b/3c/d77137e8331a2fb4435388e1a619a6a545d3eb
new file mode 100644 (file)
index 0000000..0e5f2d5
--- /dev/null
@@ -0,0 +1,130 @@
+Return-Path: <dme@dme.org>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+ by arlo.cworth.org (Postfix) with ESMTP id 415696DE035C\r
+ for <notmuch@notmuchmail.org>; Fri, 29 Apr 2016 23:52:02 -0700 (PDT)\r
+X-Virus-Scanned: Debian amavisd-new at cworth.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: 0.231\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=0.231 tagged_above=-999 required=5 tests=[AWL=0.298, \r
+ DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_LOW=-0.7,\r
+ RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_NEUTRAL=0.652,\r
+ UNPARSEABLE_RELAY=0.001] autolearn=disabled\r
+Received: from arlo.cworth.org ([127.0.0.1])\r
+ by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024)\r
+ with ESMTP id zQeplO8f_yv9 for <notmuch@notmuchmail.org>;\r
+ Fri, 29 Apr 2016 23:51:54 -0700 (PDT)\r
+Received: from mail-wm0-f49.google.com (mail-wm0-f49.google.com\r
+ [74.125.82.49]) by arlo.cworth.org (Postfix) with ESMTPS id 9AF426DE034D for\r
+ <notmuch@notmuchmail.org>; Fri, 29 Apr 2016 23:51:53 -0700 (PDT)\r
+Received: by mail-wm0-f49.google.com with SMTP id n129so50703025wmn.1\r
+ for <notmuch@notmuchmail.org>; Fri, 29 Apr 2016 23:51:53 -0700 (PDT)\r
+DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;\r
+ d=dme-org.20150623.gappssmtp.com; s=20150623;\r
+ h=from:to:subject:date:message-id:in-reply-to:references;\r
+ bh=Jn5Xj5FrI+RVuER5XWDYYAQrxqIEiH1Kkda3TGp4IW0=;\r
+ b=f8mvvXiwA3BWMxVq+9LXpIH2m9dkunv5rGmltLq5501uvmpoIvIbMYDJYQT6TPX/8Y\r
+ XpEHnEv/FHRg60cMLV/Q9Y6uL5gnCOX8kSh9In/Fz1dShhVD1Aaob7UG+73Udje6FByw\r
+ Z1j8c6KVGMOOMuFNvM/7DTd53rBUUnNgueTIIkQJEvsVwRKZ2cH+pCGLASsZl5YILNMG\r
+ Ebo8GaGvbe8MhmLi9Z9yOGWomFpFurCI7J1lW8uyXPl+swgzZ3UUeFw+YWVvzlrj8c5r\r
+ B32o3NMTaB3ate0Xe2n/NZirLefJ4Isq/DTWhtIioUy1lyNYw1a0ehxzcIn/pKg0w3W2\r
+ S9iw==\r
+X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;\r
+ d=1e100.net; s=20130820;\r
+ h=x-gm-message-state:from:to:subject:date:message-id:in-reply-to\r
+ :references;\r
+ bh=Jn5Xj5FrI+RVuER5XWDYYAQrxqIEiH1Kkda3TGp4IW0=;\r
+ b=Fvy5D49GJeo+9N1qjHay4JgTdOpI6hyc6Xt7NraIoCuCXQUwMIZ/IX8jSHcueFP7qs\r
+ MBc/27CpOeQusM2NcxpPKNdpjnHvyvwdtnxpFraMWkVTQkpKr+CqzrjzT1BrsIcxLe5Z\r
+ UrxJR+6wqgc9Cr6Mne5jmBLE9aLQWXhRbuyu8HJx3Jo1kuezD7WHHLYmRImI54nbVl0+\r
+ JM92Kjw+5gr5sUrDCxtNQ7+0QcpqixwXvNVhCTVxTQ+6IyLbsASOMeW15E01PQz9Xl83\r
+ nI4ElvrUBmZkp5J6ti6qDCky08D8yBT8707Bf5tjdm2MYziqGxCiOTIGGlYVxFJVicCh\r
+ cyQQ==\r
+X-Gm-Message-State:\r
+ AOPr4FWR/+GnChNZg2dKmU+UlfboOO/oOpXxoLQvwEg04TmcmtV0OYMEFsH2suXLssVMTQ==\r
+X-Received: by 10.194.58.138 with SMTP id r10mr25459936wjq.153.1461999112294; \r
+ Fri, 29 Apr 2016 23:51:52 -0700 (PDT)\r
+Received: from disaster-area.hh.sledj.net (disaster-area.hh.sledj.net.\r
+ [81.149.164.25])\r
+ by smtp.gmail.com with ESMTPSA id k139sm6774756wmg.24.2016.04.29.23.51.50\r
+ for <notmuch@notmuchmail.org>\r
+ (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\r
+ Fri, 29 Apr 2016 23:51:51 -0700 (PDT)\r
+Received: from localhost (disaster-area.hh.sledj.net [local])\r
+ by disaster-area.hh.sledj.net (OpenSMTPD) with ESMTPA id 539b1410\r
+ for <notmuch@notmuchmail.org>; Sat, 30 Apr 2016 06:51:48 +0000 (UTC)\r
+From: David Edmondson <dme@dme.org>\r
+To: notmuch@notmuchmail.org\r
+Subject: [PATCH v1 1/2] emacs: Observe the charset of MIME parts when reading\r
+ them.\r
+Date: Sat, 30 Apr 2016 07:51:47 +0100\r
+Message-Id: <1461999108-68582-2-git-send-email-dme@dme.org>\r
+X-Mailer: git-send-email 2.7.1\r
+In-Reply-To: <1461999108-68582-1-git-send-email-dme@dme.org>\r
+References: <1461999108-68582-1-git-send-email-dme@dme.org>\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.20\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+ <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <https://notmuchmail.org/mailman/options/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch/>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <https://notmuchmail.org/mailman/listinfo/notmuch>,\r
+ <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Sat, 30 Apr 2016 06:52:02 -0000\r
+\r
+`notmuch--get-bodypart-raw' previously assumed that all non-binary MIME\r
+parts could be successfully read by assuming that they were UTF-8\r
+encoded. This was demonstrated to be wrong, specifically when a part was\r
+marked as ISO8859-1 and included accented characters (which were\r
+incorrectly rendered as a result).\r
+\r
+Rather than assuming UTF-8, attempt to use the part's declared charset\r
+when reading it, falling back to US-ASCII if the declared charset is\r
+unknown, unsupported or invalid.\r
+---\r
+ emacs/notmuch-lib.el | 16 +++++++++++++++-\r
+ 1 file changed, 15 insertions(+), 1 deletion(-)\r
+\r
+diff --git a/emacs/notmuch-lib.el b/emacs/notmuch-lib.el\r
+index 78978ee..f05ded6 100644\r
+--- a/emacs/notmuch-lib.el\r
++++ b/emacs/notmuch-lib.el\r
+@@ -23,6 +23,7 @@\r
\r
+ ;;; Code:\r
\r
++(require 'mm-util)\r
+ (require 'mm-view)\r
+ (require 'mm-decode)\r
+ (require 'cl)\r
+@@ -572,7 +573,20 @@ the given type."\r
+                                  ,@(when process-crypto '("--decrypt"))\r
+                                  ,(notmuch-id-to-query (plist-get msg :id))))\r
+                          (coding-system-for-read\r
+-                          (if binaryp 'no-conversion 'utf-8)))\r
++                          (if binaryp 'no-conversion\r
++                            (let ((coding-system (mm-charset-to-coding-system\r
++                                                  (plist-get part :content-charset))))\r
++                              ;; Sadly,\r
++                              ;; `mm-charset-to-coding-system' seems\r
++                              ;; to return things that are not\r
++                              ;; considered acceptable values for\r
++                              ;; `coding-system-for-read'.\r
++                              (if (coding-system-p coding-system)\r
++                                  coding-system\r
++                                ;; RFC 2047 says that the default\r
++                                ;; charset is US-ASCII. RFC6657\r
++                                ;; complicates this somewhat.\r
++                                'us-ascii)))))\r
+                      (apply #'call-process notmuch-command nil '(t nil) nil args)\r
+                      (buffer-string))))))\r
+     (when (and cache data)\r
+-- \r
+2.7.1\r
+\r