1 Return-Path: <aclements@csail.mit.edu>
\r
2 X-Original-To: notmuch@notmuchmail.org
\r
3 Delivered-To: notmuch@notmuchmail.org
\r
4 Received: from localhost (localhost [127.0.0.1])
\r
5 by olra.theworths.org (Postfix) with ESMTP id 98AD7431FCB
\r
6 for <notmuch@notmuchmail.org>; Sat, 24 Jan 2015 09:10:36 -0800 (PST)
\r
7 X-Virus-Scanned: Debian amavisd-new at olra.theworths.org
\r
11 X-Spam-Status: No, score=0.138 tagged_above=-999 required=5
\r
12 tests=[DNS_FROM_AHBL_RHSBL=2.438, RCVD_IN_DNSWL_MED=-2.3]
\r
14 Received: from olra.theworths.org ([127.0.0.1])
\r
15 by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)
\r
16 with ESMTP id AgJvQKKZugG9 for <notmuch@notmuchmail.org>;
\r
17 Sat, 24 Jan 2015 09:10:32 -0800 (PST)
\r
18 Received: from outgoing.csail.mit.edu (outgoing.csail.mit.edu [128.30.2.149])
\r
19 by olra.theworths.org (Postfix) with ESMTP id CB218431FAE
\r
20 for <notmuch@notmuchmail.org>; Sat, 24 Jan 2015 09:10:32 -0800 (PST)
\r
21 Received: from [104.131.20.129] (helo=awakeningjr)
\r
22 by outgoing.csail.mit.edu with esmtpsa (TLS1.0:RSA_AES_128_CBC_SHA1:16)
\r
23 (Exim 4.72) (envelope-from <aclements@csail.mit.edu>)
\r
24 id 1YF4EA-00045v-PF; Sat, 24 Jan 2015 12:10:30 -0500
\r
25 Received: from amthrax by awakeningjr with local (Exim 4.84)
\r
26 (envelope-from <aclements@csail.mit.edu>)
\r
27 id 1YF4EA-0003V9-0W; Sat, 24 Jan 2015 12:10:30 -0500
\r
28 From: Austin Clements <amdragon@mit.edu>
\r
29 To: David Bremner <david@tethera.net>, notmuch@notmuchmail.org
\r
30 Subject: Re: [PATCH 06/11] emacs: Remove broken `notmuch-get-bodypart-content'
\r
32 In-Reply-To: <8738e8p13v.fsf@maritornes.cs.unb.ca>
\r
33 References: <1398105468-14317-1-git-send-email-amdragon@mit.edu>
\r
34 <1398105468-14317-7-git-send-email-amdragon@mit.edu>
\r
35 <8738e8p13v.fsf@maritornes.cs.unb.ca>
\r
36 User-Agent: Notmuch/0.18.1+86~gef5e66a (http://notmuchmail.org) Emacs/24.4.1
\r
37 (x86_64-pc-linux-gnu)
\r
38 Date: Sat, 24 Jan 2015 12:10:29 -0500
\r
39 Message-ID: <874mrgumt6.fsf@csail.mit.edu>
\r
41 Content-Type: text/plain
\r
42 X-BeenThere: notmuch@notmuchmail.org
\r
43 X-Mailman-Version: 2.1.13
\r
45 List-Id: "Use and development of the notmuch mail system."
\r
46 <notmuch.notmuchmail.org>
\r
47 List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,
\r
48 <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>
\r
49 List-Archive: <http://notmuchmail.org/pipermail/notmuch>
\r
50 List-Post: <mailto:notmuch@notmuchmail.org>
\r
51 List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>
\r
52 List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,
\r
53 <mailto:notmuch-request@notmuchmail.org?subject=subscribe>
\r
54 X-List-Received-Date: Sat, 24 Jan 2015 17:10:36 -0000
\r
56 On Fri, 11 Jul 2014, David Bremner <david@tethera.net> wrote:
\r
57 > Austin Clements <amdragon@MIT.EDU> writes:
\r
59 >> +This returns the content of the given part as a multibyte Lisp
\r
61 > What does "multibyte" mean here? utf8? current encoding?
\r
63 Elisp has two kinds of stings: "unibyte strings" and "multibyte
\r
66 https://www.gnu.org/software/emacs/manual/html_node/elisp/Non_002dASCII-in-Strings.html
\r
68 You can think of unibyte strings as binary data; they're just vectors of
\r
69 bytes without any particular encoding semantics (though when you use a
\r
70 unibyte string you can endow it with encoding). Multibyte strings,
\r
71 however, are text; they're vectors of Unicode code points.
\r
73 >> +string after performing content transfer decoding and any
\r
74 >> +necessary charset decoding. It is an error to use this for
\r
75 >> +non-text/* parts."
\r
76 >> + (let ((content (plist-get part :content)))
\r
77 >> + (when (not content)
\r
78 >> + ;; Use show --format=sexp to fetch decoded content
\r
79 >> + (let* ((args `("show" "--format=sexp" "--include-html"
\r
80 >> + ,(format "--part=%s" (plist-get part :id))
\r
81 >> + ,@(when process-crypto '("--decrypt"))
\r
82 >> + ,(notmuch-id-to-query (plist-get msg :id))))
\r
83 >> + (npart (apply #'notmuch-call-notmuch-sexp args)))
\r
84 >> + (setq content (plist-get npart :content))
\r
85 >> + (when (not content)
\r
86 >> + (error "Internal error: No :content from %S" args))))
\r
89 > I'm a bit curious at the lack of setting "coding-system-for-read" here.
\r
90 > Are we assuming the user has their environment set up correctly? Not so
\r
91 > much a criticism as being nervous about everything coding-system
\r
94 That is interesting. coding-system-for-read should really go in
\r
95 notmuch-call-notmuch-sexp, but I worry that, while *almost* all strings
\r
96 the CLI outputs are UTF-8, not quite all of them are. For example, we
\r
97 output filenames exactly at the OS reports the bytes to us (which is
\r
98 necessary, in a sense, because POSIX enforces no particular encoding on
\r
99 file names, but still really unfortunate).
\r
101 We could set coding-system-for-read, but a full solution needs more
\r
102 cooperation from the CLI. Possibly the right answer, at least for the
\r
103 sexp format, is to do our own UTF-8 to "\uXXXX" escapes for strings that
\r
104 are known to be UTF-8 and leave the raw bytes for the few that aren't.
\r
105 Then we would set the coding-system-for-read to 'no-conversion and I
\r
106 think everything would Just Work.
\r
108 That doesn't help for JSON, which is supposed to be all UTF-8 all the
\r
109 time. I can think of solutions there, but they're all ugly and involve
\r
110 things like encoding filenames as base64 when they aren't valid UTF-8.
\r
112 So... I don't think I'm going to do anything about this at this moment.
\r
114 > I didn't see anything else to object to in this patch or the previous
\r