Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 505FE41DB89 for ; Sat, 9 Feb 2013 02:07:07 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id l+RMh3JqqUg7 for ; Sat, 9 Feb 2013 02:07:06 -0800 (PST) Received: from mail-la0-f50.google.com (mail-la0-f50.google.com [209.85.215.50]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 9F90D41DB87 for ; Sat, 9 Feb 2013 02:07:06 -0800 (PST) Received: by mail-la0-f50.google.com with SMTP id ec20so4547478lab.37 for ; Sat, 09 Feb 2013 02:07:02 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:to:subject:in-reply-to:references:user-agent:date :message-id:mime-version:content-type:content-transfer-encoding :x-gm-message-state; bh=2d0FKD0rsAQ8iqqV2C5twH6GTC22+64v7UDR+DhtGtA=; b=L/b2u/ye4I4BsMjk1l8R4uFeV7/ilobkJPUED/x1Mpt01xx/JfgbAlFNEPzUo2W03V lgxtnYIUwhvWSxDXZXF+by2ee/UE+kKAvUmer0mGVW3vE+Q5EGwqWJ92lLyvSTJ1JhYA hT+Ydx7bO00ICmN51WRhknBdgTE0UnQLTGv7Ao11K5FDdKinKy697YBcGR3xv/nE3ta3 Amgsd0yMMFaa8+c4OCEONWnJOWIzAT11lYmtlbgvVzYewq+N4RYipUO/8TejDn2MgKQz mZMUBPJKLnyUXWCmYzxnTHlJeo9Gi80gurRUpGwXCexixBv5DZsaRDn/gGsbcj+M6VCm 6RZg== X-Received: by 10.112.9.104 with SMTP id y8mr3332656lba.132.1360404422035; Sat, 09 Feb 2013 02:07:02 -0800 (PST) Received: from localhost (dsl-hkibrasgw4-50df51-27.dhcp.inet.fi. [80.223.81.27]) by mx.google.com with ESMTPS id fh4sm10945726lbb.7.2013.02.09.02.06.59 (version=TLSv1.2 cipher=RC4-SHA bits=128/128); Sat, 09 Feb 2013 02:07:01 -0800 (PST) From: Jani Nikula To: Albin Stjerna , notmuch@notmuchmail.org Subject: Re: Bug: problem decoding some non-ascii characters in subjects In-Reply-To: <87pq09eu41.fsf@hecate.student.uu.se> References: <87txpnds0k.fsf@hecate.student.uu.se> <8738x7kq44.fsf@nikula.org> <87pq09eu41.fsf@hecate.student.uu.se> User-Agent: Notmuch/0.14+255~gff3cc55 (http://notmuchmail.org) Emacs/24.2.1 (x86_64-pc-linux-gnu) Date: Sat, 09 Feb 2013 12:06:58 +0200 Message-ID: <87mwvd6bst.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Gm-Message-State: ALoCoQnf7H6oeQAxJGM8/Qr//6aRqq8gmHuxZcIBYCPEBp5D5TjT5khZrGH1pH95OLh73c+8BkcE X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 09 Feb 2013 10:07:07 -0000 On Sat, 09 Feb 2013, Albin Stjerna wrote: > Jani Nikula wrote: > >> On Fri, 08 Feb 2013, Albin Stjerna wrote: >> > I've been noticing that notmuch has some problems decoding certain >> > strangely-encoded non-ascii characters in certain emails. For example, >> > today I got this: [BIBLIST] Digitaliseringensprojektens skadliga >> > f=3D?ISO-8859-1?Q?=3DF6rk=3DE4rlek_f=3DF6r_?=3D PDF-formatet (should be >> > rendered: =C2=BBDigitaliseringsprojektens skadliga f=C3=B6rk=C3=A4rlek= f=C3=B6r >> > PDF-formatet=C2=AB). >> > >> > Apparently, some metadata is passed on to help the MUA decode the >> > string, but notmuch doesn't seem to handle it. Entire emails can of >> > course be supplied as needed. > >> Please copy paste the Subject: header directly from the message file. > > The exact Subject: header (from the file, not notmuch) is: > Subject: [BIBLIST] Digitaliseringensprojektens skadliga f=3D?ISO-8859-1?Q= ?=3DF6rk=3DE4rlek_f=3DF6r_?=3D PDF-formatet Is that entirely on one line in the original message file? If not, where exactly is it split? Either way, at a glance, it seems like the encoding is malformed. I think the encoded-word ("=3D?" charset "?" encoding "?" encoded-text "?=3D") should be separated by space to make it an atom. [RFC 2047, RFC 2822]. If you manually move the leading 'f' after the "?Q?" bit, it works as expected. It looks like the bug is in the sender's user agent. BR, Jani.