Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id D174940D157 for ; Fri, 29 Oct 2010 16:18:50 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -2.9 X-Spam-Level: X-Spam-Status: No, score=-2.9 tagged_above=-999 required=5 tests=[ALL_TRUSTED=-1, BAYES_00=-1.9] autolearn=ham Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id kEyI9OK2oPSs; Fri, 29 Oct 2010 16:18:40 -0700 (PDT) Received: from yoom.home.cworth.org (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 7F2B540D166; Fri, 29 Oct 2010 16:18:32 -0700 (PDT) Received: by yoom.home.cworth.org (Postfix, from userid 1000) id 23B72254007; Fri, 29 Oct 2010 16:18:32 -0700 (PDT) From: Carl Worth To: Albin Stjerna , notmuch@notmuchmail.org Subject: Re: SpamAssassin or: why can't I search for =?utf-8?B?wrtbwqs/?= In-Reply-To: <87zkyvxkdd.fsf@nyx.luftslott.org> References: <87zkyvxkdd.fsf@nyx.luftslott.org> User-Agent: Notmuch/0.3.1 (http://notmuchmail.org) Emacs/23.2.1 (i486-pc-linux-gnu) Date: Fri, 29 Oct 2010 16:18:31 -0700 Message-ID: <871v78r3js.fsf@yoom.home.cworth.org> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha1; protocol="application/pgp-signature" X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 29 Oct 2010 23:18:51 -0000 --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Wed, 16 Jun 2010 13:55:10 +0200, Albin Stjerna wrote: > I've been trying to get notmuch to apply the tag =C2=BBspam=C2=AB to thes= e mails, > but it seems I can neither make it search for upper-case letters nor > =C2=BB[=C2=AB/=C2=BB]=C2=AB. My current solution is to tag everything wit= h =C2=BBspam=C2=AB somewhere > in the subject header as spam, which leads to lots of false positives. Hi Albin, I'm sorry that nobody answered this fairly simple question of yours earlier. What's happening here is that Xapian (the indexer used by notmuch) looks for "word characters" and "non-word characters" that separate words[*]. Then, the words are indexed (with numeric information indicating their position) and the separators are thrown away. So there's no way to search for separators such as =C2=BB[=C2=AB/=C2=BB]=C2=AB. As for case-sensitivity, Xapian does provide capabilities such that notmuch could offer optional case-sensitive searching. But that might require more storage space than notmuch is currently using. It would also require us to add some syntax to the search terms so that a user can request case-sensitive searches. Meanwhile, for a long-term fix for your problem, we plan to add the ability to allow you to use notmuch to search for a header such as "X-Spam-Flag: YES". This isn't currently possible, but when we implement that, it should be much more reliable than finding flagged spam by looking for words in the subject. > Also, and much less importantly, is there any way to have notmuch > harvest email addresses for BBDB? We haven't written code to do the "insinuate" into bbdb thing by default, but someone could do that. Early in my use of notmuch I wrote some scripts that ran notmuch commands, grepped out addresses, and stuffed them into bbdb. That was nice at first, but not usable in the long-term since the database didn't grow as new addresses appeared in emails. More recently, we've added support to do tab-based completion of addresses based on automatic searching through your notmuch mail store, (rather than something external like bbdb). This is quite nice, but currently a bit of effort to setup. See the "how to get email address completion" instructions here: http://notmuchmail.org/emacstips/ In the future, I'd like to get this address completion working by default without requiring the download of an additional tool, (like the current notmuch_addresses.py or addrlookup programs). Having more direct support for address completion within the notmuch database itself will make it faster as well, (the current tools are grubbing through actual mail files to find complete addresses). =2DCarl [*] I'm sure I'm using the wrong terminology for Xapian, and I might have some details wrong, but the basic idea is hopefully correct. =2D-=20 carl.d.worth@intel.com --=-=-= Content-Type: application/pgp-signature -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.10 (GNU/Linux) iD8DBQFMy1ZI6JDdNq8qSWgRAhV7AJsHpP2/t8g1Y0mQWg83y/zenZmRjACfZ3Oa V9quQ+a/luOMAsbGGkyRKYc= =sf+r -----END PGP SIGNATURE----- --=-=-=--