From 67e13e05980351ca3e6a3fcd3ee1c0967a610f60 Mon Sep 17 00:00:00 2001 From: Xu Wang Date: Wed, 30 Sep 2015 00:51:01 +2000 Subject: [PATCH] Re: correct way to search for only PDF attachments --- 11/f805837615c0a21471fc0ef9d61dc4edb6f449 | 108 ++++++++++++++++++++++ 1 file changed, 108 insertions(+) create mode 100644 11/f805837615c0a21471fc0ef9d61dc4edb6f449 diff --git a/11/f805837615c0a21471fc0ef9d61dc4edb6f449 b/11/f805837615c0a21471fc0ef9d61dc4edb6f449 new file mode 100644 index 000000000..55e39e732 --- /dev/null +++ b/11/f805837615c0a21471fc0ef9d61dc4edb6f449 @@ -0,0 +1,108 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by arlo.cworth.org (Postfix) with ESMTP id E19DC6DE0A7F + for ; Mon, 28 Sep 2015 21:51:05 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at cworth.org +X-Spam-Flag: NO +X-Spam-Score: -0.523 +X-Spam-Level: +X-Spam-Status: No, score=-0.523 tagged_above=-999 required=5 tests=[AWL=0.047, + DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, + FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7, + RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] + autolearn=disabled +Received: from arlo.cworth.org ([127.0.0.1]) + by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id 28DnPa_fmzmN for ; + Mon, 28 Sep 2015 21:51:03 -0700 (PDT) +Received: from mail-ob0-f196.google.com (mail-ob0-f196.google.com + [209.85.214.196]) + by arlo.cworth.org (Postfix) with ESMTPS id 0B46A6DE0274 + for ; Mon, 28 Sep 2015 21:51:03 -0700 (PDT) +Received: by obczc1 with SMTP id zc1so2128774obc.3 + for ; Mon, 28 Sep 2015 21:51:02 -0700 (PDT) +DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; + h=mime-version:in-reply-to:references:date:message-id:subject:from:to + :cc:content-type; + bh=3/hUH2KXcFihYxx+QXvvPTSSotEVHReXOmmc0xAoAhI=; + b=FVm06jCA7VonwRUhAgi8PksthWpTWDbEb1XcW6QV4nk62SPxN9K+u1qMmvKrGWidkX + VW5lu1ox2Hsh4RzybkNBQM5l/LQ4/FM7KXbDjhO3N9OpsMqv0A3yds9xnEJpPmOSKkr0 + ndf+HoCqWd1GX2pYx0E5P90XYaJciz279CWbTVo9hNX8mUg2mTgpyPjma4uQYT3ybiHC + DM7Dbq11j30TxF4kGpCS+cHc+hwzJPhc0lX3eGkwA0uIi1Z8EZG0A/X5kVVXi7y9HZf7 + TFXanK/oy1DzfINJWvvIMljl2KdzoM4TvzuxwluJro36nP7d6lhLUOX1X3mgt3XgpiUj + tadg== +MIME-Version: 1.0 +X-Received: by 10.182.138.40 with SMTP id qn8mr99812obb.78.1443502262004; Mon, + 28 Sep 2015 21:51:02 -0700 (PDT) +Received: by 10.202.212.204 with HTTP; Mon, 28 Sep 2015 21:51:01 -0700 (PDT) +In-Reply-To: <87vbau9e8i.fsf@wondoo.home.cworth.org> +References: + + <87vbau9e8i.fsf@wondoo.home.cworth.org> +Date: Tue, 29 Sep 2015 00:51:01 -0400 +Message-ID: + +Subject: Re: correct way to search for only PDF attachments +From: Xu Wang +To: Carl Worth +Content-Type: text/plain; charset=UTF-8 +Cc: notmuch@notmuchmail.org +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.18 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Tue, 29 Sep 2015 04:51:06 -0000 + +On Mon, Sep 28, 2015 at 10:00 PM, Carl Worth wrote: +> On Mon, Sep 28 2015, Xu Wang wrote: +>> I would look to look for all emails from a colleague jongho. I tried: +>> +>> from:jongho attachment:pdf +>> +>> which seems to do as I wanted. +> +> Good. That should work. +> +>> To understand more, what does the following search for? +>> +>> from:jongho attachment:.*pdf +> +> Uhm, probably only strange things. There are some mechanisms for getting +> notmuch to emit some debugging information on what the final search +> terms end up being, (but I don't recall if they still require +> recompilation or not). +> +> I'm not testing now, but I wouldn't be surprised if that ended up doing +> something like searching for a phrase like "attachment pdf" anywhere +> within a message. (The Xapian parser can be somewhat unpredictable when +> you give it unexpected input.) +> +>> Also, how does the first one above know that I want only PDF +>> attachments and not an attachment called "pdformula.txt" ? +> +> It doesn't know that you want only PDF attachments. The key part is that +> the indexing is performed by breaking text up into individual terms, (at +> punctuation boundaries usually). So a search specification like +> "attachment:pdf" is searching for things that were indexed with the +> "pdf" term within the attachment prefix. So that won't match a filename +> like pdformula.txt, (which would be indexed as two terms, "pdformula" +> and "txt"), but it would match pdf.ormula.txt, (which would be indexed +> as three terms, "pdf", "ormula" and "txt"). +> +> The Xapian documentation can be examined if you want more details. + +This is highly useful. Thank for such an explanation!! Thank you, Carl. + +Kind regards, + +Xu -- 2.26.2