From: Todd Date: Sat, 10 Jan 2015 14:38:09 +0000 (+1800) Subject: Re: [PATCH] Index Content-Type of attachments with a contenttype prefix X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=7222a0173345d3f2f09b690f7cf92f2bbb96cc1b;p=notmuch-archives.git Re: [PATCH] Index Content-Type of attachments with a contenttype prefix --- diff --git a/03/d519bb4fe678bfd504a89bb510a7a131ff3252 b/03/d519bb4fe678bfd504a89bb510a7a131ff3252 new file mode 100644 index 000000000..164246ec4 --- /dev/null +++ b/03/d519bb4fe678bfd504a89bb510a7a131ff3252 @@ -0,0 +1,205 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 41068429E52 + for ; Sat, 10 Jan 2015 06:38:37 -0800 (PST) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: 2.438 +X-Spam-Level: ** +X-Spam-Status: No, score=2.438 tagged_above=-999 required=5 + tests=[DNS_FROM_AHBL_RHSBL=2.438] autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id URu+DJ9Q8apK for ; + Sat, 10 Jan 2015 06:38:33 -0800 (PST) +Received: from s75.web-hosting.com (s75.web-hosting.com [198.187.31.9]) + (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) + (No client certificate requested) + by olra.theworths.org (Postfix) with ESMTPS id 958D6429E54 + for ; Sat, 10 Jan 2015 06:38:33 -0800 (PST) +Received: from user-69-73-37-128.knology.net ([69.73.37.128]:33524 + helo=tz-lab) by server75.web-hosting.com with esmtpsa + (UNKNOWN:DHE-RSA-AES128-SHA:128) (Exim 4.82) (envelope-from + ) id 1Y9xBQ-0046Hr-I9; Sat, 10 Jan 2015 09:38:32 + -0500 +From: Todd +To: Jani Nikula , notmuch@notmuchmail.org +Subject: Re: [PATCH] Index Content-Type of attachments with a contenttype + prefix +In-Reply-To: <8761ce7s16.fsf@nikula.org> +References: <1420849787-4401-1-git-send-email-todd@electricoding.com> + <8761ce7s16.fsf@nikula.org> +User-Agent: Notmuch/0.19+17~gd8b219d (http://notmuchmail.org) Emacs/24.4.1 + (x86_64-unknown-linux-gnu) +Date: Sat, 10 Jan 2015 08:38:09 -0600 +Message-ID: <87fvbi8zvy.fsf@electricoding.com> +MIME-Version: 1.0 +Content-Type: text/plain +X-AntiAbuse: This header was added to track abuse, + please include it with any abuse report +X-AntiAbuse: Primary Hostname - server75.web-hosting.com +X-AntiAbuse: Original Domain - notmuchmail.org +X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12] +X-AntiAbuse: Sender Address Domain - electricoding.com +X-Get-Message-Sender-Via: server75.web-hosting.com: authenticated_id: + todd@electricoding.com +X-Source: +X-Source-Args: +X-Source-Dir: +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Sat, 10 Jan 2015 14:38:37 -0000 + + +>>>>> "Jani" == Jani Nikula writes: + + Jani> On Sat, 10 Jan 2015, Todd wrote: + >> I wanted to tag messages with calendar invitations, but couldn't as + >> the information wasn't indexed. + >> + >> This patch allows for queries for like: + >> + >> Find calendar invites + >> - contenttype:text/calendar or contenttype:applicaton/ics + >> + >> Find any image attachments + >> - contenttype:image + >> + >> Find all patches + >> - contenttype:text/x-patch + >> + >> + >> - Todd + >> + >> --- + >> NEWS | 6 ++++++ + >> completion/notmuch-completion.bash | 2 +- + >> doc/man7/notmuch-search-terms.rst | 6 ++++++ + >> emacs/notmuch.el | 2 +- + >> lib/database.cc | 1 + + >> lib/index.cc | 5 +++++ + >> test/T190-multipart.sh | 32 ++++++++++++++++++++++++++++++++ + + Jani> IMO these could be split into several patches. + + No problem, I'll split them up the next time I post. + + >> 7 files changed, 52 insertions(+), 2 deletions(-) + >> + >> diff --git a/NEWS b/NEWS + >> index 44e8d05..5f4622c 100644 + >> --- a/NEWS + >> +++ b/NEWS + >> @@ -15,6 +15,12 @@ keyboard shortcuts to saved searches. + >> Command-Line Interface + >> ---------------------- + >> + >> +There is a new `contenttype:` search prefix + >> + + >> + The new `contenttype:` search prefix allows searching for the + >> + content-type of attachments, which is now indexed by `notmuch + >> + insert`. See the `notmuch-search-terms` manual page for details. + >> + + + Jani> Admittedly I did not have the time to dig into details, but I think + Jani> "attachment" is misleading, as it's really all mime parts, right? + + Jani> Will this also index the Content-Type: header of the message itself, + Jani> regardless of whether it has mime structure or not? Maybe it + Jani> should? + + Yes, all mime-parts. It does not index the Content-Type of the + message itself. That probably wouldn't be difficult to add if it's + a desired feature, but if there are plans for indexing other message + headers it may fit better there. + + I also wasn't too happy with a "contenttype" keyword and debated + just indexing the information under "attachment" along with the + filename. + + >> Stopped `notmuch dump` failing if someone writes to the database + >> + >> The dump command now takes the write lock when running. This + >> diff --git a/completion/notmuch-completion.bash b/completion/notmuch-completion.bash + >> index d58dc8b..05b5969 100644 + >> --- a/completion/notmuch-completion.bash + >> +++ b/completion/notmuch-completion.bash + >> @@ -61,7 +61,7 @@ _notmuch_search_terms() + >> sed "s|^$path/||" | grep -v "\(^\|/\)\(cur\|new\|tmp\)$" ) ) + >> ;; + >> *) + >> - local search_terms="from: to: subject: attachment: tag: id: thread: folder: path: date:" + >> + local search_terms="from: to: subject: attachment: contenttype: tag: id: thread: folder: path: date:" + >> compopt -o nospace + >> COMPREPLY=( $(compgen -W "${search_terms}" -- ${cur}) ) + >> ;; + >> diff --git a/doc/man7/notmuch-search-terms.rst b/doc/man7/notmuch-search-terms.rst + >> index 1acdaa0..d126ce6 100644 + >> --- a/doc/man7/notmuch-search-terms.rst + >> +++ b/doc/man7/notmuch-search-terms.rst + >> @@ -40,6 +40,8 @@ indicate user-supplied values): + >> + >> - attachment: + >> + >> +- contenttype: + >> + + >> - tag: (or is:) + >> + >> - id: + >> @@ -66,6 +68,10 @@ by including quotation marks around the phrase, immediately following + >> The **attachment:** prefix can be used to search for specific filenames + >> (or extensions) of attachments to email messages. + >> + >> +The **contenttype:** prefix can be used to search for specific + >> +content-types of attachments to email messages (as specified by the + >> +sender). + >> + + >> For **tag:** and **is:** valid tag values include **inbox** and + >> **unread** by default for new messages added by **notmuch new** as well + >> as any other tag values added manually with **notmuch tag**. + >> diff --git a/emacs/notmuch.el b/emacs/notmuch.el + >> index 218486a..702700c 100644 + >> --- a/emacs/notmuch.el + >> +++ b/emacs/notmuch.el + >> @@ -858,7 +858,7 @@ PROMPT is the string to prompt with." + >> (lexical-let + >> ((completions + >> (append (list "folder:" "path:" "thread:" "id:" "date:" "from:" "to:" + >> - "subject:" "attachment:") + >> + "subject:" "attachment:" "contenttype:") + >> (mapcar (lambda (tag) + >> (concat "tag:" (notmuch-escape-boolean-term tag))) + >> (process-lines notmuch-command "search" "--output=tags" "*"))))) + >> diff --git a/lib/database.cc b/lib/database.cc + >> index 3601f9d..a7a64c9 100644 + >> --- a/lib/database.cc + >> +++ b/lib/database.cc + >> @@ -254,6 +254,7 @@ static prefix_t PROBABILISTIC_PREFIX[]= { + >> { "from", "XFROM" }, + >> { "to", "XTO" }, + >> { "attachment", "XATTACHMENT" }, + >> + { "contenttype", "XCONTENTTYPE"}, + >> { "subject", "XSUBJECT"}, + + Jani> Is the use of probabilistic prefix intentional? I think it's probably + Jani> the right thing to do, but just checking. + + I'm not familiar with Xapian and just followed the precedence of + attachment. + + Jani> BR, + Jani> Jani. + + - Todd