Re: [PATCH] Index Content-Type of attachments with a contenttype prefix
authorTodd <todd@electricoding.com>
Sat, 10 Jan 2015 14:38:09 +0000 (08:38 +1800)
committerW. Trevor King <wking@tremily.us>
Sat, 20 Aug 2016 21:47:20 +0000 (14:47 -0700)
03/d519bb4fe678bfd504a89bb510a7a131ff3252 [new file with mode: 0644]

diff --git a/03/d519bb4fe678bfd504a89bb510a7a131ff3252 b/03/d519bb4fe678bfd504a89bb510a7a131ff3252
new file mode 100644 (file)
index 0000000..164246e
--- /dev/null
@@ -0,0 +1,205 @@
+Return-Path: <todd@electricoding.com>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+       by olra.theworths.org (Postfix) with ESMTP id 41068429E52\r
+       for <notmuch@notmuchmail.org>; Sat, 10 Jan 2015 06:38:37 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: 2.438\r
+X-Spam-Level: **\r
+X-Spam-Status: No, score=2.438 tagged_above=-999 required=5\r
+       tests=[DNS_FROM_AHBL_RHSBL=2.438] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+       by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+       with ESMTP id URu+DJ9Q8apK for <notmuch@notmuchmail.org>;\r
+       Sat, 10 Jan 2015 06:38:33 -0800 (PST)\r
+Received: from s75.web-hosting.com (s75.web-hosting.com [198.187.31.9])\r
+       (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits))\r
+       (No client certificate requested)\r
+       by olra.theworths.org (Postfix) with ESMTPS id 958D6429E54\r
+       for <notmuch@notmuchmail.org>; Sat, 10 Jan 2015 06:38:33 -0800 (PST)\r
+Received: from user-69-73-37-128.knology.net ([69.73.37.128]:33524\r
+ helo=tz-lab)  by server75.web-hosting.com with esmtpsa\r
+       (UNKNOWN:DHE-RSA-AES128-SHA:128) (Exim 4.82)    (envelope-from\r
+ <todd@electricoding.com>)     id 1Y9xBQ-0046Hr-I9; Sat, 10 Jan 2015 09:38:32\r
+ -0500\r
+From: Todd <todd@electricoding.com>\r
+To: Jani Nikula <jani@nikula.org>, notmuch@notmuchmail.org\r
+Subject: Re: [PATCH] Index Content-Type of attachments with a contenttype\r
+       prefix\r
+In-Reply-To: <8761ce7s16.fsf@nikula.org>\r
+References: <1420849787-4401-1-git-send-email-todd@electricoding.com>\r
+       <8761ce7s16.fsf@nikula.org>\r
+User-Agent: Notmuch/0.19+17~gd8b219d (http://notmuchmail.org) Emacs/24.4.1\r
+       (x86_64-unknown-linux-gnu)\r
+Date: Sat, 10 Jan 2015 08:38:09 -0600\r
+Message-ID: <87fvbi8zvy.fsf@electricoding.com>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain\r
+X-AntiAbuse: This header was added to track abuse,\r
+       please include it with any abuse report\r
+X-AntiAbuse: Primary Hostname - server75.web-hosting.com\r
+X-AntiAbuse: Original Domain - notmuchmail.org\r
+X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]\r
+X-AntiAbuse: Sender Address Domain - electricoding.com\r
+X-Get-Message-Sender-Via: server75.web-hosting.com: authenticated_id:\r
+       todd@electricoding.com\r
+X-Source: \r
+X-Source-Args: \r
+X-Source-Dir: \r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+       <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Sat, 10 Jan 2015 14:38:37 -0000\r
+\r
+\r
+>>>>> "Jani" == Jani Nikula <jani@nikula.org> writes:\r
+\r
+    Jani> On Sat, 10 Jan 2015, Todd <todd@electricoding.com> wrote:\r
+    >> I wanted to tag messages with calendar invitations, but couldn't as\r
+    >> the information wasn't indexed.\r
+    >> \r
+    >> This patch allows for queries for like:\r
+    >> \r
+    >> Find calendar invites\r
+    >> - contenttype:text/calendar or contenttype:applicaton/ics\r
+    >> \r
+    >> Find any image attachments\r
+    >> - contenttype:image\r
+    >> \r
+    >> Find all patches\r
+    >> - contenttype:text/x-patch\r
+    >> \r
+    >> \r
+    >> - Todd\r
+    >> \r
+    >> ---\r
+    >> NEWS                               |  6 ++++++\r
+    >> completion/notmuch-completion.bash |  2 +-\r
+    >> doc/man7/notmuch-search-terms.rst  |  6 ++++++\r
+    >> emacs/notmuch.el                   |  2 +-\r
+    >> lib/database.cc                    |  1 +\r
+    >> lib/index.cc                       |  5 +++++\r
+    >> test/T190-multipart.sh             | 32 ++++++++++++++++++++++++++++++++\r
+\r
+    Jani> IMO these could be split into several patches.\r
+\r
+    No problem, I'll split them up the next time I post.\r
+\r
+    >> 7 files changed, 52 insertions(+), 2 deletions(-)\r
+    >> \r
+    >> diff --git a/NEWS b/NEWS\r
+    >> index 44e8d05..5f4622c 100644\r
+    >> --- a/NEWS\r
+    >> +++ b/NEWS\r
+    >> @@ -15,6 +15,12 @@ keyboard shortcuts to saved searches.\r
+    >> Command-Line Interface\r
+    >> ----------------------\r
+    >> \r
+    >> +There is a new `contenttype:` search prefix\r
+    >> +\r
+    >> +  The new `contenttype:` search prefix allows searching for the\r
+    >> +  content-type of attachments, which is now indexed by `notmuch\r
+    >> +  insert`. See the `notmuch-search-terms` manual page for details.\r
+    >> +\r
+\r
+    Jani> Admittedly I did not have the time to dig into details, but I think\r
+    Jani> "attachment" is misleading, as it's really all mime parts, right?\r
+\r
+    Jani> Will this also index the Content-Type: header of the message itself,\r
+    Jani> regardless of whether it has mime structure or not? Maybe it\r
+    Jani> should?\r
+\r
+    Yes, all mime-parts. It does not index the Content-Type of the\r
+    message itself.  That probably wouldn't be difficult to add if it's\r
+    a desired feature, but if there are plans for indexing other message\r
+    headers it may fit better there.\r
+\r
+    I also wasn't too happy with a "contenttype" keyword and debated\r
+    just indexing the information under "attachment" along with the\r
+    filename.\r
+\r
+    >> Stopped `notmuch dump` failing if someone writes to the database\r
+    >> \r
+    >> The dump command now takes the write lock when running. This\r
+    >> diff --git a/completion/notmuch-completion.bash b/completion/notmuch-completion.bash\r
+    >> index d58dc8b..05b5969 100644\r
+    >> --- a/completion/notmuch-completion.bash\r
+    >> +++ b/completion/notmuch-completion.bash\r
+    >> @@ -61,7 +61,7 @@ _notmuch_search_terms()\r
+    >> sed "s|^$path/||" | grep -v "\(^\|/\)\(cur\|new\|tmp\)$" ) )\r
+    >> ;;\r
+    >> *)\r
+    >> -           local search_terms="from: to: subject: attachment: tag: id: thread: folder: path: date:"\r
+    >> +           local search_terms="from: to: subject: attachment: contenttype: tag: id: thread: folder: path: date:"\r
+    >> compopt -o nospace\r
+    >> COMPREPLY=( $(compgen -W "${search_terms}" -- ${cur}) )\r
+    >> ;;\r
+    >> diff --git a/doc/man7/notmuch-search-terms.rst b/doc/man7/notmuch-search-terms.rst\r
+    >> index 1acdaa0..d126ce6 100644\r
+    >> --- a/doc/man7/notmuch-search-terms.rst\r
+    >> +++ b/doc/man7/notmuch-search-terms.rst\r
+    >> @@ -40,6 +40,8 @@ indicate user-supplied values):\r
+    >> \r
+    >> -  attachment:<word>\r
+    >> \r
+    >> +-  contenttype:<word>\r
+    >> +\r
+    >> -  tag:<tag> (or is:<tag>)\r
+    >> \r
+    >> -  id:<message-id>\r
+    >> @@ -66,6 +68,10 @@ by including quotation marks around the phrase, immediately following\r
+    >> The **attachment:** prefix can be used to search for specific filenames\r
+    >> (or extensions) of attachments to email messages.\r
+    >> \r
+    >> +The **contenttype:** prefix can be used to search for specific\r
+    >> +content-types of attachments to email messages (as specified by the\r
+    >> +sender).\r
+    >> +\r
+    >> For **tag:** and **is:** valid tag values include **inbox** and\r
+    >> **unread** by default for new messages added by **notmuch new** as well\r
+    >> as any other tag values added manually with **notmuch tag**.\r
+    >> diff --git a/emacs/notmuch.el b/emacs/notmuch.el\r
+    >> index 218486a..702700c 100644\r
+    >> --- a/emacs/notmuch.el\r
+    >> +++ b/emacs/notmuch.el\r
+    >> @@ -858,7 +858,7 @@ PROMPT is the string to prompt with."\r
+    >> (lexical-let\r
+    >> ((completions\r
+    >> (append (list "folder:" "path:" "thread:" "id:" "date:" "from:" "to:"\r
+    >> -                     "subject:" "attachment:")\r
+    >> +                     "subject:" "attachment:" "contenttype:")\r
+    >> (mapcar (lambda (tag)\r
+    >> (concat "tag:" (notmuch-escape-boolean-term tag)))\r
+    >> (process-lines notmuch-command "search" "--output=tags" "*")))))\r
+    >> diff --git a/lib/database.cc b/lib/database.cc\r
+    >> index 3601f9d..a7a64c9 100644\r
+    >> --- a/lib/database.cc\r
+    >> +++ b/lib/database.cc\r
+    >> @@ -254,6 +254,7 @@ static prefix_t PROBABILISTIC_PREFIX[]= {\r
+    >> { "from",                       "XFROM" },\r
+    >> { "to",                 "XTO" },\r
+    >> { "attachment",         "XATTACHMENT" },\r
+    >> +    { "contenttype",           "XCONTENTTYPE"},\r
+    >> { "subject",            "XSUBJECT"},\r
+\r
+    Jani> Is the use of probabilistic prefix intentional? I think it's probably\r
+    Jani> the right thing to do, but just checking.\r
+\r
+    I'm not familiar with Xapian and just followed the precedence of\r
+    attachment.  \r
+\r
+    Jani> BR,\r
+    Jani> Jani.\r
+\r
+    - Todd\r