Re: [PATCH] Index Content-Type of attachments with a contenttype prefix
authorJani Nikula <jani@nikula.org>
Sat, 10 Jan 2015 12:13:09 +0000 (14:13 +0200)
committerW. Trevor King <wking@tremily.us>
Sat, 20 Aug 2016 21:47:20 +0000 (14:47 -0700)
d4/770ea703ce09f95d91a9991d50fec3b5978309 [new file with mode: 0644]

diff --git a/d4/770ea703ce09f95d91a9991d50fec3b5978309 b/d4/770ea703ce09f95d91a9991d50fec3b5978309
new file mode 100644 (file)
index 0000000..3133a32
--- /dev/null
@@ -0,0 +1,264 @@
+Return-Path: <jani@nikula.org>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+       by olra.theworths.org (Postfix) with ESMTP id E0887431E64\r
+       for <notmuch@notmuchmail.org>; Sat, 10 Jan 2015 04:13:04 -0800 (PST)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: 1.738\r
+X-Spam-Level: *\r
+X-Spam-Status: No, score=1.738 tagged_above=-999 required=5\r
+       tests=[DNS_FROM_AHBL_RHSBL=2.438, RCVD_IN_DNSWL_LOW=-0.7]\r
+       autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+       by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+       with ESMTP id YZd839BIRTqo for <notmuch@notmuchmail.org>;\r
+       Sat, 10 Jan 2015 04:13:01 -0800 (PST)\r
+Received: from mail-wg0-f51.google.com (mail-wg0-f51.google.com\r
+ [74.125.82.51])       (using TLSv1 with cipher RC4-SHA (128/128 bits))        (No client\r
+ certificate requested)        by olra.theworths.org (Postfix) with ESMTPS id\r
+ 057B7431FAF   for <notmuch@notmuchmail.org>; Sat, 10 Jan 2015 04:13:01 -0800\r
+ (PST)\r
+Received: by mail-wg0-f51.google.com with SMTP id x12so12235571wgg.10\r
+       for <notmuch@notmuchmail.org>; Sat, 10 Jan 2015 04:12:59 -0800 (PST)\r
+X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;\r
+       d=1e100.net; s=20130820;\r
+       h=x-gm-message-state:from:to:subject:in-reply-to:references\r
+       :user-agent:date:message-id:mime-version:content-type;\r
+       bh=qyewTrDxRgBL0ElwA74BVsu5aODFBpuk0lSWb5qt/+Q=;\r
+       b=GUtG2CKWj/obDHbhC7hwi3+GjjeMmqCCVYx8lNwnUE0YYR7T2IIB1EJb5bBxF6Vi74\r
+       hqQAXbgkF5waLHUOnjugs9K95S2zXcGXAsYPG/wKijbG+QbOTTbSGt8AXVqIWbm15F8v\r
+       Z1dg9WQAO1PutC3/5hXVfLVZiBOibeHFQWKTRzyB23DeUOJ9n+/dzJZhP/uUm8yHeuWs\r
+       eYrJ6GIk1RcJKfwKXGwlaBNbJUnNU/F3MWIEN1TNaLb4dWv/8nL4QjLn1H3gR4ZyPegE\r
+       0oFfp2J4JCn3QuYRUV1FvafsH42v6T0j50hwUGH5g6/zMVxl5JZV6OvYP/lIpxduaOc1\r
+       A9uQ==\r
+X-Gm-Message-State:\r
+ ALoCoQmGwSVpDeUQd2tPiVlc8KhPvTMpw/JX7x1T5Gd4xZb3+dd7KztrIg+ShvJsQ2rSOSgxa1r4\r
+X-Received: by 10.180.21.225 with SMTP id y1mr13417812wie.42.1420891979749;\r
+       Sat, 10 Jan 2015 04:12:59 -0800 (PST)\r
+Received: from localhost (mobile-internet-bcee14-89.dhcp.inet.fi.\r
+       [188.238.20.89])\r
+       by mx.google.com with ESMTPSA id r3sm2150348wic.10.2015.01.10.04.12.58\r
+       (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);\r
+       Sat, 10 Jan 2015 04:12:59 -0800 (PST)\r
+From: Jani Nikula <jani@nikula.org>\r
+To: Todd <todd@electricoding.com>, notmuch@notmuchmail.org\r
+Subject: Re: [PATCH] Index Content-Type of attachments with a contenttype\r
+       prefix\r
+In-Reply-To: <1420849787-4401-1-git-send-email-todd@electricoding.com>\r
+References: <1420849787-4401-1-git-send-email-todd@electricoding.com>\r
+User-Agent: Notmuch/0.19+6~gf2e3d2c (http://notmuchmail.org) Emacs/24.4.1\r
+       (x86_64-pc-linux-gnu)\r
+Date: Sat, 10 Jan 2015 14:13:09 +0200\r
+Message-ID: <8761ce7s16.fsf@nikula.org>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+       <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Sat, 10 Jan 2015 12:13:05 -0000\r
+\r
+On Sat, 10 Jan 2015, Todd <todd@electricoding.com> wrote:\r
+> I wanted to tag messages with calendar invitations, but couldn't as\r
+> the information wasn't indexed.\r
+>\r
+> This patch allows for queries for like:\r
+>\r
+> Find calendar invites\r
+> - contenttype:text/calendar or contenttype:applicaton/ics\r
+>\r
+> Find any image attachments\r
+> - contenttype:image\r
+>\r
+> Find all patches\r
+> - contenttype:text/x-patch\r
+>\r
+>\r
+> - Todd\r
+>\r
+> ---\r
+>  NEWS                               |  6 ++++++\r
+>  completion/notmuch-completion.bash |  2 +-\r
+>  doc/man7/notmuch-search-terms.rst  |  6 ++++++\r
+>  emacs/notmuch.el                   |  2 +-\r
+>  lib/database.cc                    |  1 +\r
+>  lib/index.cc                       |  5 +++++\r
+>  test/T190-multipart.sh             | 32 ++++++++++++++++++++++++++++++++\r
+\r
+IMO these could be split into several patches.\r
+\r
+>  7 files changed, 52 insertions(+), 2 deletions(-)\r
+>\r
+> diff --git a/NEWS b/NEWS\r
+> index 44e8d05..5f4622c 100644\r
+> --- a/NEWS\r
+> +++ b/NEWS\r
+> @@ -15,6 +15,12 @@ keyboard shortcuts to saved searches.\r
+>  Command-Line Interface\r
+>  ----------------------\r
+>\r
+> +There is a new `contenttype:` search prefix\r
+> +\r
+> +  The new `contenttype:` search prefix allows searching for the\r
+> +  content-type of attachments, which is now indexed by `notmuch\r
+> +  insert`. See the `notmuch-search-terms` manual page for details.\r
+> +\r
+\r
+Admittedly I did not have the time to dig into details, but I think\r
+"attachment" is misleading, as it's really all mime parts, right?\r
+\r
+Will this also index the Content-Type: header of the message itself,\r
+regardless of whether it has mime structure or not? Maybe it should?\r
+\r
+>  Stopped `notmuch dump` failing if someone writes to the database\r
+>\r
+>    The dump command now takes the write lock when running. This\r
+> diff --git a/completion/notmuch-completion.bash b/completion/notmuch-completion.bash\r
+> index d58dc8b..05b5969 100644\r
+> --- a/completion/notmuch-completion.bash\r
+> +++ b/completion/notmuch-completion.bash\r
+> @@ -61,7 +61,7 @@ _notmuch_search_terms()\r
+>              sed "s|^$path/||" | grep -v "\(^\|/\)\(cur\|new\|tmp\)$" ) )\r
+>          ;;\r
+>      *)\r
+> -        local search_terms="from: to: subject: attachment: tag: id: thread: folder: path: date:"\r
+> +        local search_terms="from: to: subject: attachment: contenttype: tag: id: thread: folder: path: date:"\r
+>          compopt -o nospace\r
+>          COMPREPLY=( $(compgen -W "${search_terms}" -- ${cur}) )\r
+>          ;;\r
+> diff --git a/doc/man7/notmuch-search-terms.rst b/doc/man7/notmuch-search-terms.rst\r
+> index 1acdaa0..d126ce6 100644\r
+> --- a/doc/man7/notmuch-search-terms.rst\r
+> +++ b/doc/man7/notmuch-search-terms.rst\r
+> @@ -40,6 +40,8 @@ indicate user-supplied values):\r
+>\r
+>  -  attachment:<word>\r
+>\r
+> +-  contenttype:<word>\r
+> +\r
+>  -  tag:<tag> (or is:<tag>)\r
+>\r
+>  -  id:<message-id>\r
+> @@ -66,6 +68,10 @@ by including quotation marks around the phrase, immediately following\r
+>  The **attachment:** prefix can be used to search for specific filenames\r
+>  (or extensions) of attachments to email messages.\r
+>\r
+> +The **contenttype:** prefix can be used to search for specific\r
+> +content-types of attachments to email messages (as specified by the\r
+> +sender).\r
+> +\r
+>  For **tag:** and **is:** valid tag values include **inbox** and\r
+>  **unread** by default for new messages added by **notmuch new** as well\r
+>  as any other tag values added manually with **notmuch tag**.\r
+> diff --git a/emacs/notmuch.el b/emacs/notmuch.el\r
+> index 218486a..702700c 100644\r
+> --- a/emacs/notmuch.el\r
+> +++ b/emacs/notmuch.el\r
+> @@ -858,7 +858,7 @@ PROMPT is the string to prompt with."\r
+>    (lexical-let\r
+>        ((completions\r
+>      (append (list "folder:" "path:" "thread:" "id:" "date:" "from:" "to:"\r
+> -                  "subject:" "attachment:")\r
+> +                  "subject:" "attachment:" "contenttype:")\r
+>              (mapcar (lambda (tag)\r
+>                        (concat "tag:" (notmuch-escape-boolean-term tag)))\r
+>                      (process-lines notmuch-command "search" "--output=tags" "*")))))\r
+> diff --git a/lib/database.cc b/lib/database.cc\r
+> index 3601f9d..a7a64c9 100644\r
+> --- a/lib/database.cc\r
+> +++ b/lib/database.cc\r
+> @@ -254,6 +254,7 @@ static prefix_t PROBABILISTIC_PREFIX[]= {\r
+>      { "from",                       "XFROM" },\r
+>      { "to",                 "XTO" },\r
+>      { "attachment",         "XATTACHMENT" },\r
+> +    { "contenttype",                "XCONTENTTYPE"},\r
+>      { "subject",            "XSUBJECT"},\r
+\r
+Is the use of probabilistic prefix intentional? I think it's probably\r
+the right thing to do, but just checking.\r
+\r
+BR,\r
+Jani.\r
+\r
+>  };\r
+>\r
+> diff --git a/lib/index.cc b/lib/index.cc\r
+> index 1a2e63d..c3f7c6b 100644\r
+> --- a/lib/index.cc\r
+> +++ b/lib/index.cc\r
+> @@ -346,6 +346,11 @@ _index_mime_part (notmuch_message_t *message,\r
+>      return;\r
+>      }\r
+>\r
+> +    GMimeContentType*  content_type = g_mime_object_get_content_type(part);\r
+> +    if (content_type) {\r
+> +    _notmuch_message_gen_terms (message, "contenttype", g_mime_content_type_to_string(content_type));\r
+> +    }\r
+> +\r
+>      if (GMIME_IS_MESSAGE_PART (part)) {\r
+>      GMimeMessage *mime_message;\r
+>\r
+> diff --git a/test/T190-multipart.sh b/test/T190-multipart.sh\r
+> index 85cbf67..e3270a7 100755\r
+> --- a/test/T190-multipart.sh\r
+> +++ b/test/T190-multipart.sh\r
+> @@ -104,6 +104,30 @@ Content-Transfer-Encoding: base64\r
+>  7w0K\r
+>  --==-=-=--\r
+>  EOF\r
+> +\r
+> +cat <<EOF > content_types\r
+> +From: Todd <todd@electricoding.com>\r
+> +To: todd@electricoding.com\r
+> +Subject: odd content types\r
+> +Date: Fri, 05 Jan 2001 15:42:57 +0000\r
+> +User-Agent: Notmuch/0.5 (http://notmuchmail.org) Emacs/23.3.1 (i486-pc-linux-gnu)\r
+> +Message-ID: <87liy5ap01.fsf@yoom.home.cworth.org>\r
+> +MIME-Version: 1.0\r
+> +Content-Type: multipart/alternative; boundary="==-=-=="\r
+> +\r
+> +--==-=-==\r
+> +Content-Type: application/unique_identifier\r
+> +\r
+> +<p>This is an embedded message, with a multipart/alternative part.</p>\r
+> +\r
+> +--==-=-==\r
+> +Content-Type: text/some_other_identifier\r
+> +\r
+> +This is an embedded message, with a multipart/alternative part.\r
+> +\r
+> +--==-=-==--\r
+> +EOF\r
+> +cat content_types >> ${MAIL_DIR}/odd_content_type\r
+>  notmuch new > /dev/null\r
+>\r
+>  test_begin_subtest "--format=text --part=0, full message"\r
+> @@ -727,4 +751,12 @@ test_begin_subtest "html parts included"\r
+>  notmuch show --format=json --include-html id:htmlmessage > OUTPUT\r
+>  test_expect_equal_json "$(cat OUTPUT)" "$(cat EXPECTED.withhtml)"\r
+>\r
+> +test_begin_subtest "indexes content-type"\r
+> +output=$(notmuch search contenttype:application/unique_identifier | notmuch_search_sanitize)\r
+> +test_expect_equal "$output" "thread:XXX   2001-01-05 [1/1] Todd; odd content types (inbox unread)"\r
+> +\r
+> +output=$(notmuch search contenttype:text/some_other_identifier | notmuch_search_sanitize)\r
+> +test_expect_equal "$output" "thread:XXX   2001-01-05 [1/1] Todd; odd content types (inbox unread)"\r
+> +\r
+> +\r
+>  test_done\r
+> --\r
+> 1.9.1\r
+> _______________________________________________\r
+> notmuch mailing list\r
+> notmuch@notmuchmail.org\r
+> http://notmuchmail.org/mailman/listinfo/notmuch\r