Re: [PATCH v3 3/4] cli: Extend the search command for --output={sender, recipients}

author Michal Sojka <sojkam1@fel.cvut.cz>

Mon, 13 Oct 2014 21:38:30 +0000 (23:38 +0200)

committer W. Trevor King <wking@tremily.us>

Fri, 7 Nov 2014 18:05:15 +0000 (10:05 -0800)
author Michal Sojka <sojkam1@fel.cvut.cz>
Mon, 13 Oct 2014 21:38:30 +0000 (23:38 +0200)
committer W. Trevor King <wking@tremily.us>
Fri, 7 Nov 2014 18:05:15 +0000 (10:05 -0800)
diff --git a/3e/8dd3409f580a8e0286a2eef0c2526ac063eb35 b/3e/8dd3409f580a8e0286a2eef0c2526ac063eb35

new file mode 100644 (file)

index 0000000..7715f01
--- /dev/null
+++ b/3e/8dd3409f580a8e0286a2eef0c2526ac063eb35
@@ -0,0 +1,325 @@
+Return-Path: <sojkam1@fel.cvut.cz>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+       by olra.theworths.org (Postfix) with ESMTP id 1C8E1431FAF\r
+       for <notmuch@notmuchmail.org>; Mon, 13 Oct 2014 14:38:44 -0700 (PDT)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: -2.3\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=-2.3 tagged_above=-999 required=5\r
+       tests=[RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+       by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+       with ESMTP id d50R1f3cgZVk for <notmuch@notmuchmail.org>;\r
+       Mon, 13 Oct 2014 14:38:40 -0700 (PDT)\r
+Received: from max.feld.cvut.cz (max.feld.cvut.cz [147.32.192.36])\r
+       by olra.theworths.org (Postfix) with ESMTP id 80EB9431FAE\r
+       for <notmuch@notmuchmail.org>; Mon, 13 Oct 2014 14:38:40 -0700 (PDT)\r
+Received: from localhost (unknown [192.168.200.7])\r
+       by max.feld.cvut.cz (Postfix) with ESMTP id 4D9475CCF4A;\r
+       Mon, 13 Oct 2014 23:38:35 +0200 (CEST)\r
+X-Virus-Scanned: IMAP STYX AMAVIS\r
+Received: from max.feld.cvut.cz ([192.168.200.1])\r
+       by localhost (styx.feld.cvut.cz [192.168.200.7]) (amavisd-new,\r
+       port 10044)\r
+       with ESMTP id bY1nH3zVLXiJ; Mon, 13 Oct 2014 23:38:31 +0200 (CEST)\r
+Received: from imap.feld.cvut.cz (imap.feld.cvut.cz [147.32.192.34])\r
+       by max.feld.cvut.cz (Postfix) with ESMTP id 2F68B5CCEAE;\r
+       Mon, 13 Oct 2014 23:38:31 +0200 (CEST)\r
+Received: from wsh by steelpick.2x.cz with local (Exim 4.84)\r
+       (envelope-from <sojkam1@fel.cvut.cz>)\r
+       id 1XdnK2-0005jf-7H; Mon, 13 Oct 2014 23:38:30 +0200\r
+From: Michal Sojka <sojkam1@fel.cvut.cz>\r
+To: Tomi Ollila <tomi.ollila@iki.fi>, notmuch@notmuchmail.org\r
+Subject: Re: [PATCH v3 3/4] cli: Extend the search command for\r
+       --output={sender, recipients}\r
+In-Reply-To: <m238arrdzp.fsf@guru.guru-group.fi>\r
+References: <87zjd51phx.fsf@steelpick.2x.cz>\r
+       <1413150093-8383-1-git-send-email-sojkam1@fel.cvut.cz>\r
+       <1413150093-8383-4-git-send-email-sojkam1@fel.cvut.cz>\r
+       <m238arrdzp.fsf@guru.guru-group.fi>\r
+User-Agent: Notmuch/0.18.1+101~g56b0ff0 (http://notmuchmail.org) Emacs/24.3.1\r
+       (x86_64-pc-linux-gnu)\r
+Date: Mon, 13 Oct 2014 23:38:30 +0200\r
+Message-ID: <871tqbejl5.fsf@steelpick.2x.cz>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+       <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Mon, 13 Oct 2014 21:38:44 -0000\r
+\r
+On Mon, Oct 13 2014, Tomi Ollila wrote:\r
+> On Mon, Oct 13 2014, Michal Sojka <sojkam1@fel.cvut.cz> wrote:\r
+>\r
+>> The new outputs allow printing senders, recipients or both of matching\r
+>> messages. The --output option is converted from "keyword" argument to\r
+>> "flags" argument, which means that the user can use --output=sender and\r
+>> --output=recipients simultaneously, to print both. Other combinations\r
+>> produce an error.\r
+>>\r
+>> This code based on a patch from Jani Nikula.\r
+>> ---\r
+>>  completion/notmuch-completion.bash |   2 +-\r
+>>  completion/notmuch-completion.zsh  |   3 +-\r
+>>  doc/man1/notmuch-search.rst        |  22 +++++++-\r
+>>  notmuch-search.c                   | 110 ++++++++++++++++++++++++++++++++++---\r
+>>  test/T090-search-output.sh         |  64 +++++++++++++++++++++\r
+>>  5 files changed, 189 insertions(+), 12 deletions(-)\r
+>>\r
+>> diff --git a/completion/notmuch-completion.bash b/completion/notmuch-completion.bash\r
+>> index 0571dc9..cfbd389 100644\r
+>> --- a/completion/notmuch-completion.bash\r
+>> +++ b/completion/notmuch-completion.bash\r
+>> @@ -294,7 +294,7 @@ _notmuch_search()\r
+>>         return\r
+>>         ;;\r
+>>     --output)\r
+>> -       COMPREPLY=( $( compgen -W "summary threads messages files tags" -- "${cur}" ) )\r
+>> +       COMPREPLY=( $( compgen -W "summary threads messages files tags sender recipients" -- "${cur}" ) )\r
+>>         return\r
+>>         ;;\r
+>>     --sort)\r
+>> diff --git a/completion/notmuch-completion.zsh b/completion/notmuch-completion.zsh\r
+>> index 67a9aba..3e52a00 100644\r
+>> --- a/completion/notmuch-completion.zsh\r
+>> +++ b/completion/notmuch-completion.zsh\r
+>> @@ -52,7 +52,8 @@ _notmuch_search()\r
+>>    _arguments -s : \\r
+>>      '--max-threads=[display only the first x threads from the search results]:number of threads to show: ' \\r
+>>      '--first=[omit the first x threads from the search results]:number of threads to omit: ' \\r
+>> -    '--sort=[sort results]:sorting:((newest-first\:"reverse chronological order" oldest-first\:"chronological order"))'\r
+>> +    '--sort=[sort results]:sorting:((newest-first\:"reverse chronological order" oldest-first\:"chronological order"))' \\r
+>> +    '--output=[select what to output]:output:((summary threads messages files tags sender recipients))'\r
+>>  }\r
+>>\r
+>>  _notmuch()\r
+>> diff --git a/doc/man1/notmuch-search.rst b/doc/man1/notmuch-search.rst\r
+>> index 90160f2..c9d38b1 100644\r
+>> --- a/doc/man1/notmuch-search.rst\r
+>> +++ b/doc/man1/notmuch-search.rst\r
+>> @@ -35,7 +35,7 @@ Supported options for **search** include\r
+>>          intended for programs that invoke **notmuch(1)** internally. If\r
+>>          omitted, the latest supported version will be used.\r
+>>\r
+>> -    ``--output=(summary|threads|messages|files|tags)``\r
+>> +    ``--output=(summary|threads|messages|files|tags|sender|recipients)``\r
+>>\r
+>>          **summary**\r
+>>              Output a summary of each thread with any message matching\r
+>> @@ -78,6 +78,26 @@ Supported options for **search** include\r
+>>              by null characters (--format=text0), as a JSON array\r
+>>              (--format=json), or as an S-Expression list (--format=sexp).\r
+>>\r
+>> +   **sender**\r
+>> +            Output all addresses from the *From* header that appear on\r
+>> +            any message matching the search terms, either one per line\r
+>> +            (--format=text), separated by null characters\r
+>> +            (--format=text0), as a JSON array (--format=json), or as\r
+>> +            an S-Expression list (--format=sexp).\r
+>> +\r
+>> +       Note: Searching for **sender** should be much faster than\r
+>> +       searching for **recipients**, because sender addresses are\r
+>> +       cached directly in the database whereas other addresses\r
+>> +       need to be fetched from message files.\r
+>> +\r
+>> +   **recipients**\r
+>> +            Like **sender** but for addresses from *To*, *Cc* and\r
+>> +       *Bcc* headers.\r
+>> +\r
+>> +   This option can be given multiple times to combine different\r
+>> +   outputs. Curently, this is only supported for **sender** and\r
+>> +   **recipients** outputs.\r
+>> +\r
+>>      ``--sort=``\ (**newest-first**\ \|\ **oldest-first**)\r
+>>          This option can be used to present results in either\r
+>>          chronological order (**oldest-first**) or reverse chronological\r
+>> diff --git a/notmuch-search.c b/notmuch-search.c\r
+>> index 5ac2a26..74588f8 100644\r
+>> --- a/notmuch-search.c\r
+>> +++ b/notmuch-search.c\r
+>> @@ -23,11 +23,14 @@\r
+>>  #include "string-util.h"\r
+>>\r
+>>  typedef enum {\r
+>> -    OUTPUT_SUMMARY,\r
+>> -    OUTPUT_THREADS,\r
+>> -    OUTPUT_MESSAGES,\r
+>> -    OUTPUT_FILES,\r
+>> -    OUTPUT_TAGS\r
+>> +    OUTPUT_SUMMARY = 1 << 0,\r
+>> +    OUTPUT_THREADS = 1 << 1,\r
+>> +    OUTPUT_MESSAGES        = 1 << 2,\r
+>> +    OUTPUT_FILES   = 1 << 3,\r
+>> +    OUTPUT_TAGS            = 1 << 4,\r
+>> +    OUTPUT_SENDER  = 1 << 5,\r
+>> +    OUTPUT_RECIPIENTS      = 1 << 6,\r
+>> +    OUTPUT_ADDRESSES       = OUTPUT_SENDER | OUTPUT_RECIPIENTS,\r
+>\r
+> leftover, like mentioned below (this comment added just before\r
+> sending)\r
+\r
+This is needed to suppress a warning about unknown enum value in the\r
+switch statement.\r
+\r
+>>  } output_t;\r
+>>\r
+>>  typedef struct {\r
+>> @@ -220,6 +223,67 @@ do_search_threads (search_options_t *o)\r
+>>      return 0;\r
+>>  }\r
+>>\r
+>> +static void\r
+>> +print_address_list (const search_options_t *o, InternetAddressList *list)\r
+>> +{\r
+>> +    InternetAddress *address;\r
+>> +    int i;\r
+>> +\r
+>> +    for (i = 0; i < internet_address_list_length (list); i++) {\r
+>> +   address = internet_address_list_get_address (list, i);\r
+>> +   if (INTERNET_ADDRESS_IS_GROUP (address)) {\r
+>> +       InternetAddressGroup *group;\r
+>> +       InternetAddressList *group_list;\r
+>> +\r
+>> +       group = INTERNET_ADDRESS_GROUP (address);\r
+>> +       group_list = internet_address_group_get_members (group);\r
+>> +       if (group_list == NULL)\r
+>> +           continue;\r
+>> +\r
+>> +       print_address_list (o, group_list);\r
+>> +   } else {\r
+>> +       InternetAddressMailbox *mailbox;\r
+>> +       const char *name;\r
+>> +       const char *addr;\r
+>> +       char *full_address;\r
+>> +\r
+>> +       mailbox = INTERNET_ADDRESS_MAILBOX (address);\r
+>> +\r
+>> +       name = internet_address_get_name (address);\r
+>> +       addr = internet_address_mailbox_get_addr (mailbox);\r
+>> +\r
+>> +       if (name && *name)\r
+>> +           full_address = talloc_asprintf (o->format, "%s <%s>", name, addr);\r
+>> +       else\r
+>> +           full_address = talloc_strdup (o->format, addr);\r
+>> +\r
+>> +       if (!full_address) {\r
+>\r
+> Apart from minor style issue like space after ! and the leftover ADDRESSES\r
+> parts (w/o that I would not have commented about !<SPC>) the first 3\r
+> patches look pretty good to me. I have not tested those yet.\r
+\r
+If you want, I can send another version.\r
+\r
+> But we keep to have some disagreement w/ unique/duplicate/filter-by\r
+> handling ;/\r
+>\r
+> I (currently) rest the case of first/last/most common handling to just\r
+> how the --sort=(newest-first|oldest-first) affects the order...\r
+>\r
+> Let's consider the following list of output if no /deduplication/ is done:\r
+>\r
+>   John Doe <john@example.com>\r
+>   Dr. John Doe <john@example.com>\r
+>   John Doe <JOHN@EXAMPLE.COM>\r
+>   John Doe <john@doe.name>\r
+>   Dr. John Doe <john@doe.name>\r
+>   John Doe <JOHN@doe.name.example.com>\r
+>   John Doe <john@doe.name>\r
+>   Dr. John Doe <john@example.com>\r
+>   Dr. John Doe <john@example.com>\r
+>   Dr. John Doe <john@doe.name>\r
+>   John Doe <john@example.com>\r
+>   John Doe <john@doe.name.example.com>\r
+>\r
+> To stir the pool a little more, this could be the output when\r
+> --duplicate=all (the default) is given.\r
+\r
+This seems counter-intuitive. A command without --duplicate gives you a\r
+certain number of addresses and when you add --duplicate, you get less\r
+addresses?\r
+\r
+Moreover, --duplicate is already used with --output=files and has\r
+different meaning.\r
+\r
+> With --duplicate=none the output could be (first match by unique\r
+> case-insensitive address):\r
+>\r
+>   John Doe <john@example.com>\r
+>   John Doe <john@doe.name>\r
+>   John Doe <john@doe.name.example.com>\r
+\r
+This seems confusing to me. If I see "duplicate none", I would simply\r
+expect that duplicate lines are removed from the --duplicate=all list\r
+above. Not that just duplicate addresses would be removed.\r
+\r
+> (many people may have the same name, but email address is unique per person\r
+> -- therefore I think 'none' limiting that just to John Doe <john@example.com>\r
+> would be too little)\r
+\r
+Here I agree that filtering by address part is perhaps most useful.\r
+\r
+> and with --duplicate=address\r
+>\r
+>   John Doe <john@example.com>\r
+>   Dr. John Doe <john@example.com>\r
+>   John Doe <john@doe.name>\r
+>   Dr. John Doe <john@doe.name>\r
+>   John Doe <JOHN@doe.name.example.com>\r
+\r
+I don't get this. You say --duplicate=address but the output contains\r
+also duplicate names.\r
+\r
+> (from this output user can choose how the recipient is to be called\r
+> (like "pseudonyms" mentioned in id:20141010113202.GE28601@TP_L520.localdomain )\r
+> when sending email)\r
+>\r
+> and --duplicate=address-casesensitive\r
+>\r
+>   John Doe <john@example.com>\r
+>   Dr. John Doe <john@example.com>\r
+>   John Doe <JOHN@EXAMPLE.COM>\r
+>   John Doe <john@doe.name>\r
+>   Dr. John Doe <john@doe.name>\r
+>   John Doe <JOHN@doe.name.example.com>\r
+>   John Doe <john@doe.name.example.com>\r
+>\r
+> This reuse of --duplicate was thought out after Jani's irc mention of it.\r
+> This scheme would leave no room tho the filter-by=name suggestion -- for\r
+> completeness that would make this look:\r
+>\r
+>   John Doe <john@example.com>\r
+>   Dr. John Doe <john@example.com>\r
+>\r
+> This doesn't look too bad in this particular case but not having ability to\r
+> see all potential addresses (perhaps the only working address is now\r
+> hidden) isn't not much for general use. Well, maybe for some specific use\r
+> --duplicate=no-unique-addresses could be useful :O\r
+>\r
+> Ok, this took an hour to get written to (w/ some interruptions). Healthy\r
+> criticism appreciated :D\r
+\r
+I don't know. You seem to think about this in the opposite way than how\r
+it is implemented. The implementation really filters things out whereas\r
+you specify what not to filter.\r
+\r
+My feeling is that if you would implement your proposal, the code would\r
+be more complex than in my patch, because the mapping between command\r
+line options and the actual algorithm would require some extra code. And\r
+in a previous comment, you preferred simplicity.\r
+\r
+Hopefully, you consider the above as "healthy" criticism.\r
+\r
+Thanks for quick review.\r
+-Michal\r
author	Michal Sojka <sojkam1@fel.cvut.cz>
	Mon, 13 Oct 2014 21:38:30 +0000 (23:38 +0200)
committer	W. Trevor King <wking@tremily.us>
	Fri, 7 Nov 2014 18:05:15 +0000 (10:05 -0800)