From: Michal Sojka Date: Mon, 13 Oct 2014 21:38:30 +0000 (+0200) Subject: Re: [PATCH v3 3/4] cli: Extend the search command for --output={sender, recipients} X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=c994037e040c4383208aacf487903c2e319dca34;p=notmuch-archives.git Re: [PATCH v3 3/4] cli: Extend the search command for --output={sender, recipients} --- diff --git a/3e/8dd3409f580a8e0286a2eef0c2526ac063eb35 b/3e/8dd3409f580a8e0286a2eef0c2526ac063eb35 new file mode 100644 index 000000000..7715f0186 --- /dev/null +++ b/3e/8dd3409f580a8e0286a2eef0c2526ac063eb35 @@ -0,0 +1,325 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by olra.theworths.org (Postfix) with ESMTP id 1C8E1431FAF + for ; Mon, 13 Oct 2014 14:38:44 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at olra.theworths.org +X-Spam-Flag: NO +X-Spam-Score: -2.3 +X-Spam-Level: +X-Spam-Status: No, score=-2.3 tagged_above=-999 required=5 + tests=[RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled +Received: from olra.theworths.org ([127.0.0.1]) + by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id d50R1f3cgZVk for ; + Mon, 13 Oct 2014 14:38:40 -0700 (PDT) +Received: from max.feld.cvut.cz (max.feld.cvut.cz [147.32.192.36]) + by olra.theworths.org (Postfix) with ESMTP id 80EB9431FAE + for ; Mon, 13 Oct 2014 14:38:40 -0700 (PDT) +Received: from localhost (unknown [192.168.200.7]) + by max.feld.cvut.cz (Postfix) with ESMTP id 4D9475CCF4A; + Mon, 13 Oct 2014 23:38:35 +0200 (CEST) +X-Virus-Scanned: IMAP STYX AMAVIS +Received: from max.feld.cvut.cz ([192.168.200.1]) + by localhost (styx.feld.cvut.cz [192.168.200.7]) (amavisd-new, + port 10044) + with ESMTP id bY1nH3zVLXiJ; Mon, 13 Oct 2014 23:38:31 +0200 (CEST) +Received: from imap.feld.cvut.cz (imap.feld.cvut.cz [147.32.192.34]) + by max.feld.cvut.cz (Postfix) with ESMTP id 2F68B5CCEAE; + Mon, 13 Oct 2014 23:38:31 +0200 (CEST) +Received: from wsh by steelpick.2x.cz with local (Exim 4.84) + (envelope-from ) + id 1XdnK2-0005jf-7H; Mon, 13 Oct 2014 23:38:30 +0200 +From: Michal Sojka +To: Tomi Ollila , notmuch@notmuchmail.org +Subject: Re: [PATCH v3 3/4] cli: Extend the search command for + --output={sender, recipients} +In-Reply-To: +References: <87zjd51phx.fsf@steelpick.2x.cz> + <1413150093-8383-1-git-send-email-sojkam1@fel.cvut.cz> + <1413150093-8383-4-git-send-email-sojkam1@fel.cvut.cz> + +User-Agent: Notmuch/0.18.1+101~g56b0ff0 (http://notmuchmail.org) Emacs/24.3.1 + (x86_64-pc-linux-gnu) +Date: Mon, 13 Oct 2014 23:38:30 +0200 +Message-ID: <871tqbejl5.fsf@steelpick.2x.cz> +MIME-Version: 1.0 +Content-Type: text/plain +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.13 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Mon, 13 Oct 2014 21:38:44 -0000 + +On Mon, Oct 13 2014, Tomi Ollila wrote: +> On Mon, Oct 13 2014, Michal Sojka wrote: +> +>> The new outputs allow printing senders, recipients or both of matching +>> messages. The --output option is converted from "keyword" argument to +>> "flags" argument, which means that the user can use --output=sender and +>> --output=recipients simultaneously, to print both. Other combinations +>> produce an error. +>> +>> This code based on a patch from Jani Nikula. +>> --- +>> completion/notmuch-completion.bash | 2 +- +>> completion/notmuch-completion.zsh | 3 +- +>> doc/man1/notmuch-search.rst | 22 +++++++- +>> notmuch-search.c | 110 ++++++++++++++++++++++++++++++++++--- +>> test/T090-search-output.sh | 64 +++++++++++++++++++++ +>> 5 files changed, 189 insertions(+), 12 deletions(-) +>> +>> diff --git a/completion/notmuch-completion.bash b/completion/notmuch-completion.bash +>> index 0571dc9..cfbd389 100644 +>> --- a/completion/notmuch-completion.bash +>> +++ b/completion/notmuch-completion.bash +>> @@ -294,7 +294,7 @@ _notmuch_search() +>> return +>> ;; +>> --output) +>> - COMPREPLY=( $( compgen -W "summary threads messages files tags" -- "${cur}" ) ) +>> + COMPREPLY=( $( compgen -W "summary threads messages files tags sender recipients" -- "${cur}" ) ) +>> return +>> ;; +>> --sort) +>> diff --git a/completion/notmuch-completion.zsh b/completion/notmuch-completion.zsh +>> index 67a9aba..3e52a00 100644 +>> --- a/completion/notmuch-completion.zsh +>> +++ b/completion/notmuch-completion.zsh +>> @@ -52,7 +52,8 @@ _notmuch_search() +>> _arguments -s : \ +>> '--max-threads=[display only the first x threads from the search results]:number of threads to show: ' \ +>> '--first=[omit the first x threads from the search results]:number of threads to omit: ' \ +>> - '--sort=[sort results]:sorting:((newest-first\:"reverse chronological order" oldest-first\:"chronological order"))' +>> + '--sort=[sort results]:sorting:((newest-first\:"reverse chronological order" oldest-first\:"chronological order"))' \ +>> + '--output=[select what to output]:output:((summary threads messages files tags sender recipients))' +>> } +>> +>> _notmuch() +>> diff --git a/doc/man1/notmuch-search.rst b/doc/man1/notmuch-search.rst +>> index 90160f2..c9d38b1 100644 +>> --- a/doc/man1/notmuch-search.rst +>> +++ b/doc/man1/notmuch-search.rst +>> @@ -35,7 +35,7 @@ Supported options for **search** include +>> intended for programs that invoke **notmuch(1)** internally. If +>> omitted, the latest supported version will be used. +>> +>> - ``--output=(summary|threads|messages|files|tags)`` +>> + ``--output=(summary|threads|messages|files|tags|sender|recipients)`` +>> +>> **summary** +>> Output a summary of each thread with any message matching +>> @@ -78,6 +78,26 @@ Supported options for **search** include +>> by null characters (--format=text0), as a JSON array +>> (--format=json), or as an S-Expression list (--format=sexp). +>> +>> + **sender** +>> + Output all addresses from the *From* header that appear on +>> + any message matching the search terms, either one per line +>> + (--format=text), separated by null characters +>> + (--format=text0), as a JSON array (--format=json), or as +>> + an S-Expression list (--format=sexp). +>> + +>> + Note: Searching for **sender** should be much faster than +>> + searching for **recipients**, because sender addresses are +>> + cached directly in the database whereas other addresses +>> + need to be fetched from message files. +>> + +>> + **recipients** +>> + Like **sender** but for addresses from *To*, *Cc* and +>> + *Bcc* headers. +>> + +>> + This option can be given multiple times to combine different +>> + outputs. Curently, this is only supported for **sender** and +>> + **recipients** outputs. +>> + +>> ``--sort=``\ (**newest-first**\ \|\ **oldest-first**) +>> This option can be used to present results in either +>> chronological order (**oldest-first**) or reverse chronological +>> diff --git a/notmuch-search.c b/notmuch-search.c +>> index 5ac2a26..74588f8 100644 +>> --- a/notmuch-search.c +>> +++ b/notmuch-search.c +>> @@ -23,11 +23,14 @@ +>> #include "string-util.h" +>> +>> typedef enum { +>> - OUTPUT_SUMMARY, +>> - OUTPUT_THREADS, +>> - OUTPUT_MESSAGES, +>> - OUTPUT_FILES, +>> - OUTPUT_TAGS +>> + OUTPUT_SUMMARY = 1 << 0, +>> + OUTPUT_THREADS = 1 << 1, +>> + OUTPUT_MESSAGES = 1 << 2, +>> + OUTPUT_FILES = 1 << 3, +>> + OUTPUT_TAGS = 1 << 4, +>> + OUTPUT_SENDER = 1 << 5, +>> + OUTPUT_RECIPIENTS = 1 << 6, +>> + OUTPUT_ADDRESSES = OUTPUT_SENDER | OUTPUT_RECIPIENTS, +> +> leftover, like mentioned below (this comment added just before +> sending) + +This is needed to suppress a warning about unknown enum value in the +switch statement. + +>> } output_t; +>> +>> typedef struct { +>> @@ -220,6 +223,67 @@ do_search_threads (search_options_t *o) +>> return 0; +>> } +>> +>> +static void +>> +print_address_list (const search_options_t *o, InternetAddressList *list) +>> +{ +>> + InternetAddress *address; +>> + int i; +>> + +>> + for (i = 0; i < internet_address_list_length (list); i++) { +>> + address = internet_address_list_get_address (list, i); +>> + if (INTERNET_ADDRESS_IS_GROUP (address)) { +>> + InternetAddressGroup *group; +>> + InternetAddressList *group_list; +>> + +>> + group = INTERNET_ADDRESS_GROUP (address); +>> + group_list = internet_address_group_get_members (group); +>> + if (group_list == NULL) +>> + continue; +>> + +>> + print_address_list (o, group_list); +>> + } else { +>> + InternetAddressMailbox *mailbox; +>> + const char *name; +>> + const char *addr; +>> + char *full_address; +>> + +>> + mailbox = INTERNET_ADDRESS_MAILBOX (address); +>> + +>> + name = internet_address_get_name (address); +>> + addr = internet_address_mailbox_get_addr (mailbox); +>> + +>> + if (name && *name) +>> + full_address = talloc_asprintf (o->format, "%s <%s>", name, addr); +>> + else +>> + full_address = talloc_strdup (o->format, addr); +>> + +>> + if (!full_address) { +> +> Apart from minor style issue like space after ! and the leftover ADDRESSES +> parts (w/o that I would not have commented about !) the first 3 +> patches look pretty good to me. I have not tested those yet. + +If you want, I can send another version. + +> But we keep to have some disagreement w/ unique/duplicate/filter-by +> handling ;/ +> +> I (currently) rest the case of first/last/most common handling to just +> how the --sort=(newest-first|oldest-first) affects the order... +> +> Let's consider the following list of output if no /deduplication/ is done: +> +> John Doe +> Dr. John Doe +> John Doe +> John Doe +> Dr. John Doe +> John Doe +> John Doe +> Dr. John Doe +> Dr. John Doe +> Dr. John Doe +> John Doe +> John Doe +> +> To stir the pool a little more, this could be the output when +> --duplicate=all (the default) is given. + +This seems counter-intuitive. A command without --duplicate gives you a +certain number of addresses and when you add --duplicate, you get less +addresses? + +Moreover, --duplicate is already used with --output=files and has +different meaning. + +> With --duplicate=none the output could be (first match by unique +> case-insensitive address): +> +> John Doe +> John Doe +> John Doe + +This seems confusing to me. If I see "duplicate none", I would simply +expect that duplicate lines are removed from the --duplicate=all list +above. Not that just duplicate addresses would be removed. + +> (many people may have the same name, but email address is unique per person +> -- therefore I think 'none' limiting that just to John Doe +> would be too little) + +Here I agree that filtering by address part is perhaps most useful. + +> and with --duplicate=address +> +> John Doe +> Dr. John Doe +> John Doe +> Dr. John Doe +> John Doe + +I don't get this. You say --duplicate=address but the output contains +also duplicate names. + +> (from this output user can choose how the recipient is to be called +> (like "pseudonyms" mentioned in id:20141010113202.GE28601@TP_L520.localdomain ) +> when sending email) +> +> and --duplicate=address-casesensitive +> +> John Doe +> Dr. John Doe +> John Doe +> John Doe +> Dr. John Doe +> John Doe +> John Doe +> +> This reuse of --duplicate was thought out after Jani's irc mention of it. +> This scheme would leave no room tho the filter-by=name suggestion -- for +> completeness that would make this look: +> +> John Doe +> Dr. John Doe +> +> This doesn't look too bad in this particular case but not having ability to +> see all potential addresses (perhaps the only working address is now +> hidden) isn't not much for general use. Well, maybe for some specific use +> --duplicate=no-unique-addresses could be useful :O +> +> Ok, this took an hour to get written to (w/ some interruptions). Healthy +> criticism appreciated :D + +I don't know. You seem to think about this in the opposite way than how +it is implemented. The implementation really filters things out whereas +you specify what not to filter. + +My feeling is that if you would implement your proposal, the code would +be more complex than in my patch, because the mapping between command +line options and the actual algorithm would require some extra code. And +in a previous comment, you preferred simplicity. + +Hopefully, you consider the above as "healthy" criticism. + +Thanks for quick review. +-Michal