Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 1C8E1431FAF for ; Mon, 13 Oct 2014 14:38:44 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -2.3 X-Spam-Level: X-Spam-Status: No, score=-2.3 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id d50R1f3cgZVk for ; Mon, 13 Oct 2014 14:38:40 -0700 (PDT) Received: from max.feld.cvut.cz (max.feld.cvut.cz [147.32.192.36]) by olra.theworths.org (Postfix) with ESMTP id 80EB9431FAE for ; Mon, 13 Oct 2014 14:38:40 -0700 (PDT) Received: from localhost (unknown [192.168.200.7]) by max.feld.cvut.cz (Postfix) with ESMTP id 4D9475CCF4A; Mon, 13 Oct 2014 23:38:35 +0200 (CEST) X-Virus-Scanned: IMAP STYX AMAVIS Received: from max.feld.cvut.cz ([192.168.200.1]) by localhost (styx.feld.cvut.cz [192.168.200.7]) (amavisd-new, port 10044) with ESMTP id bY1nH3zVLXiJ; Mon, 13 Oct 2014 23:38:31 +0200 (CEST) Received: from imap.feld.cvut.cz (imap.feld.cvut.cz [147.32.192.34]) by max.feld.cvut.cz (Postfix) with ESMTP id 2F68B5CCEAE; Mon, 13 Oct 2014 23:38:31 +0200 (CEST) Received: from wsh by steelpick.2x.cz with local (Exim 4.84) (envelope-from ) id 1XdnK2-0005jf-7H; Mon, 13 Oct 2014 23:38:30 +0200 From: Michal Sojka To: Tomi Ollila , notmuch@notmuchmail.org Subject: Re: [PATCH v3 3/4] cli: Extend the search command for --output={sender, recipients} In-Reply-To: References: <87zjd51phx.fsf@steelpick.2x.cz> <1413150093-8383-1-git-send-email-sojkam1@fel.cvut.cz> <1413150093-8383-4-git-send-email-sojkam1@fel.cvut.cz> User-Agent: Notmuch/0.18.1+101~g56b0ff0 (http://notmuchmail.org) Emacs/24.3.1 (x86_64-pc-linux-gnu) Date: Mon, 13 Oct 2014 23:38:30 +0200 Message-ID: <871tqbejl5.fsf@steelpick.2x.cz> MIME-Version: 1.0 Content-Type: text/plain X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 13 Oct 2014 21:38:44 -0000 On Mon, Oct 13 2014, Tomi Ollila wrote: > On Mon, Oct 13 2014, Michal Sojka wrote: > >> The new outputs allow printing senders, recipients or both of matching >> messages. The --output option is converted from "keyword" argument to >> "flags" argument, which means that the user can use --output=sender and >> --output=recipients simultaneously, to print both. Other combinations >> produce an error. >> >> This code based on a patch from Jani Nikula. >> --- >> completion/notmuch-completion.bash | 2 +- >> completion/notmuch-completion.zsh | 3 +- >> doc/man1/notmuch-search.rst | 22 +++++++- >> notmuch-search.c | 110 ++++++++++++++++++++++++++++++++++--- >> test/T090-search-output.sh | 64 +++++++++++++++++++++ >> 5 files changed, 189 insertions(+), 12 deletions(-) >> >> diff --git a/completion/notmuch-completion.bash b/completion/notmuch-completion.bash >> index 0571dc9..cfbd389 100644 >> --- a/completion/notmuch-completion.bash >> +++ b/completion/notmuch-completion.bash >> @@ -294,7 +294,7 @@ _notmuch_search() >> return >> ;; >> --output) >> - COMPREPLY=( $( compgen -W "summary threads messages files tags" -- "${cur}" ) ) >> + COMPREPLY=( $( compgen -W "summary threads messages files tags sender recipients" -- "${cur}" ) ) >> return >> ;; >> --sort) >> diff --git a/completion/notmuch-completion.zsh b/completion/notmuch-completion.zsh >> index 67a9aba..3e52a00 100644 >> --- a/completion/notmuch-completion.zsh >> +++ b/completion/notmuch-completion.zsh >> @@ -52,7 +52,8 @@ _notmuch_search() >> _arguments -s : \ >> '--max-threads=[display only the first x threads from the search results]:number of threads to show: ' \ >> '--first=[omit the first x threads from the search results]:number of threads to omit: ' \ >> - '--sort=[sort results]:sorting:((newest-first\:"reverse chronological order" oldest-first\:"chronological order"))' >> + '--sort=[sort results]:sorting:((newest-first\:"reverse chronological order" oldest-first\:"chronological order"))' \ >> + '--output=[select what to output]:output:((summary threads messages files tags sender recipients))' >> } >> >> _notmuch() >> diff --git a/doc/man1/notmuch-search.rst b/doc/man1/notmuch-search.rst >> index 90160f2..c9d38b1 100644 >> --- a/doc/man1/notmuch-search.rst >> +++ b/doc/man1/notmuch-search.rst >> @@ -35,7 +35,7 @@ Supported options for **search** include >> intended for programs that invoke **notmuch(1)** internally. If >> omitted, the latest supported version will be used. >> >> - ``--output=(summary|threads|messages|files|tags)`` >> + ``--output=(summary|threads|messages|files|tags|sender|recipients)`` >> >> **summary** >> Output a summary of each thread with any message matching >> @@ -78,6 +78,26 @@ Supported options for **search** include >> by null characters (--format=text0), as a JSON array >> (--format=json), or as an S-Expression list (--format=sexp). >> >> + **sender** >> + Output all addresses from the *From* header that appear on >> + any message matching the search terms, either one per line >> + (--format=text), separated by null characters >> + (--format=text0), as a JSON array (--format=json), or as >> + an S-Expression list (--format=sexp). >> + >> + Note: Searching for **sender** should be much faster than >> + searching for **recipients**, because sender addresses are >> + cached directly in the database whereas other addresses >> + need to be fetched from message files. >> + >> + **recipients** >> + Like **sender** but for addresses from *To*, *Cc* and >> + *Bcc* headers. >> + >> + This option can be given multiple times to combine different >> + outputs. Curently, this is only supported for **sender** and >> + **recipients** outputs. >> + >> ``--sort=``\ (**newest-first**\ \|\ **oldest-first**) >> This option can be used to present results in either >> chronological order (**oldest-first**) or reverse chronological >> diff --git a/notmuch-search.c b/notmuch-search.c >> index 5ac2a26..74588f8 100644 >> --- a/notmuch-search.c >> +++ b/notmuch-search.c >> @@ -23,11 +23,14 @@ >> #include "string-util.h" >> >> typedef enum { >> - OUTPUT_SUMMARY, >> - OUTPUT_THREADS, >> - OUTPUT_MESSAGES, >> - OUTPUT_FILES, >> - OUTPUT_TAGS >> + OUTPUT_SUMMARY = 1 << 0, >> + OUTPUT_THREADS = 1 << 1, >> + OUTPUT_MESSAGES = 1 << 2, >> + OUTPUT_FILES = 1 << 3, >> + OUTPUT_TAGS = 1 << 4, >> + OUTPUT_SENDER = 1 << 5, >> + OUTPUT_RECIPIENTS = 1 << 6, >> + OUTPUT_ADDRESSES = OUTPUT_SENDER | OUTPUT_RECIPIENTS, > > leftover, like mentioned below (this comment added just before > sending) This is needed to suppress a warning about unknown enum value in the switch statement. >> } output_t; >> >> typedef struct { >> @@ -220,6 +223,67 @@ do_search_threads (search_options_t *o) >> return 0; >> } >> >> +static void >> +print_address_list (const search_options_t *o, InternetAddressList *list) >> +{ >> + InternetAddress *address; >> + int i; >> + >> + for (i = 0; i < internet_address_list_length (list); i++) { >> + address = internet_address_list_get_address (list, i); >> + if (INTERNET_ADDRESS_IS_GROUP (address)) { >> + InternetAddressGroup *group; >> + InternetAddressList *group_list; >> + >> + group = INTERNET_ADDRESS_GROUP (address); >> + group_list = internet_address_group_get_members (group); >> + if (group_list == NULL) >> + continue; >> + >> + print_address_list (o, group_list); >> + } else { >> + InternetAddressMailbox *mailbox; >> + const char *name; >> + const char *addr; >> + char *full_address; >> + >> + mailbox = INTERNET_ADDRESS_MAILBOX (address); >> + >> + name = internet_address_get_name (address); >> + addr = internet_address_mailbox_get_addr (mailbox); >> + >> + if (name && *name) >> + full_address = talloc_asprintf (o->format, "%s <%s>", name, addr); >> + else >> + full_address = talloc_strdup (o->format, addr); >> + >> + if (!full_address) { > > Apart from minor style issue like space after ! and the leftover ADDRESSES > parts (w/o that I would not have commented about !) the first 3 > patches look pretty good to me. I have not tested those yet. If you want, I can send another version. > But we keep to have some disagreement w/ unique/duplicate/filter-by > handling ;/ > > I (currently) rest the case of first/last/most common handling to just > how the --sort=(newest-first|oldest-first) affects the order... > > Let's consider the following list of output if no /deduplication/ is done: > > John Doe > Dr. John Doe > John Doe > John Doe > Dr. John Doe > John Doe > John Doe > Dr. John Doe > Dr. John Doe > Dr. John Doe > John Doe > John Doe > > To stir the pool a little more, this could be the output when > --duplicate=all (the default) is given. This seems counter-intuitive. A command without --duplicate gives you a certain number of addresses and when you add --duplicate, you get less addresses? Moreover, --duplicate is already used with --output=files and has different meaning. > With --duplicate=none the output could be (first match by unique > case-insensitive address): > > John Doe > John Doe > John Doe This seems confusing to me. If I see "duplicate none", I would simply expect that duplicate lines are removed from the --duplicate=all list above. Not that just duplicate addresses would be removed. > (many people may have the same name, but email address is unique per person > -- therefore I think 'none' limiting that just to John Doe > would be too little) Here I agree that filtering by address part is perhaps most useful. > and with --duplicate=address > > John Doe > Dr. John Doe > John Doe > Dr. John Doe > John Doe I don't get this. You say --duplicate=address but the output contains also duplicate names. > (from this output user can choose how the recipient is to be called > (like "pseudonyms" mentioned in id:20141010113202.GE28601@TP_L520.localdomain ) > when sending email) > > and --duplicate=address-casesensitive > > John Doe > Dr. John Doe > John Doe > John Doe > Dr. John Doe > John Doe > John Doe > > This reuse of --duplicate was thought out after Jani's irc mention of it. > This scheme would leave no room tho the filter-by=name suggestion -- for > completeness that would make this look: > > John Doe > Dr. John Doe > > This doesn't look too bad in this particular case but not having ability to > see all potential addresses (perhaps the only working address is now > hidden) isn't not much for general use. Well, maybe for some specific use > --duplicate=no-unique-addresses could be useful :O > > Ok, this took an hour to get written to (w/ some interruptions). Healthy > criticism appreciated :D I don't know. You seem to think about this in the opposite way than how it is implemented. The implementation really filters things out whereas you specify what not to filter. My feeling is that if you would implement your proposal, the code would be more complex than in my patch, because the mapping between command line options and the actual algorithm would require some extra code. And in a previous comment, you preferred simplicity. Hopefully, you consider the above as "healthy" criticism. Thanks for quick review. -Michal