Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id CA487431FBC for ; Mon, 21 Jan 2013 09:22:04 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 13B7HNzYy+OP for ; Mon, 21 Jan 2013 09:22:03 -0800 (PST) Received: from mail-bk0-f43.google.com (mail-bk0-f43.google.com [209.85.214.43]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 7840D431FAF for ; Mon, 21 Jan 2013 09:22:03 -0800 (PST) Received: by mail-bk0-f43.google.com with SMTP id jf20so3233002bkc.2 for ; Mon, 21 Jan 2013 09:22:00 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=x-received:from:to:subject:in-reply-to:references:user-agent:date :message-id:mime-version:content-type:x-gm-message-state; bh=kuMR8i8//GjDPnayKBA26wSaRjW5qhDC403hC5k16W4=; b=hz0H4DoIkKQ+tzBF2TplCX8d3p8sQXfLbSQSyry2w/v38Bl5zMIa7PDX8PJpPo2IxN bHT3aouqhfSSMcoK+fRLk0XjucXB2Y5Vvf6sb1OPCuc7NkT788p52xgEv3MpDwpfLBnh 8LhYuf++nL+4nMVhy/JzMu97vn4jUmigCS3gdDJAdYgJ3hZzrm3Xrd5jKjZxHkHdH+y9 Rqs0/2Aw3FaiFls0wbZvil5FdrWgzs4pJTXac+u67qYX2hg1AXtWr9emUnspEyslDTMG Abm4oVatJ8+rUOPhCjDgE8SxVbco6SMApL1pSVdE6y6Ik3URj0CbQP/OR/YKc6gQKpho Jsjw== X-Received: by 10.204.9.132 with SMTP id l4mr757529bkl.6.1358788920678; Mon, 21 Jan 2013 09:22:00 -0800 (PST) Received: from localhost ([2001:4b98:dc0:43:216:3eff:fe1b:25f3]) by mx.google.com with ESMTPS id z5sm9100364bkv.11.2013.01.21.09.21.58 (version=TLSv1.1 cipher=RC4-SHA bits=128/128); Mon, 21 Jan 2013 09:21:59 -0800 (PST) From: Jani Nikula To: Tomi Ollila , Mark Walters , notmuch@notmuchmail.org Subject: Re: [PATCH 0/5] notmuch batch count In-Reply-To: References: <8738y2ui4y.fsf@qmul.ac.uk> User-Agent: Notmuch/0.14+259~gdee88db (http://notmuchmail.org) Emacs/23.2.1 (x86_64-pc-linux-gnu) Date: Mon, 21 Jan 2013 18:21:52 +0100 Message-ID: <87fw1u30zz.fsf@nikula.org> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Gm-Message-State: ALoCoQlCOo36TwIctYUCCJ9uT0DvYeW9V87wFqoTC4ATD4tJ1DPx+llxWGFZfe/v3WXxRpbdsNLN X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 21 Jan 2013 17:22:04 -0000 On Wed, 16 Jan 2013, Tomi Ollila wrote: > On Wed, Jan 16 2013, Mark Walters wrote: > >> On Tue, 15 Jan 2013, Jani Nikula wrote: >>> Hi all - >>> >>> Notmuch remote usage [1] is a pretty handy way of accessing a notmuch >>> database on a remote server. However, the more you have saved searches >>> and tags, the slower notmuch-hello becomes, and it ends up being by and >>> far the biggest usability issue with remote notmuch. This is because >>> notmuch-hello issues a separate 'notmuch count' for each saved search >>> and tag. >>> >>> One could argue that notmuch-hello should be fixed somehow, but I chose >>> to try another route: batch support for notmuch count. This enables >>> notmuch-hello to get the counts for all the saved searches or tags in a >>> single call. The performance improvement is huge in remote usage, but >>> it's not limited to that. Regular local usage benefits from it too, but >>> it's not as obviously noticeable. >> >> This series looks good to me (that is the code looks fine). >> >> Two questions are: >> >> Do we want this functionality? I think it is useful even on local setups >> particularly if people have lots of tags (the section that shows all >> tags can be quite noticeably sped up). It is a substantial improvement >> on remote setups but I am not sure if that is sufficiently common to >> warrant the change. At least the code path is the same so it will get >> enough testing. > > I do want the functionality. Especialy where I am now it takes about > 0.4 sec for 'ssh remote echo foo' to get executed (using connection sharing). > pipelining the count requests could make all the count requests emacs > does (in my current set) to complete in less than 1 sec. > >> Secondly, if we do the functionality should it be more general so that >> it can do searches etc too. I think this is less clear. Count is likely >> to be the most useful one since running several (simultaneous) counts is >> probably more common than running several simultaneous searches. > > One could argue that we'd should send json "documents" to notmuch in > stdin and notmuch would output json(/sexp) "documents". That is just > SMOP. I bet Austin would like this solution, especially the part > that involves writing or integrating json parser >;). > I'd be happy with this 'batch' approach. > > I'll be testing this soon, but refrain from reviewing the code > until 0.15 is out. id:87a9s5cp38.fsf@zancas.localnet ;) J. > >> >> Best wishes >> >> Mark > > > Tomi > > >> >> >>> >>> Here's a script that demonstrates one-by-one count vs. batch count, >>> locally and over ssh (assuming ssh key authentication is set up), over >>> 10 iterations: >>> >>> #!/bin/bash >>> >>> echo "tag count:" >>> notmuch search --output=tags "*" | wc -l >>> >>> for remote in "" "ssh example.com"; do >>> export remote >>> echo "one-by-one count:" >>> time sh -c 'for i in `seq 10`; do notmuch search --format=text0 --output=tags "*" | xargs -0 -n 1 -I "{}" $remote notmuch count tag:"{}" > /dev/null; done' >>> >>> echo "batch count:" >>> time sh -c 'for i in `seq 10`; do notmuch search --format=text --output=tags "*" | sed "s/.*/tag:\"\0\"/" | $remote notmuch count --batch > /dev/null; done' >>> done >>> >>> And here's the output of it in my setup: >>> >>> tag count: >>> 36 >>> one-by-one count: >>> >>> real 0m2.349s >>> user 0m0.552s >>> sys 0m0.868s >>> batch count: >>> >>> real 0m0.179s >>> user 0m0.120s >>> sys 0m0.064s >>> one-by-one count: >>> >>> real 0m56.527s >>> user 0m1.424s >>> sys 0m1.164s >>> batch count: >>> >>> real 0m2.407s >>> user 0m0.068s >>> sys 0m0.040s >>> >>> As can be seen, in local usage (the first pair of results) the speedup >>> is more than 10x, although one-by-one notmuch count is usually >>> sufficiently fast. The difference is more noticeable in remote use (the >>> second pair of results), where the speedup is 20x here, and any >>> additional, occasional network latency is multiplied by tag count. (That >>> result is actually faster than usual for me, but it's still 5+ seconds >>> to display or refresh notmuch-hello.) >>> >>> Mark has written a patch that I've been using to switch notmuch-hello to >>> use batch count. That has made me switch from running notmuch in ssh to >>> using remote notmuch. The great thing is that we could switch to using >>> that in Emacs with no special casing for remote usage, and it would >>> speed things up also in local use. I'm expecting Mark to post his patch >>> in reply to this series. >>> >>> Mark actually wrote the elisp part based on the rough idea prior to any >>> of this cli plumbing, so I felt obliged to follow up. So thanks Mark! >>> >>> >>> BR, >>> Jani. >>> >>> >>> [1] http://notmuchmail.org/remoteusage/ (the page could use some >>> cleanup; it's really not nearly as complicated as the page suggests) >>> >>> >>> Jani Nikula (5): >>> cli: remove useless strdup >>> cli: extract count printing to a separate function in notmuch count >>> cli: add --batch option to notmuch count >>> man: document notmuch count --batch and --input options >>> test: notmuch count --batch and --input options >>> >>> man/man1/notmuch-count.1 | 20 +++++++++ >>> notmuch-count.c | 111 +++++++++++++++++++++++++++++++++++----------- >>> test/count | 46 +++++++++++++++++++ >>> 3 files changed, 150 insertions(+), 27 deletions(-) >>> >>> -- >>> 1.7.10.4 >> _______________________________________________ >> notmuch mailing list >> notmuch@notmuchmail.org >> http://notmuchmail.org/mailman/listinfo/notmuch