Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 9F4D0431FB6 for ; Sun, 27 Feb 2011 00:45:10 -0800 (PST) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.699 X-Spam-Level: X-Spam-Status: No, score=-0.699 tagged_above=-999 required=5 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id x6p35e7MVq+1 for ; Sun, 27 Feb 2011 00:45:10 -0800 (PST) Received: from mail-qy0-f174.google.com (mail-qy0-f174.google.com [209.85.216.174]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id F1AE6431FB5 for ; Sun, 27 Feb 2011 00:45:09 -0800 (PST) Received: by qyk7 with SMTP id 7so1548244qyk.5 for ; Sun, 27 Feb 2011 00:45:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; bh=bKagnCGuB3Wxe7+J/y7hutn/LIAXkPKroEyAnsrijnw=; b=D1+YKjG2XIMy9Yd2j+bfIdZG2wIbMQvvEFkqV+XSZimNdQ5do9hCapE07QvoxaBvQV VVka/qu8f9LrzOaHPjUOJTi9JNGfXyyWHgnDZHErqpBY+COKhW8ieLNsY8HmAoxTsABs 03DvpsLi4AmKNcNsRtMlLxv3C7yR5ZiMW9NiA= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type :content-transfer-encoding; b=RppbrDi3j7v92MXvUtv4w99I6unzHUgVj0h0aKEkuVVmJdO86p6lfUg8btXjJyJrFo DTMbEq5e53vWTvVmMobiJG5bpLMXUl2y2QQvGaH7UL52k4akgzwXITOq/Iq3F9TK38/3 ph7pJh1x8vG6oGs0eWwhIvBwZmZOoedeVbgxA= MIME-Version: 1.0 Received: by 10.229.89.84 with SMTP id d20mr3269021qcm.100.1298796305133; Sun, 27 Feb 2011 00:45:05 -0800 (PST) Sender: amdragon@gmail.com Received: by 10.229.105.68 with HTTP; Sun, 27 Feb 2011 00:45:05 -0800 (PST) In-Reply-To: <1296855871-15702-1-git-send-email-kzak@redhat.com> References: <1296855871-15702-1-git-send-email-kzak@redhat.com> Date: Sun, 27 Feb 2011 03:45:05 -0500 X-Google-Sender-Auth: MaJ-nspqing9qwhz-sKupkzx9yM Message-ID: Subject: Re: [PATCH] new: read db_files and db_subdirs if mtime changed From: Austin Clements To: Karel Zak Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 27 Feb 2011 08:45:10 -0000 Looks good (faster than, but provably equivalent to the original code! notmuch_directory_get_child_* are side-effect free, db_files/db_subdirs aren't used between where they were set in the old code and where they are set in the new code, and db_files/db_subdirs are initialized to NULL when declared). Another timing data point: Old code: ./notmuch new 0.77s user 0.28s system 99% cpu 1.051 total New code: ./notmuch new 0.09s user 0.27s system 98% cpu 0.368 total I wonder if an even faster approach than the current recursive walk would be to get *all* of the directory names and mtimes out of Xapian in one pass and stat them all. If the mtime didn't change, then there's no need to scandir that directory at all. This could even beat the "time find >/dev/null" bound, but the gains may be too marginal to make it worthwhile. On Fri, Feb 4, 2011 at 4:44 PM, Karel Zak wrote: > The db_files and db_subdirs are unnecessary for unchanged directories. > > maildir with 10000 e-mails: > > old version: > =A0 =A0 =A0 =A0$ time ./notmuch new > =A0 =A0 =A0 =A0No new mail. > > =A0 =A0 =A0 =A0real =A0 =A00m0.053s > =A0 =A0 =A0 =A0user =A0 =A00m0.028s > =A0 =A0 =A0 =A0sys =A0 =A0 0m0.026s > > new version: > =A0 =A0 =A0 =A0$ time ./notmuch new > =A0 =A0 =A0 =A0No new mail. > > =A0 =A0 =A0 =A0real =A0 =A00m0.032s > =A0 =A0 =A0 =A0user =A0 =A00m0.009s > =A0 =A0 =A0 =A0sys =A0 =A0 0m0.023s > > Signed-off-by: Karel Zak > --- > =A0notmuch-new.c | =A0 15 ++++++--------- > =A01 files changed, 6 insertions(+), 9 deletions(-) > > diff --git a/notmuch-new.c b/notmuch-new.c > index 941f9d6..31d4553 100644 > --- a/notmuch-new.c > +++ b/notmuch-new.c > @@ -247,15 +247,7 @@ add_files_recursive (notmuch_database_t *notmuch, > =A0 =A0 directory =3D notmuch_database_get_directory (notmuch, path); > =A0 =A0 db_mtime =3D notmuch_directory_get_mtime (directory); > > - =A0 =A0if (db_mtime =3D=3D 0) { > - =A0 =A0 =A0 new_directory =3D TRUE; > - =A0 =A0 =A0 db_files =3D NULL; > - =A0 =A0 =A0 db_subdirs =3D NULL; > - =A0 =A0} else { > - =A0 =A0 =A0 new_directory =3D FALSE; > - =A0 =A0 =A0 db_files =3D notmuch_directory_get_child_files (directory); > - =A0 =A0 =A0 db_subdirs =3D notmuch_directory_get_child_directories (dir= ectory); > - =A0 =A0} > + =A0 =A0new_directory =3D db_mtime ? FALSE : TRUE; > > =A0 =A0 /* If the database knows about this directory, then we sort based > =A0 =A0 =A0* on strcmp to match the database sorting. Otherwise, we can d= o > @@ -328,6 +320,11 @@ add_files_recursive (notmuch_database_t *notmuch, > =A0 =A0 if (fs_mtime =3D=3D db_mtime) > =A0 =A0 =A0 =A0goto DONE; > > + =A0 =A0if (!new_directory) { > + =A0 =A0 =A0 db_files =3D notmuch_directory_get_child_files (directory); > + =A0 =A0 =A0 db_subdirs =3D notmuch_directory_get_child_directories (dir= ectory); > + =A0 =A0} > + > =A0 =A0 /* Pass 2: Scan for new files, removed files, and removed directo= ries. */ > =A0 =A0 for (i =3D 0; i < num_fs_entries; i++) > =A0 =A0 { > -- > 1.7.3.4 > > _______________________________________________ > notmuch mailing list > notmuch@notmuchmail.org > http://notmuchmail.org/mailman/listinfo/notmuch >