From 2d2335ad92d70e1f6cfb8499505e459043080af6 Mon Sep 17 00:00:00 2001 From: Olly Betts Date: Fri, 8 Apr 2016 00:25:38 +0100 Subject: [PATCH] Re: slowdown in notmuch perf suite with xapian 1.3.5 --- ca/aae1e2b616d9d51c73c1318f5a311cba3c66b6 | 91 +++++++++++++++++++++++ 1 file changed, 91 insertions(+) create mode 100644 ca/aae1e2b616d9d51c73c1318f5a311cba3c66b6 diff --git a/ca/aae1e2b616d9d51c73c1318f5a311cba3c66b6 b/ca/aae1e2b616d9d51c73c1318f5a311cba3c66b6 new file mode 100644 index 000000000..768964399 --- /dev/null +++ b/ca/aae1e2b616d9d51c73c1318f5a311cba3c66b6 @@ -0,0 +1,91 @@ +Return-Path: +X-Original-To: notmuch@notmuchmail.org +Delivered-To: notmuch@notmuchmail.org +Received: from localhost (localhost [127.0.0.1]) + by arlo.cworth.org (Postfix) with ESMTP id 23BCB6DE02B5 + for ; Thu, 7 Apr 2016 17:03:15 -0700 (PDT) +X-Virus-Scanned: Debian amavisd-new at cworth.org +X-Spam-Flag: NO +X-Spam-Score: -2.421 +X-Spam-Level: +X-Spam-Status: No, score=-2.421 tagged_above=-999 required=5 + tests=[AWL=-0.120, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001] + autolearn=disabled +Received: from arlo.cworth.org ([127.0.0.1]) + by localhost (arlo.cworth.org [127.0.0.1]) (amavisd-new, port 10024) + with ESMTP id zCKOoGFKIfci for ; + Thu, 7 Apr 2016 17:03:07 -0700 (PDT) +X-Greylist: delayed 2244 seconds by postgrey-1.35 at arlo; + Thu, 07 Apr 2016 17:03:06 PDT +Received: from atreus.tartarus.org (atreus.tartarus.org [80.252.125.10]) + by arlo.cworth.org (Postfix) with ESMTPS id CF23C6DE0134 + for ; Thu, 7 Apr 2016 17:03:06 -0700 (PDT) +Received: from olly by atreus.tartarus.org with local (Exim 4.69) + (envelope-from ) + id 1aoJIw-0002w0-2y; Fri, 08 Apr 2016 00:25:38 +0100 +Date: Fri, 8 Apr 2016 00:25:38 +0100 +From: Olly Betts +To: David Bremner +Cc: notmuch@notmuchmail.org, xapian-discuss@lists.xapian.org +Subject: Re: slowdown in notmuch perf suite with xapian 1.3.5 +Message-ID: <20160407232537.GB29434@survex.com> +Reply-To: Xapian Discussion +Mail-Followup-To: David Bremner , + notmuch@notmuchmail.org, xapian-discuss@lists.xapian.org +References: <87twjd639d.fsf@zancas.localnet> +MIME-Version: 1.0 +Content-Type: text/plain; charset=us-ascii +Content-Disposition: inline +In-Reply-To: <87twjd639d.fsf@zancas.localnet> +User-Agent: Mutt/1.5.21 (2010-09-15) +X-BeenThere: notmuch@notmuchmail.org +X-Mailman-Version: 2.1.20 +Precedence: list +List-Id: "Use and development of the notmuch mail system." + +List-Unsubscribe: , + +List-Archive: +List-Post: +List-Help: +List-Subscribe: , + +X-List-Received-Date: Fri, 08 Apr 2016 00:03:15 -0000 + +On Thu, Apr 07, 2016 at 08:56:46AM -0300, David Bremner wrote: +> I hadn't noticed any interactive slowdown, but when I got around to +> running the notmuch performance suite, there seems to be some noticable +> slowdown with the glass backend (default in Xapian 1.3.5) compared to +> chert (using xapian 1.2.22) + +Some of this is pretty much expected, though other parts I don't +entirely understand. + +One of the big changes in glass is how the position table is structured. +In chert, it is ordered by (document,term) but in glass that has been +changed to (term,document). + +This change makes a huge difference to phrase searches in cases where +a lot of phrase data is needed, but it has an indexing time cost - +adding a new document can no longer just append a load of entries to +the position table, but instead we need to buffer up the changes, and +then merge the entries within the existing table. + +The trade-off isn't ideal for everyone, but the cases of slow phrase +searches were a real pain point that needed addressing. The plan is +to optimise indexing speed in other ways to regain this loss - some +of that has been done but there's a lot more to do still. + +So the T00-new.sh numbers make sense - there's more work to do, and +we need to read existing positional data more to insert the new stuff, +so the increased reads and writes make sense. + +But guessing at what the other two tests do, I wouldn't expect them to +be affected by this. + +I'm also a bit puzzled by how glass can manage not to read any data +for "dump *", and several tests seem to not read or write anything +for either backend. What exactly are the "In/Out" numbers? + +Cheers, + Olly -- 2.26.2