Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 9AC3F431FB6 for ; Sat, 7 Jul 2012 22:30:38 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -1.098 X-Spam-Level: X-Spam-Status: No, score=-1.098 tagged_above=-999 required=5 tests=[DKIM_ADSP_CUSTOM_MED=0.001, FREEMAIL_FROM=0.001, NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id hkw0yIYq79jM for ; Sat, 7 Jul 2012 22:30:38 -0700 (PDT) Received: from mail2.qmul.ac.uk (mail2.qmul.ac.uk [138.37.6.6]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id D3952431FAE for ; Sat, 7 Jul 2012 22:30:37 -0700 (PDT) Received: from smtp.qmul.ac.uk ([138.37.6.40]) by mail2.qmul.ac.uk with esmtp (Exim 4.71) (envelope-from ) id 1Snk4l-0004hA-UI; Sun, 08 Jul 2012 06:30:32 +0100 Received: from 94-192-233-223.zone6.bethere.co.uk ([94.192.233.223] helo=localhost) by smtp.qmul.ac.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69) (envelope-from ) id 1Snk4l-0002bq-E5; Sun, 08 Jul 2012 06:30:31 +0100 From: Mark Walters To: Austin Clements , Tomi Ollila Subject: Re: [PATCH] cli: notmuch-show with framing newlines between threads in JSON. In-Reply-To: <20120702035241.GD6220@mit.edu> References: <1341041024-5342-1-git-send-email-markwalters1009@gmail.com> <20120702001215.GC6220@mit.edu> <20120702035241.GD6220@mit.edu> User-Agent: Notmuch/0.13.2+61~gf708609 (http://notmuchmail.org) Emacs/23.4.1 (x86_64-pc-linux-gnu) Date: Sun, 08 Jul 2012 06:30:28 +0100 Message-ID: <87ehomu8ej.fsf@qmul.ac.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Sender-Host-Address: 94.192.233.223 X-QM-SPAM-Info: Sender has good ham record. :) X-QM-Body-MD5: 41df64e922ed6acb98993a710cf53d02 (of first 20000 bytes) X-SpamAssassin-Score: -1.8 X-SpamAssassin-SpamBar: - X-SpamAssassin-Report: The QM spam filters have analysed this message to determine if it is spam. We require at least 5.0 points to mark a message as spam. This message scored -1.8 points. Summary of the scoring: * -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, * medium trust * [138.37.6.40 listed in list.dnswl.org] * 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider * (markwalters1009[at]gmail.com) * -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay * domain * 0.5 AWL AWL: From: address is in the auto white-list X-QM-Scan-Virus: ClamAV says the message is clean Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sun, 08 Jul 2012 05:30:38 -0000 On Mon, 02 Jul 2012, Austin Clements wrote: > Quoth myself on Jul 01 at 8:12 pm: >> Quoth Tomi Ollila on Jul 02 at 1:13 am: >> > On Sat, Jun 30 2012, Mark Walters wrote: >> > >> > > Add newlines between complete threads to make asynchronous parsing >> > > of the JSON easier. >> > > --- >> > > >> > > notmuch-pick uses the JSON output of notmuch show but, in many cases, >> > > for many threads. This can take quite a long time when displaying a >> > > large number of messages (say 20 seconds for the 10,000 messages in >> > > the notmuch archive). Thus it is desirable to display results >> > > incrementally in the same way that search currently does. >> > > >> > > To make this easier this patch adds newlines between each toplevel >> > > thread. So the ouput becomes >> > > >> > > [ >> > > thread1 >> > > , thread2 >> > > , thread3 >> > > ... >> > > , last_thread >> > > ] >> > > >> > > Thus the parser can easily tell if it has enough data to do some more >> > > parsing. >> > > >> > > Obviously, this changes the JSON output. This should not break any >> > > consumer as the JSON parsers should not mind. However, it does break >> > > several tests. Obviously, I will fix these but I wanted to check if >> > > people were basically happy with the change first. >> > >> > To provide this feature rather than relying on newlines the parser should >> > use it's state to notice when one thread ends. >> > >> > Such a change could be used (privately) for human consumption -- allowing >> > free change of whitespace during inspection (in a debugging session or so). >> > Computer software should not rely (or suffer) from any additional >> > (or lack thereof) whitespace there is... >> > >> > ... or at least a really convicing argument for the chance needs to >> > be presented (before "restricting" the json output notmuch spits out). >> >> Given a JSON parser that only knows how to parse complete JSON >> expressions, it's potentially very inefficient to keep attempting to >> parse something when you don't know if it's complete. The newlines >> provide an in-band framing so the consumer knows when there's a >> complete object to be parsed. >> >> In effect, this defines a super-protocol of JSON that's compatible >> with standard JSON, but easy to incrementally parse. >> >> That said, just this weekend I implemented JSON-based search with >> incremental JSON parsing and I took a slightly different approach. I >> still put framing into the newlines of the search results, but rather >> than rely on it for correctness, the consumer uses it as an >> optimization that only hints that a complete JSON expression is >> probably available. If the expression turns out to be incomplete, >> that's okay. >> >> I considered building a fully-incremental JSON parser that never >> backtracks by more than a token, which would eliminate even the cost >> of reparsing, but if we do move to S-expressions (which I think we >> should), we want to let Emacs' C implementation do as much of the >> parsing as possible, and the only thing we can do with that is read a >> complete expression. > > Actually, I take that back. While we can't do fast incremental > S-expression parsing, `parse-partial-sexp' can tell us incrementally > (and probably very quickly) *if* there's a complete expression ready > to parse, so we could avoid calling into the parser at all unless it > would succeed. > > I'll try this out in my incremental JSON parser and see how well it > works. I have converted pick to use Austin's incremental parser and all works well so this seems the way to go. Hence I have marked my original patch obsolete. Best wishes Mark