Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id EA757431FBC for ; Sun, 1 Jul 2012 17:12:43 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id MxFIJeuIzkrK for ; Sun, 1 Jul 2012 17:12:42 -0700 (PDT) Received: from dmz-mailsec-scanner-3.mit.edu (DMZ-MAILSEC-SCANNER-3.MIT.EDU [18.9.25.14]) by olra.theworths.org (Postfix) with ESMTP id 32899431FAF for ; Sun, 1 Jul 2012 17:12:42 -0700 (PDT) X-AuditID: 1209190e-b7fb56d0000008b2-61-4ff0e779327b Received: from mailhub-auth-3.mit.edu ( [18.9.21.43]) by dmz-mailsec-scanner-3.mit.edu (Symantec Messaging Gateway) with SMTP id 0D.6B.02226.977E0FF4; Sun, 1 Jul 2012 20:12:41 -0400 (EDT) Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103]) by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id q620CeXv017423; Sun, 1 Jul 2012 20:12:41 -0400 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id q620CaLp024059 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT); Sun, 1 Jul 2012 20:12:39 -0400 (EDT) Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77) (envelope-from ) id 1SlUFo-0007LF-50; Sun, 01 Jul 2012 20:12:36 -0400 Date: Sun, 1 Jul 2012 20:12:35 -0400 From: Austin Clements To: Tomi Ollila Subject: Re: [PATCH] cli: notmuch-show with framing newlines between threads in JSON. Message-ID: <20120702001215.GC6220@mit.edu> References: <1341041024-5342-1-git-send-email-markwalters1009@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFprOKsWRmVeSWpSXmKPExsUixCmqrVv5/IO/wZZmRYvVc3ksrt+cyWzx ZuU8Vgdmj52z7rJ7HP66kMXj2apbzAHMUVw2Kak5mWWpRfp2CVwZF7c+YS64IFtx6ddnpgbG TeJdjJwcEgImEnvW/mKGsMUkLtxbz9bFyMUhJLCPUaJ1Xyc7hLOeUWLuvglMEM4JJomJx4+z grQICSxhlFi3OqqLkYODRUBF4sz0BJAwm4CGxLb9yxlBbBGg8IO29WDlzAKuEjMu7GICsYUF wiUOb9rCBmLzCmhL3P3fywYxMk9i4renUHFBiZMzn7BA9GpJ3Pj3kglkFbOAtMTyfxwgYU4B A4kt71aDjRQFWjXl5Da2CYxCs5B0z0LSPQuhewEj8ypG2ZTcKt3cxMyc4tRk3eLkxLy81CJd Y73czBK91JTSTYzgMJfk28H49aDSIUYBDkYlHl7l2x/8hVgTy4orcw8xSnIwKYnyHn8MFOJL yk+pzEgszogvKs1JLT7EKMHBrCTC+/T4e38h3pTEyqrUonyYlDQHi5I475WUm/5CAumJJanZ qakFqUUwWRkODiUJ3lnPgIYKFqWmp1akZeaUIKSZODhBhvMADT8KUsNbXJCYW5yZDpE/xajL se7NkRuMQix5+XmpUuK8G0CKBECKMkrz4ObA0tMrRnGgt4R5J4FU8QBTG9ykV0BLmICWPF/9 DmRJSSJCSqqBcZLB+Wkf3wpyT9+o+yL+fEZoZuCWM8ap1ZHzpn9bzDPDZEHF7oNWlx1UxT/k CO16V/CbY01/lZWv98nfk5RznAslctYszy5l16yv3vb+esTcnL1Fu2cc/ln4ScxrUpH52Ut1 gqopL2rWy3nFXq2ZsH7OcU7ei7KyuxxV07Kcvp8s63own8mdR4mlOCPRUIu5qDgRAEZ2JCAq AwAA Cc: notmuch@notmuchmail.org X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Jul 2012 00:12:44 -0000 Quoth Tomi Ollila on Jul 02 at 1:13 am: > On Sat, Jun 30 2012, Mark Walters wrote: > > > Add newlines between complete threads to make asynchronous parsing > > of the JSON easier. > > --- > > > > notmuch-pick uses the JSON output of notmuch show but, in many cases, > > for many threads. This can take quite a long time when displaying a > > large number of messages (say 20 seconds for the 10,000 messages in > > the notmuch archive). Thus it is desirable to display results > > incrementally in the same way that search currently does. > > > > To make this easier this patch adds newlines between each toplevel > > thread. So the ouput becomes > > > > [ > > thread1 > > , thread2 > > , thread3 > > ... > > , last_thread > > ] > > > > Thus the parser can easily tell if it has enough data to do some more > > parsing. > > > > Obviously, this changes the JSON output. This should not break any > > consumer as the JSON parsers should not mind. However, it does break > > several tests. Obviously, I will fix these but I wanted to check if > > people were basically happy with the change first. > > To provide this feature rather than relying on newlines the parser should > use it's state to notice when one thread ends. > > Such a change could be used (privately) for human consumption -- allowing > free change of whitespace during inspection (in a debugging session or so). > Computer software should not rely (or suffer) from any additional > (or lack thereof) whitespace there is... > > ... or at least a really convicing argument for the chance needs to > be presented (before "restricting" the json output notmuch spits out). Given a JSON parser that only knows how to parse complete JSON expressions, it's potentially very inefficient to keep attempting to parse something when you don't know if it's complete. The newlines provide an in-band framing so the consumer knows when there's a complete object to be parsed. In effect, this defines a super-protocol of JSON that's compatible with standard JSON, but easy to incrementally parse. That said, just this weekend I implemented JSON-based search with incremental JSON parsing and I took a slightly different approach. I still put framing into the newlines of the search results, but rather than rely on it for correctness, the consumer uses it as an optimization that only hints that a complete JSON expression is probably available. If the expression turns out to be incomplete, that's okay. I considered building a fully-incremental JSON parser that never backtracks by more than a token, which would eliminate even the cost of reparsing, but if we do move to S-expressions (which I think we should), we want to let Emacs' C implementation do as much of the parsing as possible, and the only thing we can do with that is read a complete expression. > Btw: AFAIC (json-read) parses the whole json object (ignoring whitespace, > including newlines outside strings). So I quess notmuch-pick uses something > slightly different (probably using json.el subroutines).. > > Btw2: I'm very interested to see notmuch-pick in action -- I just don't > see this a way to do this particular support properly. > > Btw3: is search is ever going to use json we'll face the same problem -- > unless writing each line as a separate json object (and starting to use > s-expressions for speed) Done. I'll post the patches after a little more cleanup. > > Also, should devel/schemata be updated? It seems a little unclear as > > this is not really a "JSON" change as the JSON does not care about the > > newlines. > > > > Best wishes > > and best luck with your notmuch-pick work. > > > > > Mark > > Tomi