Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 16126429E34 for ; Thu, 12 Jul 2012 04:25:17 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -1.098 X-Spam-Level: X-Spam-Status: No, score=-1.098 tagged_above=-999 required=5 tests=[DKIM_ADSP_CUSTOM_MED=0.001, FREEMAIL_FROM=0.001, NML_ADSP_CUSTOM_MED=1.2, RCVD_IN_DNSWL_MED=-2.3] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8cEXHoa72d83 for ; Thu, 12 Jul 2012 04:25:15 -0700 (PDT) Received: from mail2.qmul.ac.uk (mail2.qmul.ac.uk [138.37.6.6]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 45246431FAE for ; Thu, 12 Jul 2012 04:25:15 -0700 (PDT) Received: from smtp.qmul.ac.uk ([138.37.6.40]) by mail2.qmul.ac.uk with esmtp (Exim 4.71) (envelope-from ) id 1SpHWC-0003YE-Ou; Thu, 12 Jul 2012 12:25:13 +0100 Received: from 94-192-233-223.zone6.bethere.co.uk ([94.192.233.223] helo=localhost) by smtp.qmul.ac.uk with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.69) (envelope-from ) id 1SpHWC-0004vz-3a; Thu, 12 Jul 2012 12:25:12 +0100 From: Mark Walters To: craven@gmx.net, notmuch@notmuchmail.org Subject: Re: [PATCH v4 1/3] Add support for structured output formatters. In-Reply-To: <87a9z5teaj.fsf@nexoid.at> References: <87d34hsdx8.fsf@awakening.csail.mit.edu> <1342079004-5300-1-git-send-email-craven@gmx.net> <1342079004-5300-2-git-send-email-craven@gmx.net> <87pq81tjv4.fsf@qmul.ac.uk> <87a9z5teaj.fsf@nexoid.at> User-Agent: Notmuch/0.13.2+61~gf708609 (http://notmuchmail.org) Emacs/23.4.1 (x86_64-pc-linux-gnu) Date: Thu, 12 Jul 2012 12:25:09 +0100 Message-ID: <87liipte5m.fsf@qmul.ac.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Sender-Host-Address: 94.192.233.223 X-QM-SPAM-Info: Sender has good ham record. :) X-QM-Body-MD5: cffec11c2a64bc7d6e3fe01dff04a7c9 (of first 20000 bytes) X-SpamAssassin-Score: -1.8 X-SpamAssassin-SpamBar: - X-SpamAssassin-Report: The QM spam filters have analysed this message to determine if it is spam. We require at least 5.0 points to mark a message as spam. This message scored -1.8 points. Summary of the scoring: * -2.3 RCVD_IN_DNSWL_MED RBL: Sender listed at http://www.dnswl.org/, * medium trust * [138.37.6.40 listed in list.dnswl.org] * 0.0 FREEMAIL_FROM Sender email is commonly abused enduser mail provider * (markwalters1009[at]gmail.com) * -0.0 T_RP_MATCHES_RCVD Envelope sender domain matches handover relay * domain * 0.5 AWL AWL: From: address is in the auto white-list X-QM-Scan-Virus: ClamAV says the message is clean X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 12 Jul 2012 11:25:17 -0000 On Thu, 12 Jul 2012, craven@gmx.net wrote: >> what is the advantage of having this as one function rather than end_map >> and end_list? Indeed, my choice (but I think most other people would >> disagree) would be to have both functions but still keep state as this >> currently does and then throw an error if the code closes the wrong >> thing. > > There's probably no advantage, one way or the other is fine, I'd say. > I've thought about introducing checks into the formatter functions, to > raise errors for improper closing, map_key not inside a map and things > like that, I just wasn't sure that would be acceptable. I will leave others to comment. >> A second question: do you have an implementation in this style for >> s-expressions. I find it hard to tell whether the interface is right >> with just a single example. Even a completely hacky not ready for review >> example would be helpful. > > See the attached patch :) This looks great. I found it much easier to review by splitting sprinter.c into two files sprinter-json.c and sprinter-sexp.c and then running meld on them. The similarity is then very clear. It might be worth submitting them as two files, but I leave other people to comment. (Doing so made some of the difference between json and s-exp clear: like that keys in mapkeys are quoted in json but not in s-exp) It could be that some of the code could be merged, but I am not sure that there is much advantage. I would imagine that these two sprinter.c files would basically never change so there is not much risk of them diverging. I wonder if it would be worth using aggregate_t for both rather than using the closing symbol for this purpose in the json output. In any case this patch answers my query: the new structure does generalise very easily to s-expressions! Best wishes Mark > > Thanks for the suggestions! > > Peter > From cf2c5eeab814970736510ca2210b909643a8cf19 Mon Sep 17 00:00:00 2001 > From: > Date: Thu, 12 Jul 2012 10:17:05 +0200 > Subject: [PATCH] Add an S-Expression output format. > > --- > notmuch-search.c | 7 ++- > sprinter.c | 170 +++++++++++++++++++++++++++++++++++++++++++++++++++++++ > sprinter.h | 4 ++ > 3 files changed, 180 insertions(+), 1 deletion(-) > > diff --git a/notmuch-search.c b/notmuch-search.c > index b853f5f..f8bea9b 100644 > --- a/notmuch-search.c > +++ b/notmuch-search.c > @@ -77,6 +77,7 @@ do_search_threads (sprinter_t *format, > > if (format != sprinter_text) { > format->begin_list (format); > + format->frame (format); > } > > for (i = 0; > @@ -380,7 +381,7 @@ notmuch_search_command (void *ctx, int argc, char *argv[]) > int exclude = EXCLUDE_TRUE; > unsigned int i; > > - enum { NOTMUCH_FORMAT_JSON, NOTMUCH_FORMAT_TEXT } > + enum { NOTMUCH_FORMAT_JSON, NOTMUCH_FORMAT_TEXT, NOTMUCH_FORMAT_SEXP } > format_sel = NOTMUCH_FORMAT_TEXT; > > notmuch_opt_desc_t options[] = { > @@ -391,6 +392,7 @@ notmuch_search_command (void *ctx, int argc, char *argv[]) > { NOTMUCH_OPT_KEYWORD, &format_sel, "format", 'f', > (notmuch_keyword_t []){ { "json", NOTMUCH_FORMAT_JSON }, > { "text", NOTMUCH_FORMAT_TEXT }, > + { "sexp", NOTMUCH_FORMAT_SEXP }, > { 0, 0 } } }, > { NOTMUCH_OPT_KEYWORD, &output, "output", 'o', > (notmuch_keyword_t []){ { "summary", OUTPUT_SUMMARY }, > @@ -422,6 +424,9 @@ notmuch_search_command (void *ctx, int argc, char *argv[]) > case NOTMUCH_FORMAT_JSON: > format = sprinter_json_new (ctx, stdout); > break; > + case NOTMUCH_FORMAT_SEXP: > + format = sprinter_sexp_new (ctx, stdout); > + break; > } > > config = notmuch_config_open (ctx, NULL, NULL); > diff --git a/sprinter.c b/sprinter.c > index 649f79a..fce0f9b 100644 > --- a/sprinter.c > +++ b/sprinter.c > @@ -170,3 +170,173 @@ sprinter_json_new(const void *ctx, FILE *stream) > res->stream = stream; > return &res->vtable; > } > + > +/* > + * Every below here is the implementation of the SEXP printer. > + */ > + > +typedef enum { MAP, LIST } aggregate_t; > + > +struct sprinter_sexp > +{ > + struct sprinter vtable; > + FILE *stream; > + /* Top of the state stack, or NULL if the printer is not currently > + * inside any aggregate types. */ > + struct sexp_state *state; > +}; > + > +struct sexp_state > +{ > + struct sexp_state *parent; > + /* True if nothing has been printed in this aggregate yet. > + * Suppresses the comma before a value. */ > + notmuch_bool_t first; > + /* The character that closes the current aggregate. */ > + aggregate_t type; > +}; > + > +static struct sprinter_sexp * > +sexp_begin_value(struct sprinter *sp) > +{ > + struct sprinter_sexp *spsx = (struct sprinter_sexp*)sp; > + if (spsx->state) { > + if (!spsx->state->first) > + fputc (' ', spsx->stream); > + else > + spsx->state->first = false; > + } > + return spsx; > +} > + > +static void > +sexp_begin_aggregate(struct sprinter *sp, aggregate_t type) > +{ > + struct sprinter_sexp *spsx = (struct sprinter_sexp*)sp; > + struct sexp_state *state = talloc (spsx, struct sexp_state); > + > + fputc ('(', spsx->stream); > + state->parent = spsx->state; > + state->first = true; > + spsx->state = state; > + state->type = type; > +} > + > +static void > +sexp_begin_map(struct sprinter *sp) > +{ > + sexp_begin_aggregate (sp, MAP); > +} > + > +static void > +sexp_begin_list(struct sprinter *sp) > +{ > + sexp_begin_aggregate (sp, LIST); > +} > + > +static void > +sexp_end(struct sprinter *sp) > +{ > + struct sprinter_sexp *spsx = (struct sprinter_sexp*)sp; > + struct sexp_state *state = spsx->state; > + > + fputc (')', spsx->stream); > + spsx->state = state->parent; > + talloc_free (state); > + if(spsx->state == NULL) > + fputc ('\n', spsx->stream); > +} > + > +static void > +sexp_string(struct sprinter *sp, const char *val) > +{ > + static const char * const escapes[] = { > + ['\"'] = "\\\"", ['\\'] = "\\\\", ['\b'] = "\\b", > + ['\f'] = "\\f", ['\n'] = "\\n", ['\t'] = "\\t" > + }; > + struct sprinter_sexp *spsx = sexp_begin_value(sp); > + fputc ('"', spsx->stream); > + for (; *val; ++val) { > + unsigned char ch = *val; > + if (ch < ARRAY_SIZE(escapes) && escapes[ch]) > + fputs (escapes[ch], spsx->stream); > + else if (ch >= 32) > + fputc (ch, spsx->stream); > + else > + fprintf (spsx->stream, "\\u%04x", ch); > + } > + fputc ('"', spsx->stream); > + if (spsx->state != NULL && spsx->state->type == MAP) > + fputc (')', spsx->stream); > + spsx->state->first = false; > +} > + > +static void > +sexp_integer(struct sprinter *sp, int val) > +{ > + struct sprinter_sexp *spsx = sexp_begin_value(sp); > + fprintf (spsx->stream, "%d", val); > + if (spsx->state != NULL && spsx->state->type == MAP) > + fputc (')', spsx->stream); > +} > + > +static void > +sexp_boolean(struct sprinter *sp, notmuch_bool_t val) > +{ > + struct sprinter_sexp *spsx = sexp_begin_value(sp); > + fputs (val ? "#t" : "#f", spsx->stream); > + if (spsx->state != NULL && spsx->state->type == MAP) > + fputc (')', spsx->stream); > +} > + > +static void > +sexp_null(struct sprinter *sp) > +{ > + struct sprinter_sexp *spsx = sexp_begin_value(sp); > + fputs ("'()", spsx->stream); > + spsx->state->first = false; > +} > + > +static void > +sexp_map_key(struct sprinter *sp, const char *key) > +{ > + struct sprinter_sexp *spsx = sexp_begin_value(sp); > + fputc ('(', spsx->stream); > + fputs (key, spsx->stream); > + fputs (" . ", spsx->stream); > + spsx->state->first = true; > +} > + > +static void > +sexp_frame(struct sprinter *sp) > +{ > + struct sprinter_sexp *spsx = (struct sprinter_sexp*)sp; > + fputc ('\n', spsx->stream); > +} > + > +struct sprinter * > +sprinter_sexp_new(const void *ctx, FILE *stream) > +{ > + static const struct sprinter_sexp template = { > + .vtable = { > + .begin_map = sexp_begin_map, > + .begin_list = sexp_begin_list, > + .end = sexp_end, > + .string = sexp_string, > + .integer = sexp_integer, > + .boolean = sexp_boolean, > + .null = sexp_null, > + .map_key = sexp_map_key, > + .frame = sexp_frame, > + } > + }; > + struct sprinter_sexp *res; > + > + res = talloc (ctx, struct sprinter_sexp); > + if (!res) > + return NULL; > + > + *res = template; > + res->stream = stream; > + return &res->vtable; > +} > diff --git a/sprinter.h b/sprinter.h > index 1dad9a0..a89eaa5 100644 > --- a/sprinter.h > +++ b/sprinter.h > @@ -40,6 +40,10 @@ typedef struct sprinter > struct sprinter * > sprinter_json_new(const void *ctx, FILE *stream); > > +/* Create a new structure printer that emits S-Expressions */ > +struct sprinter * > +sprinter_sexp_new(const void *ctx, FILE *stream); > + > /* A dummy structure printer that signifies that standard text output is > * to be used instead of any structured format. > */ > -- > 1.7.11.1