Re: [PATCH v5 2/9] parse-time-string: add a date/time parser to notmuch
authorAustin Clements <amdragon@MIT.EDU>
Thu, 25 Oct 2012 18:58:16 +0000 (14:58 +2000)
committerW. Trevor King <wking@tremily.us>
Fri, 7 Nov 2014 17:50:02 +0000 (09:50 -0800)
db/f235d0a66c23c5fea0a8f78ff756a507d4ff83 [new file with mode: 0644]

diff --git a/db/f235d0a66c23c5fea0a8f78ff756a507d4ff83 b/db/f235d0a66c23c5fea0a8f78ff756a507d4ff83
new file mode 100644 (file)
index 0000000..a6591df
--- /dev/null
@@ -0,0 +1,108 @@
+Return-Path: <amdragon@mit.edu>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+       by olra.theworths.org (Postfix) with ESMTP id 9F864431FCB\r
+       for <notmuch@notmuchmail.org>; Thu, 25 Oct 2012 11:58:20 -0700 (PDT)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: -0.7\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5\r
+       tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+       by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+       with ESMTP id Zd8ZJHWfDQc7 for <notmuch@notmuchmail.org>;\r
+       Thu, 25 Oct 2012 11:58:20 -0700 (PDT)\r
+Received: from dmz-mailsec-scanner-7.mit.edu (DMZ-MAILSEC-SCANNER-7.MIT.EDU\r
+       [18.7.68.36])\r
+       by olra.theworths.org (Postfix) with ESMTP id 15149431FAE\r
+       for <notmuch@notmuchmail.org>; Thu, 25 Oct 2012 11:58:20 -0700 (PDT)\r
+X-AuditID: 12074424-b7fce6d000000925-18-50898bcb02c7\r
+Received: from mailhub-auth-1.mit.edu ( [18.9.21.35])\r
+       by dmz-mailsec-scanner-7.mit.edu (Symantec Messaging Gateway) with SMTP\r
+       id 96.F9.02341.BCB89805; Thu, 25 Oct 2012 14:58:19 -0400 (EDT)\r
+Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103])\r
+       by mailhub-auth-1.mit.edu (8.13.8/8.9.2) with ESMTP id q9PIwIia012687; \r
+       Thu, 25 Oct 2012 14:58:19 -0400\r
+Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91])\r
+       (authenticated bits=0)\r
+       (User authenticated as amdragon@ATHENA.MIT.EDU)\r
+       by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id q9PIwGQ9019986\r
+       (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT);\r
+       Thu, 25 Oct 2012 14:58:18 -0400 (EDT)\r
+Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77)\r
+       (envelope-from <amdragon@MIT.EDU>)\r
+       id 1TRSdE-0004NP-Iu; Thu, 25 Oct 2012 14:58:16 -0400\r
+Date: Thu, 25 Oct 2012 14:58:16 -0400\r
+From: Austin Clements <amdragon@MIT.EDU>\r
+To: Jani Nikula <jani@nikula.org>\r
+Subject: Re: [PATCH v5 2/9] parse-time-string: add a date/time parser to\r
+       notmuch\r
+Message-ID: <20121025185816.GX14861@mit.edu>\r
+References: <cover.1350854171.git.jani@nikula.org>\r
+       <a90d3b687895a26f765539d6c0420038a74ee42f.1350854171.git.jani@nikula.org>\r
+       <20121022081444.GM14861@mit.edu>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain; charset=us-ascii\r
+Content-Disposition: inline\r
+In-Reply-To: <20121022081444.GM14861@mit.edu>\r
+User-Agent: Mutt/1.5.21 (2010-09-15)\r
+X-Brightmail-Tracker:\r
+ H4sIAAAAAAAAA+NgFmpmleLIzCtJLcpLzFFi42IR4hRV1j3d3RlgsP+PpkXTdGeL6zdnMjsw\r
+       edy6/5rd49mqW8wBTFFcNimpOZllqUX6dglcGV39DcwF13grZs3ewNrA+Jeri5GTQ0LARGLq\r
+       hT2sELaYxIV769m6GLk4hAT2MUocPPaQGcLZwCixYe4ZFgjnJJNE/85fbCAtQgJLGCVubhUH\r
+       sVkEVCVWNL1lB7HZBDQktu1fzghiiwgoSmw+uR/MZhaQlvj2u5kJxBYWCJL49f4uC4jNK6Aj\r
+       8an9JDvEgoWMErc2tbFCJAQlTs58wgLRrCVx499LoGYOsEHL/3GAhDkFdCVWTm0AKxEVUJGY\r
+       cnIb2wRGoVlIumch6Z6F0L2AkXkVo2xKbpVubmJmTnFqsm5xcmJeXmqRrrlebmaJXmpK6SZG\r
+       UFizu6jsYGw+pHSIUYCDUYmHNyKlM0CINbGsuDL3EKMkB5OSKO+CWqAQX1J+SmVGYnFGfFFp\r
+       TmrxIUYJDmYlEd7jxUA53pTEyqrUonyYlDQHi5I47/WUm/5CAumJJanZqakFqUUwWRkODiUJ\r
+       3uWtQI2CRanpqRVpmTklCGkmDk6Q4TxAw++B1PAWFyTmFmemQ+RPMSpKifPOBkkIgCQySvPg\r
+       emFp5xWjONArwrzbQap4gCkLrvsV0GAmoMFirGCDSxIRUlINjDuCJ20Q1Ck2/Pbt15x7q7rF\r
+       Z+5NEb21cdOnNWVitj77eo5sDHg5bzlDZOD78/K8GQtDZX7P1X+U9DHDy271h7XsXK/mFX7I\r
+       OT1h/8X+D9uPhTJtahd4WCirYPFx7pYzrR+tC9KsX1ScX/Ph0VNd4xOTXmYnbWi/3fHs3nv+\r
+       2h3JU3wlFyY+kl2lxFKckWioxVxUnAgAV/o5exYDAAA=\r
+Cc: notmuch@notmuchmail.org\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+       <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Thu, 25 Oct 2012 18:58:20 -0000\r
+\r
+Quoth myself on Oct 22 at  4:14 am:\r
+> Overall this looks pretty good to me, and I must say, this parser is\r
+> amazingly flexible and copes well with a remarkably hostile grammar.\r
+> \r
+> A lot of little comments below (sorry if any of this ground has\r
+> already been covered in the previous four versions).\r
+> \r
+> I do have one broad comment.  While I'm all for ad hoc parsers for ad\r
+> hoc grammars like dates, there is one piece of the literature I think\r
+> this parser suffers for by ignoring: tokenizing.  I think it would\r
+> simplify a lot of this code if it did a tokenizing pass before the\r
+> parsing pass.  It doesn't have to be a serious tokenizer with\r
+> streaming and keywords and token types and junk; just something that\r
+> first splits the input into substrings, possibly just non-overlapping\r
+> matches of [[:digit:]]+|[[:alpha:]]+|[-+:/.].  This would simplify the\r
+> handling of postponed numbers because, with trivial lookahead in the\r
+> token stream, you wouldn't have to postpone them.  Likewise, it would\r
+> eliminate last_field.  It would simplify keyword matching because you\r
+> wouldn't have to worry about matching substrings (I spent a long time\r
+> staring at that code before I figured out what it would and wouldn't\r
+> accept).  Most important, I think it would make the parser more\r
+> predictable for users; for example, the parser currently accepts\r
+> things like "saturtoday" because it's aggressively single-pass.\r
+\r
+I should add that I am not at all opposed to this patch as it is\r
+currently designed.  We need a date parser.  My comment about\r
+separating tokenization is just a way that this code could probably be\r
+simplified if someone were so inclined or if simplifying the code\r
+would help it pass any hurdles.\r