Re: [PATCH v5 2/9] parse-time-string: add a date/time parser to notmuch

author Austin Clements <amdragon@MIT.EDU>

Mon, 22 Oct 2012 08:14:44 +0000 (04:14 +2000)

committer W. Trevor King <wking@tremily.us>

Fri, 7 Nov 2014 17:49:58 +0000 (09:49 -0800)
author Austin Clements <amdragon@MIT.EDU>
Mon, 22 Oct 2012 08:14:44 +0000 (04:14 +2000)
committer W. Trevor King <wking@tremily.us>
Fri, 7 Nov 2014 17:49:58 +0000 (09:49 -0800)
diff --git a/1f/92c0c1fdfbab359c7dbf4e9fb057766d1183df b/1f/92c0c1fdfbab359c7dbf4e9fb057766d1183df

new file mode 100644 (file)

index 0000000..71e765d
--- /dev/null
+++ b/1f/92c0c1fdfbab359c7dbf4e9fb057766d1183df
@@ -0,0 +1,1926 @@
+Return-Path: <amdragon@mit.edu>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+       by olra.theworths.org (Postfix) with ESMTP id 7B8FC431FB6\r
+       for <notmuch@notmuchmail.org>; Mon, 22 Oct 2012 01:14:55 -0700 (PDT)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: -0.7\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5\r
+       tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+       by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+       with ESMTP id NRHArV8rEDeK for <notmuch@notmuchmail.org>;\r
+       Mon, 22 Oct 2012 01:14:51 -0700 (PDT)\r
+Received: from dmz-mailsec-scanner-4.mit.edu (DMZ-MAILSEC-SCANNER-4.MIT.EDU\r
+       [18.9.25.15])\r
+       by olra.theworths.org (Postfix) with ESMTP id 2860C431FAE\r
+       for <notmuch@notmuchmail.org>; Mon, 22 Oct 2012 01:14:51 -0700 (PDT)\r
+X-AuditID: 1209190f-b7f636d00000095b-5a-508500781c92\r
+Received: from mailhub-auth-4.mit.edu ( [18.7.62.39])\r
+       by dmz-mailsec-scanner-4.mit.edu (Symantec Messaging Gateway) with SMTP\r
+       id CE.F4.02395.87005805; Mon, 22 Oct 2012 04:14:48 -0400 (EDT)\r
+Received: from outgoing.mit.edu (OUTGOING-AUTH.MIT.EDU [18.7.22.103])\r
+       by mailhub-auth-4.mit.edu (8.13.8/8.9.2) with ESMTP id q9M8El5W017881; \r
+       Mon, 22 Oct 2012 04:14:48 -0400\r
+Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91])\r
+       (authenticated bits=0)\r
+       (User authenticated as amdragon@ATHENA.MIT.EDU)\r
+       by outgoing.mit.edu (8.13.6/8.12.4) with ESMTP id q9M8Ej6M021444\r
+       (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=NOT);\r
+       Mon, 22 Oct 2012 04:14:47 -0400 (EDT)\r
+Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.77)\r
+       (envelope-from <amdragon@mit.edu>)\r
+       id 1TQD9p-0003T9-D0; Mon, 22 Oct 2012 04:14:45 -0400\r
+Date: Mon, 22 Oct 2012 04:14:44 -0400\r
+From: Austin Clements <amdragon@MIT.EDU>\r
+To: Jani Nikula <jani@nikula.org>\r
+Subject: Re: [PATCH v5 2/9] parse-time-string: add a date/time parser to\r
+       notmuch\r
+Message-ID: <20121022081444.GM14861@mit.edu>\r
+References: <cover.1350854171.git.jani@nikula.org>\r
+       <a90d3b687895a26f765539d6c0420038a74ee42f.1350854171.git.jani@nikula.org>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain; charset=iso-8859-1\r
+Content-Disposition: inline\r
+Content-Transfer-Encoding: 8bit\r
+In-Reply-To:\r
+ <a90d3b687895a26f765539d6c0420038a74ee42f.1350854171.git.jani@nikula.org>\r
+User-Agent: Mutt/1.5.21 (2010-09-15)\r
+X-Brightmail-Tracker:\r
+ H4sIAAAAAAAAA+NgFlrOKsWRmVeSWpSXmKPExsUixG6nrlvB0Bpg0HLZzKJpurPF9ZszmR2Y\r
+       PG7df83u8WzVLeYApigum5TUnMyy1CJ9uwSujBVnjrMXfN/PXLHt+wSWBsYTp5m6GDk4JARM\r
+       JK4usu5i5AQyxSQu3FvP1sXIxSEksI9R4uzkxawQzgZGiT+//rBAOCeZJG7ePc4E4SxhlJj/\r
+       4ToLyCgWAVWJoz1JIKPYBDQktu1fzghiiwgoSmw+uR/MZhaQlvj2u5kJxBYWCJL49f4uC4jN\r
+       K6AjcfXsPjBbSKBO4vvES6wQcUGJkzOfsED06kjs3HqHDWQVyJzl/zggwvISzVtnM4PYnAJh\r
+       EpdONbOD2KICKhJTTm5jm8AoPAvJpFlIJs1CmDQLyaQFjCyrGGVTcqt0cxMzc4pTk3WLkxPz\r
+       8lKLdE30cjNL9FJTSjcxguKAU5J/B+O3g0qHGAU4GJV4eBuutwQIsSaWFVfmHmKU5GBSEuW9\r
+       9wcoxJeUn1KZkVicEV9UmpNafIhRgoNZSYT3VDxQjjclsbIqtSgfJiXNwaIkzns15aa/kEB6\r
+       YklqdmpqQWoRTFaGg0NJgrflP1CjYFFqempFWmZOCUKaiYMTZDgP0PDZIDW8xQWJucWZ6RD5\r
+       U4y6HFf+zX3IKMSSl5+XKiXOuwakSACkKKM0D24OLH29YhQHekuYdw5IFQ8w9cFNegW0hAlo\r
+       iTl3I8iSkkSElFQD4/wjOzd8SrQy/5tRXZummSnUUHz7lF+QXsP+D8u37FY0bGNOmHRPPiO0\r
+       1ai/W41f7+nzNpHwmWuezfFcdrWqx3T3NtM1Ffw8Szeukwr1apgQnGrQ735v9/d97qffT37b\r
+       ueUBz7+Dfc1uml63P61we8frMV9G4PrZbcoHc+7aLWLON76i89yvSImlOCPRUIu5qDgRANxm\r
+       F5o6AwAA\r
+Cc: notmuch@notmuchmail.org\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+       <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Mon, 22 Oct 2012 08:14:55 -0000\r
+\r
+Overall this looks pretty good to me, and I must say, this parser is\r
+amazingly flexible and copes well with a remarkably hostile grammar.\r
+\r
+A lot of little comments below (sorry if any of this ground has\r
+already been covered in the previous four versions).\r
+\r
+I do have one broad comment.  While I'm all for ad hoc parsers for ad\r
+hoc grammars like dates, there is one piece of the literature I think\r
+this parser suffers for by ignoring: tokenizing.  I think it would\r
+simplify a lot of this code if it did a tokenizing pass before the\r
+parsing pass.  It doesn't have to be a serious tokenizer with\r
+streaming and keywords and token types and junk; just something that\r
+first splits the input into substrings, possibly just non-overlapping\r
+matches of [[:digit:]]+|[[:alpha:]]+|[-+:/.].  This would simplify the\r
+handling of postponed numbers because, with trivial lookahead in the\r
+token stream, you wouldn't have to postpone them.  Likewise, it would\r
+eliminate last_field.  It would simplify keyword matching because you\r
+wouldn't have to worry about matching substrings (I spent a long time\r
+staring at that code before I figured out what it would and wouldn't\r
+accept).  Most important, I think it would make the parser more\r
+predictable for users; for example, the parser currently accepts\r
+things like "saturtoday" because it's aggressively single-pass.\r
+\r
+Quoth Jani Nikula on Oct 22 at 12:22 am:\r
+> Add a date/time parser to notmuch, to be used for adding date range\r
+> query support for notmuch lib later on. Add the parser to a directory\r
+> of its own to make it independent of the rest of the notmuch code\r
+> base.\r
+> \r
+> Signed-off-by: Jani Nikula <jani@nikula.org>\r
+> ---\r
+>  Makefile                              |    2 +-\r
+>  parse-time-string/Makefile            |    5 +\r
+>  parse-time-string/Makefile.local      |   12 +\r
+>  parse-time-string/README              |    9 +\r
+>  parse-time-string/parse-time-string.c | 1477 +++++++++++++++++++++++++++++++++\r
+>  parse-time-string/parse-time-string.h |  102 +++\r
+>  6 files changed, 1606 insertions(+), 1 deletion(-)\r
+>  create mode 100644 parse-time-string/Makefile\r
+>  create mode 100644 parse-time-string/Makefile.local\r
+>  create mode 100644 parse-time-string/README\r
+>  create mode 100644 parse-time-string/parse-time-string.c\r
+>  create mode 100644 parse-time-string/parse-time-string.h\r
+> \r
+> diff --git a/Makefile b/Makefile\r
+> index e5e2e3a..bb9c316 100644\r
+> --- a/Makefile\r
+> +++ b/Makefile\r
+> @@ -3,7 +3,7 @@\r
+>  all:\r
+>  \r
+>  # List all subdirectories here. Each contains its own Makefile.local\r
+> -subdirs = compat completion emacs lib man util test\r
+> +subdirs = compat completion emacs lib man parse-time-string util test\r
+>  \r
+>  # We make all targets depend on the Makefiles themselves.\r
+>  global_deps = Makefile Makefile.config Makefile.local \\r
+> diff --git a/parse-time-string/Makefile b/parse-time-string/Makefile\r
+> new file mode 100644\r
+> index 0000000..fa25832\r
+> --- /dev/null\r
+> +++ b/parse-time-string/Makefile\r
+> @@ -0,0 +1,5 @@\r
+> +all:\r
+> +    $(MAKE) -C .. all\r
+> +\r
+> +.DEFAULT:\r
+> +    $(MAKE) -C .. $@\r
+> diff --git a/parse-time-string/Makefile.local b/parse-time-string/Makefile.local\r
+> new file mode 100644\r
+> index 0000000..53534f3\r
+> --- /dev/null\r
+> +++ b/parse-time-string/Makefile.local\r
+> @@ -0,0 +1,12 @@\r
+> +dir := parse-time-string\r
+> +extra_cflags += -I$(srcdir)/$(dir)\r
+> +\r
+> +libparse-time-string_c_srcs := $(dir)/parse-time-string.c\r
+> +\r
+> +libparse-time-string_modules := $(libparse-time-string_c_srcs:.c=.o)\r
+> +\r
+> +$(dir)/libparse-time-string.a: $(libparse-time-string_modules)\r
+> +    $(call quiet,AR) rcs $@ $^\r
+> +\r
+> +SRCS := $(SRCS) $(libparse-time-string_c_srcs)\r
+> +CLEAN := $(CLEAN) $(libparse-time-string_modules) $(dir)/libparse-time-string.a\r
+> diff --git a/parse-time-string/README b/parse-time-string/README\r
+> new file mode 100644\r
+> index 0000000..300ff1f\r
+> --- /dev/null\r
+> +++ b/parse-time-string/README\r
+> @@ -0,0 +1,9 @@\r
+> +PARSE TIME STRING\r
+> +=================\r
+> +\r
+> +parse_time_string() is a date/time parser originally written for\r
+> +notmuch by Jani Nikula <jani@nikula.org>. However, there is nothing\r
+> +notmuch specific in it, and it should be kept reusable for other\r
+> +projects, and ready to be packaged on its own as needed. Please do not\r
+> +add dependencies on or references to anything notmuch specific. The\r
+> +parser should only depend on the C library.\r
+> diff --git a/parse-time-string/parse-time-string.c b/parse-time-string/parse-time-string.c\r
+> new file mode 100644\r
+> index 0000000..942041a\r
+> --- /dev/null\r
+> +++ b/parse-time-string/parse-time-string.c\r
+> @@ -0,0 +1,1477 @@\r
+> +/*\r
+> + * parse time string - user friendly date and time parser\r
+> + * Copyright © 2012 Jani Nikula\r
+> + *\r
+> + * This program is free software: you can redistribute it and/or modify\r
+> + * it under the terms of the GNU General Public License as published by\r
+> + * the Free Software Foundation, either version 2 of the License, or\r
+> + * (at your option) any later version.\r
+> + *\r
+> + * This program is distributed in the hope that it will be useful,\r
+> + * but WITHOUT ANY WARRANTY; without even the implied warranty of\r
+> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\r
+> + * GNU General Public License for more details.\r
+> + *\r
+> + * You should have received a copy of the GNU General Public License\r
+> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.\r
+> + *\r
+> + * Author: Jani Nikula <jani@nikula.org>\r
+> + */\r
+> +\r
+> +#include <assert.h>\r
+> +#include <ctype.h>\r
+> +#include <errno.h>\r
+> +#include <limits.h>\r
+> +#include <stdio.h>\r
+> +#include <stdarg.h>\r
+> +#include <stdbool.h>\r
+> +#include <stdlib.h>\r
+> +#include <string.h>\r
+> +#include <strings.h>\r
+> +#include <time.h>\r
+> +#include <sys/time.h>\r
+> +#include <sys/types.h>\r
+> +\r
+> +#include "parse-time-string.h"\r
+> +\r
+> +/*\r
+> + * IMPLEMENTATION DETAILS\r
+> + *\r
+> + * At a high level, the parsing is done in two phases: 1) actual\r
+> + * parsing of the input string and storing the parsed data into\r
+> + * 'struct state', and 2) processing of the data in 'struct state'\r
+> + * according to current time (or provided reference time) and\r
+> + * rounding. This is evident in the main entry point function\r
+> + * parse_time_string().\r
+> + *\r
+> + * 1) The parsing phase - parse_input()\r
+> + *\r
+> + * Parsing is greedy and happens from left to right. The parsing is as\r
+> + * unambiguous as possible; only unambiguous date/time formats are\r
+> + * accepted. Redundant or contradictory absolute date/time in the\r
+> + * input (e.g. date specified multiple times/ways) is not\r
+> + * accepted. Relative date/time on the other hand just accumulates if\r
+> + * present multiple times (e.g. "5 days 5 days" just turns into 10\r
+> + * days).\r
+> + *\r
+> + * Parsing decisions are made on the input format, not value. For\r
+> + * example, "20/5/2005" fails because the recognized format here is\r
+> + * MM/D/YYYY, even though the values would suggest DD/M/YYYY.\r
+> + *\r
+> + * Parsing is mostly stateless in the sense that parsing decisions are\r
+> + * not made based on the values of previously parsed data, or whether\r
+> + * certain data is present in the first place. (There are a few\r
+> + * exceptions to the latter part, though, such as parsing of time zone\r
+> + * that would otherwise look like plain time.)\r
+> + *\r
+> + * When the parser encounters a number that is not greedily parsed as\r
+> + * part of a format, the interpretation is postponed until the next\r
+> + * token is parsed. The parser for the next token may consume the\r
+> + * previously postponed number. For example, when parsing "20 May" the\r
+> + * meaning of "20" is not known until "May" is parsed. If the parser\r
+> + * for the next token does not consume the postponed number, the\r
+> + * number is handled as a "lone" number before parser for the next\r
+> + * token finishes.\r
+> + *\r
+> + * 2) The processing phase - create_output()\r
+> + *\r
+> + * Once the parser in phase 1 has finished, 'struct state' contains\r
+> + * all the information from the input string, and it's no longer\r
+> + * needed. Since the parser does not even handle the concept of "now",\r
+> + * the processing initializes the fields referring to the current\r
+> + * date/time.\r
+> + *\r
+> + * If requested, the result is rounded towards past or future. The\r
+> + * idea behind rounding is to support parsing date/time ranges in an\r
+> + * obvious way. For example, for a range defined as two dates (without\r
+> + * time), one would typically want to have an inclusive range from the\r
+> + * beginning of start date to the end of the end date. The caller\r
+> + * would use rounding towards past in the start date, and towards\r
+> + * future in the end date.\r
+> + *\r
+> + * The absolute date and time is shifted by the relative date and\r
+> + * time, and time zone adjustments are made. Daylight saving time\r
+> + * (DST) is specifically *not* handled at all.\r
+> + *\r
+> + * Finally, the result is stored to time_t.\r
+> + */\r
+> +\r
+> +#define unused(x) x __attribute__ ((unused))\r
+> +\r
+> +/* XXX: Redefine these to add i18n support. The keyword table uses\r
+> + * N_() to mark strings to be translated; they are accessed\r
+> + * dynamically using _(). */\r
+> +#define _(s) (s)    /* i18n: define as gettext (s) */\r
+> +#define N_(s) (s)   /* i18n: define as gettext_noop (s) */\r
+> +\r
+> +#define ARRAY_SIZE(a) (sizeof (a) / sizeof (a[0]))\r
+> +\r
+> +/*\r
+> + * Field indices in the tm and set arrays of struct state.\r
+> + *\r
+> + * NOTE: There's some code that depends on the ordering of this enum.\r
+> + */\r
+> +enum field {\r
+> +    /* Keep SEC...YEAR in this order. */\r
+> +    TM_ABS_SEC,             /* seconds */\r
+> +    TM_ABS_MIN,             /* minutes */\r
+> +    TM_ABS_HOUR,    /* hours */\r
+> +    TM_ABS_MDAY,    /* day of the month */\r
+> +    TM_ABS_MON,             /* month */\r
+> +    TM_ABS_YEAR,    /* year */\r
+> +\r
+> +    TM_ABS_WDAY,    /* day of the week. special: may be relative */\r
+\r
+Given that this may be relative, should it really be called\r
+TM_ABS_WDAY?\r
+\r
+> +    TM_ABS_ISDST,   /* daylight saving time */\r
+> +\r
+> +    TM_AMPM,                /* am vs. pm */\r
+> +    TM_TZ,          /* timezone in minutes */\r
+> +\r
+> +    /* Keep SEC...YEAR in this order. */\r
+> +    TM_REL_SEC,             /* seconds relative to absolute or reference time */\r
+> +    TM_REL_MIN,             /* minutes ... */\r
+> +    TM_REL_HOUR,    /* hours ... */\r
+> +    TM_REL_DAY,             /* days ... */\r
+> +    TM_REL_MON,             /* months ... */\r
+> +    TM_REL_YEAR,    /* years ... */\r
+> +    TM_REL_WEEK,    /* weeks ... */\r
+> +\r
+> +    TM_NONE,                /* not a field */\r
+> +\r
+> +    TM_SIZE = TM_NONE,\r
+> +    TM_FIRST_ABS = TM_ABS_SEC,\r
+> +    TM_FIRST_REL = TM_REL_SEC,\r
+> +};\r
+> +\r
+> +/* Values for the set array of struct state. */\r
+> +enum field_set {\r
+> +    FIELD_UNSET,    /* The field has not been touched by parser. */\r
+> +    FIELD_SET,              /* The field has been set by parser. */\r
+> +    FIELD_NOW,              /* The field will be set to reference time. */\r
+> +};\r
+> +\r
+> +static enum field\r
+> +next_abs_field (enum field field)\r
+> +{\r
+> +    /* NOTE: Depends on the enum ordering. */\r
+> +    return field < TM_ABS_YEAR ? field + 1 : TM_NONE;\r
+> +}\r
+> +\r
+> +static enum field\r
+> +abs_to_rel_field (enum field field)\r
+> +{\r
+> +    assert (field <= TM_ABS_YEAR);\r
+> +\r
+> +    /* NOTE: Depends on the enum ordering. */\r
+> +    return field + (TM_FIRST_REL - TM_FIRST_ABS);\r
+> +}\r
+> +\r
+> +/* Get epoch value for field. */\r
+\r
+Explain what an "epoch value" for a field is.\r
+\r
+> +static int\r
+> +field_epoch (enum field field)\r
+> +{\r
+> +    if (field == TM_ABS_MDAY || field == TM_ABS_MON)\r
+> +    return 1;\r
+> +    else if (field == TM_ABS_YEAR)\r
+> +    return 1970;\r
+> +    else\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +/* The parsing state. */\r
+> +struct state {\r
+> +    int tm[TM_SIZE];                        /* parsed date and time */\r
+> +    enum field_set set[TM_SIZE];    /* set status of tm */\r
+> +\r
+> +    enum field last_field;  /* Previously set field. */\r
+> +    char delim;\r
+> +\r
+> +    int postponed_length;   /* Number of digits in postponed value. */\r
+> +    int postponed_value;\r
+> +    char postponed_delim;   /* The delimiter preceding postponed number. */\r
+> +};\r
+> +\r
+> +/*\r
+> + * Helpers for postponed numbers.\r
+> + *\r
+> + * postponed_length is the number of digits in postponed value. 0\r
+> + * means there is no postponed number. -1 means there is a postponed\r
+> + * number, but it comes from a keyword, and it doesn't have digits.\r
+> + */\r
+> +static int\r
+> +get_postponed_length (struct state *state)\r
+> +{\r
+> +    return state->postponed_length;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Consume a previously postponed number. Return true if a number was\r
+> + * in fact postponed, false otherwise. Store the postponed number's\r
+> + * value in *v, length in the input string in *n (or -1 if the number\r
+> + * was written out and parsed as a keyword), and the preceding\r
+> + * delimiter to *d.\r
+\r
+Mention that v, n, and d are unchanged if no number is postponed?  You\r
+exploit this for default values elsewhere in the code.\r
+\r
+> + */\r
+> +static bool\r
+> +get_postponed_number (struct state *state, int *v, int *n, char *d)\r
+\r
+Maybe "consume_postponed_number" to emphasize that this function has\r
+side-effects (and isn't simply a "getter")?\r
+\r
+> +{\r
+> +    if (!state->postponed_length)\r
+> +    return false;\r
+> +\r
+> +    if (n)\r
+> +    *n = state->postponed_length;\r
+> +\r
+> +    if (v)\r
+> +    *v = state->postponed_value;\r
+> +\r
+> +    if (d)\r
+> +    *d = state->postponed_delim;\r
+> +\r
+> +    state->postponed_length = 0;\r
+> +    state->postponed_value = 0;\r
+> +    state->postponed_delim = 0;\r
+> +\r
+> +    return true;\r
+> +}\r
+> +\r
+> +static int parse_postponed_number (struct state *state, enum field next_field);\r
+> +\r
+> +/*\r
+> + * Postpone a number to be handled later. If one exists already,\r
+> + * handle it first. n may be -1 to indicate a keyword that has no\r
+> + * number length.\r
+> + */\r
+> +static int\r
+> +set_postponed_number (struct state *state, int v, int n)\r
+> +{\r
+> +    int r;\r
+> +    char d = state->delim;\r
+> +\r
+> +    /* Parse a previously postponed number, if any. */\r
+> +    r = parse_postponed_number (state, TM_NONE);\r
+> +    if (r)\r
+> +    return r;\r
+> +\r
+> +    state->postponed_length = n;\r
+> +    state->postponed_value = v;\r
+> +    state->postponed_delim = d;\r
+> +\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +static void\r
+> +set_delim (struct state *state, char delim)\r
+> +{\r
+> +    state->delim = delim;\r
+> +}\r
+> +\r
+> +static void\r
+> +unset_delim (struct state *state)\r
+> +{\r
+> +    state->delim = 0;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Field set/get/mod helpers.\r
+> + */\r
+> +\r
+> +/* Return true if field has been set. */\r
+> +static bool\r
+> +is_field_set (struct state *state, enum field field)\r
+> +{\r
+> +    assert (field < ARRAY_SIZE (state->tm));\r
+> +\r
+> +    return field < ARRAY_SIZE (state->set) &&\r
+\r
+state->tm and state->set are the same size, so this will always by\r
+true given that the assert hasn't fired.  Is this just defensive\r
+programming?\r
+\r
+> +       state->set[field] != FIELD_UNSET;\r
+> +}\r
+> +\r
+> +static void\r
+> +unset_field (struct state *state, enum field field)\r
+> +{\r
+> +    assert (field < ARRAY_SIZE (state->tm));\r
+> +\r
+> +    state->set[field] = FIELD_UNSET;\r
+> +    state->tm[field] = 0;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Set field to value. A field can only be set once to ensure the\r
+> + * input does not contain redundant and potentially conflicting data.\r
+> + */\r
+> +static int\r
+> +set_field (struct state *state, enum field field, int value)\r
+> +{\r
+> +    int r;\r
+> +\r
+> +    assert (field < ARRAY_SIZE (state->tm));\r
+> +\r
+> +    /* Fields can only be set once. */\r
+> +    if (field < ARRAY_SIZE (state->set) && state->set[field] != FIELD_UNSET)\r
+\r
+Same comment about array sizes.  Also, this should probably call\r
+is_field_set instead of open-coding it (which would make the array\r
+size check even more redundant!)\r
+\r
+> +    return -PARSE_TIME_ERR_ALREADYSET;\r
+> +\r
+> +    state->set[field] = FIELD_SET;\r
+> +\r
+> +    /* Parse a previously postponed number, if any. */\r
+> +    r = parse_postponed_number (state, field);\r
+\r
+I don't understand the big picture with postponed number handling yet,\r
+but is it worth mentioning in this function's doc comment that it\r
+processes postponed numbers?\r
+\r
+> +    if (r)\r
+> +    return r;\r
+> +\r
+> +    unset_delim (state);\r
+> +\r
+> +    state->tm[field] = value;\r
+> +    state->last_field = field;\r
+> +\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Mark n fields in fields to be set to the reference date/time in the\r
+> + * specified time zone, or local timezone if not specified. The fields\r
+> + * will be initialized after parsing is complete and timezone is\r
+> + * known.\r
+> + */\r
+> +static int\r
+> +set_fields_to_now (struct state *state, enum field *fields, size_t n)\r
+> +{\r
+> +    size_t i;\r
+> +    int r;\r
+> +\r
+> +    for (i = 0; i < n; i++) {\r
+> +    r = set_field (state, fields[i], 0);\r
+> +    if (r)\r
+> +        return r;\r
+> +    state->set[fields[i]] = FIELD_NOW;\r
+> +    }\r
+> +\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +/* Modify field by adding value to it. To be used on relative fields,\r
+> + * which can be modified multiple times (to accumulate). */\r
+> +static int\r
+> +mod_field (struct state *state, enum field field, int value)\r
+\r
+add_to_field?\r
+\r
+> +{\r
+> +    int r;\r
+> +\r
+> +    assert (field < ARRAY_SIZE (state->tm));   /* assert relative??? */\r
+> +\r
+> +    if (field < ARRAY_SIZE (state->set))\r
+\r
+Another redundant check?\r
+\r
+> +    state->set[field] = FIELD_SET;\r
+> +\r
+> +    /* Parse a previously postponed number, if any. */\r
+> +    r = parse_postponed_number (state, field);\r
+\r
+This postponed number stuff is getting really confusing...\r
+\r
+> +    if (r)\r
+> +    return r;\r
+> +\r
+> +    unset_delim (state);\r
+> +\r
+> +    state->tm[field] += value;\r
+> +    state->last_field = field;\r
+> +\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Get field value. Make sure the field is set before query. It's most\r
+> + * likely an error to call this while parsing (for example fields set\r
+> + * as FIELD_NOW will only be set to some value after parsing).\r
+> + */\r
+> +static int\r
+> +get_field (struct state *state, enum field field)\r
+> +{\r
+> +    assert (field < ARRAY_SIZE (state->tm));\r
+\r
+Assert that the field is set?\r
+\r
+> +\r
+> +    return state->tm[field];\r
+> +}\r
+> +\r
+> +/*\r
+> + * Validity checkers.\r
+> + */\r
+> +static bool is_valid_12hour (int h)\r
+> +{\r
+> +    return h >= 0 && h <= 12;\r
+\r
+h >= 1?\r
+\r
+> +}\r
+> +\r
+> +static bool is_valid_time (int h, int m, int s)\r
+> +{\r
+> +    /* Allow 24:00:00 to denote end of day. */\r
+> +    if (h == 24 && m == 0 && s == 0)\r
+> +    return true;\r
+> +\r
+> +    return h >= 0 && h <= 23 && m >= 0 && m <= 59 && s >= 0 && s <= 59;\r
+> +}\r
+> +\r
+> +static bool is_valid_mday (int mday)\r
+> +{\r
+> +    return mday >= 1 && mday <= 31;\r
+> +}\r
+> +\r
+> +static bool is_valid_mon (int mon)\r
+> +{\r
+> +    return mon >= 1 && mon <= 12;\r
+> +}\r
+> +\r
+> +static bool is_valid_year (int year)\r
+> +{\r
+> +    return year >= 1970;\r
+> +}\r
+> +\r
+> +static bool is_valid_date (int year, int mon, int mday)\r
+> +{\r
+> +    return is_valid_year (year) && is_valid_mon (mon) && is_valid_mday (mday);\r
+> +}\r
+> +\r
+> +/* Unset indicator for time and date set helpers. */\r
+> +#define UNSET -1\r
+> +\r
+> +/* Time set helper. No input checking. Use UNSET (-1) to leave unset. */\r
+> +static int\r
+> +set_abs_time (struct state *state, int hour, int min, int sec)\r
+> +{\r
+> +    int r;\r
+> +\r
+> +    if (hour != UNSET) {\r
+> +    if ((r = set_field (state, TM_ABS_HOUR, hour)))\r
+> +        return r;\r
+> +    }\r
+> +\r
+> +    if (min != UNSET) {\r
+> +    if ((r = set_field (state, TM_ABS_MIN, min)))\r
+> +        return r;\r
+> +    }\r
+> +\r
+> +    if (sec != UNSET) {\r
+> +    if ((r = set_field (state, TM_ABS_SEC, sec)))\r
+> +        return r;\r
+> +    }\r
+> +\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +/* Date set helper. No input checking. Use UNSET (-1) to leave unset. */\r
+> +static int\r
+> +set_abs_date (struct state *state, int year, int mon, int mday)\r
+> +{\r
+> +    int r;\r
+> +\r
+> +    if (year != UNSET) {\r
+> +    if ((r = set_field (state, TM_ABS_YEAR, year)))\r
+> +        return r;\r
+> +    }\r
+> +\r
+> +    if (mon != UNSET) {\r
+> +    if ((r = set_field (state, TM_ABS_MON, mon)))\r
+> +        return r;\r
+> +    }\r
+> +\r
+> +    if (mday != UNSET) {\r
+> +    if ((r = set_field (state, TM_ABS_MDAY, mday)))\r
+> +        return r;\r
+> +    }\r
+> +\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Keyword parsing and handling.\r
+> + */\r
+> +struct keyword;\r
+> +typedef int (*setter_t)(struct state *state, struct keyword *kw);\r
+> +\r
+> +struct keyword {\r
+> +    const char *name;       /* keyword */\r
+> +    enum field field;       /* field to set, or FIELD_NONE if N/A */\r
+> +    int value;              /* value to set, or 0 if N/A */\r
+> +    setter_t set;   /* function to use for setting, if non-NULL */\r
+> +};\r
+> +\r
+> +/*\r
+> + * Setter callback functions for keywords.\r
+> + */\r
+> +static int\r
+> +kw_set_default (struct state *state, struct keyword *kw)\r
+\r
+It took me a while to figure out what the name of this had to do with\r
+the action it performs, then I realized that it's never used in the\r
+table and only called when set is NULL.  Given that, I think it would\r
+make more sense to just put the set_field call in place of the one\r
+current call to kw_set_default.  Currently, this seems like one\r
+indirection too much.\r
+\r
+> +{\r
+> +    return set_field (state, kw->field, kw->value);\r
+> +}\r
+> +\r
+> +static int\r
+> +kw_set_rel (struct state *state, struct keyword *kw)\r
+> +{\r
+> +    int multiplier = 1;\r
+> +\r
+> +    /* Get a previously set multiplier, if any. */\r
+> +    get_postponed_number (state, &multiplier, NULL, NULL);\r
+> +\r
+> +    /* Accumulate relative field values. */\r
+> +    return mod_field (state, kw->field, multiplier * kw->value);\r
+> +}\r
+> +\r
+> +static int\r
+> +kw_set_number (struct state *state, struct keyword *kw)\r
+> +{\r
+> +    /* -1 = no length, from keyword. */\r
+> +    return set_postponed_number (state, kw->value, -1);\r
+> +}\r
+> +\r
+> +static int\r
+> +kw_set_month (struct state *state, struct keyword *kw)\r
+> +{\r
+> +    int n = get_postponed_length (state);\r
+> +\r
+> +    /* Consume postponed number if it could be mday. This handles "20\r
+> +     * January". */\r
+> +    if (n == 1 || n == 2) {\r
+\r
+Should this be (n && is_valid_mday (state->postponed_value))?  It\r
+seems a little odd that postponed numbers three digits or longer are\r
+treated as independent, but two digits numbers > 31 are an error.\r
+\r
+> +    int r, v;\r
+> +\r
+> +    get_postponed_number (state, &v, NULL, NULL);\r
+> +\r
+> +    if (!is_valid_mday (v))\r
+> +        return -PARSE_TIME_ERR_INVALIDDATE;\r
+> +\r
+> +    r = set_field (state, TM_ABS_MDAY, v);\r
+> +    if (r)\r
+> +        return r;\r
+> +    }\r
+> +\r
+> +    return set_field (state, kw->field, kw->value);\r
+> +}\r
+> +\r
+> +static int\r
+> +kw_set_ampm (struct state *state, struct keyword *kw)\r
+> +{\r
+> +    int n = get_postponed_length (state);\r
+> +\r
+> +    /* Consume postponed number if it could be hour. This handles\r
+> +     * "5pm". */\r
+> +    if (n == 1 || n == 2) {\r
+\r
+Same comment as for kw_set_month.\r
+\r
+> +    int r, v;\r
+> +\r
+> +    get_postponed_number (state, &v, NULL, NULL);\r
+> +\r
+> +    if (!is_valid_12hour (v))\r
+> +        return -PARSE_TIME_ERR_INVALIDTIME;\r
+> +\r
+> +    r = set_abs_time (state, v, 0, 0);\r
+> +    if (r)\r
+> +        return r;\r
+> +    }\r
+> +\r
+> +    return set_field (state, kw->field, kw->value);\r
+> +}\r
+> +\r
+> +static int\r
+> +kw_set_timeofday (struct state *state, struct keyword *kw)\r
+> +{\r
+> +    return set_abs_time (state, kw->value, 0, 0);\r
+> +}\r
+> +\r
+> +static int\r
+> +kw_set_today (struct state *state, unused (struct keyword *kw))\r
+> +{\r
+> +    enum field fields[] = { TM_ABS_YEAR, TM_ABS_MON, TM_ABS_MDAY };\r
+> +\r
+> +    return set_fields_to_now (state, fields, ARRAY_SIZE (fields));\r
+> +}\r
+> +\r
+> +static int\r
+> +kw_set_now (struct state *state, unused (struct keyword *kw))\r
+> +{\r
+> +    enum field fields[] = { TM_ABS_HOUR, TM_ABS_MIN, TM_ABS_SEC };\r
+> +\r
+> +    return set_fields_to_now (state, fields, ARRAY_SIZE (fields));\r
+> +}\r
+> +\r
+> +static int\r
+> +kw_set_ordinal (struct state *state, struct keyword *kw)\r
+> +{\r
+> +    int n, v;\r
+> +\r
+> +    /* Require a postponed number. */\r
+> +    if (!get_postponed_number (state, &v, &n, NULL))\r
+> +    return -PARSE_TIME_ERR_DATEFORMAT;\r
+> +\r
+> +    /* Ordinals are mday. */\r
+> +    if (n != 1 && n != 2)\r
+\r
+Is this redundant with your is_valid_mday test below?\r
+\r
+> +    return -PARSE_TIME_ERR_DATEFORMAT;\r
+> +\r
+> +    /* Be strict about st, nd, rd, and lax about th. */\r
+> +    if (strcasecmp (kw->name, "st") == 0 && v != 1 && v != 21 && v != 31)\r
+> +    return -PARSE_TIME_ERR_INVALIDDATE;\r
+> +    else if (strcasecmp (kw->name, "nd") == 0 && v != 2 && v != 22)\r
+> +    return -PARSE_TIME_ERR_INVALIDDATE;\r
+> +    else if (strcasecmp (kw->name, "rd") == 0 && v != 3 && v != 23)\r
+> +    return -PARSE_TIME_ERR_INVALIDDATE;\r
+> +    else if (strcasecmp (kw->name, "th") == 0 && !is_valid_mday (v))\r
+> +    return -PARSE_TIME_ERR_INVALIDDATE;\r
+> +\r
+> +    return set_field (state, TM_ABS_MDAY, v);\r
+> +}\r
+> +\r
+> +/*\r
+> + * Accepted keywords.\r
+> + *\r
+> + * A keyword may optionally contain a '|' to indicate the minimum\r
+> + * match length. Without one, full match is required. It's advisable\r
+> + * to keep the minimum match parts unique across all keywords.\r
+> + *\r
+> + * If keyword begins with upper case letter, then the matching will be\r
+> + * case sensitive. Otherwise the matching is case insensitive.\r
+> + *\r
+> + * If setter is NULL, set_default will be used.\r
+> + *\r
+> + * Note: Order matters. Matching is greedy, longest match is used, but\r
+> + * of equal length matches the first one is used, unless there's an\r
+> + * equal length case sensitive match which trumps case insensitive\r
+> + * matches.\r
+\r
+If you do have a tokenizer (or disallow mashing keywords together),\r
+then all of complexity arising from longest match goes away because\r
+the keyword token either will or won't match a keyword.  If you also\r
+eliminate the rule for case sensitivity and put case-sensitive things\r
+before conflicting case-insensitive things (so put "M" before\r
+"m|inutes"), then you can simply use the first match.\r
+\r
+> + */\r
+> +static struct keyword keywords[] = {\r
+> +    /* Weekdays. */\r
+> +    { N_("sun|day"),        TM_ABS_WDAY,    0,      NULL },\r
+> +    { N_("mon|day"),        TM_ABS_WDAY,    1,      NULL },\r
+> +    { N_("tue|sday"),       TM_ABS_WDAY,    2,      NULL },\r
+> +    { N_("wed|nesday"),     TM_ABS_WDAY,    3,      NULL },\r
+> +    { N_("thu|rsday"),      TM_ABS_WDAY,    4,      NULL },\r
+> +    { N_("fri|day"),        TM_ABS_WDAY,    5,      NULL },\r
+> +    { N_("sat|urday"),      TM_ABS_WDAY,    6,      NULL },\r
+> +\r
+> +    /* Months. */\r
+> +    { N_("jan|uary"),       TM_ABS_MON,     1,      kw_set_month },\r
+> +    { N_("feb|ruary"),      TM_ABS_MON,     2,      kw_set_month },\r
+> +    { N_("mar|ch"), TM_ABS_MON,     3,      kw_set_month },\r
+> +    { N_("apr|il"), TM_ABS_MON,     4,      kw_set_month },\r
+> +    { N_("may"),    TM_ABS_MON,     5,      kw_set_month },\r
+> +    { N_("jun|e"),  TM_ABS_MON,     6,      kw_set_month },\r
+> +    { N_("jul|y"),  TM_ABS_MON,     7,      kw_set_month },\r
+> +    { N_("aug|ust"),        TM_ABS_MON,     8,      kw_set_month },\r
+> +    { N_("sep|tember"),     TM_ABS_MON,     9,      kw_set_month },\r
+> +    { N_("oct|ober"),       TM_ABS_MON,     10,     kw_set_month },\r
+> +    { N_("nov|ember"),      TM_ABS_MON,     11,     kw_set_month },\r
+> +    { N_("dec|ember"),      TM_ABS_MON,     12,     kw_set_month },\r
+> +\r
+> +    /* Durations. */\r
+> +    { N_("y|ears"), TM_REL_YEAR,    1,      kw_set_rel },\r
+> +    { N_("w|eeks"), TM_REL_WEEK,    1,      kw_set_rel },\r
+> +    { N_("d|ays"),  TM_REL_DAY,     1,      kw_set_rel },\r
+> +    { N_("h|ours"), TM_REL_HOUR,    1,      kw_set_rel },\r
+> +    { N_("hr|s"),   TM_REL_HOUR,    1,      kw_set_rel },\r
+> +    { N_("m|inutes"),       TM_REL_MIN,     1,      kw_set_rel },\r
+> +    /* M=months, m=minutes */\r
+> +    { N_("M"),              TM_REL_MON,     1,      kw_set_rel },\r
+> +    { N_("mins"),   TM_REL_MIN,     1,      kw_set_rel },\r
+> +    { N_("mo|nths"),        TM_REL_MON,     1,      kw_set_rel },\r
+> +    { N_("s|econds"),       TM_REL_SEC,     1,      kw_set_rel },\r
+> +    { N_("secs"),   TM_REL_SEC,     1,      kw_set_rel },\r
+> +\r
+> +    /* Numbers. */\r
+> +    { N_("one"),    TM_NONE,        1,      kw_set_number },\r
+> +    { N_("two"),    TM_NONE,        2,      kw_set_number },\r
+> +    { N_("three"),  TM_NONE,        3,      kw_set_number },\r
+> +    { N_("four"),   TM_NONE,        4,      kw_set_number },\r
+> +    { N_("five"),   TM_NONE,        5,      kw_set_number },\r
+> +    { N_("six"),    TM_NONE,        6,      kw_set_number },\r
+> +    { N_("seven"),  TM_NONE,        7,      kw_set_number },\r
+> +    { N_("eight"),  TM_NONE,        8,      kw_set_number },\r
+> +    { N_("nine"),   TM_NONE,        9,      kw_set_number },\r
+> +    { N_("ten"),    TM_NONE,        10,     kw_set_number },\r
+> +    { N_("dozen"),  TM_NONE,        12,     kw_set_number },\r
+> +    { N_("hundred"),        TM_NONE,        100,    kw_set_number },\r
+> +\r
+> +    /* Special number forms. */\r
+> +    { N_("this"),   TM_NONE,        0,      kw_set_number },\r
+> +    { N_("last"),   TM_NONE,        1,      kw_set_number },\r
+> +\r
+> +    /* Other special keywords. */\r
+> +    { N_("yesterday"),      TM_REL_DAY,     1,      kw_set_rel },\r
+> +    { N_("today"),  TM_NONE,        0,      kw_set_today },\r
+> +    { N_("now"),    TM_NONE,        0,      kw_set_now },\r
+> +    { N_("noon"),   TM_NONE,        12,     kw_set_timeofday },\r
+> +    { N_("midnight"),       TM_NONE,        0,      kw_set_timeofday },\r
+> +    { N_("am"),             TM_AMPM,        0,      kw_set_ampm },\r
+> +    { N_("a.m."),   TM_AMPM,        0,      kw_set_ampm },\r
+> +    { N_("pm"),             TM_AMPM,        1,      kw_set_ampm },\r
+> +    { N_("p.m."),   TM_AMPM,        1,      kw_set_ampm },\r
+> +    { N_("st"),             TM_NONE,        0,      kw_set_ordinal },\r
+> +    { N_("nd"),             TM_NONE,        0,      kw_set_ordinal },\r
+> +    { N_("rd"),             TM_NONE,        0,      kw_set_ordinal },\r
+> +    { N_("th"),             TM_NONE,        0,      kw_set_ordinal },\r
+> +\r
+> +    /* Timezone codes: offset in minutes. XXX: Add more codes. */\r
+> +    { N_("pst"),    TM_TZ,          -8*60,  NULL },\r
+> +    { N_("mst"),    TM_TZ,          -7*60,  NULL },\r
+> +    { N_("cst"),    TM_TZ,          -6*60,  NULL },\r
+> +    { N_("est"),    TM_TZ,          -5*60,  NULL },\r
+> +    { N_("ast"),    TM_TZ,          -4*60,  NULL },\r
+> +    { N_("nst"),    TM_TZ,          -(3*60+30),     NULL },\r
+> +\r
+> +    { N_("gmt"),    TM_TZ,          0,      NULL },\r
+> +    { N_("utc"),    TM_TZ,          0,      NULL },\r
+> +\r
+> +    { N_("wet"),    TM_TZ,          0,      NULL },\r
+> +    { N_("cet"),    TM_TZ,          1*60,   NULL },\r
+> +    { N_("eet"),    TM_TZ,          2*60,   NULL },\r
+> +    { N_("fet"),    TM_TZ,          3*60,   NULL },\r
+> +\r
+> +    { N_("wat"),    TM_TZ,          1*60,   NULL },\r
+> +    { N_("cat"),    TM_TZ,          2*60,   NULL },\r
+> +    { N_("eat"),    TM_TZ,          3*60,   NULL },\r
+> +};\r
+> +\r
+> +/*\r
+> + * Compare strings s and keyword. Return number of matching chars on\r
+> + * match, 0 for no match. Match must be at least n chars, or all of\r
+> + * keyword if n < 0, otherwise it's not a match. Use match_case for\r
+> + * case sensitive matching.\r
+> + */\r
+> +static size_t\r
+> +match_keyword (const char *s, const char *keyword, ssize_t n, bool match_case)\r
+> +{\r
+> +    ssize_t i;\r
+> +\r
+> +    if (!n)\r
+> +    return 0;\r
+> +\r
+> +    for (i = 0; *s && *keyword; i++, s++, keyword++) {\r
+> +    if (match_case) {\r
+> +        if (*s != *keyword)\r
+\r
+The pointer arithmetic doesn't seem to buy anything here.  What about\r
+just looping over i and using s[i] and keyword[i]?\r
+\r
+> +            break;\r
+> +    } else {\r
+> +        if (tolower ((unsigned char) *s) !=\r
+> +            tolower ((unsigned char) *keyword))\r
+\r
+I don't think the cast to unsigned char is necessary.\r
+\r
+> +            break;\r
+> +    }\r
+> +    }\r
+> +\r
+> +    if (n > 0)\r
+> +    return i < n ? 0 : i;\r
+> +    else\r
+> +    return *keyword ? 0 : i;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Parse a keyword. Return < 0 on error, number of parsed chars on\r
+> + * success.\r
+> + */\r
+> +static ssize_t\r
+> +parse_keyword (struct state *state, const char *s)\r
+> +{\r
+> +    unsigned int i;\r
+> +    size_t n, max_n = 0;\r
+> +    struct keyword *kw = NULL;\r
+> +    int r;\r
+> +\r
+> +    /* Match longest keyword */\r
+> +    for (i = 0; i < ARRAY_SIZE (keywords); i++) {\r
+> +    /* Match case if keyword begins with upper case letter. */\r
+> +    bool mcase = isupper ((unsigned char) keywords[i].name[0]);\r
+\r
+Same with this cast.\r
+\r
+> +    ssize_t minlen = -1;\r
+> +    char keyword[128];\r
+> +    char *p;\r
+> +\r
+> +    strncpy (keyword, _(keywords[i].name), sizeof (keyword));\r
+> +\r
+> +    /* Truncate too long keywords. XXX: Make this dynamic? */\r
+> +    keyword[sizeof (keyword) - 1] = '\0';\r
+> +\r
+> +    /* Minimum match length. */\r
+> +    p = strchr (keyword, '|');\r
+> +    if (p) {\r
+> +        minlen = p - keyword;\r
+> +\r
+> +        /* Remove the minimum match length separator. */\r
+> +        memmove (p, p + 1, strlen (p + 1) + 1);\r
+> +    }\r
+\r
+Would it make more sense to make match_keyword aware of the |\r
+character?  Then you wouldn't need this dance with copying the keyword\r
+into a scratch buffer.  I'm thinking something like (untested)\r
+\r
+static size_t\r
+match_keyword (const char *s, const char *keyword, bool match_case)\r
+{\r
+    size_t i;\r
+    bool prefix_matched = false;\r
+\r
+    for (i = 0; *s && *keyword; i++, s++, keyword++) {\r
+        if (*keyword == '|') {\r
+            prefix_matched = true;\r
+            ++keyword;\r
+        }\r
+        if (match_case && *s != *keyword)\r
+            return 0;\r
+        else if (tolower (*s) != tolower (*keyword))\r
+            return 0;\r
+    }\r
+\r
+    if (!*keyword || prefix_matched)\r
+        return i;\r
+    return 0;\r
+}\r
+\r
+> +\r
+> +    n = match_keyword (s, keyword, minlen, mcase);\r
+> +    if (n > max_n || (n == max_n && mcase)) {\r
+> +        max_n = n;\r
+> +        kw = &keywords[i];\r
+> +    }\r
+> +    }\r
+> +\r
+> +    if (!kw)\r
+> +    return -PARSE_TIME_ERR_KEYWORD;\r
+> +\r
+> +    if (kw->set)\r
+> +    r = kw->set (state, kw);\r
+> +    else\r
+> +    r = kw_set_default (state, kw);\r
+> +\r
+> +    if (r < 0)\r
+> +    return r;\r
+> +\r
+> +    return max_n;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Non-keyword parsers and their helpers.\r
+> + */\r
+> +\r
+> +static int\r
+> +set_user_tz (struct state *state, char sign, int hour, int min)\r
+> +{\r
+> +    int tz = hour * 60 + min;\r
+> +\r
+> +    assert (sign == '+' || sign == '-');\r
+> +\r
+> +    if (hour < 0 || hour > 14 || min < 0 || min > 59 || min % 15)\r
+\r
+Good to see you're not forgetting our Kiribati notmuch user base.\r
+\r
+> +    return -PARSE_TIME_ERR_INVALIDTIME;\r
+> +\r
+> +    if (sign == '-')\r
+> +    tz = -tz;\r
+> +\r
+> +    return set_field (state, TM_TZ, tz);\r
+> +}\r
+> +\r
+> +/*\r
+> + * Parse a previously postponed number if one exists. Independent\r
+> + * parsing of a postponed number when it wasn't consumed during\r
+> + * parsing of the following token.\r
+> + */\r
+> +static int\r
+> +parse_postponed_number (struct state *state, unused (enum field next_field))\r
+> +{\r
+> +    int v, n;\r
+> +    char d;\r
+> +\r
+> +    /* Bail out if there's no postponed number. */\r
+> +    if (!get_postponed_number (state, &v, &n, &d))\r
+> +    return 0;\r
+> +\r
+> +    if (n == 1 || n == 2) {\r
+> +    /* Notable exception: Previous field affects parsing. This\r
+> +     * handles "January 20". */\r
+> +    if (state->last_field == TM_ABS_MON) {\r
+> +        /* D[D] */\r
+> +        if (!is_valid_mday (v))\r
+> +            return -PARSE_TIME_ERR_INVALIDDATE;\r
+> +\r
+> +        return set_field (state, TM_ABS_MDAY, v);\r
+> +    } else if (n == 2) {\r
+> +        /* XXX: Only allow if last field is hour, min, or sec? */\r
+> +        if (d == '+' || d == '-') {\r
+> +            /* +/-HH */\r
+> +            return set_user_tz (state, d, v, 0);\r
+> +        }\r
+> +    }\r
+> +    } else if (n == 4) {\r
+> +    /* Notable exception: Value affects parsing. Time zones are\r
+> +     * always at most 1400 and we don't understand years before\r
+> +     * 1970. */\r
+> +    if (!is_valid_year (v)) {\r
+> +        if (d == '+' || d == '-') {\r
+> +            /* +/-HHMM */\r
+> +            return set_user_tz (state, d, v / 100, v % 100);\r
+> +        }\r
+> +    } else {\r
+> +        /* YYYY */\r
+> +        return set_field (state, TM_ABS_YEAR, v);\r
+> +    }\r
+> +    } else if (n == 6) {\r
+> +    /* HHMMSS */\r
+> +    int hour = v / 10000;\r
+> +    int min = (v / 100) % 100;\r
+> +    int sec = v % 100;\r
+> +\r
+> +    if (!is_valid_time (hour, min, sec))\r
+> +        return -PARSE_TIME_ERR_INVALIDTIME;\r
+> +\r
+> +    return set_abs_time (state, hour, min, sec);\r
+> +    } else if (n == 8) {\r
+> +    /* YYYYMMDD */\r
+> +    int year = v / 10000;\r
+> +    int mon = (v / 100) % 100;\r
+> +    int mday = v % 100;\r
+> +\r
+> +    if (!is_valid_date (year, mon, mday))\r
+> +        return -PARSE_TIME_ERR_INVALIDDATE;\r
+> +\r
+> +    return set_abs_date (state, year, mon, mday);\r
+> +    } else {\r
+> +    return -PARSE_TIME_ERR_FORMAT;\r
+\r
+No need for the else block, given the return at the end.\r
+\r
+> +    }\r
+> +\r
+> +    return -PARSE_TIME_ERR_FORMAT;\r
+> +}\r
+> +\r
+> +static int tm_get_field (const struct tm *tm, enum field field);\r
+> +\r
+> +static int\r
+> +set_timestamp (struct state *state, time_t t)\r
+> +{\r
+> +    struct tm tm;\r
+> +    enum field f;\r
+> +    int r;\r
+> +\r
+> +    if (gmtime_r (&t, &tm) == NULL)\r
+> +    return -PARSE_TIME_ERR_LIB;\r
+> +\r
+> +    for (f = TM_ABS_SEC; f != TM_NONE; f = next_abs_field (f)) {\r
+> +    r = set_field (state, f, tm_get_field (&tm, f));\r
+> +    if (r)\r
+> +        return r;\r
+> +    }\r
+> +\r
+> +    r = set_field (state, TM_TZ, 0);\r
+> +    if (r)\r
+> +    return r;\r
+> +\r
+> +    /* XXX: Prevent TM_AMPM with timestamp, e.g. "@123456 pm" */\r
+> +\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +/* Parse a single number. Typically postpone parsing until later. */\r
+> +static int\r
+> +parse_single_number (struct state *state, unsigned long v,\r
+> +                 unsigned long n)\r
+> +{\r
+> +    assert (n);\r
+> +\r
+> +    if (state->delim == '@')\r
+> +    return set_timestamp (state, (time_t) v);\r
+> +\r
+> +    if (v > INT_MAX)\r
+> +    return -PARSE_TIME_ERR_FORMAT;\r
+> +\r
+> +    return set_postponed_number (state, v, n);\r
+> +}\r
+> +\r
+> +static bool\r
+> +is_time_sep (char c)\r
+> +{\r
+> +    return c == ':';\r
+> +}\r
+> +\r
+> +static bool\r
+> +is_date_sep (char c)\r
+> +{\r
+> +    return c == '/' || c == '-' || c == '.';\r
+> +}\r
+> +\r
+> +static bool\r
+> +is_sep (char c)\r
+> +{\r
+> +    return is_time_sep (c) || is_date_sep (c);\r
+> +}\r
+> +\r
+> +/* Two-digit year: 00...69 is 2000s, 70...99 1900s, if n == 0 keep\r
+> + * unset. */\r
+> +static int\r
+> +expand_year (unsigned long year, size_t n)\r
+> +{\r
+> +    if (n == 2) {\r
+> +    return (year < 70 ? 2000 : 1900) + year;\r
+> +    } else if (n == 4) {\r
+> +    return year;\r
+> +    } else {\r
+> +    return UNSET;\r
+> +    }\r
+> +}\r
+> +\r
+> +/* Parse a date number triplet. */\r
+> +static int\r
+> +parse_date (struct state *state, char sep,\r
+> +        unsigned long v1, unsigned long v2, unsigned long v3,\r
+> +        size_t n1, size_t n2, size_t n3)\r
+> +{\r
+> +    int year = UNSET, mon = UNSET, mday = UNSET;\r
+> +\r
+> +    assert (is_date_sep (sep));\r
+> +\r
+> +    switch (sep) {\r
+> +    case '/': /* Date: M[M]/D[D][/YY[YY]] or M[M]/YYYY */\r
+> +    if (n1 != 1 && n1 != 2)\r
+> +        return -PARSE_TIME_ERR_DATEFORMAT;\r
+> +\r
+> +    if ((n2 == 1 || n2 == 2) && (n3 == 0 || n3 == 2 || n3 == 4)) {\r
+> +        /* M[M]/D[D][/YY[YY]] */\r
+> +        year = expand_year (v3, n3);\r
+> +        mon = v1;\r
+> +        mday = v2;\r
+> +    } else if (n2 == 4 && n3 == 0) {\r
+> +        /* M[M]/YYYY */\r
+> +        year = v2;\r
+> +        mon = v1;\r
+> +    } else {\r
+> +        return -PARSE_TIME_ERR_DATEFORMAT;\r
+> +    }\r
+> +    break;\r
+> +\r
+> +    case '-': /* Date: YYYY-MM[-DD] or DD-MM[-YY[YY]] or MM-YYYY */\r
+> +    if (n1 == 4 && n2 == 2 && (n3 == 0 || n3 == 2)) {\r
+> +        /* YYYY-MM[-DD] */\r
+> +        year = v1;\r
+> +        mon = v2;\r
+> +        if (n3)\r
+> +            mday = v3;\r
+> +    } else if (n1 == 2 && n2 == 2 && (n3 == 0 || n3 == 2 || n3 == 4)) {\r
+> +        /* DD-MM[-YY[YY]] */\r
+> +        year = expand_year (v3, n3);\r
+> +        mon = v2;\r
+> +        mday = v1;\r
+> +    } else if (n1 == 2 && n2 == 4 && n3 == 0) {\r
+> +        /* MM-YYYY */\r
+> +        year = v2;\r
+> +        mon = v1;\r
+> +    } else {\r
+> +        return -PARSE_TIME_ERR_DATEFORMAT;\r
+> +    }\r
+> +    break;\r
+> +\r
+> +    case '.': /* Date: D[D].M[M][.[YY[YY]]] */\r
+> +    if ((n1 != 1 && n1 != 2) || (n2 != 1 && n2 != 2) ||\r
+> +        (n3 != 0 && n3 != 2 && n3 != 4))\r
+> +        return -PARSE_TIME_ERR_DATEFORMAT;\r
+> +\r
+> +    year = expand_year (v3, n3);\r
+> +    mon = v2;\r
+> +    mday = v1;\r
+> +    break;\r
+> +    }\r
+> +\r
+> +    if (year != UNSET && !is_valid_year (year))\r
+> +    return -PARSE_TIME_ERR_INVALIDDATE;\r
+> +\r
+> +    if (mon != UNSET && !is_valid_mon (mon))\r
+> +    return -PARSE_TIME_ERR_INVALIDDATE;\r
+> +\r
+> +    if (mday != UNSET && !is_valid_mday (mday))\r
+> +    return -PARSE_TIME_ERR_INVALIDDATE;\r
+> +\r
+> +    return set_abs_date (state, year, mon, mday);\r
+> +}\r
+> +\r
+> +/* Parse a time number triplet. */\r
+> +static int\r
+> +parse_time (struct state *state, char sep,\r
+> +        unsigned long v1, unsigned long v2, unsigned long v3,\r
+> +        size_t n1, size_t n2, size_t n3)\r
+> +{\r
+> +    assert (is_time_sep (sep));\r
+> +\r
+> +    if ((n1 != 1 && n1 != 2) || n2 != 2 || (n3 != 0 && n3 != 2))\r
+> +    return -PARSE_TIME_ERR_TIMEFORMAT;\r
+> +\r
+> +    /*\r
+> +     * Notable exception: Previously set fields affect\r
+> +     * parsing. Interpret (+|-)HH:MM as time zone only if hour and\r
+> +     * minute have been set.\r
+> +     *\r
+> +     * XXX: This could be fixed by restricting the delimiters\r
+> +     * preceding time. For '+' it would be justified, but for '-' it\r
+> +     * might be inconvenient. However prefer to allow '-' as an\r
+> +     * insignificant delimiter preceding time for convenience, and\r
+> +     * handle '+' the same way for consistency between positive and\r
+> +     * negative time zones.\r
+> +     */\r
+> +    if (is_field_set (state, TM_ABS_HOUR) &&\r
+> +    is_field_set (state, TM_ABS_MIN) &&\r
+> +    n1 == 2 && n2 == 2 && n3 == 0 &&\r
+> +    (state->delim == '+' || state->delim == '-')) {\r
+> +    return set_user_tz (state, state->delim, v1, v2);\r
+> +    }\r
+> +\r
+> +    if (!is_valid_time (v1, v2, v3))\r
+> +    return -PARSE_TIME_ERR_INVALIDTIME;\r
+> +\r
+> +    return set_abs_time (state, v1, v2, n3 ? v3 : 0);\r
+> +}\r
+> +\r
+> +/* strtoul helper that assigns length. */\r
+> +static unsigned long\r
+> +strtoul_len (const char *s, const char **endp, size_t *len)\r
+> +{\r
+> +    unsigned long val = strtoul (s, (char **) endp, 10);\r
+\r
+This could technically get confused by really large numbers, but I\r
+don't know if that's worth worrying about.\r
+\r
+> +\r
+> +    *len = *endp - s;\r
+> +    return val;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Parse a (group of) number(s). Return < 0 on error, number of parsed\r
+> + * chars on success.\r
+> + */\r
+> +static ssize_t\r
+> +parse_number (struct state *state, const char *s)\r
+> +{\r
+> +    int r;\r
+> +    unsigned long v1, v2, v3 = 0;\r
+> +    size_t n1, n2, n3 = 0;\r
+> +    const char *p = s;\r
+> +    char sep;\r
+> +\r
+> +    v1 = strtoul_len (p, &p, &n1);\r
+> +\r
+> +    if (is_sep (*p) && isdigit ((unsigned char) *(p + 1))) {\r
+\r
+Unnecessary cast?\r
+\r
+> +    sep = *p;\r
+> +    v2 = strtoul_len (p + 1, &p, &n2);\r
+> +    } else {\r
+> +    /* A single number. */\r
+> +    r = parse_single_number (state, v1, n1);\r
+> +    if (r)\r
+> +        return r;\r
+> +\r
+> +    return p - s;\r
+\r
+I found the control flow here confusing.  You might want to flip the\r
+two conditions so the single number return happens first and the rest\r
+of the code flows straight through:\r
+\r
+if (!is_sep (*p) || !isdigit (*(p + 1))) {\r
+    ...\r
+    return p - s;\r
+}\r
+\r
+sep = *p;\r
+...\r
+\r
+> +    }\r
+> +\r
+> +    /* A group of two or three numbers? */\r
+> +    if (*p == sep && isdigit ((unsigned char) *(p + 1)))\r
+> +    v3 = strtoul_len (p + 1, &p, &n3);\r
+> +\r
+> +    if (is_time_sep (sep))\r
+> +    r = parse_time (state, sep, v1, v2, v3, n1, n2, n3);\r
+> +    else\r
+> +    r = parse_date (state, sep, v1, v2, v3, n1, n2, n3);\r
+> +\r
+> +    if (r)\r
+> +    return r;\r
+> +\r
+> +    return p - s;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Parse delimiter(s). Throw away all except the last one, which is\r
+> + * stored for parsing the next non-delimiter. Return < 0 on error,\r
+> + * number of parsed chars on success.\r
+> + *\r
+> + * XXX: We might want to be more strict here.\r
+> + */\r
+> +static ssize_t\r
+> +parse_delim (struct state *state, const char *s)\r
+> +{\r
+> +    const char *p = s;\r
+> +\r
+> +    /*\r
+> +     * Skip non-alpha and non-digit, and store the last for further\r
+> +     * processing.\r
+> +     */\r
+> +    while (*p && !isalnum ((unsigned char) *p)) {\r
+> +    set_delim (state, *p);\r
+> +    p++;\r
+> +    }\r
+> +\r
+> +    return p - s;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Parse a date/time string. Return < 0 on error, number of parsed\r
+> + * chars on success.\r
+> + */\r
+> +static ssize_t\r
+> +parse_input (struct state *state, const char *s)\r
+> +{\r
+> +    const char *p = s;\r
+> +    ssize_t n;\r
+> +    int r;\r
+> +\r
+> +    while (*p) {\r
+> +    if (isalpha ((unsigned char) *p)) {\r
+> +        n = parse_keyword (state, p);\r
+> +    } else if (isdigit ((unsigned char) *p)) {\r
+> +        n = parse_number (state, p);\r
+> +    } else {\r
+> +        n = parse_delim (state, p);\r
+> +    }\r
+> +\r
+> +    if (n <= 0) {\r
+> +        if (n == 0)\r
+> +            n = -PARSE_TIME_ERR;\r
+> +\r
+> +        return n;\r
+> +    }\r
+> +\r
+> +    p += n;\r
+> +    }\r
+> +\r
+> +    /* Parse a previously postponed number, if any. */\r
+> +    r = parse_postponed_number (state, TM_NONE);\r
+> +    if (r < 0)\r
+> +    return r;\r
+> +\r
+> +    return p - s;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Processing the parsed input.\r
+> + */\r
+> +\r
+> +/*\r
+> + * Initialize reference time to tm. Use time zone in state if\r
+> + * specified, otherwise local time. Use now for reference time if\r
+> + * non-NULL, otherwise current time.\r
+> + */\r
+> +static int\r
+> +initialize_now (struct state *state, struct tm *tm, const time_t *now)\r
+\r
+Should tm be the last argument, since it's an out-argument?\r
+\r
+Why is now a pointer?  Just so it can be NULL?\r
+\r
+> +{\r
+> +    time_t t;\r
+> +\r
+> +    if (now) {\r
+> +    t = *now;\r
+> +    } else {\r
+> +    if (time (&t) == (time_t) -1)\r
+> +        return -PARSE_TIME_ERR_LIB;\r
+> +    }\r
+> +\r
+> +    if (is_field_set (state, TM_TZ)) {\r
+> +    /* Some other time zone. */\r
+> +\r
+> +    /* Adjust now according to the TZ. */\r
+> +    t += get_field (state, TM_TZ) * 60;\r
+> +\r
+> +    /* It's not gm, but this doesn't mess with the TZ. */\r
+> +    if (gmtime_r (&t, tm) == NULL)\r
+> +        return -PARSE_TIME_ERR_LIB;\r
+> +    } else {\r
+> +    /* Local time. */\r
+> +    if (localtime_r (&t, tm) == NULL)\r
+> +        return -PARSE_TIME_ERR_LIB;\r
+> +    }\r
+> +\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +/*\r
+> + * Normalize tm according to mktime(3). Both mktime(3) and\r
+\r
+This comment could elaborate a bit on what it means to normalize a tm.\r
+\r
+> + * localtime_r(3) use local time, but they cancel each other out here,\r
+> + * making this function agnostic to time zone.\r
+> + */\r
+> +static int\r
+> +normalize_tm (struct tm *tm)\r
+> +{\r
+> +    time_t t = mktime (tm);\r
+> +\r
+> +    if (t == (time_t) -1)\r
+> +    return -PARSE_TIME_ERR_LIB;\r
+> +\r
+> +    if (!localtime_r (&t, tm))\r
+> +    return -PARSE_TIME_ERR_LIB;\r
+\r
+Do you actually need this call to localtime_r or can you just return\r
+after the mktime modifies tm?  Does this have to do with timezones?\r
+\r
+> +\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +/* Get field out of a struct tm. */\r
+> +static int\r
+> +tm_get_field (const struct tm *tm, enum field field)\r
+> +{\r
+> +    switch (field) {\r
+> +    case TM_ABS_SEC:        return tm->tm_sec;\r
+> +    case TM_ABS_MIN:        return tm->tm_min;\r
+> +    case TM_ABS_HOUR:       return tm->tm_hour;\r
+> +    case TM_ABS_MDAY:       return tm->tm_mday;\r
+> +    case TM_ABS_MON:        return tm->tm_mon + 1; /* 0- to 1-based */\r
+> +    case TM_ABS_YEAR:       return 1900 + tm->tm_year;\r
+> +    case TM_ABS_WDAY:       return tm->tm_wday;\r
+> +    case TM_ABS_ISDST:      return tm->tm_isdst;\r
+> +    default:\r
+> +    assert (false);\r
+> +    break;\r
+> +    }\r
+> +\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +/* Modify hour according to am/pm setting. */\r
+> +static int\r
+> +fixup_ampm (struct state *state)\r
+> +{\r
+> +    int hour, hdiff = 0;\r
+> +\r
+> +    if (!is_field_set (state, TM_AMPM))\r
+> +    return 0;\r
+> +\r
+> +    if (!is_field_set (state, TM_ABS_HOUR))\r
+> +    return -PARSE_TIME_ERR_TIMEFORMAT;\r
+> +\r
+> +    hour = get_field (state, TM_ABS_HOUR);\r
+> +    if (!is_valid_12hour (hour))\r
+> +    return -PARSE_TIME_ERR_INVALIDTIME;\r
+> +\r
+> +    if (get_field (state, TM_AMPM)) {\r
+> +    /* 12pm is noon. */\r
+> +    if (hour != 12)\r
+> +        hdiff = 12;\r
+> +    } else {\r
+> +    /* 12am is midnight, beginning of day. */\r
+> +    if (hour == 12)\r
+> +        hdiff = -12;\r
+> +    }\r
+> +\r
+> +    mod_field (state, TM_REL_HOUR, -hdiff);\r
+> +\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +/* Combine absolute and relative fields, and round. */\r
+> +static int\r
+> +create_output (struct state *state, time_t *t_out, const time_t *ref,\r
+> +           int round)\r
+> +{\r
+> +    struct tm tm = { .tm_isdst = -1 };\r
+> +    struct tm now;\r
+> +    time_t t;\r
+> +    enum field f;\r
+> +    int r;\r
+> +    int week_round = PARSE_TIME_NO_ROUND;\r
+> +\r
+> +    r = initialize_now (state, &now, ref);\r
+> +    if (r)\r
+> +    return r;\r
+> +\r
+> +    /* Initialize fields flagged as "now" to reference time. */\r
+> +    for (f = TM_ABS_SEC; f != TM_NONE; f = next_abs_field (f)) {\r
+> +    if (state->set[f] == FIELD_NOW) {\r
+> +        state->tm[f] = tm_get_field (&now, f);\r
+> +        state->set[f] = FIELD_SET;\r
+> +    }\r
+> +    }\r
+> +\r
+> +    /*\r
+> +     * If WDAY is set but MDAY is not, we consider WDAY relative\r
+> +     *\r
+> +     * XXX: This fails on stuff like "two months monday" because two\r
+> +     * months ago wasn't the same day as today. Postpone until we know\r
+> +     * date?\r
+> +     */\r
+> +    if (is_field_set (state, TM_ABS_WDAY) &&\r
+> +    !is_field_set (state, TM_ABS_MDAY)) {\r
+> +    int wday = get_field (state, TM_ABS_WDAY);\r
+> +    int today = tm_get_field (&now, TM_ABS_WDAY);\r
+> +    int rel_days;\r
+> +\r
+> +    if (today > wday)\r
+> +        rel_days = today - wday;\r
+> +    else\r
+> +        rel_days = today + 7 - wday;\r
+> +\r
+> +    /* This also prevents special week rounding from happening. */\r
+> +    mod_field (state, TM_REL_DAY, rel_days);\r
+> +\r
+> +    unset_field (state, TM_ABS_WDAY);\r
+> +    }\r
+> +\r
+> +    r = fixup_ampm (state);\r
+> +    if (r)\r
+> +    return r;\r
+> +\r
+> +    /*\r
+> +     * Iterate fields from most accurate to least accurate, and set\r
+> +     * unset fields according to requested rounding.\r
+> +     */\r
+> +    for (f = TM_ABS_SEC; f != TM_NONE; f = next_abs_field (f)) {\r
+> +    if (round != PARSE_TIME_NO_ROUND) {\r
+> +        enum field r = abs_to_rel_field (f);\r
+> +\r
+> +        if (is_field_set (state, f) || is_field_set (state, r)) {\r
+> +            if (round >= PARSE_TIME_ROUND_UP && f != TM_ABS_SEC) {\r
+> +                mod_field (state, r, -1);\r
+\r
+Crazy.  This could use a comment.  It took me a while to figure out\r
+why this was -1, though maybe that's just because it's late.\r
+\r
+> +                if (round == PARSE_TIME_ROUND_UP_INCLUSIVE)\r
+> +                    mod_field (state, TM_REL_SEC, 1);\r
+> +            }\r
+> +            round = PARSE_TIME_NO_ROUND; /* No more rounding. */\r
+> +        } else {\r
+> +            if (f == TM_ABS_MDAY &&\r
+> +                is_field_set (state, TM_REL_WEEK)) {\r
+> +                /* Week is most accurate. */\r
+> +                week_round = round;\r
+> +                round = PARSE_TIME_NO_ROUND;\r
+> +            } else {\r
+> +                set_field (state, f, field_epoch (f));\r
+> +            }\r
+> +        }\r
+> +    }\r
+> +\r
+> +    if (!is_field_set (state, f))\r
+> +        set_field (state, f, tm_get_field (&now, f));\r
+> +    }\r
+> +\r
+> +    /* Special case: rounding with week accuracy. */\r
+> +    if (week_round != PARSE_TIME_NO_ROUND) {\r
+> +    /* Temporarily set more accurate fields to now. */\r
+> +    set_field (state, TM_ABS_SEC, tm_get_field (&now, TM_ABS_SEC));\r
+> +    set_field (state, TM_ABS_MIN, tm_get_field (&now, TM_ABS_MIN));\r
+> +    set_field (state, TM_ABS_HOUR, tm_get_field (&now, TM_ABS_HOUR));\r
+> +    set_field (state, TM_ABS_MDAY, tm_get_field (&now, TM_ABS_MDAY));\r
+> +    }\r
+> +\r
+> +    /*\r
+> +     * Set all fields. They may contain out of range values before\r
+> +     * normalization by mktime(3).\r
+> +     */\r
+> +    tm.tm_sec = get_field (state, TM_ABS_SEC) - get_field (state, TM_REL_SEC);\r
+> +    tm.tm_min = get_field (state, TM_ABS_MIN) - get_field (state, TM_REL_MIN);\r
+> +    tm.tm_hour = get_field (state, TM_ABS_HOUR) - get_field (state, TM_REL_HOUR);\r
+> +    tm.tm_mday = get_field (state, TM_ABS_MDAY) -\r
+> +             get_field (state, TM_REL_DAY) - 7 * get_field (state, TM_REL_WEEK);\r
+> +    tm.tm_mon = get_field (state, TM_ABS_MON) - get_field (state, TM_REL_MON);\r
+> +    tm.tm_mon--; /* 1- to 0-based */\r
+> +    tm.tm_year = get_field (state, TM_ABS_YEAR) - get_field (state, TM_REL_YEAR) - 1900;\r
+> +\r
+> +    /*\r
+> +     * It's always normal time.\r
+> +     *\r
+> +     * XXX: This is probably not a solution that universally\r
+> +     * works. Just make sure DST is not taken into account. We don't\r
+> +     * want rounding to be affected by DST.\r
+> +     */\r
+> +    tm.tm_isdst = -1;\r
+> +\r
+> +    /* Special case: rounding with week accuracy. */\r
+> +    if (week_round != PARSE_TIME_NO_ROUND) {\r
+> +    /* Normalize to get proper tm.wday. */\r
+> +    r = normalize_tm (&tm);\r
+> +    if (r < 0)\r
+> +        return r;\r
+> +\r
+> +    /* Set more accurate fields back to zero. */\r
+> +    tm.tm_sec = 0;\r
+> +    tm.tm_min = 0;\r
+> +    tm.tm_hour = 0;\r
+> +    tm.tm_isdst = -1;\r
+> +\r
+> +    /* Monday is the true 1st day of week, but this is easier. */\r
+> +    if (week_round >= PARSE_TIME_ROUND_UP) {\r
+> +        tm.tm_mday += 7 - tm.tm_wday;\r
+> +        if (week_round == PARSE_TIME_ROUND_UP_INCLUSIVE)\r
+> +            tm.tm_sec--;\r
+> +    } else {\r
+> +        tm.tm_mday -= tm.tm_wday;\r
+> +    }\r
+> +    }\r
+> +\r
+> +    if (is_field_set (state, TM_TZ)) {\r
+> +    /* tm is in specified TZ, convert to UTC for timegm(3). */\r
+> +    tm.tm_min -= get_field (state, TM_TZ);\r
+> +    t = timegm (&tm);\r
+> +    } else {\r
+> +    /* tm is in local time. */\r
+> +    t = mktime (&tm);\r
+> +    }\r
+> +\r
+> +    if (t == (time_t) -1)\r
+> +    return -PARSE_TIME_ERR_LIB;\r
+> +\r
+> +    *t_out = t;\r
+> +\r
+> +    return 0;\r
+> +}\r
+> +\r
+> +/* Internally, all errors are < 0. parse_time_string() returns errors > 0. */\r
+> +#define EXTERNAL_ERR(r) (-r)\r
+> +\r
+> +int\r
+> +parse_time_string (const char *s, time_t *t, const time_t *ref, int round)\r
+> +{\r
+> +    struct state state = { .last_field = TM_NONE };\r
+> +    int r;\r
+> +\r
+> +    if (!s || !t)\r
+> +    return EXTERNAL_ERR (-PARSE_TIME_ERR);\r
+> +\r
+> +    r = parse_input (&state, s);\r
+> +    if (r < 0)\r
+> +    return EXTERNAL_ERR (r);\r
+> +\r
+> +    r = create_output (&state, t, ref, round);\r
+> +    if (r < 0)\r
+> +    return EXTERNAL_ERR (r);\r
+> +\r
+> +    return 0;\r
+> +}\r
+> diff --git a/parse-time-string/parse-time-string.h b/parse-time-string/parse-time-string.h\r
+> new file mode 100644\r
+> index 0000000..bfa4ee3\r
+> --- /dev/null\r
+> +++ b/parse-time-string/parse-time-string.h\r
+> @@ -0,0 +1,102 @@\r
+> +/*\r
+> + * parse time string - user friendly date and time parser\r
+> + * Copyright © 2012 Jani Nikula\r
+> + *\r
+> + * This program is free software: you can redistribute it and/or modify\r
+> + * it under the terms of the GNU General Public License as published by\r
+> + * the Free Software Foundation, either version 2 of the License, or\r
+> + * (at your option) any later version.\r
+> + *\r
+> + * This program is distributed in the hope that it will be useful,\r
+> + * but WITHOUT ANY WARRANTY; without even the implied warranty of\r
+> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\r
+> + * GNU General Public License for more details.\r
+> + *\r
+> + * You should have received a copy of the GNU General Public License\r
+> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.\r
+> + *\r
+> + * Author: Jani Nikula <jani@nikula.org>\r
+> + */\r
+> +\r
+> +#ifndef PARSE_TIME_STRING_H\r
+> +#define PARSE_TIME_STRING_H\r
+> +\r
+> +#ifdef __cplusplus\r
+> +extern "C" {\r
+> +#endif\r
+> +\r
+> +#include <time.h>\r
+> +\r
+> +/* return values for parse_time_string() */\r
+> +enum {\r
+> +    PARSE_TIME_OK = 0,\r
+> +    PARSE_TIME_ERR,         /* unspecified error */\r
+> +    PARSE_TIME_ERR_LIB,             /* library call failed */\r
+> +    PARSE_TIME_ERR_ALREADYSET,      /* attempt to set unit twice */\r
+> +    PARSE_TIME_ERR_FORMAT,  /* generic date/time format error */\r
+> +    PARSE_TIME_ERR_DATEFORMAT,      /* date format error */\r
+> +    PARSE_TIME_ERR_TIMEFORMAT,      /* time format error */\r
+> +    PARSE_TIME_ERR_INVALIDDATE,     /* date value error */\r
+> +    PARSE_TIME_ERR_INVALIDTIME,     /* time value error */\r
+> +    PARSE_TIME_ERR_KEYWORD, /* unknown keyword */\r
+> +};\r
+> +\r
+> +/* round values for parse_time_string() */\r
+> +enum {\r
+> +    PARSE_TIME_ROUND_DOWN = -1,\r
+> +    PARSE_TIME_NO_ROUND = 0,\r
+> +    PARSE_TIME_ROUND_UP = 1,\r
+> +    PARSE_TIME_ROUND_UP_INCLUSIVE = 2,\r
+> +};\r
+> +\r
+> +/**\r
+> + * parse_time_string() - user friendly date and time parser\r
+> + * @s:              string to parse\r
+> + * @t:              pointer to time_t to store parsed time in\r
+> + * @ref:    pointer to time_t containing reference date/time, or NULL\r
+> + * @round:  PARSE_TIME_NO_ROUND, PARSE_TIME_ROUND_DOWN, or\r
+> + *          PARSE_TIME_ROUND_UP\r
+> + *\r
+> + * Parse a date/time string 's' and store the parsed date/time result\r
+> + * in 't'.\r
+> + *\r
+> + * A reference date/time is used for determining the "date/time units"\r
+> + * (roughly equivalent to struct tm members) not specified by 's'. If\r
+> + * 'ref' is non-NULL, it must contain a pointer to a time_t to be used\r
+> + * as reference date/time. Otherwise, the current time is used.\r
+> + *\r
+> + * If 's' does not specify a full date/time, the 'round' parameter\r
+> + * specifies if and how the result should be rounded as follows:\r
+> + *\r
+> + *   PARSE_TIME_NO_ROUND: All date/time units that are not specified\r
+> + *   by 's' are set to the corresponding unit derived from the\r
+> + *   reference date/time.\r
+> + *\r
+> + *   PARSE_TIME_ROUND_DOWN: All date/time units that are more accurate\r
+> + *   than the most accurate unit specified by 's' are set to the\r
+> + *   smallest valid value for that unit. Rest of the unspecified units\r
+> + *   are set as in PARSE_TIME_NO_ROUND.\r
+> + *\r
+> + *   PARSE_TIME_ROUND_UP: All date/time units that are more accurate\r
+> + *   than the most accurate unit specified by 's' are set to the\r
+> + *   smallest valid value for that unit. The most accurate unit\r
+> + *   specified by 's' is incremented by one (and this is rolled over\r
+> + *   to the less accurate units as necessary), unless the most\r
+> + *   accurate unit is seconds. Rest of the unspecified units are set\r
+> + *   as in PARSE_TIME_NO_ROUND.\r
+> + *\r
+> + *   PARSE_TIME_ROUND_UP_INCLUSIVE: Same as PARSE_TIME_ROUND_UP, minus\r
+> + *   one second, unless the most accurate unit specified by 's' is\r
+> + *   seconds. This is useful for callers that require a value for\r
+> + *   inclusive comparison of the result.\r
+> + *\r
+> + * Return 0 (PARSE_TIME_OK) for succesfully parsed date/time, or one\r
+> + * of PARSE_TIME_ERR_* on error. 't' is not modified on error.\r
+> + */\r
+> +int parse_time_string (const char *s, time_t *t, const time_t *ref, int round);\r
+> +\r
+> +#ifdef __cplusplus\r
+> +}\r
+> +#endif\r
+> +\r
+> +#endif /* PARSE_TIME_STRING_H */\r
+\r
+Made it!\r
author	Austin Clements <amdragon@MIT.EDU>
	Mon, 22 Oct 2012 08:14:44 +0000 (04:14 +2000)
committer	W. Trevor King <wking@tremily.us>
	Fri, 7 Nov 2014 17:49:58 +0000 (09:49 -0800)