Re: [PATCH v5 2/9] parse-time-string: add a date/time parser to notmuch

author Jani Nikula <jani@nikula.org>

Sun, 28 Oct 2012 22:30:58 +0000 (00:30 +0200)

committer W. Trevor King <wking@tremily.us>

Fri, 7 Nov 2014 17:50:08 +0000 (09:50 -0800)
author Jani Nikula <jani@nikula.org>
Sun, 28 Oct 2012 22:30:58 +0000 (00:30 +0200)
committer W. Trevor King <wking@tremily.us>
Fri, 7 Nov 2014 17:50:08 +0000 (09:50 -0800)
diff --git a/2a/9df48161141a650843df56f6287e0c26664b00 b/2a/9df48161141a650843df56f6287e0c26664b00

new file mode 100644 (file)

index 0000000..3209b71
--- /dev/null
+++ b/2a/9df48161141a650843df56f6287e0c26664b00
@@ -0,0 +1,2028 @@
+Return-Path: <jani@nikula.org>\r
+X-Original-To: notmuch@notmuchmail.org\r
+Delivered-To: notmuch@notmuchmail.org\r
+Received: from localhost (localhost [127.0.0.1])\r
+       by olra.theworths.org (Postfix) with ESMTP id 1CFAA431FBC\r
+       for <notmuch@notmuchmail.org>; Sun, 28 Oct 2012 15:31:10 -0700 (PDT)\r
+X-Virus-Scanned: Debian amavisd-new at olra.theworths.org\r
+X-Spam-Flag: NO\r
+X-Spam-Score: -0.7\r
+X-Spam-Level: \r
+X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5\r
+       tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled\r
+Received: from olra.theworths.org ([127.0.0.1])\r
+       by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024)\r
+       with ESMTP id sN0KyEeLZazz for <notmuch@notmuchmail.org>;\r
+       Sun, 28 Oct 2012 15:31:05 -0700 (PDT)\r
+Received: from mail-la0-f53.google.com (mail-la0-f53.google.com\r
+       [209.85.215.53]) (using TLSv1 with cipher RC4-SHA (128/128 bits))\r
+       (No client certificate requested)\r
+       by olra.theworths.org (Postfix) with ESMTPS id 54A67431FAF\r
+       for <notmuch@notmuchmail.org>; Sun, 28 Oct 2012 15:31:05 -0700 (PDT)\r
+Received: by mail-la0-f53.google.com with SMTP id l5so3911531lah.26\r
+       for <notmuch@notmuchmail.org>; Sun, 28 Oct 2012 15:31:03 -0700 (PDT)\r
+X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;\r
+       d=google.com; s=20120113;\r
+       h=from:to:cc:subject:in-reply-to:references:user-agent:date\r
+       :message-id:mime-version:content-type:content-transfer-encoding\r
+       :x-gm-message-state;\r
+       bh=e3QxPM1CpFjLKigQh3lrDcHDa0IsfqyhBANEYJFyS4E=;\r
+       b=B+aWonXsSo8ySyvs0etLlxL4qj8OIyBLrgfbKsz1rT7a4ztxKsmyNASbIWpORIDanx\r
+       c/CzDHOxBvih+62DyeOQgSAmPca3WF10xSzJieuwQ3iDXSS2/BBDJRaQ0CB6sy+iTt6L\r
+       WcLX+UXxjHQpUt+lPvo737LXmXZCLhP5bKFwa+5F8SUuOi8UIybG7sLbfRVhzr0eEFBo\r
+       kmNE7rWkuQh39VMfl64q8NGC/YF4CEGHNUSLIm0kMhfIKeaQ/OlzPIDWoIOu9nyUmrB4\r
+       BMgIDKNEhv7k4diosc6REY1tcAqbFCqGrkicz89W5K5AoJIeAXCbdi4+zhvADMN/u0/q\r
+       NOyQ==\r
+Received: by 10.112.13.173 with SMTP id i13mr11308881lbc.108.1351463463635;\r
+       Sun, 28 Oct 2012 15:31:03 -0700 (PDT)\r
+Received: from localhost (dsl-hkibrasgw4-fe51df00-27.dhcp.inet.fi.\r
+       [80.223.81.27])\r
+       by mx.google.com with ESMTPS id sj3sm2475866lab.2.2012.10.28.15.30.59\r
+       (version=SSLv3 cipher=OTHER); Sun, 28 Oct 2012 15:31:02 -0700 (PDT)\r
+From: Jani Nikula <jani@nikula.org>\r
+To: Austin Clements <amdragon@MIT.EDU>\r
+Subject: Re: [PATCH v5 2/9] parse-time-string: add a date/time parser to\r
+       notmuch\r
+In-Reply-To: <20121022081444.GM14861@mit.edu>\r
+References: <cover.1350854171.git.jani@nikula.org>\r
+       <a90d3b687895a26f765539d6c0420038a74ee42f.1350854171.git.jani@nikula.org>\r
+       <20121022081444.GM14861@mit.edu>\r
+User-Agent: Notmuch/0.14+46~g272a1f1 (http://notmuchmail.org) Emacs/23.4.1\r
+       (i686-pc-linux-gnu)\r
+Date: Mon, 29 Oct 2012 00:30:58 +0200\r
+Message-ID: <87lieqkz4t.fsf@nikula.org>\r
+MIME-Version: 1.0\r
+Content-Type: text/plain; charset=utf-8\r
+Content-Transfer-Encoding: quoted-printable\r
+X-Gm-Message-State:\r
+ ALoCoQm6jf4JuYwdb9RlFTkXoM6Z/MlwK54xvKsGp5NZ8DomdiPjlQ9mfBxrUEOziFTavRivxsFL\r
+Cc: notmuch@notmuchmail.org\r
+X-BeenThere: notmuch@notmuchmail.org\r
+X-Mailman-Version: 2.1.13\r
+Precedence: list\r
+List-Id: "Use and development of the notmuch mail system."\r
+       <notmuch.notmuchmail.org>\r
+List-Unsubscribe: <http://notmuchmail.org/mailman/options/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=unsubscribe>\r
+List-Archive: <http://notmuchmail.org/pipermail/notmuch>\r
+List-Post: <mailto:notmuch@notmuchmail.org>\r
+List-Help: <mailto:notmuch-request@notmuchmail.org?subject=help>\r
+List-Subscribe: <http://notmuchmail.org/mailman/listinfo/notmuch>,\r
+       <mailto:notmuch-request@notmuchmail.org?subject=subscribe>\r
+X-List-Received-Date: Sun, 28 Oct 2012 22:31:10 -0000\r
+\r
+On Mon, 22 Oct 2012, Austin Clements <amdragon@MIT.EDU> wrote:\r
+> Overall this looks pretty good to me, and I must say, this parser is\r
+> amazingly flexible and copes well with a remarkably hostile grammar.\r
+>\r
+> A lot of little comments below (sorry if any of this ground has\r
+> already been covered in the previous four versions).\r
+\r
+Nope, apart from "postponed numbers are confusing".\r
+\r
+> I do have one broad comment.  While I'm all for ad hoc parsers for ad\r
+> hoc grammars like dates, there is one piece of the literature I think\r
+> this parser suffers for by ignoring: tokenizing.  I think it would\r
+> simplify a lot of this code if it did a tokenizing pass before the\r
+> parsing pass.  It doesn't have to be a serious tokenizer with\r
+> streaming and keywords and token types and junk; just something that\r
+> first splits the input into substrings, possibly just non-overlapping\r
+> matches of [[:digit:]]+|[[:alpha:]]+|[-+:/.].  This would simplify the\r
+> handling of postponed numbers because, with trivial lookahead in the\r
+> token stream, you wouldn't have to postpone them.  Likewise, it would\r
+> eliminate last_field.  It would simplify keyword matching because you\r
+> wouldn't have to worry about matching substrings (I spent a long time\r
+> staring at that code before I figured out what it would and wouldn't\r
+> accept).  Most important, I think it would make the parser more\r
+> predictable for users; for example, the parser currently accepts\r
+> things like "saturtoday" because it's aggressively single-pass.\r
+\r
+I'll fix this to require a non-keyword character between keywords, but,\r
+since you're not adamant about it, I'll pass adding the tokenizer, at\r
+least for now.\r
+\r
+> Quoth Jani Nikula on Oct 22 at 12:22 am:\r
+>> Add a date/time parser to notmuch, to be used for adding date range\r
+>> query support for notmuch lib later on. Add the parser to a directory\r
+>> of its own to make it independent of the rest of the notmuch code\r
+>> base.\r
+>>=20\r
+>> Signed-off-by: Jani Nikula <jani@nikula.org>\r
+>> ---\r
+>>  Makefile                              |    2 +-\r
+>>  parse-time-string/Makefile            |    5 +\r
+>>  parse-time-string/Makefile.local      |   12 +\r
+>>  parse-time-string/README              |    9 +\r
+>>  parse-time-string/parse-time-string.c | 1477 ++++++++++++++++++++++++++=\r
++++++++\r
+>>  parse-time-string/parse-time-string.h |  102 +++\r
+>>  6 files changed, 1606 insertions(+), 1 deletion(-)\r
+>>  create mode 100644 parse-time-string/Makefile\r
+>>  create mode 100644 parse-time-string/Makefile.local\r
+>>  create mode 100644 parse-time-string/README\r
+>>  create mode 100644 parse-time-string/parse-time-string.c\r
+>>  create mode 100644 parse-time-string/parse-time-string.h\r
+>>=20\r
+>> diff --git a/Makefile b/Makefile\r
+>> index e5e2e3a..bb9c316 100644\r
+>> --- a/Makefile\r
+>> +++ b/Makefile\r
+>> @@ -3,7 +3,7 @@\r
+>>  all:\r
+>>=20=20\r
+>>  # List all subdirectories here. Each contains its own Makefile.local\r
+>> -subdirs =3D compat completion emacs lib man util test\r
+>> +subdirs =3D compat completion emacs lib man parse-time-string util test\r
+>>=20=20\r
+>>  # We make all targets depend on the Makefiles themselves.\r
+>>  global_deps =3D Makefile Makefile.config Makefile.local \\r
+>> diff --git a/parse-time-string/Makefile b/parse-time-string/Makefile\r
+>> new file mode 100644\r
+>> index 0000000..fa25832\r
+>> --- /dev/null\r
+>> +++ b/parse-time-string/Makefile\r
+>> @@ -0,0 +1,5 @@\r
+>> +all:\r
+>> +   $(MAKE) -C .. all\r
+>> +\r
+>> +.DEFAULT:\r
+>> +   $(MAKE) -C .. $@\r
+>> diff --git a/parse-time-string/Makefile.local b/parse-time-string/Makefi=\r
+le.local\r
+>> new file mode 100644\r
+>> index 0000000..53534f3\r
+>> --- /dev/null\r
+>> +++ b/parse-time-string/Makefile.local\r
+>> @@ -0,0 +1,12 @@\r
+>> +dir :=3D parse-time-string\r
+>> +extra_cflags +=3D -I$(srcdir)/$(dir)\r
+>> +\r
+>> +libparse-time-string_c_srcs :=3D $(dir)/parse-time-string.c\r
+>> +\r
+>> +libparse-time-string_modules :=3D $(libparse-time-string_c_srcs:.c=3D.o)\r
+>> +\r
+>> +$(dir)/libparse-time-string.a: $(libparse-time-string_modules)\r
+>> +   $(call quiet,AR) rcs $@ $^\r
+>> +\r
+>> +SRCS :=3D $(SRCS) $(libparse-time-string_c_srcs)\r
+>> +CLEAN :=3D $(CLEAN) $(libparse-time-string_modules) $(dir)/libparse-tim=\r
+e-string.a\r
+>> diff --git a/parse-time-string/README b/parse-time-string/README\r
+>> new file mode 100644\r
+>> index 0000000..300ff1f\r
+>> --- /dev/null\r
+>> +++ b/parse-time-string/README\r
+>> @@ -0,0 +1,9 @@\r
+>> +PARSE TIME STRING\r
+>> +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D\r
+>> +\r
+>> +parse_time_string() is a date/time parser originally written for\r
+>> +notmuch by Jani Nikula <jani@nikula.org>. However, there is nothing\r
+>> +notmuch specific in it, and it should be kept reusable for other\r
+>> +projects, and ready to be packaged on its own as needed. Please do not\r
+>> +add dependencies on or references to anything notmuch specific. The\r
+>> +parser should only depend on the C library.\r
+>> diff --git a/parse-time-string/parse-time-string.c b/parse-time-string/p=\r
+arse-time-string.c\r
+>> new file mode 100644\r
+>> index 0000000..942041a\r
+>> --- /dev/null\r
+>> +++ b/parse-time-string/parse-time-string.c\r
+>> @@ -0,0 +1,1477 @@\r
+>> +/*\r
+>> + * parse time string - user friendly date and time parser\r
+>> + * Copyright =C2=A9 2012 Jani Nikula\r
+>> + *\r
+>> + * This program is free software: you can redistribute it and/or modify\r
+>> + * it under the terms of the GNU General Public License as published by\r
+>> + * the Free Software Foundation, either version 2 of the License, or\r
+>> + * (at your option) any later version.\r
+>> + *\r
+>> + * This program is distributed in the hope that it will be useful,\r
+>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of\r
+>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\r
+>> + * GNU General Public License for more details.\r
+>> + *\r
+>> + * You should have received a copy of the GNU General Public License\r
+>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.\r
+>> + *\r
+>> + * Author: Jani Nikula <jani@nikula.org>\r
+>> + */\r
+>> +\r
+>> +#include <assert.h>\r
+>> +#include <ctype.h>\r
+>> +#include <errno.h>\r
+>> +#include <limits.h>\r
+>> +#include <stdio.h>\r
+>> +#include <stdarg.h>\r
+>> +#include <stdbool.h>\r
+>> +#include <stdlib.h>\r
+>> +#include <string.h>\r
+>> +#include <strings.h>\r
+>> +#include <time.h>\r
+>> +#include <sys/time.h>\r
+>> +#include <sys/types.h>\r
+>> +\r
+>> +#include "parse-time-string.h"\r
+>> +\r
+>> +/*\r
+>> + * IMPLEMENTATION DETAILS\r
+>> + *\r
+>> + * At a high level, the parsing is done in two phases: 1) actual\r
+>> + * parsing of the input string and storing the parsed data into\r
+>> + * 'struct state', and 2) processing of the data in 'struct state'\r
+>> + * according to current time (or provided reference time) and\r
+>> + * rounding. This is evident in the main entry point function\r
+>> + * parse_time_string().\r
+>> + *\r
+>> + * 1) The parsing phase - parse_input()\r
+>> + *\r
+>> + * Parsing is greedy and happens from left to right. The parsing is as\r
+>> + * unambiguous as possible; only unambiguous date/time formats are\r
+>> + * accepted. Redundant or contradictory absolute date/time in the\r
+>> + * input (e.g. date specified multiple times/ways) is not\r
+>> + * accepted. Relative date/time on the other hand just accumulates if\r
+>> + * present multiple times (e.g. "5 days 5 days" just turns into 10\r
+>> + * days).\r
+>> + *\r
+>> + * Parsing decisions are made on the input format, not value. For\r
+>> + * example, "20/5/2005" fails because the recognized format here is\r
+>> + * MM/D/YYYY, even though the values would suggest DD/M/YYYY.\r
+>> + *\r
+>> + * Parsing is mostly stateless in the sense that parsing decisions are\r
+>> + * not made based on the values of previously parsed data, or whether\r
+>> + * certain data is present in the first place. (There are a few\r
+>> + * exceptions to the latter part, though, such as parsing of time zone\r
+>> + * that would otherwise look like plain time.)\r
+>> + *\r
+>> + * When the parser encounters a number that is not greedily parsed as\r
+>> + * part of a format, the interpretation is postponed until the next\r
+>> + * token is parsed. The parser for the next token may consume the\r
+>> + * previously postponed number. For example, when parsing "20 May" the\r
+>> + * meaning of "20" is not known until "May" is parsed. If the parser\r
+>> + * for the next token does not consume the postponed number, the\r
+>> + * number is handled as a "lone" number before parser for the next\r
+>> + * token finishes.\r
+>> + *\r
+>> + * 2) The processing phase - create_output()\r
+>> + *\r
+>> + * Once the parser in phase 1 has finished, 'struct state' contains\r
+>> + * all the information from the input string, and it's no longer\r
+>> + * needed. Since the parser does not even handle the concept of "now",\r
+>> + * the processing initializes the fields referring to the current\r
+>> + * date/time.\r
+>> + *\r
+>> + * If requested, the result is rounded towards past or future. The\r
+>> + * idea behind rounding is to support parsing date/time ranges in an\r
+>> + * obvious way. For example, for a range defined as two dates (without\r
+>> + * time), one would typically want to have an inclusive range from the\r
+>> + * beginning of start date to the end of the end date. The caller\r
+>> + * would use rounding towards past in the start date, and towards\r
+>> + * future in the end date.\r
+>> + *\r
+>> + * The absolute date and time is shifted by the relative date and\r
+>> + * time, and time zone adjustments are made. Daylight saving time\r
+>> + * (DST) is specifically *not* handled at all.\r
+>> + *\r
+>> + * Finally, the result is stored to time_t.\r
+>> + */\r
+>> +\r
+>> +#define unused(x) x __attribute__ ((unused))\r
+>> +\r
+>> +/* XXX: Redefine these to add i18n support. The keyword table uses\r
+>> + * N_() to mark strings to be translated; they are accessed\r
+>> + * dynamically using _(). */\r
+>> +#define _(s) (s)   /* i18n: define as gettext (s) */\r
+>> +#define N_(s) (s)  /* i18n: define as gettext_noop (s) */\r
+>> +\r
+>> +#define ARRAY_SIZE(a) (sizeof (a) / sizeof (a[0]))\r
+>> +\r
+>> +/*\r
+>> + * Field indices in the tm and set arrays of struct state.\r
+>> + *\r
+>> + * NOTE: There's some code that depends on the ordering of this enum.\r
+>> + */\r
+>> +enum field {\r
+>> +    /* Keep SEC...YEAR in this order. */\r
+>> +    TM_ABS_SEC,            /* seconds */\r
+>> +    TM_ABS_MIN,            /* minutes */\r
+>> +    TM_ABS_HOUR,   /* hours */\r
+>> +    TM_ABS_MDAY,   /* day of the month */\r
+>> +    TM_ABS_MON,            /* month */\r
+>> +    TM_ABS_YEAR,   /* year */\r
+>> +\r
+>> +    TM_ABS_WDAY,   /* day of the week. special: may be relative */\r
+>\r
+> Given that this may be relative, should it really be called\r
+> TM_ABS_WDAY?\r
+\r
+Will change to TM_WDAY.\r
+\r
+>> +    TM_ABS_ISDST,  /* daylight saving time */\r
+>> +\r
+>> +    TM_AMPM,               /* am vs. pm */\r
+>> +    TM_TZ,         /* timezone in minutes */\r
+>> +\r
+>> +    /* Keep SEC...YEAR in this order. */\r
+>> +    TM_REL_SEC,            /* seconds relative to absolute or reference time */\r
+>> +    TM_REL_MIN,            /* minutes ... */\r
+>> +    TM_REL_HOUR,   /* hours ... */\r
+>> +    TM_REL_DAY,            /* days ... */\r
+>> +    TM_REL_MON,            /* months ... */\r
+>> +    TM_REL_YEAR,   /* years ... */\r
+>> +    TM_REL_WEEK,   /* weeks ... */\r
+>> +\r
+>> +    TM_NONE,               /* not a field */\r
+>> +\r
+>> +    TM_SIZE =3D TM_NONE,\r
+>> +    TM_FIRST_ABS =3D TM_ABS_SEC,\r
+>> +    TM_FIRST_REL =3D TM_REL_SEC,\r
+>> +};\r
+>> +\r
+>> +/* Values for the set array of struct state. */\r
+>> +enum field_set {\r
+>> +    FIELD_UNSET,   /* The field has not been touched by parser. */\r
+>> +    FIELD_SET,             /* The field has been set by parser. */\r
+>> +    FIELD_NOW,             /* The field will be set to reference time. */\r
+>> +};\r
+>> +\r
+>> +static enum field\r
+>> +next_abs_field (enum field field)\r
+>> +{\r
+>> +    /* NOTE: Depends on the enum ordering. */\r
+>> +    return field < TM_ABS_YEAR ? field + 1 : TM_NONE;\r
+>> +}\r
+>> +\r
+>> +static enum field\r
+>> +abs_to_rel_field (enum field field)\r
+>> +{\r
+>> +    assert (field <=3D TM_ABS_YEAR);\r
+>> +\r
+>> +    /* NOTE: Depends on the enum ordering. */\r
+>> +    return field + (TM_FIRST_REL - TM_FIRST_ABS);\r
+>> +}\r
+>> +\r
+>> +/* Get epoch value for field. */\r
+>\r
+> Explain what an "epoch value" for a field is.\r
+\r
+Will do.\r
+\r
+>> +static int\r
+>> +field_epoch (enum field field)\r
+>> +{\r
+>> +    if (field =3D=3D TM_ABS_MDAY || field =3D=3D TM_ABS_MON)\r
+>> +   return 1;\r
+>> +    else if (field =3D=3D TM_ABS_YEAR)\r
+>> +   return 1970;\r
+>> +    else\r
+>> +   return 0;\r
+>> +}\r
+>> +\r
+>> +/* The parsing state. */\r
+>> +struct state {\r
+>> +    int tm[TM_SIZE];                       /* parsed date and time */\r
+>> +    enum field_set set[TM_SIZE];   /* set status of tm */\r
+>> +\r
+>> +    enum field last_field; /* Previously set field. */\r
+>> +    char delim;\r
+>> +\r
+>> +    int postponed_length;  /* Number of digits in postponed value. */\r
+>> +    int postponed_value;\r
+>> +    char postponed_delim;  /* The delimiter preceding postponed number. =\r
+*/\r
+>> +};\r
+>> +\r
+>> +/*\r
+>> + * Helpers for postponed numbers.\r
+>> + *\r
+>> + * postponed_length is the number of digits in postponed value. 0\r
+>> + * means there is no postponed number. -1 means there is a postponed\r
+>> + * number, but it comes from a keyword, and it doesn't have digits.\r
+>> + */\r
+>> +static int\r
+>> +get_postponed_length (struct state *state)\r
+>> +{\r
+>> +    return state->postponed_length;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Consume a previously postponed number. Return true if a number was\r
+>> + * in fact postponed, false otherwise. Store the postponed number's\r
+>> + * value in *v, length in the input string in *n (or -1 if the number\r
+>> + * was written out and parsed as a keyword), and the preceding\r
+>> + * delimiter to *d.\r
+>\r
+> Mention that v, n, and d are unchanged if no number is postponed?  You\r
+> exploit this for default values elsewhere in the code.\r
+\r
+Will do.\r
+\r
+>> + */\r
+>> +static bool\r
+>> +get_postponed_number (struct state *state, int *v, int *n, char *d)\r
+>\r
+> Maybe "consume_postponed_number" to emphasize that this function has\r
+> side-effects (and isn't simply a "getter")?\r
+\r
+Agreed.\r
+\r
+>> +{\r
+>> +    if (!state->postponed_length)\r
+>> +   return false;\r
+>> +\r
+>> +    if (n)\r
+>> +   *n =3D state->postponed_length;\r
+>> +\r
+>> +    if (v)\r
+>> +   *v =3D state->postponed_value;\r
+>> +\r
+>> +    if (d)\r
+>> +   *d =3D state->postponed_delim;\r
+>> +\r
+>> +    state->postponed_length =3D 0;\r
+>> +    state->postponed_value =3D 0;\r
+>> +    state->postponed_delim =3D 0;\r
+>> +\r
+>> +    return true;\r
+>> +}\r
+>> +\r
+>> +static int parse_postponed_number (struct state *state, enum field next=\r
+_field);\r
+>> +\r
+>> +/*\r
+>> + * Postpone a number to be handled later. If one exists already,\r
+>> + * handle it first. n may be -1 to indicate a keyword that has no\r
+>> + * number length.\r
+>> + */\r
+>> +static int\r
+>> +set_postponed_number (struct state *state, int v, int n)\r
+>> +{\r
+>> +    int r;\r
+>> +    char d =3D state->delim;\r
+>> +\r
+>> +    /* Parse a previously postponed number, if any. */\r
+>> +    r =3D parse_postponed_number (state, TM_NONE);\r
+>> +    if (r)\r
+>> +   return r;\r
+>> +\r
+>> +    state->postponed_length =3D n;\r
+>> +    state->postponed_value =3D v;\r
+>> +    state->postponed_delim =3D d;\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> +\r
+>> +static void\r
+>> +set_delim (struct state *state, char delim)\r
+>> +{\r
+>> +    state->delim =3D delim;\r
+>> +}\r
+>> +\r
+>> +static void\r
+>> +unset_delim (struct state *state)\r
+>> +{\r
+>> +    state->delim =3D 0;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Field set/get/mod helpers.\r
+>> + */\r
+>> +\r
+>> +/* Return true if field has been set. */\r
+>> +static bool\r
+>> +is_field_set (struct state *state, enum field field)\r
+>> +{\r
+>> +    assert (field < ARRAY_SIZE (state->tm));\r
+>> +\r
+>> +    return field < ARRAY_SIZE (state->set) &&\r
+>\r
+> state->tm and state->set are the same size, so this will always by\r
+> true given that the assert hasn't fired.  Is this just defensive\r
+> programming?\r
+\r
+This is leftover from when state->tm and state->set weren't the same\r
+size. Will clean up the asserts and checks.\r
+\r
+>> +      state->set[field] !=3D FIELD_UNSET;\r
+>> +}\r
+>> +\r
+>> +static void\r
+>> +unset_field (struct state *state, enum field field)\r
+>> +{\r
+>> +    assert (field < ARRAY_SIZE (state->tm));\r
+>> +\r
+>> +    state->set[field] =3D FIELD_UNSET;\r
+>> +    state->tm[field] =3D 0;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Set field to value. A field can only be set once to ensure the\r
+>> + * input does not contain redundant and potentially conflicting data.\r
+>> + */\r
+>> +static int\r
+>> +set_field (struct state *state, enum field field, int value)\r
+>> +{\r
+>> +    int r;\r
+>> +\r
+>> +    assert (field < ARRAY_SIZE (state->tm));\r
+>> +\r
+>> +    /* Fields can only be set once. */\r
+>> +    if (field < ARRAY_SIZE (state->set) && state->set[field] !=3D FIELD=\r
+_UNSET)\r
+>\r
+> Same comment about array sizes.  Also, this should probably call\r
+> is_field_set instead of open-coding it (which would make the array\r
+> size check even more redundant!)\r
+\r
+Agreed.\r
+\r
+>> +   return -PARSE_TIME_ERR_ALREADYSET;\r
+>> +\r
+>> +    state->set[field] =3D FIELD_SET;\r
+>> +\r
+>> +    /* Parse a previously postponed number, if any. */\r
+>> +    r =3D parse_postponed_number (state, field);\r
+>\r
+> I don't understand the big picture with postponed number handling yet,\r
+> but is it worth mentioning in this function's doc comment that it\r
+> processes postponed numbers?\r
+>\r
+>> +    if (r)\r
+>> +   return r;\r
+>> +\r
+>> +    unset_delim (state);\r
+>> +\r
+>> +    state->tm[field] =3D value;\r
+>> +    state->last_field =3D field;\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Mark n fields in fields to be set to the reference date/time in the\r
+>> + * specified time zone, or local timezone if not specified. The fields\r
+>> + * will be initialized after parsing is complete and timezone is\r
+>> + * known.\r
+>> + */\r
+>> +static int\r
+>> +set_fields_to_now (struct state *state, enum field *fields, size_t n)\r
+>> +{\r
+>> +    size_t i;\r
+>> +    int r;\r
+>> +\r
+>> +    for (i =3D 0; i < n; i++) {\r
+>> +   r =3D set_field (state, fields[i], 0);\r
+>> +   if (r)\r
+>> +       return r;\r
+>> +   state->set[fields[i]] =3D FIELD_NOW;\r
+>> +    }\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> +\r
+>> +/* Modify field by adding value to it. To be used on relative fields,\r
+>> + * which can be modified multiple times (to accumulate). */\r
+>> +static int\r
+>> +mod_field (struct state *state, enum field field, int value)\r
+>\r
+> add_to_field?\r
+\r
+Agreed.\r
+\r
+>> +{\r
+>> +    int r;\r
+>> +\r
+>> +    assert (field < ARRAY_SIZE (state->tm));   /* assert relative??? */\r
+>> +\r
+>> +    if (field < ARRAY_SIZE (state->set))\r
+>\r
+> Another redundant check?\r
+\r
+Yes.\r
+\r
+>> +   state->set[field] =3D FIELD_SET;\r
+>> +\r
+>> +    /* Parse a previously postponed number, if any. */\r
+>> +    r =3D parse_postponed_number (state, field);\r
+>\r
+> This postponed number stuff is getting really confusing...\r
+\r
+Ouch...\r
+\r
+>> +    if (r)\r
+>> +   return r;\r
+>> +\r
+>> +    unset_delim (state);\r
+>> +\r
+>> +    state->tm[field] +=3D value;\r
+>> +    state->last_field =3D field;\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Get field value. Make sure the field is set before query. It's most\r
+>> + * likely an error to call this while parsing (for example fields set\r
+>> + * as FIELD_NOW will only be set to some value after parsing).\r
+>> + */\r
+>> +static int\r
+>> +get_field (struct state *state, enum field field)\r
+>> +{\r
+>> +    assert (field < ARRAY_SIZE (state->tm));\r
+>\r
+> Assert that the field is set?\r
+\r
+The relative fields might not be set, but have 0 value by default,\r
+during create_output().\r
+\r
+>> +\r
+>> +    return state->tm[field];\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Validity checkers.\r
+>> + */\r
+>> +static bool is_valid_12hour (int h)\r
+>> +{\r
+>> +    return h >=3D 0 && h <=3D 12;\r
+>\r
+> h >=3D 1?\r
+\r
+Will fix.\r
+\r
+>> +}\r
+>> +\r
+>> +static bool is_valid_time (int h, int m, int s)\r
+>> +{\r
+>> +    /* Allow 24:00:00 to denote end of day. */\r
+>> +    if (h =3D=3D 24 && m =3D=3D 0 && s =3D=3D 0)\r
+>> +   return true;\r
+>> +\r
+>> +    return h >=3D 0 && h <=3D 23 && m >=3D 0 && m <=3D 59 && s >=3D 0 &=\r
+& s <=3D 59;\r
+>> +}\r
+>> +\r
+>> +static bool is_valid_mday (int mday)\r
+>> +{\r
+>> +    return mday >=3D 1 && mday <=3D 31;\r
+>> +}\r
+>> +\r
+>> +static bool is_valid_mon (int mon)\r
+>> +{\r
+>> +    return mon >=3D 1 && mon <=3D 12;\r
+>> +}\r
+>> +\r
+>> +static bool is_valid_year (int year)\r
+>> +{\r
+>> +    return year >=3D 1970;\r
+>> +}\r
+>> +\r
+>> +static bool is_valid_date (int year, int mon, int mday)\r
+>> +{\r
+>> +    return is_valid_year (year) && is_valid_mon (mon) && is_valid_mday =\r
+(mday);\r
+>> +}\r
+>> +\r
+>> +/* Unset indicator for time and date set helpers. */\r
+>> +#define UNSET -1\r
+>> +\r
+>> +/* Time set helper. No input checking. Use UNSET (-1) to leave unset. */\r
+>> +static int\r
+>> +set_abs_time (struct state *state, int hour, int min, int sec)\r
+>> +{\r
+>> +    int r;\r
+>> +\r
+>> +    if (hour !=3D UNSET) {\r
+>> +   if ((r =3D set_field (state, TM_ABS_HOUR, hour)))\r
+>> +       return r;\r
+>> +    }\r
+>> +\r
+>> +    if (min !=3D UNSET) {\r
+>> +   if ((r =3D set_field (state, TM_ABS_MIN, min)))\r
+>> +       return r;\r
+>> +    }\r
+>> +\r
+>> +    if (sec !=3D UNSET) {\r
+>> +   if ((r =3D set_field (state, TM_ABS_SEC, sec)))\r
+>> +       return r;\r
+>> +    }\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> +\r
+>> +/* Date set helper. No input checking. Use UNSET (-1) to leave unset. */\r
+>> +static int\r
+>> +set_abs_date (struct state *state, int year, int mon, int mday)\r
+>> +{\r
+>> +    int r;\r
+>> +\r
+>> +    if (year !=3D UNSET) {\r
+>> +   if ((r =3D set_field (state, TM_ABS_YEAR, year)))\r
+>> +       return r;\r
+>> +    }\r
+>> +\r
+>> +    if (mon !=3D UNSET) {\r
+>> +   if ((r =3D set_field (state, TM_ABS_MON, mon)))\r
+>> +       return r;\r
+>> +    }\r
+>> +\r
+>> +    if (mday !=3D UNSET) {\r
+>> +   if ((r =3D set_field (state, TM_ABS_MDAY, mday)))\r
+>> +       return r;\r
+>> +    }\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Keyword parsing and handling.\r
+>> + */\r
+>> +struct keyword;\r
+>> +typedef int (*setter_t)(struct state *state, struct keyword *kw);\r
+>> +\r
+>> +struct keyword {\r
+>> +    const char *name;      /* keyword */\r
+>> +    enum field field;      /* field to set, or FIELD_NONE if N/A */\r
+>> +    int value;             /* value to set, or 0 if N/A */\r
+>> +    setter_t set;  /* function to use for setting, if non-NULL */\r
+>> +};\r
+>> +\r
+>> +/*\r
+>> + * Setter callback functions for keywords.\r
+>> + */\r
+>> +static int\r
+>> +kw_set_default (struct state *state, struct keyword *kw)\r
+>\r
+> It took me a while to figure out what the name of this had to do with\r
+> the action it performs, then I realized that it's never used in the\r
+> table and only called when set is NULL.  Given that, I think it would\r
+> make more sense to just put the set_field call in place of the one\r
+> current call to kw_set_default.  Currently, this seems like one\r
+> indirection too much.\r
+\r
+Agreed.\r
+\r
+>> +{\r
+>> +    return set_field (state, kw->field, kw->value);\r
+>> +}\r
+>> +\r
+>> +static int\r
+>> +kw_set_rel (struct state *state, struct keyword *kw)\r
+>> +{\r
+>> +    int multiplier =3D 1;\r
+>> +\r
+>> +    /* Get a previously set multiplier, if any. */\r
+>> +    get_postponed_number (state, &multiplier, NULL, NULL);\r
+>> +\r
+>> +    /* Accumulate relative field values. */\r
+>> +    return mod_field (state, kw->field, multiplier * kw->value);\r
+>> +}\r
+>> +\r
+>> +static int\r
+>> +kw_set_number (struct state *state, struct keyword *kw)\r
+>> +{\r
+>> +    /* -1 =3D no length, from keyword. */\r
+>> +    return set_postponed_number (state, kw->value, -1);\r
+>> +}\r
+>> +\r
+>> +static int\r
+>> +kw_set_month (struct state *state, struct keyword *kw)\r
+>> +{\r
+>> +    int n =3D get_postponed_length (state);\r
+>> +\r
+>> +    /* Consume postponed number if it could be mday. This handles "20\r
+>> +     * January". */\r
+>> +    if (n =3D=3D 1 || n =3D=3D 2) {\r
+>\r
+> Should this be (n && is_valid_mday (state->postponed_value))?  It\r
+> seems a little odd that postponed numbers three digits or longer are\r
+> treated as independent, but two digits numbers > 31 are an error.\r
+\r
+I'm inclined to treating any one- or two-digit number preceding a month\r
+name as day of the month, and not letting the value affect that\r
+decision.\r
+\r
+>> +   int r, v;\r
+>> +\r
+>> +   get_postponed_number (state, &v, NULL, NULL);\r
+>> +\r
+>> +   if (!is_valid_mday (v))\r
+>> +       return -PARSE_TIME_ERR_INVALIDDATE;\r
+>> +\r
+>> +   r =3D set_field (state, TM_ABS_MDAY, v);\r
+>> +   if (r)\r
+>> +       return r;\r
+>> +    }\r
+>> +\r
+>> +    return set_field (state, kw->field, kw->value);\r
+>> +}\r
+>> +\r
+>> +static int\r
+>> +kw_set_ampm (struct state *state, struct keyword *kw)\r
+>> +{\r
+>> +    int n =3D get_postponed_length (state);\r
+>> +\r
+>> +    /* Consume postponed number if it could be hour. This handles\r
+>> +     * "5pm". */\r
+>> +    if (n =3D=3D 1 || n =3D=3D 2) {\r
+>\r
+> Same comment as for kw_set_month.\r
+\r
+Same as above.\r
+\r
+>> +   int r, v;\r
+>> +\r
+>> +   get_postponed_number (state, &v, NULL, NULL);\r
+>> +\r
+>> +   if (!is_valid_12hour (v))\r
+>> +       return -PARSE_TIME_ERR_INVALIDTIME;\r
+>> +\r
+>> +   r =3D set_abs_time (state, v, 0, 0);\r
+>> +   if (r)\r
+>> +       return r;\r
+>> +    }\r
+>> +\r
+>> +    return set_field (state, kw->field, kw->value);\r
+>> +}\r
+>> +\r
+>> +static int\r
+>> +kw_set_timeofday (struct state *state, struct keyword *kw)\r
+>> +{\r
+>> +    return set_abs_time (state, kw->value, 0, 0);\r
+>> +}\r
+>> +\r
+>> +static int\r
+>> +kw_set_today (struct state *state, unused (struct keyword *kw))\r
+>> +{\r
+>> +    enum field fields[] =3D { TM_ABS_YEAR, TM_ABS_MON, TM_ABS_MDAY };\r
+>> +\r
+>> +    return set_fields_to_now (state, fields, ARRAY_SIZE (fields));\r
+>> +}\r
+>> +\r
+>> +static int\r
+>> +kw_set_now (struct state *state, unused (struct keyword *kw))\r
+>> +{\r
+>> +    enum field fields[] =3D { TM_ABS_HOUR, TM_ABS_MIN, TM_ABS_SEC };\r
+>> +\r
+>> +    return set_fields_to_now (state, fields, ARRAY_SIZE (fields));\r
+>> +}\r
+>> +\r
+>> +static int\r
+>> +kw_set_ordinal (struct state *state, struct keyword *kw)\r
+>> +{\r
+>> +    int n, v;\r
+>> +\r
+>> +    /* Require a postponed number. */\r
+>> +    if (!get_postponed_number (state, &v, &n, NULL))\r
+>> +   return -PARSE_TIME_ERR_DATEFORMAT;\r
+>> +\r
+>> +    /* Ordinals are mday. */\r
+>> +    if (n !=3D 1 && n !=3D 2)\r
+>\r
+> Is this redundant with your is_valid_mday test below?\r
+\r
+No, this rejects stuff like "005th" and "five th".\r
+\r
+>> +   return -PARSE_TIME_ERR_DATEFORMAT;\r
+>> +\r
+>> +    /* Be strict about st, nd, rd, and lax about th. */\r
+>> +    if (strcasecmp (kw->name, "st") =3D=3D 0 && v !=3D 1 && v !=3D 21 &=\r
+& v !=3D 31)\r
+>> +   return -PARSE_TIME_ERR_INVALIDDATE;\r
+>> +    else if (strcasecmp (kw->name, "nd") =3D=3D 0 && v !=3D 2 && v !=3D=\r
+ 22)\r
+>> +   return -PARSE_TIME_ERR_INVALIDDATE;\r
+>> +    else if (strcasecmp (kw->name, "rd") =3D=3D 0 && v !=3D 3 && v !=3D=\r
+ 23)\r
+>> +   return -PARSE_TIME_ERR_INVALIDDATE;\r
+>> +    else if (strcasecmp (kw->name, "th") =3D=3D 0 && !is_valid_mday (v))\r
+>> +   return -PARSE_TIME_ERR_INVALIDDATE;\r
+>> +\r
+>> +    return set_field (state, TM_ABS_MDAY, v);\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Accepted keywords.\r
+>> + *\r
+>> + * A keyword may optionally contain a '|' to indicate the minimum\r
+>> + * match length. Without one, full match is required. It's advisable\r
+>> + * to keep the minimum match parts unique across all keywords.\r
+>> + *\r
+>> + * If keyword begins with upper case letter, then the matching will be\r
+>> + * case sensitive. Otherwise the matching is case insensitive.\r
+>> + *\r
+>> + * If setter is NULL, set_default will be used.\r
+>> + *\r
+>> + * Note: Order matters. Matching is greedy, longest match is used, but\r
+>> + * of equal length matches the first one is used, unless there's an\r
+>> + * equal length case sensitive match which trumps case insensitive\r
+>> + * matches.\r
+>\r
+> If you do have a tokenizer (or disallow mashing keywords together),\r
+> then all of complexity arising from longest match goes away because\r
+> the keyword token either will or won't match a keyword.  If you also\r
+> eliminate the rule for case sensitivity and put case-sensitive things\r
+> before conflicting case-insensitive things (so put "M" before\r
+> "m|inutes"), then you can simply use the first match.\r
+\r
+At least one reason for going through the whole table is that if this\r
+ever gets i18n support, the conflicting things might be different. While\r
+order matters in principle, you should create the table so that it\r
+really doesn't matter.\r
+\r
+>\r
+>> + */\r
+>> +static struct keyword keywords[] =3D {\r
+>> +    /* Weekdays. */\r
+>> +    { N_("sun|day"),       TM_ABS_WDAY,    0,      NULL },\r
+>> +    { N_("mon|day"),       TM_ABS_WDAY,    1,      NULL },\r
+>> +    { N_("tue|sday"),      TM_ABS_WDAY,    2,      NULL },\r
+>> +    { N_("wed|nesday"),    TM_ABS_WDAY,    3,      NULL },\r
+>> +    { N_("thu|rsday"),     TM_ABS_WDAY,    4,      NULL },\r
+>> +    { N_("fri|day"),       TM_ABS_WDAY,    5,      NULL },\r
+>> +    { N_("sat|urday"),     TM_ABS_WDAY,    6,      NULL },\r
+>> +\r
+>> +    /* Months. */\r
+>> +    { N_("jan|uary"),      TM_ABS_MON,     1,      kw_set_month },\r
+>> +    { N_("feb|ruary"),     TM_ABS_MON,     2,      kw_set_month },\r
+>> +    { N_("mar|ch"),        TM_ABS_MON,     3,      kw_set_month },\r
+>> +    { N_("apr|il"),        TM_ABS_MON,     4,      kw_set_month },\r
+>> +    { N_("may"),   TM_ABS_MON,     5,      kw_set_month },\r
+>> +    { N_("jun|e"), TM_ABS_MON,     6,      kw_set_month },\r
+>> +    { N_("jul|y"), TM_ABS_MON,     7,      kw_set_month },\r
+>> +    { N_("aug|ust"),       TM_ABS_MON,     8,      kw_set_month },\r
+>> +    { N_("sep|tember"),    TM_ABS_MON,     9,      kw_set_month },\r
+>> +    { N_("oct|ober"),      TM_ABS_MON,     10,     kw_set_month },\r
+>> +    { N_("nov|ember"),     TM_ABS_MON,     11,     kw_set_month },\r
+>> +    { N_("dec|ember"),     TM_ABS_MON,     12,     kw_set_month },\r
+>> +\r
+>> +    /* Durations. */\r
+>> +    { N_("y|ears"),        TM_REL_YEAR,    1,      kw_set_rel },\r
+>> +    { N_("w|eeks"),        TM_REL_WEEK,    1,      kw_set_rel },\r
+>> +    { N_("d|ays"), TM_REL_DAY,     1,      kw_set_rel },\r
+>> +    { N_("h|ours"),        TM_REL_HOUR,    1,      kw_set_rel },\r
+>> +    { N_("hr|s"),  TM_REL_HOUR,    1,      kw_set_rel },\r
+>> +    { N_("m|inutes"),      TM_REL_MIN,     1,      kw_set_rel },\r
+>> +    /* M=3Dmonths, m=3Dminutes */\r
+>> +    { N_("M"),             TM_REL_MON,     1,      kw_set_rel },\r
+>> +    { N_("mins"),  TM_REL_MIN,     1,      kw_set_rel },\r
+>> +    { N_("mo|nths"),       TM_REL_MON,     1,      kw_set_rel },\r
+>> +    { N_("s|econds"),      TM_REL_SEC,     1,      kw_set_rel },\r
+>> +    { N_("secs"),  TM_REL_SEC,     1,      kw_set_rel },\r
+>> +\r
+>> +    /* Numbers. */\r
+>> +    { N_("one"),   TM_NONE,        1,      kw_set_number },\r
+>> +    { N_("two"),   TM_NONE,        2,      kw_set_number },\r
+>> +    { N_("three"), TM_NONE,        3,      kw_set_number },\r
+>> +    { N_("four"),  TM_NONE,        4,      kw_set_number },\r
+>> +    { N_("five"),  TM_NONE,        5,      kw_set_number },\r
+>> +    { N_("six"),   TM_NONE,        6,      kw_set_number },\r
+>> +    { N_("seven"), TM_NONE,        7,      kw_set_number },\r
+>> +    { N_("eight"), TM_NONE,        8,      kw_set_number },\r
+>> +    { N_("nine"),  TM_NONE,        9,      kw_set_number },\r
+>> +    { N_("ten"),   TM_NONE,        10,     kw_set_number },\r
+>> +    { N_("dozen"), TM_NONE,        12,     kw_set_number },\r
+>> +    { N_("hundred"),       TM_NONE,        100,    kw_set_number },\r
+>> +\r
+>> +    /* Special number forms. */\r
+>> +    { N_("this"),  TM_NONE,        0,      kw_set_number },\r
+>> +    { N_("last"),  TM_NONE,        1,      kw_set_number },\r
+>> +\r
+>> +    /* Other special keywords. */\r
+>> +    { N_("yesterday"),     TM_REL_DAY,     1,      kw_set_rel },\r
+>> +    { N_("today"), TM_NONE,        0,      kw_set_today },\r
+>> +    { N_("now"),   TM_NONE,        0,      kw_set_now },\r
+>> +    { N_("noon"),  TM_NONE,        12,     kw_set_timeofday },\r
+>> +    { N_("midnight"),      TM_NONE,        0,      kw_set_timeofday },\r
+>> +    { N_("am"),            TM_AMPM,        0,      kw_set_ampm },\r
+>> +    { N_("a.m."),  TM_AMPM,        0,      kw_set_ampm },\r
+>> +    { N_("pm"),            TM_AMPM,        1,      kw_set_ampm },\r
+>> +    { N_("p.m."),  TM_AMPM,        1,      kw_set_ampm },\r
+>> +    { N_("st"),            TM_NONE,        0,      kw_set_ordinal },\r
+>> +    { N_("nd"),            TM_NONE,        0,      kw_set_ordinal },\r
+>> +    { N_("rd"),            TM_NONE,        0,      kw_set_ordinal },\r
+>> +    { N_("th"),            TM_NONE,        0,      kw_set_ordinal },\r
+>> +\r
+>> +    /* Timezone codes: offset in minutes. XXX: Add more codes. */\r
+>> +    { N_("pst"),   TM_TZ,          -8*60,  NULL },\r
+>> +    { N_("mst"),   TM_TZ,          -7*60,  NULL },\r
+>> +    { N_("cst"),   TM_TZ,          -6*60,  NULL },\r
+>> +    { N_("est"),   TM_TZ,          -5*60,  NULL },\r
+>> +    { N_("ast"),   TM_TZ,          -4*60,  NULL },\r
+>> +    { N_("nst"),   TM_TZ,          -(3*60+30),     NULL },\r
+>> +\r
+>> +    { N_("gmt"),   TM_TZ,          0,      NULL },\r
+>> +    { N_("utc"),   TM_TZ,          0,      NULL },\r
+>> +\r
+>> +    { N_("wet"),   TM_TZ,          0,      NULL },\r
+>> +    { N_("cet"),   TM_TZ,          1*60,   NULL },\r
+>> +    { N_("eet"),   TM_TZ,          2*60,   NULL },\r
+>> +    { N_("fet"),   TM_TZ,          3*60,   NULL },\r
+>> +\r
+>> +    { N_("wat"),   TM_TZ,          1*60,   NULL },\r
+>> +    { N_("cat"),   TM_TZ,          2*60,   NULL },\r
+>> +    { N_("eat"),   TM_TZ,          3*60,   NULL },\r
+>> +};\r
+>> +\r
+>> +/*\r
+>> + * Compare strings s and keyword. Return number of matching chars on\r
+>> + * match, 0 for no match. Match must be at least n chars, or all of\r
+>> + * keyword if n < 0, otherwise it's not a match. Use match_case for\r
+>> + * case sensitive matching.\r
+>> + */\r
+>> +static size_t\r
+>> +match_keyword (const char *s, const char *keyword, ssize_t n, bool matc=\r
+h_case)\r
+>> +{\r
+>> +    ssize_t i;\r
+>> +\r
+>> +    if (!n)\r
+>> +   return 0;\r
+>> +\r
+>> +    for (i =3D 0; *s && *keyword; i++, s++, keyword++) {\r
+>> +   if (match_case) {\r
+>> +       if (*s !=3D *keyword)\r
+>\r
+> The pointer arithmetic doesn't seem to buy anything here.  What about\r
+> just looping over i and using s[i] and keyword[i]?\r
+\r
+The pointer arithmetic will be useful when I implement your other\r
+suggestion of handling '|' here. ;) Otherwise, I'd need two index\r
+variables.\r
+\r
+>\r
+>> +           break;\r
+>> +   } else {\r
+>> +       if (tolower ((unsigned char) *s) !=3D\r
+>> +           tolower ((unsigned char) *keyword))\r
+>\r
+> I don't think the cast to unsigned char is necessary.\r
+\r
+As discussed on IRC, pedantically it is necessary, as ctype.h functions\r
+accept an int that must have the value of an unsigned char or EOF, and\r
+char might be signed.\r
+\r
+>> +           break;\r
+>> +   }\r
+>> +    }\r
+>> +\r
+>> +    if (n > 0)\r
+>> +   return i < n ? 0 : i;\r
+>> +    else\r
+>> +   return *keyword ? 0 : i;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Parse a keyword. Return < 0 on error, number of parsed chars on\r
+>> + * success.\r
+>> + */\r
+>> +static ssize_t\r
+>> +parse_keyword (struct state *state, const char *s)\r
+>> +{\r
+>> +    unsigned int i;\r
+>> +    size_t n, max_n =3D 0;\r
+>> +    struct keyword *kw =3D NULL;\r
+>> +    int r;\r
+>> +\r
+>> +    /* Match longest keyword */\r
+>> +    for (i =3D 0; i < ARRAY_SIZE (keywords); i++) {\r
+>> +   /* Match case if keyword begins with upper case letter. */\r
+>> +   bool mcase =3D isupper ((unsigned char) keywords[i].name[0]);\r
+>\r
+> Same with this cast.\r
+>\r
+>> +   ssize_t minlen =3D -1;\r
+>> +   char keyword[128];\r
+>> +   char *p;\r
+>> +\r
+>> +   strncpy (keyword, _(keywords[i].name), sizeof (keyword));\r
+>> +\r
+>> +   /* Truncate too long keywords. XXX: Make this dynamic? */\r
+>> +   keyword[sizeof (keyword) - 1] =3D '\0';\r
+>> +\r
+>> +   /* Minimum match length. */\r
+>> +   p =3D strchr (keyword, '|');\r
+>> +   if (p) {\r
+>> +       minlen =3D p - keyword;\r
+>> +\r
+>> +       /* Remove the minimum match length separator. */\r
+>> +       memmove (p, p + 1, strlen (p + 1) + 1);\r
+>> +   }\r
+>\r
+> Would it make more sense to make match_keyword aware of the |\r
+> character?  Then you wouldn't need this dance with copying the keyword\r
+> into a scratch buffer.  I'm thinking something like (untested)\r
+\r
+Agreed.\r
+\r
+> static size_t\r
+> match_keyword (const char *s, const char *keyword, bool match_case)\r
+> {\r
+>     size_t i;\r
+>     bool prefix_matched =3D false;\r
+>\r
+>     for (i =3D 0; *s && *keyword; i++, s++, keyword++) {\r
+>         if (*keyword =3D=3D '|') {\r
+>             prefix_matched =3D true;\r
+>             ++keyword;\r
+>         }\r
+>         if (match_case && *s !=3D *keyword)\r
+>             return 0;\r
+>         else if (tolower (*s) !=3D tolower (*keyword))\r
+>             return 0;\r
+>     }\r
+>\r
+>     if (!*keyword || prefix_matched)\r
+>         return i;\r
+>     return 0;\r
+> }\r
+>\r
+>> +\r
+>> +   n =3D match_keyword (s, keyword, minlen, mcase);\r
+>> +   if (n > max_n || (n =3D=3D max_n && mcase)) {\r
+>> +       max_n =3D n;\r
+>> +       kw =3D &keywords[i];\r
+>> +   }\r
+>> +    }\r
+>> +\r
+>> +    if (!kw)\r
+>> +   return -PARSE_TIME_ERR_KEYWORD;\r
+>> +\r
+>> +    if (kw->set)\r
+>> +   r =3D kw->set (state, kw);\r
+>> +    else\r
+>> +   r =3D kw_set_default (state, kw);\r
+>> +\r
+>> +    if (r < 0)\r
+>> +   return r;\r
+>> +\r
+>> +    return max_n;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Non-keyword parsers and their helpers.\r
+>> + */\r
+>> +\r
+>> +static int\r
+>> +set_user_tz (struct state *state, char sign, int hour, int min)\r
+>> +{\r
+>> +    int tz =3D hour * 60 + min;\r
+>> +\r
+>> +    assert (sign =3D=3D '+' || sign =3D=3D '-');\r
+>> +\r
+>> +    if (hour < 0 || hour > 14 || min < 0 || min > 59 || min % 15)\r
+>\r
+> Good to see you're not forgetting our Kiribati notmuch user base.\r
+\r
+:)\r
+\r
+>> +   return -PARSE_TIME_ERR_INVALIDTIME;\r
+>> +\r
+>> +    if (sign =3D=3D '-')\r
+>> +   tz =3D -tz;\r
+>> +\r
+>> +    return set_field (state, TM_TZ, tz);\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Parse a previously postponed number if one exists. Independent\r
+>> + * parsing of a postponed number when it wasn't consumed during\r
+>> + * parsing of the following token.\r
+>> + */\r
+>> +static int\r
+>> +parse_postponed_number (struct state *state, unused (enum field next_fi=\r
+eld))\r
+>> +{\r
+>> +    int v, n;\r
+>> +    char d;\r
+>> +\r
+>> +    /* Bail out if there's no postponed number. */\r
+>> +    if (!get_postponed_number (state, &v, &n, &d))\r
+>> +   return 0;\r
+>> +\r
+>> +    if (n =3D=3D 1 || n =3D=3D 2) {\r
+>> +   /* Notable exception: Previous field affects parsing. This\r
+>> +    * handles "January 20". */\r
+>> +   if (state->last_field =3D=3D TM_ABS_MON) {\r
+>> +       /* D[D] */\r
+>> +       if (!is_valid_mday (v))\r
+>> +           return -PARSE_TIME_ERR_INVALIDDATE;\r
+>> +\r
+>> +       return set_field (state, TM_ABS_MDAY, v);\r
+>> +   } else if (n =3D=3D 2) {\r
+>> +       /* XXX: Only allow if last field is hour, min, or sec? */\r
+>> +       if (d =3D=3D '+' || d =3D=3D '-') {\r
+>> +           /* +/-HH */\r
+>> +           return set_user_tz (state, d, v, 0);\r
+>> +       }\r
+>> +   }\r
+>> +    } else if (n =3D=3D 4) {\r
+>> +   /* Notable exception: Value affects parsing. Time zones are\r
+>> +    * always at most 1400 and we don't understand years before\r
+>> +    * 1970. */\r
+>> +   if (!is_valid_year (v)) {\r
+>> +       if (d =3D=3D '+' || d =3D=3D '-') {\r
+>> +           /* +/-HHMM */\r
+>> +           return set_user_tz (state, d, v / 100, v % 100);\r
+>> +       }\r
+>> +   } else {\r
+>> +       /* YYYY */\r
+>> +       return set_field (state, TM_ABS_YEAR, v);\r
+>> +   }\r
+>> +    } else if (n =3D=3D 6) {\r
+>> +   /* HHMMSS */\r
+>> +   int hour =3D v / 10000;\r
+>> +   int min =3D (v / 100) % 100;\r
+>> +   int sec =3D v % 100;\r
+>> +\r
+>> +   if (!is_valid_time (hour, min, sec))\r
+>> +       return -PARSE_TIME_ERR_INVALIDTIME;\r
+>> +\r
+>> +   return set_abs_time (state, hour, min, sec);\r
+>> +    } else if (n =3D=3D 8) {\r
+>> +   /* YYYYMMDD */\r
+>> +   int year =3D v / 10000;\r
+>> +   int mon =3D (v / 100) % 100;\r
+>> +   int mday =3D v % 100;\r
+>> +\r
+>> +   if (!is_valid_date (year, mon, mday))\r
+>> +       return -PARSE_TIME_ERR_INVALIDDATE;\r
+>> +\r
+>> +   return set_abs_date (state, year, mon, mday);\r
+>> +    } else {\r
+>> +   return -PARSE_TIME_ERR_FORMAT;\r
+>\r
+> No need for the else block, given the return at the end.\r
+\r
+Will fix.\r
+\r
+>> +    }\r
+>> +\r
+>> +    return -PARSE_TIME_ERR_FORMAT;\r
+>> +}\r
+>> +\r
+>> +static int tm_get_field (const struct tm *tm, enum field field);\r
+>> +\r
+>> +static int\r
+>> +set_timestamp (struct state *state, time_t t)\r
+>> +{\r
+>> +    struct tm tm;\r
+>> +    enum field f;\r
+>> +    int r;\r
+>> +\r
+>> +    if (gmtime_r (&t, &tm) =3D=3D NULL)\r
+>> +   return -PARSE_TIME_ERR_LIB;\r
+>> +\r
+>> +    for (f =3D TM_ABS_SEC; f !=3D TM_NONE; f =3D next_abs_field (f)) {\r
+>> +   r =3D set_field (state, f, tm_get_field (&tm, f));\r
+>> +   if (r)\r
+>> +       return r;\r
+>> +    }\r
+>> +\r
+>> +    r =3D set_field (state, TM_TZ, 0);\r
+>> +    if (r)\r
+>> +   return r;\r
+>> +\r
+>> +    /* XXX: Prevent TM_AMPM with timestamp, e.g. "@123456 pm" */\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> +\r
+>> +/* Parse a single number. Typically postpone parsing until later. */\r
+>> +static int\r
+>> +parse_single_number (struct state *state, unsigned long v,\r
+>> +                unsigned long n)\r
+>> +{\r
+>> +    assert (n);\r
+>> +\r
+>> +    if (state->delim =3D=3D '@')\r
+>> +   return set_timestamp (state, (time_t) v);\r
+>> +\r
+>> +    if (v > INT_MAX)\r
+>> +   return -PARSE_TIME_ERR_FORMAT;\r
+>> +\r
+>> +    return set_postponed_number (state, v, n);\r
+>> +}\r
+>> +\r
+>> +static bool\r
+>> +is_time_sep (char c)\r
+>> +{\r
+>> +    return c =3D=3D ':';\r
+>> +}\r
+>> +\r
+>> +static bool\r
+>> +is_date_sep (char c)\r
+>> +{\r
+>> +    return c =3D=3D '/' || c =3D=3D '-' || c =3D=3D '.';\r
+>> +}\r
+>> +\r
+>> +static bool\r
+>> +is_sep (char c)\r
+>> +{\r
+>> +    return is_time_sep (c) || is_date_sep (c);\r
+>> +}\r
+>> +\r
+>> +/* Two-digit year: 00...69 is 2000s, 70...99 1900s, if n =3D=3D 0 keep\r
+>> + * unset. */\r
+>> +static int\r
+>> +expand_year (unsigned long year, size_t n)\r
+>> +{\r
+>> +    if (n =3D=3D 2) {\r
+>> +   return (year < 70 ? 2000 : 1900) + year;\r
+>> +    } else if (n =3D=3D 4) {\r
+>> +   return year;\r
+>> +    } else {\r
+>> +   return UNSET;\r
+>> +    }\r
+>> +}\r
+>> +\r
+>> +/* Parse a date number triplet. */\r
+>> +static int\r
+>> +parse_date (struct state *state, char sep,\r
+>> +       unsigned long v1, unsigned long v2, unsigned long v3,\r
+>> +       size_t n1, size_t n2, size_t n3)\r
+>> +{\r
+>> +    int year =3D UNSET, mon =3D UNSET, mday =3D UNSET;\r
+>> +\r
+>> +    assert (is_date_sep (sep));\r
+>> +\r
+>> +    switch (sep) {\r
+>> +    case '/': /* Date: M[M]/D[D][/YY[YY]] or M[M]/YYYY */\r
+>> +   if (n1 !=3D 1 && n1 !=3D 2)\r
+>> +       return -PARSE_TIME_ERR_DATEFORMAT;\r
+>> +\r
+>> +   if ((n2 =3D=3D 1 || n2 =3D=3D 2) && (n3 =3D=3D 0 || n3 =3D=3D 2 || n3 =\r
+=3D=3D 4)) {\r
+>> +       /* M[M]/D[D][/YY[YY]] */\r
+>> +       year =3D expand_year (v3, n3);\r
+>> +       mon =3D v1;\r
+>> +       mday =3D v2;\r
+>> +   } else if (n2 =3D=3D 4 && n3 =3D=3D 0) {\r
+>> +       /* M[M]/YYYY */\r
+>> +       year =3D v2;\r
+>> +       mon =3D v1;\r
+>> +   } else {\r
+>> +       return -PARSE_TIME_ERR_DATEFORMAT;\r
+>> +   }\r
+>> +   break;\r
+>> +\r
+>> +    case '-': /* Date: YYYY-MM[-DD] or DD-MM[-YY[YY]] or MM-YYYY */\r
+>> +   if (n1 =3D=3D 4 && n2 =3D=3D 2 && (n3 =3D=3D 0 || n3 =3D=3D 2)) {\r
+>> +       /* YYYY-MM[-DD] */\r
+>> +       year =3D v1;\r
+>> +       mon =3D v2;\r
+>> +       if (n3)\r
+>> +           mday =3D v3;\r
+>> +   } else if (n1 =3D=3D 2 && n2 =3D=3D 2 && (n3 =3D=3D 0 || n3 =3D=3D 2 |=\r
+| n3 =3D=3D 4)) {\r
+>> +       /* DD-MM[-YY[YY]] */\r
+>> +       year =3D expand_year (v3, n3);\r
+>> +       mon =3D v2;\r
+>> +       mday =3D v1;\r
+>> +   } else if (n1 =3D=3D 2 && n2 =3D=3D 4 && n3 =3D=3D 0) {\r
+>> +       /* MM-YYYY */\r
+>> +       year =3D v2;\r
+>> +       mon =3D v1;\r
+>> +   } else {\r
+>> +       return -PARSE_TIME_ERR_DATEFORMAT;\r
+>> +   }\r
+>> +   break;\r
+>> +\r
+>> +    case '.': /* Date: D[D].M[M][.[YY[YY]]] */\r
+>> +   if ((n1 !=3D 1 && n1 !=3D 2) || (n2 !=3D 1 && n2 !=3D 2) ||\r
+>> +       (n3 !=3D 0 && n3 !=3D 2 && n3 !=3D 4))\r
+>> +       return -PARSE_TIME_ERR_DATEFORMAT;\r
+>> +\r
+>> +   year =3D expand_year (v3, n3);\r
+>> +   mon =3D v2;\r
+>> +   mday =3D v1;\r
+>> +   break;\r
+>> +    }\r
+>> +\r
+>> +    if (year !=3D UNSET && !is_valid_year (year))\r
+>> +   return -PARSE_TIME_ERR_INVALIDDATE;\r
+>> +\r
+>> +    if (mon !=3D UNSET && !is_valid_mon (mon))\r
+>> +   return -PARSE_TIME_ERR_INVALIDDATE;\r
+>> +\r
+>> +    if (mday !=3D UNSET && !is_valid_mday (mday))\r
+>> +   return -PARSE_TIME_ERR_INVALIDDATE;\r
+>> +\r
+>> +    return set_abs_date (state, year, mon, mday);\r
+>> +}\r
+>> +\r
+>> +/* Parse a time number triplet. */\r
+>> +static int\r
+>> +parse_time (struct state *state, char sep,\r
+>> +       unsigned long v1, unsigned long v2, unsigned long v3,\r
+>> +       size_t n1, size_t n2, size_t n3)\r
+>> +{\r
+>> +    assert (is_time_sep (sep));\r
+>> +\r
+>> +    if ((n1 !=3D 1 && n1 !=3D 2) || n2 !=3D 2 || (n3 !=3D 0 && n3 !=3D =\r
+2))\r
+>> +   return -PARSE_TIME_ERR_TIMEFORMAT;\r
+>> +\r
+>> +    /*\r
+>> +     * Notable exception: Previously set fields affect\r
+>> +     * parsing. Interpret (+|-)HH:MM as time zone only if hour and\r
+>> +     * minute have been set.\r
+>> +     *\r
+>> +     * XXX: This could be fixed by restricting the delimiters\r
+>> +     * preceding time. For '+' it would be justified, but for '-' it\r
+>> +     * might be inconvenient. However prefer to allow '-' as an\r
+>> +     * insignificant delimiter preceding time for convenience, and\r
+>> +     * handle '+' the same way for consistency between positive and\r
+>> +     * negative time zones.\r
+>> +     */\r
+>> +    if (is_field_set (state, TM_ABS_HOUR) &&\r
+>> +   is_field_set (state, TM_ABS_MIN) &&\r
+>> +   n1 =3D=3D 2 && n2 =3D=3D 2 && n3 =3D=3D 0 &&\r
+>> +   (state->delim =3D=3D '+' || state->delim =3D=3D '-')) {\r
+>> +   return set_user_tz (state, state->delim, v1, v2);\r
+>> +    }\r
+>> +\r
+>> +    if (!is_valid_time (v1, v2, v3))\r
+>> +   return -PARSE_TIME_ERR_INVALIDTIME;\r
+>> +\r
+>> +    return set_abs_time (state, v1, v2, n3 ? v3 : 0);\r
+>> +}\r
+>> +\r
+>> +/* strtoul helper that assigns length. */\r
+>> +static unsigned long\r
+>> +strtoul_len (const char *s, const char **endp, size_t *len)\r
+>> +{\r
+>> +    unsigned long val =3D strtoul (s, (char **) endp, 10);\r
+>\r
+> This could technically get confused by really large numbers, but I\r
+> don't know if that's worth worrying about.\r
+\r
+I think I'll just ignore that for now.\r
+\r
+>> +\r
+>> +    *len =3D *endp - s;\r
+>> +    return val;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Parse a (group of) number(s). Return < 0 on error, number of parsed\r
+>> + * chars on success.\r
+>> + */\r
+>> +static ssize_t\r
+>> +parse_number (struct state *state, const char *s)\r
+>> +{\r
+>> +    int r;\r
+>> +    unsigned long v1, v2, v3 =3D 0;\r
+>> +    size_t n1, n2, n3 =3D 0;\r
+>> +    const char *p =3D s;\r
+>> +    char sep;\r
+>> +\r
+>> +    v1 =3D strtoul_len (p, &p, &n1);\r
+>> +\r
+>> +    if (is_sep (*p) && isdigit ((unsigned char) *(p + 1))) {\r
+>\r
+> Unnecessary cast?\r
+>\r
+>> +   sep =3D *p;\r
+>> +   v2 =3D strtoul_len (p + 1, &p, &n2);\r
+>> +    } else {\r
+>> +   /* A single number. */\r
+>> +   r =3D parse_single_number (state, v1, n1);\r
+>> +   if (r)\r
+>> +       return r;\r
+>> +\r
+>> +   return p - s;\r
+>\r
+> I found the control flow here confusing.  You might want to flip the\r
+> two conditions so the single number return happens first and the rest\r
+> of the code flows straight through:\r
+\r
+Agreed.\r
+\r
+> if (!is_sep (*p) || !isdigit (*(p + 1))) {\r
+>     ...\r
+>     return p - s;\r
+> }\r
+>\r
+> sep =3D *p;\r
+> ...\r
+>\r
+>> +    }\r
+>> +\r
+>> +    /* A group of two or three numbers? */\r
+>> +    if (*p =3D=3D sep && isdigit ((unsigned char) *(p + 1)))\r
+>> +   v3 =3D strtoul_len (p + 1, &p, &n3);\r
+>> +\r
+>> +    if (is_time_sep (sep))\r
+>> +   r =3D parse_time (state, sep, v1, v2, v3, n1, n2, n3);\r
+>> +    else\r
+>> +   r =3D parse_date (state, sep, v1, v2, v3, n1, n2, n3);\r
+>> +\r
+>> +    if (r)\r
+>> +   return r;\r
+>> +\r
+>> +    return p - s;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Parse delimiter(s). Throw away all except the last one, which is\r
+>> + * stored for parsing the next non-delimiter. Return < 0 on error,\r
+>> + * number of parsed chars on success.\r
+>> + *\r
+>> + * XXX: We might want to be more strict here.\r
+>> + */\r
+>> +static ssize_t\r
+>> +parse_delim (struct state *state, const char *s)\r
+>> +{\r
+>> +    const char *p =3D s;\r
+>> +\r
+>> +    /*\r
+>> +     * Skip non-alpha and non-digit, and store the last for further\r
+>> +     * processing.\r
+>> +     */\r
+>> +    while (*p && !isalnum ((unsigned char) *p)) {\r
+>> +   set_delim (state, *p);\r
+>> +   p++;\r
+>> +    }\r
+>> +\r
+>> +    return p - s;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Parse a date/time string. Return < 0 on error, number of parsed\r
+>> + * chars on success.\r
+>> + */\r
+>> +static ssize_t\r
+>> +parse_input (struct state *state, const char *s)\r
+>> +{\r
+>> +    const char *p =3D s;\r
+>> +    ssize_t n;\r
+>> +    int r;\r
+>> +\r
+>> +    while (*p) {\r
+>> +   if (isalpha ((unsigned char) *p)) {\r
+>> +       n =3D parse_keyword (state, p);\r
+>> +   } else if (isdigit ((unsigned char) *p)) {\r
+>> +       n =3D parse_number (state, p);\r
+>> +   } else {\r
+>> +       n =3D parse_delim (state, p);\r
+>> +   }\r
+>> +\r
+>> +   if (n <=3D 0) {\r
+>> +       if (n =3D=3D 0)\r
+>> +           n =3D -PARSE_TIME_ERR;\r
+>> +\r
+>> +       return n;\r
+>> +   }\r
+>> +\r
+>> +   p +=3D n;\r
+>> +    }\r
+>> +\r
+>> +    /* Parse a previously postponed number, if any. */\r
+>> +    r =3D parse_postponed_number (state, TM_NONE);\r
+>> +    if (r < 0)\r
+>> +   return r;\r
+>> +\r
+>> +    return p - s;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Processing the parsed input.\r
+>> + */\r
+>> +\r
+>> +/*\r
+>> + * Initialize reference time to tm. Use time zone in state if\r
+>> + * specified, otherwise local time. Use now for reference time if\r
+>> + * non-NULL, otherwise current time.\r
+>> + */\r
+>> +static int\r
+>> +initialize_now (struct state *state, struct tm *tm, const time_t *now)\r
+>\r
+> Should tm be the last argument, since it's an out-argument?\r
+\r
+Will change.\r
+\r
+> Why is now a pointer?  Just so it can be NULL?\r
+\r
+Yes, coming all the way from the API.\r
+\r
+>> +{\r
+>> +    time_t t;\r
+>> +\r
+>> +    if (now) {\r
+>> +   t =3D *now;\r
+>> +    } else {\r
+>> +   if (time (&t) =3D=3D (time_t) -1)\r
+>> +       return -PARSE_TIME_ERR_LIB;\r
+>> +    }\r
+>> +\r
+>> +    if (is_field_set (state, TM_TZ)) {\r
+>> +   /* Some other time zone. */\r
+>> +\r
+>> +   /* Adjust now according to the TZ. */\r
+>> +   t +=3D get_field (state, TM_TZ) * 60;\r
+>> +\r
+>> +   /* It's not gm, but this doesn't mess with the TZ. */\r
+>> +   if (gmtime_r (&t, tm) =3D=3D NULL)\r
+>> +       return -PARSE_TIME_ERR_LIB;\r
+>> +    } else {\r
+>> +   /* Local time. */\r
+>> +   if (localtime_r (&t, tm) =3D=3D NULL)\r
+>> +       return -PARSE_TIME_ERR_LIB;\r
+>> +    }\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> +\r
+>> +/*\r
+>> + * Normalize tm according to mktime(3). Both mktime(3) and\r
+>\r
+> This comment could elaborate a bit on what it means to normalize a tm.\r
+\r
+Agreed.\r
+\r
+>> + * localtime_r(3) use local time, but they cancel each other out here,\r
+>> + * making this function agnostic to time zone.\r
+>> + */\r
+>> +static int\r
+>> +normalize_tm (struct tm *tm)\r
+>> +{\r
+>> +    time_t t =3D mktime (tm);\r
+>> +\r
+>> +    if (t =3D=3D (time_t) -1)\r
+>> +   return -PARSE_TIME_ERR_LIB;\r
+>> +\r
+>> +    if (!localtime_r (&t, tm))\r
+>> +   return -PARSE_TIME_ERR_LIB;\r
+>\r
+> Do you actually need this call to localtime_r or can you just return\r
+> after the mktime modifies tm?  Does this have to do with timezones?\r
+\r
+Hmm, I'm not sure. I think I'll just keep it like this, because that's\r
+the way it has worked for months...\r
+\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> +\r
+>> +/* Get field out of a struct tm. */\r
+>> +static int\r
+>> +tm_get_field (const struct tm *tm, enum field field)\r
+>> +{\r
+>> +    switch (field) {\r
+>> +    case TM_ABS_SEC:       return tm->tm_sec;\r
+>> +    case TM_ABS_MIN:       return tm->tm_min;\r
+>> +    case TM_ABS_HOUR:      return tm->tm_hour;\r
+>> +    case TM_ABS_MDAY:      return tm->tm_mday;\r
+>> +    case TM_ABS_MON:       return tm->tm_mon + 1; /* 0- to 1-based */\r
+>> +    case TM_ABS_YEAR:      return 1900 + tm->tm_year;\r
+>> +    case TM_ABS_WDAY:      return tm->tm_wday;\r
+>> +    case TM_ABS_ISDST:     return tm->tm_isdst;\r
+>> +    default:\r
+>> +   assert (false);\r
+>> +   break;\r
+>> +    }\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> +\r
+>> +/* Modify hour according to am/pm setting. */\r
+>> +static int\r
+>> +fixup_ampm (struct state *state)\r
+>> +{\r
+>> +    int hour, hdiff =3D 0;\r
+>> +\r
+>> +    if (!is_field_set (state, TM_AMPM))\r
+>> +   return 0;\r
+>> +\r
+>> +    if (!is_field_set (state, TM_ABS_HOUR))\r
+>> +   return -PARSE_TIME_ERR_TIMEFORMAT;\r
+>> +\r
+>> +    hour =3D get_field (state, TM_ABS_HOUR);\r
+>> +    if (!is_valid_12hour (hour))\r
+>> +   return -PARSE_TIME_ERR_INVALIDTIME;\r
+>> +\r
+>> +    if (get_field (state, TM_AMPM)) {\r
+>> +   /* 12pm is noon. */\r
+>> +   if (hour !=3D 12)\r
+>> +       hdiff =3D 12;\r
+>> +    } else {\r
+>> +   /* 12am is midnight, beginning of day. */\r
+>> +   if (hour =3D=3D 12)\r
+>> +       hdiff =3D -12;\r
+>> +    }\r
+>> +\r
+>> +    mod_field (state, TM_REL_HOUR, -hdiff);\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> +\r
+>> +/* Combine absolute and relative fields, and round. */\r
+>> +static int\r
+>> +create_output (struct state *state, time_t *t_out, const time_t *ref,\r
+>> +          int round)\r
+>> +{\r
+>> +    struct tm tm =3D { .tm_isdst =3D -1 };\r
+>> +    struct tm now;\r
+>> +    time_t t;\r
+>> +    enum field f;\r
+>> +    int r;\r
+>> +    int week_round =3D PARSE_TIME_NO_ROUND;\r
+>> +\r
+>> +    r =3D initialize_now (state, &now, ref);\r
+>> +    if (r)\r
+>> +   return r;\r
+>> +\r
+>> +    /* Initialize fields flagged as "now" to reference time. */\r
+>> +    for (f =3D TM_ABS_SEC; f !=3D TM_NONE; f =3D next_abs_field (f)) {\r
+>> +   if (state->set[f] =3D=3D FIELD_NOW) {\r
+>> +       state->tm[f] =3D tm_get_field (&now, f);\r
+>> +       state->set[f] =3D FIELD_SET;\r
+>> +   }\r
+>> +    }\r
+>> +\r
+>> +    /*\r
+>> +     * If WDAY is set but MDAY is not, we consider WDAY relative\r
+>> +     *\r
+>> +     * XXX: This fails on stuff like "two months monday" because two\r
+>> +     * months ago wasn't the same day as today. Postpone until we know\r
+>> +     * date?\r
+>> +     */\r
+>> +    if (is_field_set (state, TM_ABS_WDAY) &&\r
+>> +   !is_field_set (state, TM_ABS_MDAY)) {\r
+>> +   int wday =3D get_field (state, TM_ABS_WDAY);\r
+>> +   int today =3D tm_get_field (&now, TM_ABS_WDAY);\r
+>> +   int rel_days;\r
+>> +\r
+>> +   if (today > wday)\r
+>> +       rel_days =3D today - wday;\r
+>> +   else\r
+>> +       rel_days =3D today + 7 - wday;\r
+>> +\r
+>> +   /* This also prevents special week rounding from happening. */\r
+>> +   mod_field (state, TM_REL_DAY, rel_days);\r
+>> +\r
+>> +   unset_field (state, TM_ABS_WDAY);\r
+>> +    }\r
+>> +\r
+>> +    r =3D fixup_ampm (state);\r
+>> +    if (r)\r
+>> +   return r;\r
+>> +\r
+>> +    /*\r
+>> +     * Iterate fields from most accurate to least accurate, and set\r
+>> +     * unset fields according to requested rounding.\r
+>> +     */\r
+>> +    for (f =3D TM_ABS_SEC; f !=3D TM_NONE; f =3D next_abs_field (f)) {\r
+>> +   if (round !=3D PARSE_TIME_NO_ROUND) {\r
+>> +       enum field r =3D abs_to_rel_field (f);\r
+>> +\r
+>> +       if (is_field_set (state, f) || is_field_set (state, r)) {\r
+>> +           if (round >=3D PARSE_TIME_ROUND_UP && f !=3D TM_ABS_SEC) {\r
+>> +               mod_field (state, r, -1);\r
+>\r
+> Crazy.  This could use a comment.  It took me a while to figure out\r
+> why this was -1, though maybe that's just because it's late.\r
+\r
+Will do.\r
+\r
+/* You're not expected to understand this */ ;)\r
+\r
+>> +               if (round =3D=3D PARSE_TIME_ROUND_UP_INCLUSIVE)\r
+>> +                   mod_field (state, TM_REL_SEC, 1);\r
+>> +           }\r
+>> +           round =3D PARSE_TIME_NO_ROUND; /* No more rounding. */\r
+>> +       } else {\r
+>> +           if (f =3D=3D TM_ABS_MDAY &&\r
+>> +               is_field_set (state, TM_REL_WEEK)) {\r
+>> +               /* Week is most accurate. */\r
+>> +               week_round =3D round;\r
+>> +               round =3D PARSE_TIME_NO_ROUND;\r
+>> +           } else {\r
+>> +               set_field (state, f, field_epoch (f));\r
+>> +           }\r
+>> +       }\r
+>> +   }\r
+>> +\r
+>> +   if (!is_field_set (state, f))\r
+>> +       set_field (state, f, tm_get_field (&now, f));\r
+>> +    }\r
+>> +\r
+>> +    /* Special case: rounding with week accuracy. */\r
+>> +    if (week_round !=3D PARSE_TIME_NO_ROUND) {\r
+>> +   /* Temporarily set more accurate fields to now. */\r
+>> +   set_field (state, TM_ABS_SEC, tm_get_field (&now, TM_ABS_SEC));\r
+>> +   set_field (state, TM_ABS_MIN, tm_get_field (&now, TM_ABS_MIN));\r
+>> +   set_field (state, TM_ABS_HOUR, tm_get_field (&now, TM_ABS_HOUR));\r
+>> +   set_field (state, TM_ABS_MDAY, tm_get_field (&now, TM_ABS_MDAY));\r
+>> +    }\r
+>> +\r
+>> +    /*\r
+>> +     * Set all fields. They may contain out of range values before\r
+>> +     * normalization by mktime(3).\r
+>> +     */\r
+>> +    tm.tm_sec =3D get_field (state, TM_ABS_SEC) - get_field (state, TM_=\r
+REL_SEC);\r
+>> +    tm.tm_min =3D get_field (state, TM_ABS_MIN) - get_field (state, TM_=\r
+REL_MIN);\r
+>> +    tm.tm_hour =3D get_field (state, TM_ABS_HOUR) - get_field (state, T=\r
+M_REL_HOUR);\r
+>> +    tm.tm_mday =3D get_field (state, TM_ABS_MDAY) -\r
+>> +            get_field (state, TM_REL_DAY) - 7 * get_field (state, TM_REL_WEEK);\r
+>> +    tm.tm_mon =3D get_field (state, TM_ABS_MON) - get_field (state, TM_=\r
+REL_MON);\r
+>> +    tm.tm_mon--; /* 1- to 0-based */\r
+>> +    tm.tm_year =3D get_field (state, TM_ABS_YEAR) - get_field (state, T=\r
+M_REL_YEAR) - 1900;\r
+>> +\r
+>> +    /*\r
+>> +     * It's always normal time.\r
+>> +     *\r
+>> +     * XXX: This is probably not a solution that universally\r
+>> +     * works. Just make sure DST is not taken into account. We don't\r
+>> +     * want rounding to be affected by DST.\r
+>> +     */\r
+>> +    tm.tm_isdst =3D -1;\r
+>> +\r
+>> +    /* Special case: rounding with week accuracy. */\r
+>> +    if (week_round !=3D PARSE_TIME_NO_ROUND) {\r
+>> +   /* Normalize to get proper tm.wday. */\r
+>> +   r =3D normalize_tm (&tm);\r
+>> +   if (r < 0)\r
+>> +       return r;\r
+>> +\r
+>> +   /* Set more accurate fields back to zero. */\r
+>> +   tm.tm_sec =3D 0;\r
+>> +   tm.tm_min =3D 0;\r
+>> +   tm.tm_hour =3D 0;\r
+>> +   tm.tm_isdst =3D -1;\r
+>> +\r
+>> +   /* Monday is the true 1st day of week, but this is easier. */\r
+>> +   if (week_round >=3D PARSE_TIME_ROUND_UP) {\r
+>> +       tm.tm_mday +=3D 7 - tm.tm_wday;\r
+>> +       if (week_round =3D=3D PARSE_TIME_ROUND_UP_INCLUSIVE)\r
+>> +           tm.tm_sec--;\r
+>> +   } else {\r
+>> +       tm.tm_mday -=3D tm.tm_wday;\r
+>> +   }\r
+>> +    }\r
+>> +\r
+>> +    if (is_field_set (state, TM_TZ)) {\r
+>> +   /* tm is in specified TZ, convert to UTC for timegm(3). */\r
+>> +   tm.tm_min -=3D get_field (state, TM_TZ);\r
+>> +   t =3D timegm (&tm);\r
+>> +    } else {\r
+>> +   /* tm is in local time. */\r
+>> +   t =3D mktime (&tm);\r
+>> +    }\r
+>> +\r
+>> +    if (t =3D=3D (time_t) -1)\r
+>> +   return -PARSE_TIME_ERR_LIB;\r
+>> +\r
+>> +    *t_out =3D t;\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> +\r
+>> +/* Internally, all errors are < 0. parse_time_string() returns errors >=\r
+ 0. */\r
+>> +#define EXTERNAL_ERR(r) (-r)\r
+>> +\r
+>> +int\r
+>> +parse_time_string (const char *s, time_t *t, const time_t *ref, int rou=\r
+nd)\r
+>> +{\r
+>> +    struct state state =3D { .last_field =3D TM_NONE };\r
+>> +    int r;\r
+>> +\r
+>> +    if (!s || !t)\r
+>> +   return EXTERNAL_ERR (-PARSE_TIME_ERR);\r
+>> +\r
+>> +    r =3D parse_input (&state, s);\r
+>> +    if (r < 0)\r
+>> +   return EXTERNAL_ERR (r);\r
+>> +\r
+>> +    r =3D create_output (&state, t, ref, round);\r
+>> +    if (r < 0)\r
+>> +   return EXTERNAL_ERR (r);\r
+>> +\r
+>> +    return 0;\r
+>> +}\r
+>> diff --git a/parse-time-string/parse-time-string.h b/parse-time-string/p=\r
+arse-time-string.h\r
+>> new file mode 100644\r
+>> index 0000000..bfa4ee3\r
+>> --- /dev/null\r
+>> +++ b/parse-time-string/parse-time-string.h\r
+>> @@ -0,0 +1,102 @@\r
+>> +/*\r
+>> + * parse time string - user friendly date and time parser\r
+>> + * Copyright =C2=A9 2012 Jani Nikula\r
+>> + *\r
+>> + * This program is free software: you can redistribute it and/or modify\r
+>> + * it under the terms of the GNU General Public License as published by\r
+>> + * the Free Software Foundation, either version 2 of the License, or\r
+>> + * (at your option) any later version.\r
+>> + *\r
+>> + * This program is distributed in the hope that it will be useful,\r
+>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of\r
+>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the\r
+>> + * GNU General Public License for more details.\r
+>> + *\r
+>> + * You should have received a copy of the GNU General Public License\r
+>> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.\r
+>> + *\r
+>> + * Author: Jani Nikula <jani@nikula.org>\r
+>> + */\r
+>> +\r
+>> +#ifndef PARSE_TIME_STRING_H\r
+>> +#define PARSE_TIME_STRING_H\r
+>> +\r
+>> +#ifdef __cplusplus\r
+>> +extern "C" {\r
+>> +#endif\r
+>> +\r
+>> +#include <time.h>\r
+>> +\r
+>> +/* return values for parse_time_string() */\r
+>> +enum {\r
+>> +    PARSE_TIME_OK =3D 0,\r
+>> +    PARSE_TIME_ERR,                /* unspecified error */\r
+>> +    PARSE_TIME_ERR_LIB,            /* library call failed */\r
+>> +    PARSE_TIME_ERR_ALREADYSET,     /* attempt to set unit twice */\r
+>> +    PARSE_TIME_ERR_FORMAT, /* generic date/time format error */\r
+>> +    PARSE_TIME_ERR_DATEFORMAT,     /* date format error */\r
+>> +    PARSE_TIME_ERR_TIMEFORMAT,     /* time format error */\r
+>> +    PARSE_TIME_ERR_INVALIDDATE,    /* date value error */\r
+>> +    PARSE_TIME_ERR_INVALIDTIME,    /* time value error */\r
+>> +    PARSE_TIME_ERR_KEYWORD,        /* unknown keyword */\r
+>> +};\r
+>> +\r
+>> +/* round values for parse_time_string() */\r
+>> +enum {\r
+>> +    PARSE_TIME_ROUND_DOWN =3D -1,\r
+>> +    PARSE_TIME_NO_ROUND =3D 0,\r
+>> +    PARSE_TIME_ROUND_UP =3D 1,\r
+>> +    PARSE_TIME_ROUND_UP_INCLUSIVE =3D 2,\r
+>> +};\r
+>> +\r
+>> +/**\r
+>> + * parse_time_string() - user friendly date and time parser\r
+>> + * @s:             string to parse\r
+>> + * @t:             pointer to time_t to store parsed time in\r
+>> + * @ref:   pointer to time_t containing reference date/time, or NULL\r
+>> + * @round: PARSE_TIME_NO_ROUND, PARSE_TIME_ROUND_DOWN, or\r
+>> + *         PARSE_TIME_ROUND_UP\r
+>> + *\r
+>> + * Parse a date/time string 's' and store the parsed date/time result\r
+>> + * in 't'.\r
+>> + *\r
+>> + * A reference date/time is used for determining the "date/time units"\r
+>> + * (roughly equivalent to struct tm members) not specified by 's'. If\r
+>> + * 'ref' is non-NULL, it must contain a pointer to a time_t to be used\r
+>> + * as reference date/time. Otherwise, the current time is used.\r
+>> + *\r
+>> + * If 's' does not specify a full date/time, the 'round' parameter\r
+>> + * specifies if and how the result should be rounded as follows:\r
+>> + *\r
+>> + *   PARSE_TIME_NO_ROUND: All date/time units that are not specified\r
+>> + *   by 's' are set to the corresponding unit derived from the\r
+>> + *   reference date/time.\r
+>> + *\r
+>> + *   PARSE_TIME_ROUND_DOWN: All date/time units that are more accurate\r
+>> + *   than the most accurate unit specified by 's' are set to the\r
+>> + *   smallest valid value for that unit. Rest of the unspecified units\r
+>> + *   are set as in PARSE_TIME_NO_ROUND.\r
+>> + *\r
+>> + *   PARSE_TIME_ROUND_UP: All date/time units that are more accurate\r
+>> + *   than the most accurate unit specified by 's' are set to the\r
+>> + *   smallest valid value for that unit. The most accurate unit\r
+>> + *   specified by 's' is incremented by one (and this is rolled over\r
+>> + *   to the less accurate units as necessary), unless the most\r
+>> + *   accurate unit is seconds. Rest of the unspecified units are set\r
+>> + *   as in PARSE_TIME_NO_ROUND.\r
+>> + *\r
+>> + *   PARSE_TIME_ROUND_UP_INCLUSIVE: Same as PARSE_TIME_ROUND_UP, minus\r
+>> + *   one second, unless the most accurate unit specified by 's' is\r
+>> + *   seconds. This is useful for callers that require a value for\r
+>> + *   inclusive comparison of the result.\r
+>> + *\r
+>> + * Return 0 (PARSE_TIME_OK) for succesfully parsed date/time, or one\r
+>> + * of PARSE_TIME_ERR_* on error. 't' is not modified on error.\r
+>> + */\r
+>> +int parse_time_string (const char *s, time_t *t, const time_t *ref, int=\r
+ round);\r
+>> +\r
+>> +#ifdef __cplusplus\r
+>> +}\r
+>> +#endif\r
+>> +\r
+>> +#endif /* PARSE_TIME_STRING_H */\r
+>\r
+> Made it!\r
+\r
+Thanks for your very helpful and constructive review, as always!\r
+\r
+BR,\r
+Jani.\r
author	Jani Nikula <jani@nikula.org>
	Sun, 28 Oct 2012 22:30:58 +0000 (00:30 +0200)
committer	W. Trevor King <wking@tremily.us>
	Fri, 7 Nov 2014 17:50:08 +0000 (09:50 -0800)