Return-Path: X-Original-To: notmuch@notmuchmail.org Delivered-To: notmuch@notmuchmail.org Received: from localhost (localhost [127.0.0.1]) by olra.theworths.org (Postfix) with ESMTP id 83C6A431FBD for ; Mon, 10 Mar 2014 19:07:08 -0700 (PDT) X-Virus-Scanned: Debian amavisd-new at olra.theworths.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=5 tests=[RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Received: from olra.theworths.org ([127.0.0.1]) by localhost (olra.theworths.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cuG6QWbMUVIK for ; Mon, 10 Mar 2014 19:07:04 -0700 (PDT) Received: from dmz-mailsec-scanner-4.mit.edu (dmz-mailsec-scanner-4.mit.edu [18.9.25.15]) (using TLSv1 with cipher DHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by olra.theworths.org (Postfix) with ESMTPS id 25D0B431FBC for ; Mon, 10 Mar 2014 19:07:04 -0700 (PDT) X-AuditID: 1209190f-f790b6d000000c3a-93-531e6fc74afc Received: from mailhub-auth-3.mit.edu ( [18.9.21.43]) (using TLS with cipher AES256-SHA (256/256 bits)) (Client did not present a certificate) by dmz-mailsec-scanner-4.mit.edu (Symantec Messaging Gateway) with SMTP id F2.BA.03130.7CF6E135; Mon, 10 Mar 2014 22:07:03 -0400 (EDT) Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) by mailhub-auth-3.mit.edu (8.13.8/8.9.2) with ESMTP id s2B271aO002954; Mon, 10 Mar 2014 22:07:02 -0400 Received: from awakening.csail.mit.edu (awakening.csail.mit.edu [18.26.4.91]) (authenticated bits=0) (User authenticated as amdragon@ATHENA.MIT.EDU) by outgoing.mit.edu (8.13.8/8.12.4) with ESMTP id s2B26xxj008042 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES128-SHA bits=128 verify=NOT); Mon, 10 Mar 2014 22:07:00 -0400 Received: from amthrax by awakening.csail.mit.edu with local (Exim 4.80) (envelope-from ) id 1WNC5r-0008IW-Gs; Mon, 10 Mar 2014 22:06:59 -0400 From: Austin Clements To: Mark Walters , notmuch@notmuchmail.org Subject: Re: [PATCH v2 1/7] Make keys of notmuch-tag-formats regexps and use caching In-Reply-To: <1392841212-8494-2-git-send-email-markwalters1009@gmail.com> References: <1392841212-8494-1-git-send-email-markwalters1009@gmail.com> <1392841212-8494-2-git-send-email-markwalters1009@gmail.com> User-Agent: Notmuch/0.17~rc2+14~g06f47e0 (http://notmuchmail.org) Emacs/23.4.1 (i486-pc-linux-gnu) Date: Mon, 10 Mar 2014 22:06:59 -0400 Message-ID: <87eh2932v0.fsf@awakening.csail.mit.edu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrEIsWRmVeSWpSXmKPExsUixCmqrXs8Xy7YYPt8NovVc3ksrt+cyezA 5LFz1l12j2erbjEHMEVx2aSk5mSWpRbp2yVwZdx++out4KVXxdFLm9gbGE9bdTFyckgImEgs OTKdCcIWk7hwbz1bFyMXh5DAbCaJHyc/skA4Gxklzs08AZU5zSQx5ck3ZghnCaPEzeXP2UH6 2QQ0JLbtX84IYosIuEo8/faZGcQWFgiReHZ5DVgNp4CnxNmpfYwQza2MEheWXmEBSYgKJEmc nvqbFcRmEVCV+HhhLlicF+jAXZP2M0HYghInZz4BizMLaEnc+PeSaQKjwCwkqVlIUgsYmVYx yqbkVunmJmbmFKcm6xYnJ+blpRbpmujlZpbopaaUbmIEB6Uk/w7GbweVDjEKcDAq8fAG+MoF C7EmlhVX5h5ilORgUhLlXZcJFOJLyk+pzEgszogvKs1JLT7EKMHBrCTCe9oCKMebklhZlVqU D5OS5mBREuftOysRLCSQnliSmp2aWpBaBJOV4eBQkuB9kQfUKFiUmp5akZaZU4KQZuLgBBnO AzScA6SGt7ggMbc4Mx0if4pRUUqcVx4kIQCSyCjNg+uFJY1XjOJArwjzXgCp4gEmHLjuV0CD mYAGNx+XAhlckoiQkmpgjHl77Sdj94TfAcoLVsfXGrb5lQsV/5sb3KEqci3fK27z1PW/lilE tGyvUTbwrD9x9r9Vj+qW9/s1XXZMTLubefynw944wydhy9XyDoodKZRl1LBQPcsULfmBVXxO iPRil/Tf9Z/7dgv73H47+Z983QTvKR7ROUW/F56Szzn/JvrlozuWnx6dVmIpzkg01GIuKk4E ADs20GL1AgAA X-BeenThere: notmuch@notmuchmail.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: "Use and development of the notmuch mail system." List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 11 Mar 2014 02:07:08 -0000 On Wed, 19 Feb 2014, Mark Walters wrote: > From: Austin Clements > > This patch switches notmuch-tag-formats to use regexps with caching > for performance. > > We have to clear the cache somehow on changes to notmuch-tag-formats. > This version takes the simplest approach: search/show/tree all clear > the cache whenever they start loading. > > We cannot use assoc-default since there's no way to distinguish a > missing key from a present key with a null cdr: thus, we use assoc* > from cl instead. > > Performance-wise, the caching of regexp lookup makes this at least as > fast as the previous code using assoc (see > id:1392226351-31440-1-git-send-email-amdragon@mit.edu for timing > details). How about this for a commit message? - 8< - This modifies `notmuch-tag-format-tag' to treat the keys of `notmuch-tag-formats' as (anchored) regexps, rather than literal strings. This is clearly more flexible, as it allows for prefix matching, defining a fallback format, etc. This may cause compatibility problems if people have customized `notmuch-tag-formats' to match tags that contain regexp specials, but this seems unlikely. Regular expression matching has quite a performance hit over string lookup, so this also introduces a simple cache from exact tags to formatted strings. The number of unique tags is likely to be quite small, so this cache should have a high hit rate. In addition to eliminating the regexp lookup in the common case, this cache stores fully formatted tags, eliminating the repeated evaluation of potentially expensive, user-specified formatting code. This makes regexp lookup at least as fast as assoc for unformatted tags (e.g., inbox) and *faster* than the current code for formatted tags (e.g., unread): inbox (usec) unread (usec) assoc: 0.4 2.8 regexp: 3.2 7.2 regexp+caching: 0.4 0.4 (Though even at 7.2 usec, tag formatting is not our top bottleneck.) This cache must be explicitly cleared to keep it coherent, so this adds the appropriate clearing calls. - >8 - > --- > emacs/notmuch-show.el | 1 + > emacs/notmuch-tag.el | 70 +++++++++++++++++++++++++++++++++--------------- > emacs/notmuch-tree.el | 1 + > emacs/notmuch.el | 1 + > 4 files changed, 51 insertions(+), 22 deletions(-) > > diff --git a/emacs/notmuch-show.el b/emacs/notmuch-show.el > index 1ac80ca..4bddf6c 100644 > --- a/emacs/notmuch-show.el > +++ b/emacs/notmuch-show.el > @@ -1145,6 +1145,7 @@ function is used." > ;; Don't track undo information for this buffer > (set 'buffer-undo-list t) > > + (notmuch-tag-clear-cache) > (erase-buffer) > (goto-char (point-min)) > (save-excursion > diff --git a/emacs/notmuch-tag.el b/emacs/notmuch-tag.el > index 908e7ad..47e0205 100644 > --- a/emacs/notmuch-tag.el > +++ b/emacs/notmuch-tag.el > @@ -28,23 +28,39 @@ > (require 'crm) > (require 'notmuch-lib) > > +;; (notmuch-tag-clear-cache will be called by the defcustom > +;; notmuch-tag-formats, so it has to be defined first.) This is no longer true, so this comment can be removed and the following two defs can be moved to a more natural location further down in the file. > + > +(defvar notmuch-tag--format-cache (make-hash-table :test 'equal) > + "Cache of tag format lookup. Internal to `notmuch-tag-format-tag'.") > + > +(defun notmuch-tag-clear-cache () > + "Clear the internal cache of tag formats. > + > +This must be called after changes to `notmuch-tag-formats'." Likewise, this sentence is out of date. I would just omit it and add the doc I suggest below. > + (clrhash notmuch-tag--format-cache)) > + > (defcustom notmuch-tag-formats > '(("unread" (propertize tag 'face '(:foreground "red"))) > ("flagged" (propertize tag 'face '(:foreground "blue")) > (notmuch-tag-format-image-data tag (notmuch-tag-star-icon)))) > "Custom formats for individual tags. > > -This gives a list that maps from tag names to lists of formatting > -expressions. The car of each element gives a tag name and the > -cdr gives a list of Elisp expressions that modify the tag. If > -the list is empty, the tag will simply be hidden. Otherwise, > -each expression will be evaluated in order: for the first > -expression, the variable `tag' will be bound to the tag name; for > -each later expression, the variable `tag' will be bound to the > -result of the previous expression. In this way, each expression > -can build on the formatting performed by the previous expression. > -The result of the last expression will displayed in place of the > -tag. > +This is an association list that maps from tag name regexps to > +lists of formatting expressions. The first entry whose car > +regexp-matches a tag will be used to format that tag. The regexp > +is implicitly anchored, so to match a literal tag name, just use > +that tag name (if it contains special regexp characters like > +\".\" or \"*\", these have to be escaped). The cdr of the > +matching entry gives a list of Elisp expressions that modify the > +tag. If the list is empty, the tag will simply be hidden. > +Otherwise, each expression will be evaluated in order: for the > +first expression, the variable `tag' will be bound to the tag > +name; for each later expression, the variable `tag' will be bound > +to the result of the previous expression. In this way, each > +expression can build on the formatting performed by the previous > +expression. The result of the last expression will displayed in > +place of the tag. > > For example, to replace a tag with another string, simply use > that string as a formatting expression. To change the foreground > @@ -56,7 +72,7 @@ with images." > > :group 'notmuch-search > :group 'notmuch-show > - :type '(alist :key-type (string :tag "Tag") > + :type '(alist :key-type (regexp :tag "Tag") > :extra-offset -3 > :value-type > (radio :format "%v" > @@ -137,16 +153,26 @@ This can be used with `notmuch-tag-format-image-data'." > > (defun notmuch-tag-format-tag (tag) > "Format TAG by looking into `notmuch-tag-formats'." This would be a great place to mention that modes need to call `notmuch-tag-clear-cache' if they intent to use formatted tags. "Format TAG according to `notmuch-tag-formats'. Callers must ensure that the tag format cache has been recently cleared via `notmuch-tag-clear-cache' before using this function. For example, it would be appropriate to clear the cache just prior to filling a buffer that uses formatted tags." > - (let ((formats (assoc tag notmuch-tag-formats))) > - (cond > - ((null formats) ;; - Tag not in `notmuch-tag-formats', > - tag) ;; the format is the tag itself. > - ((null (cdr formats)) ;; - Tag was deliberately hidden, > - nil) ;; no format must be returned > - (t ;; - Tag was found and has formats, > - (let ((tag tag)) ;; we must apply all the formats. > - (dolist (format (cdr formats) tag) > - (setq tag (eval format)))))))) > + (let ((formatted (gethash tag notmuch-tag--format-cache 'missing))) > + (when (eq formatted 'missing) > + (let* ((formats > + (save-match-data This is a better place for the comment about assoc* that's currently in the commit message. ;; Don't use assoc-default since there's no way to distinguish a missing ;; key from a present key with a null cdr:. > + (assoc* tag notmuch-tag-formats > + :test (lambda (tag key) > + (and (eq (string-match key tag) 0) > + (= (match-end 0) (length tag)))))))) > + (setq formatted > + (cond > + ((null formats) ;; - Tag not in `notmuch-tag-formats', > + tag) ;; the format is the tag itself. > + ((null (cdr formats)) ;; - Tag was deliberately hidden, > + nil) ;; no format must be returned > + (t ;; - Tag was found and has formats, > + (let ((tag tag)) ;; we must apply all the formats. > + (dolist (format (cdr formats) tag) > + (setq tag (eval format))))))) > + (puthash tag formatted notmuch-tag--format-cache))) > + formatted)) > > (defun notmuch-tag-format-tags (tags &optional face) > "Return a string representing formatted TAGS." > diff --git a/emacs/notmuch-tree.el b/emacs/notmuch-tree.el > index 4f2ac02..a106e09 100644 > --- a/emacs/notmuch-tree.el > +++ b/emacs/notmuch-tree.el > @@ -881,6 +881,7 @@ the same as for the function notmuch-tree." > (message-arg "--entire-thread")) > (if (equal (car (process-lines notmuch-command "count" search-args)) "0") > (setq search-args basic-query)) > + (notmuch-tag-clear-cache) > (let ((proc (notmuch-start-notmuch > "notmuch-tree" (current-buffer) #'notmuch-tree-process-sentinel > "show" "--body=false" "--format=sexp" > diff --git a/emacs/notmuch.el b/emacs/notmuch.el > index 0471750..0c767f7 100644 > --- a/emacs/notmuch.el > +++ b/emacs/notmuch.el > @@ -888,6 +888,7 @@ the configured default sort order." > (set 'notmuch-search-oldest-first oldest-first) > (set 'notmuch-search-target-thread target-thread) > (set 'notmuch-search-target-line target-line) > + (notmuch-tag-clear-cache) > (let ((proc (get-buffer-process (current-buffer))) > (inhibit-read-only t)) > (if proc > -- > 1.7.9.1