From: Joey Hess Date: Tue, 16 Nov 2010 17:19:13 +0000 (-0400) Subject: thoughts X-Git-Url: http://git.tremily.us/?a=commitdiff_plain;h=9a7387a53be2b1e182003f3e86cb76d7f10c4b67;p=ikiwiki.git thoughts --- diff --git a/doc/todo/Improving_the_efficiency_of_match__95__glob.mdwn b/doc/todo/Improving_the_efficiency_of_match__95__glob.mdwn index 43571ead7..0fc059ad7 100644 --- a/doc/todo/Improving_the_efficiency_of_match__95__glob.mdwn +++ b/doc/todo/Improving_the_efficiency_of_match__95__glob.mdwn @@ -22,6 +22,23 @@ Here's my patch - please consider it! -- [[KathrynAndersen]] >>>>> I think it's because my patch focuses on match_glob while the memoize patch focuses on `glob2re`, and `glob2re` is called in `filecheck`, `meta` and `po` as well as in `match_glob` and `match_user`; thus the memoized `glob2re` is dealing with a bigger set of globs to look up, and thus could be just that little bit slower. -- [[KathrynAndersen]] +>>>>>> What may be going on is that glob2re is already a fairly fast +>>>>>> function, so the overhead of memoizing it with the very generic +>>>>>> `_memoizer` (see its source) swamps the memoization gain. Note +>>>>>> that the few functions memoized with the Memoizer before were much +>>>>>> more expensive, so that little overhead was acceptable then. +>>>>>> +>>>>>> It also may be that Kathryn's patch is slightly faster due to using +>>>>>> the construct `$foo =~ $regexp` rather than `$foo =~ /$regexp/` +>>>>>> (probably avoids a copy or something like that internally) -- +>>>>>> this despite checking both `exists` and `defined` on the hash, which +>>>>>> should be reundant AFAICS. +>>>>>> +>>>>>> My guess is that the best of both worlds would be to move +>>>>>> the byhand memoization to glob2re and have it return a compiled +>>>>>> `/^/i` regexp that can be used without further modifiction in most +>>>>>> cases. --[[Joey]] + -------------------------------------------------------------- Benchmarks done with Devel::Profile on the same testbed IkiWiki setup. I'm just showing the start of the profile output, since that's what's relevant.