git-rev-list --bisect: optimization
This improves the performance of revision bisection.
The idea is to avoid rather expensive count_distance() function,
which counts the number of commits that are reachable from any
given commit (including itself) in the set. When a commit has
only one relevant parent commit, the number of commits the
commit can reach is exactly the number of commits that the
parent can reach plus one; instead of running count_distance()
on commits that are on straight single strand of pearls, we can
just add one to the parents' count.
On the other hand, for a merge commit, because the commits
reachable from one parent can be reachable from another parent,
you cannot just add the parents' counts up plus one for the
commit itself; that would overcount ancestors that are reachable
from more than one parents.
The algorithm used in the patch runs count_distance() on merge
commits, and uses the util field of commit objects to remember
them. After that, the number of commits reachable from each of
the remaining commits is counted by finding a commit whose count
is not yet known but the count for its (sole) parent is known,
and adding one to the parent's count, until we assign numbers to
everybody.
Another small optimization is whenever we find a half-way commit
(that is, a commit that can reach exactly half of the commits),
we stop giving counts to remaining commits, as we will not find
any better commit than we just found.
The performance to bisect between v1.0.0 and v1.5.0 in git.git
repository was improved by saying good and bad in turns from
3.68 seconds down to 1.26 seconds. Bisecting the kernel between
v2.6.18 and v2.6.20 was sped up from 21.84 seconds down to 4.22
seconds.
Signed-off-by: Junio C Hamano <junkio@cox.net>