Mark Levedahl [Sat, 11 Aug 2007 00:39:24 +0000 (20:39 -0400)]
builtin-bundle - use buffered reads for bundle header
This eliminates all use of byte-at-a-time reading of data in this
function: as Junio noted, a bundle file is seekable so we can
reset the file position to the first part of the pack-file using lseek
after reading the header.
Signed-off-by: Mark Levedahl <mdl123@verizon.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark Levedahl [Fri, 10 Aug 2007 22:29:49 +0000 (18:29 -0400)]
builtin-bundle.c - use stream buffered input for rev-list
git-bundle create on cygwin was nearly unusable due to 1 character
at a time (unbuffered) reading from an exec'ed process. Fix by using
fdopen to get a buffered stream.
Results for "time git bundle create test.bdl v1.0.3..v1.5.2" are:
before this patch:
cygwin linux
real 1m38.828s 0m3.578s
user 0m12.122s 0m2.896s
sys 1m28.215s 0m0.692s
after this patch:
real 0m3.688s 0m2.835s
user 0m3.075s 0m2.731s
sys 0m1.075s 0m0.149s
Signed-off-by: Mark Levedahl <mdl123@verizon.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 9 Aug 2007 05:04:06 +0000 (22:04 -0700)]
allow git-bundle to create bottomless bundle
Mark Levedahl <mlevedahl@gmail.com> writes:
> Junio C Hamano wrote:
>> While "git bundle" was a useful way to sneakernet incremental
>> changes, we did not allow:
>>
> Thanks - I've been thinking for months I could fix this bug, never
> figured it out and didn't want to nag Dscho one more time. I confirm
> that this allows creation of bundles with arbitrary refs, not just
> those under refs/heads. Yahoo!
the bundle records that it requires v2.6.20^0 commit (correct)
and gives you tag v2.6.22 (incorrect); the bug is that the
object it lists in fact is the commit v2.6.22^0, not the tag.
This is because the revision range operation .. is always about
set of commits, but the code near where my patch touches does
not validate that the sha1 value obtained from dwim_ref()
against the commit object name e->item->sha1 before placing the
head information in the commit.
Junio C Hamano [Thu, 9 Aug 2007 00:01:49 +0000 (17:01 -0700)]
allow git-bundle to create bottomless bundle
While "git bundle" was a useful way to sneakernet incremental
changes, we did not allow:
$ git bundle create v2.6.20.bndl v2.6.20
to create a bundle that contains the whole history to a
well-known good revision. Such a bundle can be mirrored
everywhere, and people can prime their repository with it to
reduce the load on the repository that serves near the tip of
the development.
Linus Torvalds [Fri, 10 Aug 2007 19:31:20 +0000 (12:31 -0700)]
Optimize the two-way merge of git-read-tree too
This trivially optimizes the two-way merge case of git-read-tree too,
which affects switching branches.
When you have tons and tons of files in your repository, but there are
only small differences in the branches (maybe just a couple of files
changed), the biggest cost of the branch switching was actually just the
index calculations.
This fixes it (timings for switching between the "testing" and "master"
branches in the 100,000 file testing-repo-from-hell, where the branches
only differ in one small file).
Before:
[torvalds@woody bummer]$ time git checkout master
real 0m9.919s
user 0m8.461s
sys 0m0.264s
After:
[torvalds@woody bummer]$ time git checkout testing
real 0m0.576s
user 0m0.348s
sys 0m0.228s
so it's easily an order of magnitude different.
This concludes the series. I think we could/should do the three-way merge
too (to speed up merges), but I'm lazy. Somebody else can do it.
The rule is very simple: you need to remove the old entry if:
- you want to remove the file entirely
- you replace it with a "merge conflict" entry (ie a non-stage-0 entry)
and you can avoid removing it if you either
- keep the old one
- or resolve it to a new one.
and these rules should all be valid for the three-way case too.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Linus Torvalds [Fri, 10 Aug 2007 19:21:20 +0000 (12:21 -0700)]
Optimize the common cases of git-read-tree
This optimizes bind_merge() and oneway_merge() to not unnecessarily
remove and re-add the old index entries when they can just get replaced
by updated ones.
This makes these operations much faster for large trees (where "large"
is in the 50,000+ file range), because we don't unnecessarily move index
entries around in the index array all the time.
Using the "bummer" tree (a test-tree with 100,000 files) we get:
Before:
[torvalds@woody bummer]$ time git commit -m"Change one file" 50/500
real 0m9.470s
user 0m8.729s
sys 0m0.476s
After:
[torvalds@woody bummer]$ time git commit -m"Change one file" 50/500
real 0m1.173s
user 0m0.720s
sys 0m0.452s
so for large trees this is easily very noticeable indeed.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Linus Torvalds [Fri, 10 Aug 2007 19:15:54 +0000 (12:15 -0700)]
Move old index entry removal from "unpack_trees()" into the individual functions
This makes no changes to current code, but it allows the individual merge
functions to decide what to do about the old entry. They might decide to
update it in place, rather than force them to always delete and re-add it.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Linus Torvalds [Fri, 10 Aug 2007 16:51:58 +0000 (09:51 -0700)]
Fix "git commit directory/" performance anomaly
This trivial patch avoids re-hashing files that are already clean in the
index. This mirrors what commit 0781b8a9b2fe760fc4ed519a3a26e4b9bd6ccffe
did for "git add .", only for "git commit ." instead.
This improves the cold-cache case immensely, since we don't need to bring
in all the file contents, just the index and any files dirty in the index.
Before:
[torvalds@woody linux]$ time git commit .
real 1m49.537s
user 0m3.892s
sys 0m2.432s
After:
[torvalds@woody linux]$ time git commit .
real 0m14.273s
user 0m1.312s
sys 0m0.516s
(both after doing a "echo 3 > /proc/sys/vm/drop_caches" to get cold-cache
behaviour - even with the index optimization git still has to "lstat()"
all the files, so with a truly cold cache, bringing all the inodes in
will take some time).
[jc: trivial "return 0;" fixed]
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Thu, 9 Aug 2007 20:42:50 +0000 (13:42 -0700)]
Optimize "diff --cached" performance.
The read_tree() function is called only from the call chain to
run "git diff --cached" (this includes the internal call made by
git-runstatus to run_diff_index()). The function vacates stage
without any funky "merge" magic. The caller then goes and
compares stage #1 entries from the tree with stage #0 entries
from the original index.
When adding the cache entries this way, it used the general
purpose add_cache_entry(). This function looks for an existing
entry to replace or if there is none to find where to insert the
new entry, resolves D/F conflict and all the other things.
For the purpose of reading entries into an empty stage, none of
that processing is needed. We can instead append everything and
then sort the result at the end.
This commit changes read_tree() to first make sure that there is
no existing cache entries at specified stage, and if that is the
case, it runs add_cache_entry() with ADD_CACHE_JUST_APPEND flag
(new), and then sort the resulting cache using qsort().
This new flag tells add_cache_entry() to omit all the checks
such as "Does this path already exist? Does adding this path
remove other existing entries because it turns a directory to a
file?" and instead append the given cache entry straight at the
end of the active cache. The caller of course is expected to
sort the resulting cache at the end before using the result.
Linus Torvalds [Fri, 10 Aug 2007 05:21:29 +0000 (22:21 -0700)]
Start moving unpack-trees to "struct tree_desc"
This doesn't actually change any real code, but it changes the interface
to unpack_trees() to take an array of "struct tree_desc" entries, the same
way the tree-walk.c functions do.
The reason for this is that we would be much better off if we can do the
tree-unpacking using the generic "traverse_trees()" functionality instead
of having to the special "unpack" infrastructure.
This really is a pretty minimal diff, just to change the calling
convention. It passes all the tests, and looks sane. There were only two
users of "unpack_trees()": builtin-read-tree and merge-recursive, and I
tried to keep the changes minimal.
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reinstate the old behaviour when GIT_DIR is set and GIT_WORK_TREE is unset
The old behaviour was to unilaterally default to the cwd is the work tree
when GIT_DIR was set, but GIT_WORK_TREE wasn't, no matter if we are inside
the GIT_DIR, or if GIT_DIR is actually something like ../../../.git.
Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Fri, 10 Aug 2007 07:49:27 +0000 (00:49 -0700)]
Fix an illustration in git-rev-parse.txt
This hides the backslash at the end of line from AsciiDoc
toolchain by introducing a trailing whitespace on one line in an
illustration in git-rev-parse.txt.
send-email: get all the quoting of realnames right
- when sending several mails I got a slightly different behaviour for the first
mail compared to the second to last one. The reason is that $from was
assigned in line 608 and was not reset when beginning to handle the next
mail.
- Email::Valid can only handle properly quoted real names, so quote arguments
to extract_valid_address.
This patch cleans up variable naming to better differentiate between sender of
the mail and it's author.
Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
send-email: rfc822 forbids using <address@domain> without a non-empty "phrase"
Email::Valid does respect this considering such a mailbox specification
invalid. b06c6bc831cbb9e9eb82fd3ffd5a2b674cd940d0 addressed the issue, but
only if Email::Valid is available.
Signed-off-by: Uwe Kleine-König <ukleinek@informatik.uni-freiburg.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shawn O. Pearce [Thu, 9 Aug 2007 06:38:16 +0000 (02:38 -0400)]
Use the empty tree for base diff in paranoid-update on new branches
We have to load a tree difference for the purpose of testing
file patterns. But if our branch is being created and there is no
specific base to difference against in the rule our base will be
'0'x40. This is (usually) not a valid tree-ish object in a Git
repository, so there's nothing to difference against.
Instead of creating the empty tree and running git-diff against
that we just take the output of `ls-tree -r --name-only` and mark
every returned pathname as an add.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shawn O. Pearce [Thu, 9 Aug 2007 06:38:12 +0000 (02:38 -0400)]
Teach the update-paranoid to look at file differences
In some applications of the update hook a user may be allowed to
modify a branch, but only if the file level difference is also an
allowed change. This is the commonly requested feature of allowing
users to modify only certain files.
A new repository.*.allow syntax permits granting the three basic
file level operations:
A: file is added relative to the other tree
M: file exists in both trees, but its SHA-1 or mode differs
D: file is removed relative to the other tree
on a per-branch and path-name basis. The user must also have a
branch level allow line already granting them access to create,
rewind or update (CRU) that branch before the hook will consult
any file level rules.
In order for a branch change to succeed _all_ files that differ
relative to some base (by default the old value of this branch,
but it can also be any valid tree-ish) must be allowed by file
level allow rules. A push is rejected if any diff exists that
is not covered by at least one allow rule.
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shawn O. Pearce [Thu, 9 Aug 2007 06:38:09 +0000 (02:38 -0400)]
Teach update-paranoid how to store ACLs organized by groups
In some applications of this paranoid update hook the set of ACL
rules that need to be applied to a user can be large, and the
number of users that those rules must also be applied to can be
more than a handful of individuals. Rather than repeating the same
rules multiple times (once for each user) we now allow users to be
members of groups, where the group supplies the list of ACL rules.
For various reasons we don't depend on the underlying OS groups
and instead perform our own group handling.
Users can be made a member of one or more groups by setting the
user.memberOf property within the "users/$who.acl" file:
This will cause the hook to also parse the "groups/$groupname.acl"
file for each value of user.memberOf, and merge any allow rules
that match the current repository with the user's own private rules
(if they had any).
Since some rules are basically the same but may have a component
differ based on the individual user, any user.* key may be inserted
into a rule using the "${user.foo}" syntax. The allow rule does
not match if the user does not define one (and exactly one) value
for the key "foo".
Signed-off-by: Shawn O. Pearce <spearce@spearce.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Brian Downing [Thu, 9 Aug 2007 04:26:10 +0000 (23:26 -0500)]
cvsserver: Fix for work trees
git-cvsserver used checkout-index internally for commit and annotate.
Since a work tree is required for this to function now, this was
breaking. Work around this by defining GIT_WORK_TREE=. in the
appropriate places.
Signed-off-by: Brian Downing <bdowning@lavos.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simon Hausmann [Wed, 8 Aug 2007 15:06:55 +0000 (17:06 +0200)]
git-p4: Fix git-p4 submit to include only changed files in the perforce submit template.
Parse the files section in the "p4 change -o" output and remove lines with file changes in unrelated depot paths.
Signed-off-by: Simon Hausmann <simon@lst.de> Signed-off-by: Marius Storm-Olsen <marius@trolltech.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Junio C Hamano [Wed, 8 Aug 2007 20:41:46 +0000 (13:41 -0700)]
Reorder the list of commands in the manual.
The basic idea was proposed by Steve Hoelzer; in order to make
the list easier to search, we keep the command list in the
script that generates it with "sort -d".
Simon Hausmann [Tue, 7 Aug 2007 10:28:00 +0000 (12:28 +0200)]
git-p4: Fix support for symlinks.
Detect symlinks as file type, set the git file mode accordingly and strip off the trailing newline in the p4 print output.
Make the mode handling a bit more readable at the same time.
Signed-off-by: Simon Hausmann <simon@lst.de> Acked-by: Brian Swetland <swetland@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>