diff options
author | Nguyễn Thái Ngọc Duy <pclouds@gmail.com> | 2013-05-26 08:16:17 +0700 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2013-05-28 08:07:20 -0700 |
commit | c6807a40dcd29f7e5ad1e2f4fc44f1729c9afa11 (patch) | |
tree | 756db497bfc2308440f6e746dac6138e1bd32501 /connected.c | |
parent | 920734b069b269937b25a692d21c1623cbaec4b0 (diff) | |
download | git-c6807a40dcd29f7e5ad1e2f4fc44f1729c9afa11.tar.gz |
clone: open a shortcut for connectivity check
In order to make sure the cloned repository is good, we run "rev-list
--objects --not --all $new_refs" on the repository. This is expensive
on large repositories. This patch attempts to mitigate the impact in
this special case.
In the "good" clone case, we only have one pack. If all of the
following are met, we can be sure that all objects reachable from the
new refs exist, which is the intention of running "rev-list ...":
- all refs point to an object in the pack
- there are no dangling pointers in any object in the pack
- no objects in the pack point to objects outside the pack
The second and third checks can be done with the help of index-pack as
a slight variation of --strict check (which introduces a new condition
for the shortcut: pack transfer must be used and the number of objects
large enough to call index-pack). The first is checked in
check_everything_connected after we get an "ok" from index-pack.
"index-pack + new checks" is still faster than the current "index-pack
+ rev-list", which is the whole point of this patch. If any of the
conditions fail, we fall back to the good old but expensive "rev-list
..". In that case it's even more expensive because we have to pay for
the new checks in index-pack. But that should only happen when the
other side is either buggy or malicious.
Cloning linux-2.6 over file://
before after
real 3m25.693s 2m53.050s
user 5m2.037s 4m42.396s
sys 0m13.750s 0m16.574s
A more realistic test with ssh:// over wireless
before after
real 11m26.629s 10m4.213s
user 5m43.196s 5m19.444s
sys 0m35.812s 0m37.630s
This shortcut is not applied to shallow clones, partly because shallow
clones should have no more objects than a usual fetch and the cost of
rev-list is acceptable, partly to avoid dealing with corner cases when
grafting is involved.
This shortcut does not apply to unpack-objects code path either
because the number of objects must be small in order to trigger that
code path.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'connected.c')
-rw-r--r-- | connected.c | 34 |
1 files changed, 33 insertions, 1 deletions
diff --git a/connected.c b/connected.c index 1e89c1cd1d..fae8d64c12 100644 --- a/connected.c +++ b/connected.c @@ -2,7 +2,12 @@ #include "run-command.h" #include "sigchain.h" #include "connected.h" +#include "transport.h" +int check_everything_connected(sha1_iterate_fn fn, int quiet, void *cb_data) +{ + return check_everything_connected_with_transport(fn, quiet, cb_data, NULL); +} /* * If we feed all the commits we want to verify to this command * @@ -14,7 +19,10 @@ * * Returns 0 if everything is connected, non-zero otherwise. */ -int check_everything_connected(sha1_iterate_fn fn, int quiet, void *cb_data) +int check_everything_connected_with_transport(sha1_iterate_fn fn, + int quiet, + void *cb_data, + struct transport *transport) { struct child_process rev_list; const char *argv[] = {"rev-list", "--objects", @@ -22,10 +30,23 @@ int check_everything_connected(sha1_iterate_fn fn, int quiet, void *cb_data) char commit[41]; unsigned char sha1[20]; int err = 0; + struct packed_git *new_pack = NULL; if (fn(cb_data, sha1)) return err; + if (transport && transport->smart_options && + transport->smart_options->self_contained_and_connected && + transport->pack_lockfile && + !suffixcmp(transport->pack_lockfile, ".keep")) { + struct strbuf idx_file = STRBUF_INIT; + strbuf_addstr(&idx_file, transport->pack_lockfile); + strbuf_setlen(&idx_file, idx_file.len - 5); /* ".keep" */ + strbuf_addstr(&idx_file, ".idx"); + new_pack = add_packed_git(idx_file.buf, idx_file.len, 1); + strbuf_release(&idx_file); + } + if (quiet) argv[5] = "--quiet"; @@ -42,6 +63,17 @@ int check_everything_connected(sha1_iterate_fn fn, int quiet, void *cb_data) commit[40] = '\n'; do { + /* + * If index-pack already checked that: + * - there are no dangling pointers in the new pack + * - the pack is self contained + * Then if the updated ref is in the new pack, then we + * are sure the ref is good and not sending it to + * rev-list for verification. + */ + if (new_pack && find_pack_entry_one(sha1, new_pack)) + continue; + memcpy(commit, sha1_to_hex(sha1), 40); if (write_in_full(rev_list.in, commit, 41) < 0) { if (errno != EPIPE && errno != EINVAL) |