diff options
author | Jeff King <peff@peff.net> | 2010-04-01 20:14:24 -0400 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2010-04-02 00:11:20 -0700 |
commit | b3373982667dc983b8dacf33861d25b20bafb995 (patch) | |
tree | 852fb52d64888301061e03ec99f4906385cb3aa9 /commit.h | |
parent | d9bae1a178f0f8b198ea611e874975214ad6f990 (diff) | |
download | git-b3373982667dc983b8dacf33861d25b20bafb995.tar.gz |
diff: avoid useless filespec population
builtin_diff calls fill_mmfile fairly early, which in turn
calls diff_populate_filespec, which actually retrieves the
file's blob contents into a buffer. Long ago, this was
sensible as we would need to look at the blobs eventually.
These days, however, we may not ever want those blobs if we
end up using a textconv cache, and for large binary files
(exactly the sort for which you might have a textconv
cache), just retrieving the objects can be costly.
This patch just pushes the fill_mmfile call a bit later, so
we can avoid populating the filespec in some cases. There
is one thing to note that looks like a bug but isn't. We
push the fill_mmfile down into the first branch of a
conditional. It seems like we would need it on the other
branch, too, but we don't; fill_textconv does it for us (in
fact, before this, we were just writing over the results of
the fill_mmfile on that branch).
Here's a timing sample on a commit with 45 changed jpgs and
avis. The result is fully textconv cached, but we still
wasted a lot of time just pulling the blobs from storage.
The total size of the blobs (source and dest) is about
180M.
[before]
$ time git show >/dev/null
real 0m0.352s
user 0m0.148s
sys 0m0.200s
[after]
$ time git show >/dev/null
real 0m0.009s
user 0m0.004s
sys 0m0.004s
And that's on a warm cache. On a cold cache, the "after"
case is not much worse, but the "before" case has to do an
extra 180M of I/O.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'commit.h')
0 files changed, 0 insertions, 0 deletions