summaryrefslogtreecommitdiff
path: root/Documentation
diff options
context:
space:
mode:
authorJohan Herland <johan@herland.net>2011-04-29 11:36:21 +0200
committerJunio C Hamano <gitster@pobox.com>2011-04-29 11:22:55 -0700
commit1c57a627bf269f3c83c48ad724cd8b14292502ef (patch)
treecf9a49d2ff4665e31f21c4885ce9193dccf87d78 /Documentation
parent712d2c7dd893212756c21787fc12d6f71327e167 (diff)
downloadgit-1c57a627bf269f3c83c48ad724cd8b14292502ef.tar.gz
New --dirstat=lines mode, doing dirstat analysis based on diffstat
This patch adds an alternative implementation of show_dirstat(), called show_dirstat_by_line(), which uses the more expensive diffstat analysis (as opposed to show_dirstat()'s own (relatively inexpensive) analysis) to derive the numbers from which the --dirstat output is computed. The alternative implementation is controlled by the new "lines" parameter to the --dirstat option (or the diff.dirstat config variable). For binary files, the diffstat analysis counts bytes instead of lines, so to prevent binary files from dominating the dirstat results, the byte counts for binary files are divided by 64 before being compared to their textual/line-based counterparts. This is a stupid and ugly - but very cheap - heuristic. In linux-2.6.git, running the three different --dirstat modes: time git diff v2.6.20..v2.6.30 --dirstat=changes > /dev/null vs. time git diff v2.6.20..v2.6.30 --dirstat=lines > /dev/null vs. time git diff v2.6.20..v2.6.30 --dirstat=files > /dev/null yields the following average runtimes on my machine: - "changes" (default): ~6.0 s - "lines": ~9.6 s - "files": ~0.1 s So, as expected, there's a considerable performance hit (~60%) by going through the full diffstat analysis as compared to the default "changes" analysis (obviously, "files" is much faster than both). As such, the "lines" mode is probably only useful if you really need the --dirstat numbers to be consistent with the numbers returned from the other --*stat options. The patch also includes documentation and tests for the new dirstat mode. Improved-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Johan Herland <johan@herland.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'Documentation')
-rw-r--r--Documentation/diff-config.txt8
-rw-r--r--Documentation/diff-options.txt8
2 files changed, 16 insertions, 0 deletions
diff --git a/Documentation/diff-config.txt b/Documentation/diff-config.txt
index 228329d4ad..1aed79e7dc 100644
--- a/Documentation/diff-config.txt
+++ b/Documentation/diff-config.txt
@@ -23,6 +23,14 @@ diff.dirstat::
the amount of pure code movements within a file. In other words,
rearranging lines in a file is not counted as much as other changes.
This is the default behavior when no parameter is given.
+`lines`;;
+ Compute the dirstat numbers by doing the regular line-based diff
+ analysis, and summing the removed/added line counts. (For binary
+ files, count 64-byte chunks instead, since binary files have no
+ natural concept of lines). This is a more expensive `--dirstat`
+ behavior than the `changes` behavior, but it does count rearranged
+ lines within a file as much as other changes. The resulting output
+ is consistent with what you get from the other `--*stat` options.
`files`;;
Compute the dirstat numbers by counting the number of files changed.
Each changed file counts equally in the dirstat analysis. This is
diff --git a/Documentation/diff-options.txt b/Documentation/diff-options.txt
index dc023523ee..bddceb06b7 100644
--- a/Documentation/diff-options.txt
+++ b/Documentation/diff-options.txt
@@ -81,6 +81,14 @@ endif::git-format-patch[]
the amount of pure code movements within a file. In other words,
rearranging lines in a file is not counted as much as other changes.
This is the default behavior when no parameter is given.
+`lines`;;
+ Compute the dirstat numbers by doing the regular line-based diff
+ analysis, and summing the removed/added line counts. (For binary
+ files, count 64-byte chunks instead, since binary files have no
+ natural concept of lines). This is a more expensive `--dirstat`
+ behavior than the `changes` behavior, but it does count rearranged
+ lines within a file as much as other changes. The resulting output
+ is consistent with what you get from the other `--*stat` options.
`files`;;
Compute the dirstat numbers by counting the number of files changed.
Each changed file counts equally in the dirstat analysis. This is