summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorÆvar Arnfjörð Bjarmason <avarab@gmail.com>2017-05-25 19:45:21 +0000
committerJunio C Hamano <gitster@pobox.com>2017-05-26 12:52:36 +0900
commit3878c7a5404dc20b24d7a3df2dce76bdc25fbb7f (patch)
tree9bfe520ff11bf380080f4b543aebec7f18131d7c
parentb11ad029cb67070f04e131b3afb739da5d086509 (diff)
downloadgit-3878c7a5404dc20b24d7a3df2dce76bdc25fbb7f.tar.gz
perf: add a comparison test of grep regex engines
Add a very basic performance comparison test comparing the POSIX basic, extended and perl engines. In theory the "basic" and "extended" engines should be implemented using the same underlying code with a slightly different pattern parser, but some implementations may not do this. Jump through some slight hoops to test both, which is worthwhile since "basic" is the default. Running this on an i7 3.4GHz Linux 4.9.0-2 Debian testing against a checkout of linux.git & latest upstream PCRE, both PCRE and git compiled with -O3 using gcc 7.1.1: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux ./run p7820-grep-engines.sh [...] Test this tree --------------------------------------------------------------- 7820.1: basic grep 'how.to' 0.34(1.24+0.53) 7820.2: extended grep 'how.to' 0.33(1.23+0.45) 7820.3: perl grep 'how.to' 0.31(1.05+0.56) 7820.5: basic grep '^how to' 0.32(1.24+0.42) 7820.6: extended grep '^how to' 0.33(1.20+0.44) 7820.7: perl grep '^how to' 0.57(2.67+0.42) 7820.9: basic grep '[how] to' 0.51(2.16+0.45) 7820.10: extended grep '[how] to' 0.49(2.20+0.43) 7820.11: perl grep '[how] to' 0.56(2.60+0.43) 7820.13: basic grep '\(e.t[^ ]*\|v.ry\) rare' 0.66(3.25+0.40) 7820.14: extended grep '(e.t[^ ]*|v.ry) rare' 0.65(3.19+0.46) 7820.15: perl grep '(e.t[^ ]*|v.ry) rare' 1.05(5.74+0.34) 7820.17: basic grep 'm\(ú\|u\)lt.b\(æ\|y\)te' 0.34(1.28+0.47) 7820.18: extended grep 'm(ú|u)lt.b(æ|y)te' 0.34(1.38+0.38) 7820.19: perl grep 'm(ú|u)lt.b(æ|y)te' 0.39(1.56+0.44) Options can also be passed to git-grep via the GIT_PERF_7820_GREP_OPTS environment variable. There are various modes such as "-v" that have very different performance profiles, but handling the combinatorial explosion of testing all those options would make this script much more complex and harder to maintain. Instead just add the ability to do one-shot runs with arbitrary options, e.g.: $ GIT_PERF_REPEAT_COUNT=10 GIT_PERF_LARGE_REPO=~/g/linux GIT_PERF_7820_GREP_OPTS=" -i" ./run p7820-grep-engines.sh [...] Test this tree ------------------------------------------------------------------ 7820.1: basic grep -i 'how.to' 0.49(1.72+0.38) 7820.2: extended grep -i 'how.to' 0.46(1.64+0.42) 7820.3: perl grep -i 'how.to' 0.44(1.45+0.45) 7820.5: basic grep -i '^how to' 0.47(1.76+0.38) 7820.6: extended grep -i '^how to' 0.47(1.70+0.42) 7820.7: perl grep -i '^how to' 0.65(2.72+0.37) 7820.9: basic grep -i '[how] to' 0.86(3.64+0.42) 7820.10: extended grep -i '[how] to' 0.84(3.62+0.46) 7820.11: perl grep -i '[how] to' 0.73(3.06+0.39) 7820.13: basic grep -i '\(e.t[^ ]*\|v.ry\) rare' 1.63(8.13+0.36) 7820.14: extended grep -i '(e.t[^ ]*|v.ry) rare' 1.64(8.01+0.44) 7820.15: perl grep -i '(e.t[^ ]*|v.ry) rare' 1.44(6.88+0.44) 7820.17: basic grep -i 'm\(ú\|u\)lt.b\(æ\|y\)te' 0.66(2.67+0.44) 7820.18: extended grep -i 'm(ú|u)lt.b(æ|y)te' 0.66(2.67+0.43) 7820.19: perl grep -i 'm(ú|u)lt.b(æ|y)te' 0.59(2.31+0.37) Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
-rwxr-xr-xt/perf/p7820-grep-engines.sh56
1 files changed, 56 insertions, 0 deletions
diff --git a/t/perf/p7820-grep-engines.sh b/t/perf/p7820-grep-engines.sh
new file mode 100755
index 0000000000..62aba19e76
--- /dev/null
+++ b/t/perf/p7820-grep-engines.sh
@@ -0,0 +1,56 @@
+#!/bin/sh
+
+test_description="Comparison of git-grep's regex engines
+
+Set GIT_PERF_7820_GREP_OPTS in the environment to pass options to
+git-grep. Make sure to include a leading space,
+e.g. GIT_PERF_7820_GREP_OPTS=' -i'. Some options to try:
+
+ -i
+ -w
+ -v
+ -vi
+ -vw
+ -viw
+"
+
+. ./perf-lib.sh
+
+test_perf_large_repo
+test_checkout_worktree
+
+for pattern in \
+ 'how.to' \
+ '^how to' \
+ '[how] to' \
+ '\(e.t[^ ]*\|v.ry\) rare' \
+ 'm\(ú\|u\)lt.b\(æ\|y\)te'
+do
+ for engine in basic extended perl
+ do
+ if test $engine != "basic"
+ then
+ # Poor man's basic -> extended converter.
+ pattern=$(echo "$pattern" | sed 's/\\//g')
+ fi
+ if test $engine = "perl" && ! test_have_prereq PCRE
+ then
+ prereq="PCRE"
+ else
+ prereq=""
+ fi
+ test_perf $prereq "$engine grep$GIT_PERF_7820_GREP_OPTS '$pattern'" "
+ git -c grep.patternType=$engine grep$GIT_PERF_7820_GREP_OPTS -- '$pattern' >'out.$engine' || :
+ "
+ done
+
+ test_expect_success "assert that all engines found the same for$GIT_PERF_7820_GREP_OPTS '$pattern'" '
+ test_cmp out.basic out.extended &&
+ if test_have_prereq PCRE
+ then
+ test_cmp out.basic out.perl
+ fi
+ '
+done
+
+test_done