make /fixed-substr/ much faster.

TL;DR: on platforms with a libc memchr() implementation which makes good use of underlying hardware support, patterns which include fixed substrings will now often be much faster; for example with glibc on on a recent x86_64 CPU, this: $s = "a" x 1000 . "wxyz"; $s =~ /wxyz/ for 1..30000 is now about 7 times faster. On systems with slow memchr(), e.g. 32-bit ARM Raspberry Pi, there will be a small or little speedup. Conversely, some pathological cases, such as "ab" x 1000 =~ /aa/ will be slower now; up to 3 times slower on the rPi, 1.5x slower on x86_64. In detail: The perl core includes a Boyer-Moore substring matcher, Perl_fbm_instr(), which is used to quickly find a fixed substring within a longer string, where a table of lookups is pre-computed from the substring. As well as being used in index() when the substring is a constant, its main use is in patterns. When the regex engine compiles a pattern, it typically takes note of the two longest fixed substrings within the pattern; for example in /[01]abc\w+de\d+fghij+/ the two longest are "abc" and "fghij". The engine uses Perl_fbm_instr() to scan for these two strings before running the full NFA. This often allows the string to be quickly rejected, or to find a suitable minimum starting point to run the NFA. However, Perl_fbm_instr() was written about 16 years ago and has been virtually untouched since, so it could do with some love. It currently special-cases strings of length 1 and 2, using roll-your-own loops along the lines of while (s < end) { if (*s++ = c1) ... } while strings of length 3+ use the Boyer-Moore algorithm. The big advantage of BM is that in a best-case, where none of the characters from the substring are found in this region of the string, it only has to test every N'th char, where N is length of the substring. For example when searching for wxyz in abcdefghikl..., it just reads and tests d,h,l,.. However these days some platforms have decent memchr() implementations. For example, glibc has assembly-level implementations for i386, x86_64, sparc32/64, powerpc32/64, s390-32/64, arm, m68k and ia64 by the looks of it. These can often be substantially faster than a C-level implementation. This commit makes Perl_fbm_instr() use memchr() where possible. For the length == 1 special case, it just calls memchr() directly rather than using a loop as previously. For the length == 2 special case, it continues to distinguish the cases where the two chars of the substring are the same or differ. For the former it calls memchr() after an initial direct failure, i.e. if (*s != c) { s++; s = memchr(....); ... } For the latter case it does a similar direct test first (to avoid the costly overhead of a call to memchr() when the next char is the one we seek anyway), but in addition, on each failure to find the second char following a found first char, it swaps which char it's searching for. This means that in something like "aaaaaaa..." =~ /ab/, it wont keep hopping 1 char position with memchar(s,'a'); after the first hop it will do memchr(s,'b') and skip lots of chars in one go. This helps reduce the number of pathological cases. For the length >= 3 cases (normal BM), it keeps using BM, but after each iteration where the pointer has been incremented by the skip determined by the BM algorithm, it now does an additional if (*s != c) { s++; s = memchr(....); ... } step before running the next iteration of BM.
author: David Mitchell <davem@iabyn.com> 2015-09-26 13:12:40 +0100
committer: David Mitchell <davem@iabyn.com> 2015-10-13 09:00:26 +0100
commit: 147f21b5b8054c559a1ffb568dbf310244fa0c91 (patch)
tree: 354af91ff33d42ba70decaea17146c06f5a07aa2 /t/perf
parent: 41c8d07a26fff22a68ce29ba9cedf18578b32343 (diff)
download: perl-147f21b5b8054c559a1ffb568dbf310244fa0c91.tar.gz
1 files changed, 88 insertions, 0 deletions
diff --git a/t/perf/benchmarks b/t/perf/benchmarks
index 7fcc1fd253..9456a6e7bb 100644
--- a/t/perf/benchmarks
+++ b/t/perf/benchmarks
@@ -236,6 +236,94 @@
     },
 
 
+    # using a const string as second arg to index triggers using FBM.
+    # the FBM matcher special-cases 1,2-byte strings.
+    #
+    'expr::index::short_const1' => {
+        desc    => 'index of a short string against a 1 char const substr',
+        setup   => 'my $x = "aaaab"',
+        code    => 'index $x, "b"',
+    },
+    'expr::index::long_const1' => {
+        desc    => 'index of a long string against a 1 char const substr',
+        setup   => 'my $x = "a" x 1000 . "b"',
+        code    => 'index $x, "b"',
+    },
+    'expr::index::short_const2aabc_bc' => {
+        desc    => 'index of a short string against a 2 char const substr',
+        setup   => 'my $x = "aaaabc"',
+        code    => 'index $x, "bc"',
+    },
+    'expr::index::long_const2aabc_bc' => {
+        desc    => 'index of a long string against a 2 char const substr',
+        setup   => 'my $x = "a" x 1000 . "bc"',
+        code    => 'index $x, "bc"',
+    },
+    'expr::index::long_const2aa_ab' => {
+        desc    => 'index of a long string aaa.. against const substr "ab"',
+        setup   => 'my $x = "a" x 1000',
+        code    => 'index $x, "ab"',
+    },
+    'expr::index::long_const2bb_ab' => {
+        desc    => 'index of a long string bbb.. against const substr "ab"',
+        setup   => 'my $x = "b" x 1000',
+        code    => 'index $x, "ab"',
+    },
+    'expr::index::long_const2aa_bb' => {
+        desc    => 'index of a long string aaa.. against const substr "bb"',
+        setup   => 'my $x = "a" x 1000',
+        code    => 'index $x, "bb"',
+    },
+    # this one is designed to be pathological
+    'expr::index::long_const2ab_aa' => {
+        desc    => 'index of a long string abab.. against const substr "aa"',
+        setup   => 'my $x = "ab" x 500',
+        code    => 'index $x, "aa"',
+    },
+    # near misses with gaps, 1st letter
+    'expr::index::long_const2aaxx_xy' => {
+        desc    => 'index of a long string with "xx"s against const substr "xy"',
+        setup   => 'my $x = "aaaaaaaaxx" x 100',
+        code    => 'index $x, "xy"',
+    },
+    # near misses with gaps, 2nd letter
+    'expr::index::long_const2aayy_xy' => {
+        desc    => 'index of a long string with "yy"s against const substr "xy"',
+        setup   => 'my $x = "aaaaaaaayy" x 100',
+        code    => 'index $x, "xy"',
+    },
+    # near misses with gaps, duplicate letter
+    'expr::index::long_const2aaxy_xx' => {
+        desc    => 'index of a long string with "xy"s against const substr "xx"',
+        setup   => 'my $x = "aaaaaaaaxy" x 100',
+        code    => 'index $x, "xx"',
+    },
+    # alternating near misses with gaps
+    'expr::index::long_const2aaxxaayy_xy' => {
+        desc    => 'index of a long string with "xx/yy"s against const substr "xy"',
+        setup   => 'my $x = "aaaaaaaaxxbbbbbbbbyy" x 50',
+        code    => 'index $x, "xy"',
+    },
+    'expr::index::short_const3aabcd_bcd' => {
+        desc    => 'index of a short string against a 3 char const substr',
+        setup   => 'my $x = "aaaabcd"',
+        code    => 'index $x, "bcd"',
+    },
+    'expr::index::long_const3aabcd_bcd' => {
+        desc    => 'index of a long string against a 3 char const substr',
+        setup   => 'my $x = "a" x 1000 . "bcd"',
+        code    => 'index $x, "bcd"',
+    },
+    'expr::index::long_const3ab_abc' => {
+        desc    => 'index of a long string of "ab"s against a 3 char const substr "abc"',
+        setup   => 'my $x = "ab" x 500',
+        code    => 'index $x, "abc"',
+    },
+    'expr::index::long_const3bc_abc' => {
+        desc    => 'index of a long string of "bc"s against a 3 char const substr "abc"',
+        setup   => 'my $x = "bc" x 500',
+        code    => 'index $x, "abc"',
+    },
     'expr::index::utf8_position_1' => {
         desc    => 'index of a utf8 string, matching at position 1',
         setup   => 'my $x = "abc". chr(0x100); chop $x',
author	David Mitchell <davem@iabyn.com>	2015-09-26 13:12:40 +0100
committer	David Mitchell <davem@iabyn.com>	2015-10-13 09:00:26 +0100
commit	147f21b5b8054c559a1ffb568dbf310244fa0c91 (patch)
tree	354af91ff33d42ba70decaea17146c06f5a07aa2 /t/perf
parent	41c8d07a26fff22a68ce29ba9cedf18578b32343 (diff)
download	perl-147f21b5b8054c559a1ffb568dbf310244fa0c91.tar.gz