summaryrefslogtreecommitdiff
path: root/embed.fnc
diff options
context:
space:
mode:
authorKarl Williamson <khw@cpan.org>2018-01-13 15:40:34 -0700
committerKarl Williamson <khw@cpan.org>2018-01-15 15:20:47 -0700
commit1d2af5744d75143cf7ee8bfd33d4366a95dd1b95 (patch)
treec84a30f7324a7318473554e3544a137a6b88ce7a /embed.fnc
parent5d0379de16ad15d28efd4497c918e0ed272eb8c3 (diff)
downloadperl-1d2af5744d75143cf7ee8bfd33d4366a95dd1b95.tar.gz
Avoid some branches
This replaces some looping with branchless code in two places: looking for the first UTF-8 variant byte in a string (which is used under several circumstances), and looking for an ASCII or non-ASCII character during pattern matching. Recent commits have changed these operations to do word-at-a-time look- ups, essentially vectorizing the problem into 4 or 8 parallel probes. But when the word is found which contains the desired byte, until this commit, that word would be scanned byte-at-a-time in a loop. I found some bit hacks on the internet, which when stitched togther, can find the first desired byte in the word without branching, while doing this while the word is still loaded, without having to load each byte.
Diffstat (limited to 'embed.fnc')
-rw-r--r--embed.fnc3
1 files changed, 3 insertions, 0 deletions
diff --git a/embed.fnc b/embed.fnc
index beb52e8b66..cd654dd1e7 100644
--- a/embed.fnc
+++ b/embed.fnc
@@ -806,6 +806,9 @@ AndmoR |bool |is_utf8_invariant_string|NN const U8* const s \
AnidR |bool |is_utf8_invariant_string_loc|NN const U8* const s \
|STRLEN len \
|NULLOK const U8 ** ep
+#ifndef EBCDIC
+AniR |unsigned int|_variant_byte_number|PERL_UINTMAX_T word
+#endif
#if defined(PERL_CORE) || defined(PERL_EXT)
EinR |Size_t |variant_under_utf8_count|NN const U8* const s \
|NN const U8* const e