diff options
author | Karl Williamson <khw@cpan.org> | 2018-01-13 15:40:34 -0700 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2018-01-15 15:20:47 -0700 |
commit | 1d2af5744d75143cf7ee8bfd33d4366a95dd1b95 (patch) | |
tree | c84a30f7324a7318473554e3544a137a6b88ce7a /embed.fnc | |
parent | 5d0379de16ad15d28efd4497c918e0ed272eb8c3 (diff) | |
download | perl-1d2af5744d75143cf7ee8bfd33d4366a95dd1b95.tar.gz |
Avoid some branches
This replaces some looping with branchless code in two places: looking
for the first UTF-8 variant byte in a string (which is used under
several circumstances), and looking for an ASCII or non-ASCII character
during pattern matching.
Recent commits have changed these operations to do word-at-a-time look-
ups, essentially vectorizing the problem into 4 or 8 parallel probes.
But when the word is found which contains the desired byte, until this
commit, that word would be scanned byte-at-a-time in a loop.
I found some bit hacks on the internet, which when stitched togther, can
find the first desired byte in the word without branching, while doing
this while the word is still loaded, without having to load each byte.
Diffstat (limited to 'embed.fnc')
-rw-r--r-- | embed.fnc | 3 |
1 files changed, 3 insertions, 0 deletions
@@ -806,6 +806,9 @@ AndmoR |bool |is_utf8_invariant_string|NN const U8* const s \ AnidR |bool |is_utf8_invariant_string_loc|NN const U8* const s \ |STRLEN len \ |NULLOK const U8 ** ep +#ifndef EBCDIC +AniR |unsigned int|_variant_byte_number|PERL_UINTMAX_T word +#endif #if defined(PERL_CORE) || defined(PERL_EXT) EinR |Size_t |variant_under_utf8_count|NN const U8* const s \ |NN const U8* const e |