delta/perl.git - github.com: perl/perl5.git

diff options

author	Karl Williamson <khw@cpan.org>	2017-11-15 10:19:33 -0700
committer	Karl Williamson <khw@cpan.org>	2017-11-23 14:18:51 -0700
commit	e17544a60909ed9555c0dad7cd24afc40eb736e7 (patch)
tree	3e49108314dd819ad6880ebaeb4640c0e8b3494d /doop.c
parent	46a08a6f3bc2ec1482773059c74749f47b161b01 (diff)
download	perl-e17544a60909ed9555c0dad7cd24afc40eb736e7.tar.gz

Search for UTF-8 invariants by word

The functions is_utf8_invariant_string() and is_utf8_invariant_string_loc() are used in several places in the core and are part of the public API. This commit speeds them up significantly on ASCII (not EBCDIC) platforms, by changing to use word-at-a-time parsing instead of per-byte. (Per-byte is retained for any initial bytes to reach the next word boundary, and any final bytes that don't fill an entire word.) The following results were obtained parsing a long string on a 64-bit word machine: byte word ------ ------ Ir 100.00 665.35 Dr 100.00 797.03 Dw 100.00 102.12 COND 100.00 799.27 IND 100.00 97.56 COND_m 100.00 144.83 IND_m 100.00 75.00 Ir_m1 100.00 100.00 Dr_m1 100.00 100.02 Dw_m1 100.00 104.12 Ir_mm 100.00 100.00 Dr_mm 100.00 100.00 Dw_mm 100.00 100.00 100% is baseline; numbers larger than that are improvements. The COND measurement indicates, for example, that there 1/8 as many conditional branches in the word-at-a-time version.

Diffstat (limited to 'doop.c')

0 files changed, 0 insertions, 0 deletions


context:
space:
mode: