diff options
author | Karl Williamson <khw@cpan.org> | 2017-06-06 02:06:30 -0600 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2022-08-20 20:13:36 -0600 |
commit | 9b6b0f247b0616a79146a34a5b2490ea71618d7c (patch) | |
tree | fcdd1da8105995beeadb84817486aa1a15f6e013 /.mailmap | |
parent | 55e09612c46a02ba88349885254b48ae51cf30d8 (diff) | |
download | perl-9b6b0f247b0616a79146a34a5b2490ea71618d7c.tar.gz |
Per-word utf8_to_bytes()
This changes utf8_to_bytes() to do a per-word initial scan to see if the
source is actually downgradable, before starting the conversion. This
is significantly faster than the current per-character scan. However,
the speed advantage evaporates in doing the actual conversion to being a
wash with the previous scheme.
Thus it finds out quicker if the source is downgradable.
cache grind yields this, based on a 100K character string; the
non-downgradable one has the next character after that be the only one
that's too large.:
Key:
Ir Instruction read
Dr Data read
Dw Data write
COND conditional branches
IND indirect branches
_m branch predict miss
_m1 level 1 cache miss
_mm last cache (e.g. L3) miss
- indeterminate percentage (e.g. 1/0)
The numbers represent relative counts per loop iteration, compared to
blead at 100.0%.
Higher is better: for example, using half as many instructions gives 200%,
while using twice as many gives 50%.
unicode::bytes_to_utf8_legal_API_test
Downgrading 100K valid characters
blead proposed
------ ------
Ir 100.00 99.99
Dr 100.00 100.03
Dw 100.00 100.04
COND 100.00 100.05
IND 100.00 100.00
COND_m 100.00 87.25
IND_m 100.00 100.00
Ir_m1 100.00 123.25
Dr_m1 100.00 100.18
Dw_m1 100.00 99.94
Ir_mm 100.00 100.00
Dr_mm 100.00 100.00
Dw_mm 100.00 100.00
unicode::bytes_to_utf8_illegal
Finding too high a character after 100K valid ones
blead fast
------ ------
Ir 100.00 188.91
Dr 100.00 179.77
Dw 100.00 66.75
COND 100.00 278.47
IND 100.00 100.00
COND_m 100.00 88.71
IND_m 100.00 100.00
Ir_m1 100.00 121.86
Dr_m1 100.00 100.01
Dw_m1 100.00 100.03
Ir_mm 100.00 100.00
Dr_mm 100.00 100.00
Dw_mm 100.00 100.00
Diffstat (limited to '.mailmap')
0 files changed, 0 insertions, 0 deletions