diff options
author | Karl Williamson <khw@cpan.org> | 2016-08-31 20:33:21 -0600 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2016-08-31 20:33:21 -0600 |
commit | 8d19ebbca9eecf219cc453cffe88722722860dd9 (patch) | |
tree | 3cdd8f02788e9061d090782ed6be525887204a92 /unicode_constants.h | |
parent | 0baa827e0fd16abde2450ecee673f26319010e2d (diff) | |
parent | 2b6852c008f43c765471849e5576c5425c5d9e23 (diff) | |
download | perl-8d19ebbca9eecf219cc453cffe88722722860dd9.tar.gz |
Merge branch for improving API UTF-8 handling into blead
This set of commits came about to allow XS code to more easily and
quickly check for valid UTF-8 without rolling their own, which could be
lacking in security considerations.
Most of the small UTF-8 handling functions have now been inlined, and
the validity-only checking function has been rewritten to never need to
actually calculate the code point the UTF-8 represents.
The original impetus for this was because of changes in Encode that made
it vulnerable to malformed UTF-8. These changes were to speed up its
UTF-8 processing. By changing Encode to use this new stuff, it is
sped up on valid input by over a factor of 5 from the original
implementation, at the expense of slowing down entirely invalid input by
a factor of 4. Since we are expecting mostly valid input, this is an
overall big win. The original handrolled Encode changes sped up valid
input handling by about 1.5, without slowing handling of invalid down
appreciably.
Diffstat (limited to 'unicode_constants.h')
0 files changed, 0 insertions, 0 deletions