diff options
author | Karl Williamson <khw@cpan.org> | 2018-12-15 21:10:44 -0700 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2018-12-26 12:50:37 -0700 |
commit | a56633d124943ba26d859e646540b2c7ca8fa4c9 (patch) | |
tree | feaa5673ed199286a85216d4336c2c3fe82b17a8 /regcomp.sym | |
parent | 817985d646817b410e3ca7ee60a1f3403c84b5b2 (diff) | |
download | perl-a56633d124943ba26d859e646540b2c7ca8fa4c9.tar.gz |
regcomp.c: Generate EXACTFU_SS only for non-UTF8
It turns out that now, the regular methods for handling multi-character
folds work for the ones involving LATIN SMALL LETTER SHARP S when the
pattern is in UTF-8. So the special code for handling this case can be
removed, and a regular EXACTFU node is generated. This has the
advantage of being trie-able, and requiring fewer operations at run
time, as the pattern is pre-folded at compile time, and doesn't have to
be re-folded during each backtracking at run-time.
This means that the EXACTFU_SS node type will only be generated for
non-UTF-8 patterns, and the handling of it is unchanged in these cases.
Diffstat (limited to 'regcomp.sym')
-rw-r--r-- | regcomp.sym | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/regcomp.sym b/regcomp.sym index ddf5ba886f..235305dbc9 100644 --- a/regcomp.sym +++ b/regcomp.sym @@ -107,7 +107,7 @@ EXACTFAA EXACT, str ; Match this string using /iaa rules (w/len) (stri # End of important relative ordering. -EXACTFU_SS EXACT, str ; Match this string using /iu rules (w/len); (string folded iff in UTF-8; non-UTF8 folded length > unfolded). +EXACTFU_SS EXACT, str ; Match this string using /iu rules (w/len); (string not UTF-8, only portions guaranteed to be folded; folded length > unfolded). EXACTFLU8 EXACT, str ; Like EXACTFU, but use /il, UTF-8, folded, and everything in it is above 255. EXACTFAA_NO_TRIE EXACT, str ; Match this string using /iaa rules (w/len) (string not UTF-8, not guaranteed to be folded, not currently trie-able). |