summaryrefslogtreecommitdiff
path: root/regcomp.sym
diff options
context:
space:
mode:
authorKarl Williamson <khw@cpan.org>2018-12-15 21:10:44 -0700
committerKarl Williamson <khw@cpan.org>2018-12-26 12:50:37 -0700
commita56633d124943ba26d859e646540b2c7ca8fa4c9 (patch)
treefeaa5673ed199286a85216d4336c2c3fe82b17a8 /regcomp.sym
parent817985d646817b410e3ca7ee60a1f3403c84b5b2 (diff)
downloadperl-a56633d124943ba26d859e646540b2c7ca8fa4c9.tar.gz
regcomp.c: Generate EXACTFU_SS only for non-UTF8
It turns out that now, the regular methods for handling multi-character folds work for the ones involving LATIN SMALL LETTER SHARP S when the pattern is in UTF-8. So the special code for handling this case can be removed, and a regular EXACTFU node is generated. This has the advantage of being trie-able, and requiring fewer operations at run time, as the pattern is pre-folded at compile time, and doesn't have to be re-folded during each backtracking at run-time. This means that the EXACTFU_SS node type will only be generated for non-UTF-8 patterns, and the handling of it is unchanged in these cases.
Diffstat (limited to 'regcomp.sym')
-rw-r--r--regcomp.sym2
1 files changed, 1 insertions, 1 deletions
diff --git a/regcomp.sym b/regcomp.sym
index ddf5ba886f..235305dbc9 100644
--- a/regcomp.sym
+++ b/regcomp.sym
@@ -107,7 +107,7 @@ EXACTFAA EXACT, str ; Match this string using /iaa rules (w/len) (stri
# End of important relative ordering.
-EXACTFU_SS EXACT, str ; Match this string using /iu rules (w/len); (string folded iff in UTF-8; non-UTF8 folded length > unfolded).
+EXACTFU_SS EXACT, str ; Match this string using /iu rules (w/len); (string not UTF-8, only portions guaranteed to be folded; folded length > unfolded).
EXACTFLU8 EXACT, str ; Like EXACTFU, but use /il, UTF-8, folded, and everything in it is above 255.
EXACTFAA_NO_TRIE EXACT, str ; Match this string using /iaa rules (w/len) (string not UTF-8, not guaranteed to be folded, not currently trie-able).