diff options
author | Karl Williamson <khw@cpan.org> | 2018-12-15 21:10:44 -0700 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2018-12-26 12:50:37 -0700 |
commit | a56633d124943ba26d859e646540b2c7ca8fa4c9 (patch) | |
tree | feaa5673ed199286a85216d4336c2c3fe82b17a8 /regnodes.h | |
parent | 817985d646817b410e3ca7ee60a1f3403c84b5b2 (diff) | |
download | perl-a56633d124943ba26d859e646540b2c7ca8fa4c9.tar.gz |
regcomp.c: Generate EXACTFU_SS only for non-UTF8
It turns out that now, the regular methods for handling multi-character
folds work for the ones involving LATIN SMALL LETTER SHARP S when the
pattern is in UTF-8. So the special code for handling this case can be
removed, and a regular EXACTFU node is generated. This has the
advantage of being trie-able, and requiring fewer operations at run
time, as the pattern is pre-folded at compile time, and doesn't have to
be re-folded during each backtracking at run-time.
This means that the EXACTFU_SS node type will only be generated for
non-UTF-8 patterns, and the handling of it is unchanged in these cases.
Diffstat (limited to 'regnodes.h')
-rw-r--r-- | regnodes.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/regnodes.h b/regnodes.h index 94b444379c..9fd01a70fc 100644 --- a/regnodes.h +++ b/regnodes.h @@ -53,7 +53,7 @@ #define EXACTFL 39 /* 0x27 Match this string using /il rules (w/len); (string not guaranteed to be folded). */ #define EXACTFU 40 /* 0x28 Match this string using /iu rules (w/len); (string folded iff in UTF-8; non-UTF8 folded length <= unfolded). */ #define EXACTFAA 41 /* 0x29 Match this string using /iaa rules (w/len) (string folded iff in UTF-8; non-UTF8 folded length <= unfolded). */ -#define EXACTFU_SS 42 /* 0x2a Match this string using /iu rules (w/len); (string folded iff in UTF-8; non-UTF8 folded length > unfolded). */ +#define EXACTFU_SS 42 /* 0x2a Match this string using /iu rules (w/len); (string not UTF-8, only portions guaranteed to be folded; folded length > unfolded). */ #define EXACTFLU8 43 /* 0x2b Like EXACTFU, but use /il, UTF-8, folded, and everything in it is above 255. */ #define EXACTFAA_NO_TRIE 44 /* 0x2c Match this string using /iaa rules (w/len) (string not UTF-8, not guaranteed to be folded, not currently trie-able). */ #define EXACT_ONLY8 45 /* 0x2d Like EXACT, but only UTF-8 encoded targets can match */ |