summaryrefslogtreecommitdiff
path: root/regcomp.c
diff options
context:
space:
mode:
authorKarl Williamson <public@khwilliamson.com>2012-01-06 13:46:17 -0700
committerKarl Williamson <public@khwilliamson.com>2012-01-13 09:58:39 -0700
commitbf4c00b474859c4f7090aa4d9988621f0cd3946c (patch)
tree38e1a0505468ed8a82f5815c6a4a4e27436c0feb /regcomp.c
parentd9105c956266099fce1d9a12501a44113182711e (diff)
downloadperl-bf4c00b474859c4f7090aa4d9988621f0cd3946c.tar.gz
regcomp.c: Better optimize [classes] under /aa.
An optimization introduced in 5.14 is for bracketed character classes of the very special form like [Bb]. These can be optimized into an EXACTFish node. In this case, they can be optimized to an EXACTFA node since they are ASCII characters. If the surrounding options are /aa, it is likely that any adjacent EXACTFish nodes will be EXACTFA, so optimize to that node instead of the previous EXACTFU. This will allow the optimizer to collapse any adjacent nodes. For example qr/a[B]c/aai will now get optimized to an EXACTFA of "abc". Previously it would have gotten optimized to EXACTFA<a> . EXACTFU<b> . EXACTFA<c>.
Diffstat (limited to 'regcomp.c')
-rw-r--r--regcomp.c14
1 files changed, 9 insertions, 5 deletions
diff --git a/regcomp.c b/regcomp.c
index b71942ff00..c2cc4c410d 100644
--- a/regcomp.c
+++ b/regcomp.c
@@ -11062,12 +11062,16 @@ parseit:
* is just the lower case of the current one (which may resolve to
* itself, or to the other one */
value = toLOWER_LATIN1(value);
- if (AT_LEAST_UNI_SEMANTICS || !isASCII(value)) {
- /* To join adjacent nodes, they must be the exact EXACTish
- * type. Try to use the most likely type, by using EXACTFU if
- * the regex calls for them, or is required because the
- * character is non-ASCII */
+ /* To join adjacent nodes, they must be the exact EXACTish type.
+ * Try to use the most likely type, by using EXACTFA if possible,
+ * then EXACTFU if the regex calls for it, or is required because
+ * the character is non-ASCII. (If <value> is ASCII, its fold is
+ * also ASCII for the cases where we get here.) */
+ if (MORE_ASCII_RESTRICTED && isASCII(value)) {
+ op = EXACTFA;
+ }
+ else if (AT_LEAST_UNI_SEMANTICS || !isASCII(value)) {
op = EXACTFU;
}
else { /* Otherwise, more likely to be EXACTF type */