diff options
author | Karl Williamson <public@khwilliamson.com> | 2012-01-06 13:46:17 -0700 |
---|---|---|
committer | Karl Williamson <public@khwilliamson.com> | 2012-01-13 09:58:39 -0700 |
commit | bf4c00b474859c4f7090aa4d9988621f0cd3946c (patch) | |
tree | 38e1a0505468ed8a82f5815c6a4a4e27436c0feb /regcomp.c | |
parent | d9105c956266099fce1d9a12501a44113182711e (diff) | |
download | perl-bf4c00b474859c4f7090aa4d9988621f0cd3946c.tar.gz |
regcomp.c: Better optimize [classes] under /aa.
An optimization introduced in 5.14 is for bracketed character classes of
the very special form like [Bb]. These can be optimized into an
EXACTFish node. In this case, they can be optimized to an EXACTFA node
since they are ASCII characters. If the surrounding options are /aa, it
is likely that any adjacent EXACTFish nodes will be EXACTFA, so optimize
to that node instead of the previous EXACTFU. This will allow the
optimizer to collapse any adjacent nodes. For example
qr/a[B]c/aai
will now get optimized to an EXACTFA of "abc". Previously it would
have gotten optimized to EXACTFA<a> . EXACTFU<b> . EXACTFA<c>.
Diffstat (limited to 'regcomp.c')
-rw-r--r-- | regcomp.c | 14 |
1 files changed, 9 insertions, 5 deletions
@@ -11062,12 +11062,16 @@ parseit: * is just the lower case of the current one (which may resolve to * itself, or to the other one */ value = toLOWER_LATIN1(value); - if (AT_LEAST_UNI_SEMANTICS || !isASCII(value)) { - /* To join adjacent nodes, they must be the exact EXACTish - * type. Try to use the most likely type, by using EXACTFU if - * the regex calls for them, or is required because the - * character is non-ASCII */ + /* To join adjacent nodes, they must be the exact EXACTish type. + * Try to use the most likely type, by using EXACTFA if possible, + * then EXACTFU if the regex calls for it, or is required because + * the character is non-ASCII. (If <value> is ASCII, its fold is + * also ASCII for the cases where we get here.) */ + if (MORE_ASCII_RESTRICTED && isASCII(value)) { + op = EXACTFA; + } + else if (AT_LEAST_UNI_SEMANTICS || !isASCII(value)) { op = EXACTFU; } else { /* Otherwise, more likely to be EXACTF type */ |