diff options
author | Karl Williamson <khw@cpan.org> | 2019-06-04 12:16:10 -0600 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2019-06-26 09:01:27 -0600 |
commit | 29a889ef8a5621dae70b129c9b5db9e83e1087f9 (patch) | |
tree | db39af97de82472d6b107fa39e08fcc8274bf335 /regcomp.sym | |
parent | f6eaa562638a777c6c2e56637898eb90a0f40412 (diff) | |
download | perl-29a889ef8a5621dae70b129c9b5db9e83e1087f9.tar.gz |
regex: Add lower bound to ANYOFH nodes UTF-8 byte
This commit adds a lower bound for the first UTF-8 byte matchable by an
ANYOFH node. The flags field is otherwise unused, and using it for this
purpose allows code to rule out match possibilities without having to
convert from UTF-8 to code point.
It might be better to do the inverse instead, to have the field be an
upper bound. The reason is that the conversion is cheap for smaller
numbers. The commit following mostly addresses this.
Diffstat (limited to 'regcomp.sym')
-rw-r--r-- | regcomp.sym | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/regcomp.sym b/regcomp.sym index 4b8885d0b3..9e2c6d3aea 100644 --- a/regcomp.sym +++ b/regcomp.sym @@ -64,7 +64,7 @@ ANYOFL ANYOF, sv charclass S ; Like ANYOF, but /l is in effect ANYOFPOSIXL ANYOF, sv charclass_posixl S ; Like ANYOFL, but matches [[:posix:]] classes # Must be sequential -ANYOFH ANYOF, sv 1 S ; Like ANYOF, but only has "High" matches, none in the bitmap; +ANYOFH ANYOF, sv 1 S ; Like ANYOF, but only has "High" matches, none in the bitmap; the flags field contains the lowest matchable UTF-8 start byte ANYOFHb ANYOF, sv 1 S ; Like ANYOFH, but all matches share the same UTF-8 start byte, given in the flags field ANYOFM ANYOFM byte 1 S ; Like ANYOF, but matches an invariant byte as determined by the mask and arg |