summaryrefslogtreecommitdiff
path: root/regcomp.sym
diff options
context:
space:
mode:
authorKarl Williamson <khw@cpan.org>2019-06-04 12:16:10 -0600
committerKarl Williamson <khw@cpan.org>2019-06-26 09:01:27 -0600
commit29a889ef8a5621dae70b129c9b5db9e83e1087f9 (patch)
treedb39af97de82472d6b107fa39e08fcc8274bf335 /regcomp.sym
parentf6eaa562638a777c6c2e56637898eb90a0f40412 (diff)
downloadperl-29a889ef8a5621dae70b129c9b5db9e83e1087f9.tar.gz
regex: Add lower bound to ANYOFH nodes UTF-8 byte
This commit adds a lower bound for the first UTF-8 byte matchable by an ANYOFH node. The flags field is otherwise unused, and using it for this purpose allows code to rule out match possibilities without having to convert from UTF-8 to code point. It might be better to do the inverse instead, to have the field be an upper bound. The reason is that the conversion is cheap for smaller numbers. The commit following mostly addresses this.
Diffstat (limited to 'regcomp.sym')
-rw-r--r--regcomp.sym2
1 files changed, 1 insertions, 1 deletions
diff --git a/regcomp.sym b/regcomp.sym
index 4b8885d0b3..9e2c6d3aea 100644
--- a/regcomp.sym
+++ b/regcomp.sym
@@ -64,7 +64,7 @@ ANYOFL ANYOF, sv charclass S ; Like ANYOF, but /l is in effect
ANYOFPOSIXL ANYOF, sv charclass_posixl S ; Like ANYOFL, but matches [[:posix:]] classes
# Must be sequential
-ANYOFH ANYOF, sv 1 S ; Like ANYOF, but only has "High" matches, none in the bitmap;
+ANYOFH ANYOF, sv 1 S ; Like ANYOF, but only has "High" matches, none in the bitmap; the flags field contains the lowest matchable UTF-8 start byte
ANYOFHb ANYOF, sv 1 S ; Like ANYOFH, but all matches share the same UTF-8 start byte, given in the flags field
ANYOFM ANYOFM byte 1 S ; Like ANYOF, but matches an invariant byte as determined by the mask and arg