diff options
author | Karl Williamson <khw@cpan.org> | 2019-06-04 12:16:10 -0600 |
---|---|---|
committer | Karl Williamson <khw@cpan.org> | 2019-06-26 09:01:27 -0600 |
commit | 29a889ef8a5621dae70b129c9b5db9e83e1087f9 (patch) | |
tree | db39af97de82472d6b107fa39e08fcc8274bf335 /regnodes.h | |
parent | f6eaa562638a777c6c2e56637898eb90a0f40412 (diff) | |
download | perl-29a889ef8a5621dae70b129c9b5db9e83e1087f9.tar.gz |
regex: Add lower bound to ANYOFH nodes UTF-8 byte
This commit adds a lower bound for the first UTF-8 byte matchable by an
ANYOFH node. The flags field is otherwise unused, and using it for this
purpose allows code to rule out match possibilities without having to
convert from UTF-8 to code point.
It might be better to do the inverse instead, to have the field be an
upper bound. The reason is that the conversion is cheap for smaller
numbers. The commit following mostly addresses this.
Diffstat (limited to 'regnodes.h')
-rw-r--r-- | regnodes.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/regnodes.h b/regnodes.h index 487b6c2dee..5e39b5035b 100644 --- a/regnodes.h +++ b/regnodes.h @@ -33,7 +33,7 @@ #define ANYOFD 19 /* 0x13 Like ANYOF, but /d is in effect */ #define ANYOFL 20 /* 0x14 Like ANYOF, but /l is in effect */ #define ANYOFPOSIXL 21 /* 0x15 Like ANYOFL, but matches [[:posix:]] classes */ -#define ANYOFH 22 /* 0x16 Like ANYOF, but only has "High" matches, none in the bitmap; */ +#define ANYOFH 22 /* 0x16 Like ANYOF, but only has "High" matches, none in the bitmap; the flags field contains the lowest matchable UTF-8 start byte */ #define ANYOFHb 23 /* 0x17 Like ANYOFH, but all matches share the same UTF-8 start byte, given in the flags field */ #define ANYOFM 24 /* 0x18 Like ANYOF, but matches an invariant byte as determined by the mask and arg */ #define NANYOFM 25 /* 0x19 complement of ANYOFM */ |