summaryrefslogtreecommitdiff
path: root/regcomp.sym
diff options
context:
space:
mode:
authorYves Orton <demerphq@gmail.com>2023-01-15 13:00:46 +0100
committerYves Orton <demerphq@gmail.com>2023-03-13 21:26:08 +0800
commit17e3e02ad120eabda2bdb6c297a70d53294437ef (patch)
tree6fc99228c2a34c7ee5ec892de4c1f1a980e2f240 /regcomp.sym
parent59db194299c94c6707095797c3df0e2f67ff82b2 (diff)
downloadperl-17e3e02ad120eabda2bdb6c297a70d53294437ef.tar.gz
regex engine - simplify regnode structures and make them consistent
This eliminates the regnode_2L data structure, and merges it with the older regnode_2 data structure. At the same time it makes each "arg" property of the various regnode types that have one be consistently structured as an anonymous union like this: union { U32 arg1u; I32 arg2i; struct { U16 arg1a; U16 arg1b; }; }; We then expose four macros for accessing each slot: ARG1u() ARG1i() and ARG1a() and ARG1b(). Code then explicitly designates which they want. The old logic used ARG() to access an U32 arg1, and ARG1() to access an I32 arg1, which was confusing to say the least. The regnode_2L structure had a U32 arg1, and I32 arg2, and the regnode_2 data strucutre had two I32 args. With the new set of macros we use the regnode_2 for both, and use the appropriate macros to show whether we want to signed or unsigned values. This also renames the regnode_4 to regnode_3. The 3 stands for "three 32-bit args". However as each slot can also store two U16s, a regnode_3 can hold up to 6 U16s, or as 3 I32's, or a combination. For instance the CURLY style nodes use regnode_3 to store 4 values, ARG1i() for min count, ARG2i() for max count and ARG3a() and ARG3b() for parens before and inside the quantifier. It also changes the functions reganode() to reg1node() and changes reg2Lanode() to reg2node(). The 2L thing was just confusing.
Diffstat (limited to 'regcomp.sym')
-rw-r--r--regcomp.sym16
1 files changed, 8 insertions, 8 deletions
diff --git a/regcomp.sym b/regcomp.sym
index d58f1cb54f..1c0af0cd53 100644
--- a/regcomp.sym
+++ b/regcomp.sym
@@ -217,10 +217,10 @@ TAIL NOTHING, no ; Match empty string. Can jump here from outsi
STAR STAR, node 0 V ; Match this (simple) thing 0 or more times: /A{0,}B/ where A is width 1 char
PLUS PLUS, node 0 V ; Match this (simple) thing 1 or more times: /A{1,}B/ where A is width 1 char
-CURLY CURLY, sv 4 V ; Match this (simple) thing {n,m} times: /A{m,n}B/ where A is width 1 char
-CURLYN CURLY, no 4 V ; Capture next-after-this simple thing: /(A){m,n}B/ where A is width 1 char
-CURLYM CURLY, no 4 V ; Capture this medium-complex thing {n,m} times: /(A){m,n}B/ where A is fixed-length
-CURLYX CURLY, sv 4 V ; Match/Capture this complex thing {n,m} times.
+CURLY CURLY, sv 3 V ; Match this (simple) thing {n,m} times: /A{m,n}B/ where A is width 1 char
+CURLYN CURLY, no 3 V ; Capture next-after-this simple thing: /(A){m,n}B/ where A is width 1 char
+CURLYM CURLY, no 3 V ; Capture this medium-complex thing {n,m} times: /(A){m,n}B/ where A is fixed-length
+CURLYX CURLY, sv 3 V ; Match/Capture this complex thing {n,m} times.
#*This terminator creates a loop structure for CURLYX
WHILEM WHILEM, no 0 V ; Do curly processing and see if rest matches.
@@ -252,7 +252,7 @@ REFFAN REF, num 1 V ; Match already matched string, using /aai rul
#*Support for long RE
LONGJMP LONGJMP, off 1 . 1 ; Jump far away.
-BRANCHJ BRANCHJ, off 2L V 1 ; BRANCH with long offset.
+BRANCHJ BRANCHJ, off 2 V 1 ; BRANCH with long offset.
#*Special Case Regops
IFMATCH BRANCHJ, off 1 . 1 ; Succeeds if the following matches; non-zero flags "f", next_off "o" means lookbehind assertion starting "f..(f-o)" characters before current
@@ -263,7 +263,7 @@ GROUPP GROUPP, num 1 ; Whether the group matched.
#*The heavy worker
-EVAL EVAL, evl/flags 2L ; Execute some Perl code.
+EVAL EVAL, evl/flags 2 ; Execute some Perl code.
#*Modifiers
@@ -274,7 +274,7 @@ LOGICAL LOGICAL, no ; Next opcode should set the flag only.
RENUM BRANCHJ, off 1 . 1 ; Group with independently numbered parens.
#*Regex Subroutines
-GOSUB GOSUB, num/ofs 2L ; recurse to paren arg1 at (signed) ofs arg2
+GOSUB GOSUB, num/ofs 2 ; recurse to paren arg1 at (signed) ofs arg2
#*Special conditionals
GROUPPN GROUPPN, no-sv 1 ; Whether the group matched.
@@ -284,7 +284,7 @@ DEFINEP DEFINEP, none 1 ; Never execute directly.
#*Backtracking Verbs
ENDLIKE ENDLIKE, none ; Used only for the type field of verbs
OPFAIL ENDLIKE, no-sv 1 ; Same as (?!), but with verb arg
-ACCEPT ENDLIKE, no-sv/num 2L ; Accepts the current matched string, with verbar
+ACCEPT ENDLIKE, no-sv/num 2 ; Accepts the current matched string, with verbar
#*Verbs With Arguments
VERB VERB, no-sv 1 ; Used only for the type field of verbs