summaryrefslogtreecommitdiff
path: root/HACKING
diff options
context:
space:
mode:
authorph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2011-06-02 19:04:54 +0000
committerph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2011-06-02 19:04:54 +0000
commit8e87d659e6c3275819d3e760f912d7daf1175036 (patch)
tree66fbde1fdd75c8168eb510e90d13b0fc1428d90a /HACKING
parentadf2e795979e12190b8631f1360e941c4efb104d (diff)
downloadpcre-8e87d659e6c3275819d3e760f912d7daf1175036.tar.gz
Refactoring to reduce stack usage for possessively quantified subpatterns. Also
fixed a number of bugs related to repeated subpatterns. Some further tidies consequent on the removal of OP_OPT are also in this patch. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@604 2f5784b3-3f2a-0410-8824-cb99058d5e15
Diffstat (limited to 'HACKING')
-rw-r--r--HACKING14
1 files changed, 12 insertions, 2 deletions
diff --git a/HACKING b/HACKING
index 690b47e..709609b 100644
--- a/HACKING
+++ b/HACKING
@@ -349,8 +349,9 @@ number immediately follows the offset, always as a 2-byte item.
OP_KET is used for subpatterns that do not repeat indefinitely, while
OP_KETRMIN and OP_KETRMAX are used for indefinite repetitions, minimally or
-maximally respectively. All three are followed by LINK_SIZE bytes giving (as a
-positive number) the offset back to the matching bracket opcode.
+maximally respectively (see below for possessive repetitions). All three are
+followed by LINK_SIZE bytes giving (as a positive number) the offset back to
+the matching bracket opcode.
If a subpattern is quantified such that it is permitted to match zero times, it
is preceded by one of OP_BRAZERO, OP_BRAMINZERO, or OP_SKIPZERO. These are
@@ -377,6 +378,15 @@ final replication is changed to OP_SBRA or OP_SCBRA. This tells the matcher
that it needs to check for matching an empty string when it hits OP_KETRMIN or
OP_KETRMAX, and if so, to break the loop.
+Possessive brackets
+-------------------
+
+When a repeated group (capturing or non-capturing) is marked as possessive by
+the "+" notation, e.g. (abc)++, different opcodes are used. Their names all
+have POS on the end, e.g. OP_BRAPOS instead of OP_BRA and OP_SCPBRPOS instead
+of OP_SCBRA. The end of such a group is marked by OP_KETRPOS. If the minimum
+repetition is zero, the group is preceded by OP_BRAPOSZERO.
+
Assertions
----------