Fix problems with conditional references to duplicate named subpatterns.

git-svn-id: svn://vcs.exim.org/pcre/code/trunk@459 2f5784b3-3f2a-0410-8824-cb99058d5e15
author: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> 2009-10-04 09:21:39 +0000
committer: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> 2009-10-04 09:21:39 +0000
commit: ff6ca31f93c0b34a945871afc954a0aa54800137 (patch)
tree: 9f44d2b7a8367d7b81c7f31670b36b9d696bb8b4
parent: c0aaf57a170aff4923dab5442eb87ad8b09d6c58 (diff)
download: pcre-ff6ca31f93c0b34a945871afc954a0aa54800137.tar.gz
14 files changed, 430 insertions, 61 deletions
diff --git a/ChangeLog b/ChangeLog
index 3aa750c..a1d2af4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -157,6 +157,14 @@ Version 8.00 ??-???-??
     names, because these cannot be distinguished in PCRE, and this has caused
     confusion. (This is a difference from Perl.)
     
+28. When duplicate subpattern names are present (necessarily with different 
+    numbers, as required by 27 above), and a test is made by name in a 
+    conditional pattern, either for a subpattern having been matched, or for 
+    recursion in such a pattern, all the associated numbered subpatterns are 
+    tested, and the overall condition is true if the condition is true for any
+    one of them. This is the way Perl works, and is also more like the way
+    testing by number works.
+    
 
 Version 7.9 11-Apr-09
 ---------------------
diff --git a/doc/pcrecompat.3 b/doc/pcrecompat.3
index 8bd0675..243e9bf 100644
--- a/doc/pcrecompat.3
+++ b/doc/pcrecompat.3
@@ -98,22 +98,14 @@ the pattern /^(a(b)?)+$/ in Perl leaves $2 unset, but in PCRE it is set to "b".
 argument. PCRE does not support (*MARK).
 .P
 12. PCRE's handling of duplicate subpattern numbers and duplicate subpattern 
-names is not as general as Perl's. This is a consequence of the fact the PCRE 
-works internally just with numbers, using an external table to translate 
-between numbers and names. The following are some specific differences:
-.sp
-(a) A pattern such as (?|(?<a>A)|(?<b)B), where the two capturing 
-parentheses have the same number but different names, is not supported, and 
-causes an error at compile time. If it were allowed, it would not be possible 
-to distinguish which parentheses matched, because both names map to capturing
-subpattern number 1. To avoid this confusing situation, an error is given at 
-compile time.
-.sp
-(b) A condition test for a subpattern with a name that is duplicated gives
-unpredictable results. For example, when the pattern
-(?:(?<a>A)|(?<a>B))(?('a')...|...) is compiled (the PCRE_DUPNAMES option is
-required), the condition test (?('a') is set to test whether subpattern 1 has
-matched, ignoring subpattern 2, even though it has the same name.
+names is not as general as Perl's. This is a consequence of the fact the PCRE
+works internally just with numbers, using an external table to translate
+between numbers and names. In particular, a pattern such as (?|(?<a>A)|(?<b)B),
+where the two capturing parentheses have the same number but different names,
+is not supported, and causes an error at compile time. If it were allowed, it
+would not be possible to distinguish which parentheses matched, because both
+names map to capturing subpattern number 1. To avoid this confusing situation,
+an error is given at compile time.
 .P
 13. PCRE provides some extensions to the Perl regular expression facilities.
 Perl 5.10 includes new features that are not in earlier versions of Perl, some
@@ -172,6 +164,6 @@ Cambridge CB2 3QH, England.
 .rs
 .sp
 .nf
-Last updated: 03 October 2009
+Last updated: 04 October 2009
 Copyright (c) 1997-2009 University of Cambridge.
 .fi
diff --git a/doc/pcrepattern.3 b/doc/pcrepattern.3
index 8a69ef0..6cb24f4 100644
--- a/doc/pcrepattern.3
+++ b/doc/pcrepattern.3
@@ -1175,7 +1175,15 @@ pattern matches "abcabc" or "defabc":
 .sp
   /(?|(abc)|(def))(?1)/
 .sp
-An alternative approach to using the "branch reset" feature is to use
+If a
+.\" HTML <a href="#conditions">
+.\" </a>
+condition test
+.\"
+for a subpattern's having matched refers to a non-unique number, the test is
+true if any of the subpatterns of that number have matched.
+.P
+An alternative approach to using this "branch reset" feature is to use
 duplicate named subpatterns, as described in the next section.
 .
 .
@@ -1188,7 +1196,8 @@ if an expression is modified, the numbers may change. To help with this
 difficulty, PCRE supports the naming of subpatterns. This feature was not
 added to Perl until release 5.10. Python had the feature earlier, and PCRE
 introduced it at release 4.0, using the Python syntax. PCRE now supports both
-the Perl and the Python syntax.
+the Perl and the Python syntax. Perl allows identically numbered subpatterns to
+have different names, but PCRE does not.
 .P
 In PCRE, a subpattern can be named in one of three ways: (?<name>...) or
 (?'name'...) as in Perl, or (?P<name>...) as in Python. References to capturing
@@ -1235,10 +1244,23 @@ subpattern, as described in the previous section.)
 .P
 The convenience function for extracting the data by name returns the substring
 for the first (and in this example, the only) subpattern of that name that
-matched. This saves searching to find which numbered subpattern it was. If you
-make a reference to a non-unique named subpattern from elsewhere in the
-pattern, the one that corresponds to the lowest number is used. For further
-details of the interfaces for handling named subpatterns, see the
+matched. This saves searching to find which numbered subpattern it was. 
+.P
+If you make a backreference to a non-unique named subpattern from elsewhere in
+the pattern, the one that corresponds to the first occurrence of the name is
+used. In the absence of duplicate numbers (see the previous section) this is
+the one with the lowest number. If you use a named reference in a condition
+test (see the
+.\"
+.\" HTML <a href="#conditions">
+.\" </a>
+section about conditions
+.\"
+below), either to check whether a subpattern has matched, or to check for 
+recursion, all subpatterns with the same name are tested. If the condition is
+true for any one of them, the overall condition is true. This is the same
+behaviour as testing by number. For further details of the interfaces for
+handling named subpatterns, see the
 .\" HREF
 \fBpcreapi\fP
 .\"
@@ -1877,6 +1899,9 @@ Rewriting the above example to use a named subpattern gives this:
 .sp
   (?<OPEN> \e( )?    [^()]+    (?(<OPEN>) \e) )
 .sp
+If the name used in a condition of this kind is a duplicate, the test is 
+applied to all subpatterns of the same name, and is true if any one of them has 
+matched.
 .
 .SS "Checking for pattern recursion"
 .rs
@@ -1890,14 +1915,16 @@ letter R, for example:
 .sp
 the condition is true if the most recent recursion is into a subpattern whose
 number or name is given. This condition does not check the entire recursion
-stack.
+stack. If the name used in a condition of this kind is a duplicate, the test is 
+applied to all subpatterns of the same name, and is true if any one of them is 
+the most recent recursion. 
 .P
 At "top level", all these recursion test conditions are false. 
 .\" HTML <a href="#recursion">
 .\" </a>
-Recursive patterns
+The syntax for recursive patterns
 .\"
-are described below.
+is described below.
 .
 .SS "Defining subpatterns for use by reference only"
 .rs
@@ -2391,6 +2418,6 @@ Cambridge CB2 3QH, England.
 .rs
 .sp
 .nf
-Last updated: 03 October 2009
+Last updated: 04 October 2009
 Copyright (c) 1997-2009 University of Cambridge.
 .fi
diff --git a/pcre_compile.c b/pcre_compile.c
index aed9801..5dd5cba 100644
--- a/pcre_compile.c
+++ b/pcre_compile.c
@@ -1317,7 +1317,9 @@ for (;;)
 
     case OP_CALLOUT:
     case OP_CREF:
+    case OP_NCREF:
     case OP_RREF:
+    case OP_NRREF:
     case OP_DEF:
     code += _pcre_OP_lengths[*code];
     break;
@@ -1432,7 +1434,9 @@ for (;;)
 
     case OP_REVERSE:
     case OP_CREF:
+    case OP_NCREF:
     case OP_RREF:
+    case OP_NRREF:
     case OP_DEF:
     case OP_OPT:
     case OP_CALLOUT:
@@ -4654,7 +4658,10 @@ we set the flag only if there is a literal "\r" or "\n" in the class. */
           }
 
         /* Otherwise (did not start with "+" or "-"), start by looking for the
-        name. */
+        name. If we find a name, add one to the opcode to change OP_CREF or 
+        OP_RREF into OP_NCREF or OP_NRREF. These behave exactly the same, 
+        except they record that the reference was originally to a name. The 
+        information is used to check duplicate names. */
 
         slot = cd->name_table;
         for (i = 0; i < cd->names_found; i++)
@@ -4669,6 +4676,7 @@ we set the flag only if there is a literal "\r" or "\n" in the class. */
           {
           recno = GET2(slot, 0);
           PUT2(code, 2+LINK_SIZE, recno);
+          code[1+LINK_SIZE]++;
           }
 
         /* Search the pattern for a forward reference */
@@ -4677,6 +4685,7 @@ we set the flag only if there is a literal "\r" or "\n" in the class. */
                         (options & PCRE_EXTENDED) != 0)) > 0)
           {
           PUT2(code, 2+LINK_SIZE, i);
+          code[1+LINK_SIZE]++;
           }
 
         /* If terminator == 0 it means that the name followed directly after
@@ -6156,7 +6165,9 @@ do {
      switch (*scode)
        {
        case OP_CREF:
+       case OP_NCREF:
        case OP_RREF:
+       case OP_NRREF:
        case OP_DEF:
        return FALSE;
 
diff --git a/pcre_dfa_exec.c b/pcre_dfa_exec.c
index bdb4668..ce1c456 100644
--- a/pcre_dfa_exec.c
+++ b/pcre_dfa_exec.c
@@ -2287,7 +2287,8 @@ for (;;)
 
         /* Back reference conditions are not supported */
 
-        if (condcode == OP_CREF) return PCRE_ERROR_DFA_UCOND;
+        if (condcode == OP_CREF || condcode == OP_NCREF) 
+          return PCRE_ERROR_DFA_UCOND;
 
         /* The DEFINE condition is always false */
 
@@ -2298,7 +2299,7 @@ for (;;)
         which means "test if in any recursion". We can't test for specifically
         recursed groups. */
 
-        else if (condcode == OP_RREF)
+        else if (condcode == OP_RREF || condcode == OP_NRREF)
           {
           int value = GET2(code, LINK_SIZE+2);
           if (value != RREF_ANY) return PCRE_ERROR_DFA_UCOND;
diff --git a/pcre_exec.c b/pcre_exec.c
index a585481..607c57a 100644
--- a/pcre_exec.c
+++ b/pcre_exec.c
@@ -839,18 +839,139 @@ for (;;)
 
     /* Now see what the actual condition is */
 
-    if (condcode == OP_RREF)         /* Recursion test */
+    if (condcode == OP_RREF || condcode == OP_NRREF)    /* Recursion test */
       {
-      offset = GET2(ecode, LINK_SIZE + 2);     /* Recursion group number*/
-      condition = md->recursive != NULL &&
-        (offset == RREF_ANY || offset == md->recursive->group_num);
-      ecode += condition? 3 : GET(ecode, 1);
-      }
+      if (md->recursive == NULL)                /* Not recursing => FALSE */
+        {
+        condition = FALSE;  
+        ecode += GET(ecode, 1);                         
+        } 
+      else
+        {    
+        int recno = GET2(ecode, LINK_SIZE + 2);   /* Recursion group number*/
+        condition =  (recno == RREF_ANY || recno == md->recursive->group_num);
+          
+        /* If the test is for recursion into a specific subpattern, and it is
+        false, but the test was set up by name, scan the table to see if the
+        name refers to any other numbers, and test them. The condition is true
+        if any one is set. */
+         
+        if (!condition && condcode == OP_NRREF && recno != RREF_ANY)
+          {
+          uschar *slotA = md->name_table;
+          for (i = 0; i < md->name_count; i++)
+            { 
+            if (GET2(slotA, 0) == recno) break; 
+            slotA += md->name_entry_size;
+            }
+             
+          /* Found a name for the number - there can be only one; duplicate
+          names for different numbers are allowed, but not vice versa. First
+          scan down for duplicates. */
+            
+          if (i < md->name_count)
+            {    
+            uschar *slotB = slotA;
+            while (slotB > md->name_table)
+              {
+              slotB -= md->name_entry_size;
+              if (strcmp((char *)slotA + 2, (char *)slotB + 2) == 0)
+                {
+                condition = GET2(slotB, 0) == md->recursive->group_num;
+                if (condition) break;   
+                }    
+              else break;
+              } 
+        
+            /* Scan up for duplicates */
+        
+            if (!condition)
+              { 
+              slotB = slotA;
+              for (i++; i < md->name_count; i++)
+                {
+                slotB += md->name_entry_size;
+                if (strcmp((char *)slotA + 2, (char *)slotB + 2) == 0)
+                  {
+                  condition = GET2(slotB, 0) == md->recursive->group_num;
+                  if (condition) break;
+                  }    
+                else break;
+                }  
+              } 
+            }
+          }  
+        
+        /* Chose branch according to the condition */
+         
+        ecode += condition? 3 : GET(ecode, 1);
+        }
+      }   
 
-    else if (condcode == OP_CREF)    /* Group used test */
+    else if (condcode == OP_CREF || condcode == OP_NCREF)  /* Group used test */
       {
       offset = GET2(ecode, LINK_SIZE+2) << 1;  /* Doubled ref number */
       condition = offset < offset_top && md->offset_vector[offset] >= 0;
+      
+      /* If the numbered capture is unset, but the reference was by name,
+      scan the table to see if the name refers to any other numbers, and test 
+      them. The condition is true if any one is set. This is tediously similar 
+      to the code above, but not close enough to try to amalgamate. */ 
+      
+      if (!condition && condcode == OP_NCREF)
+        {
+        int refno = offset >> 1; 
+        uschar *slotA = md->name_table;
+         
+        for (i = 0; i < md->name_count; i++)
+          { 
+          if (GET2(slotA, 0) == refno) break; 
+          slotA += md->name_entry_size;
+          }
+           
+        /* Found a name for the number - there can be only one; duplicate names 
+        for different numbers are allowed, but not vice versa. First scan down 
+        for duplicates. */
+          
+        if (i < md->name_count)
+          {    
+          uschar *slotB = slotA;
+          while (slotB > md->name_table)
+            {
+            slotB -= md->name_entry_size;
+            if (strcmp((char *)slotA + 2, (char *)slotB + 2) == 0)
+              {
+              offset = GET2(slotB, 0) << 1;
+              condition = offset < offset_top && 
+                md->offset_vector[offset] >= 0;
+              if (condition) break;   
+              }    
+            else break;
+            } 
+      
+          /* Scan up for duplicates */
+      
+          if (!condition)
+            { 
+            slotB = slotA;
+            for (i++; i < md->name_count; i++)
+              {
+              slotB += md->name_entry_size;
+              if (strcmp((char *)slotA + 2, (char *)slotB + 2) == 0)
+                {
+                offset = GET2(slotB, 0) << 1;
+                condition = offset < offset_top && 
+                  md->offset_vector[offset] >= 0;
+                if (condition) break;   
+                }    
+              else break;
+              } 
+            }   
+          }
+        }  
+         
+      /* Chose branch according to the condition */
+
       ecode += condition? 3 : GET(ecode, 1);
       }
 
@@ -4889,6 +5010,13 @@ if (re == NULL || subject == NULL ||
    (offsets == NULL && offsetcount > 0)) return PCRE_ERROR_NULL;
 if (offsetcount < 0) return PCRE_ERROR_BADCOUNT;
 
+/* This information is for finding all the numbers associated with a given 
+name, for condition testing. */
+
+md->name_table = (uschar *)re + re->name_table_offset;
+md->name_count = re->name_count;
+md->name_entry_size = re->name_entry_size;
+
 /* Fish out the optional data from the extra_data structure, first setting
 the default values. */
 
diff --git a/pcre_internal.h b/pcre_internal.h
index 9ea130c..9ba2c88 100644
--- a/pcre_internal.h
+++ b/pcre_internal.h
@@ -1347,29 +1347,33 @@ enum {
   OP_SCBRA,          /* 98 Start of capturing bracket, check empty */
   OP_SCOND,          /* 99 Conditional group, check empty */
 
+  /* The next two pairs must (respectively) be kept together. */
+   
   OP_CREF,           /* 100 Used to hold a capture number as condition */
-  OP_RREF,           /* 101 Used to hold a recursion number as condition */
-  OP_DEF,            /* 102 The DEFINE condition */
+  OP_NCREF,          /* 101 Same, but generaged by a name reference*/
+  OP_RREF,           /* 102 Used to hold a recursion number as condition */
+  OP_NRREF,          /* 103 Same, but generaged by a name reference*/
+  OP_DEF,            /* 104 The DEFINE condition */
 
-  OP_BRAZERO,        /* 103 These two must remain together and in this */
-  OP_BRAMINZERO,     /* 104 order. */
+  OP_BRAZERO,        /* 105 These two must remain together and in this */
+  OP_BRAMINZERO,     /* 106 order. */
 
   /* These are backtracking control verbs */
 
-  OP_PRUNE,          /* 105 */
-  OP_SKIP,           /* 106 */
-  OP_THEN,           /* 107 */
-  OP_COMMIT,         /* 108 */
+  OP_PRUNE,          /* 107 */
+  OP_SKIP,           /* 108 */
+  OP_THEN,           /* 109 */
+  OP_COMMIT,         /* 110 */
 
   /* These are forced failure and success verbs */
 
-  OP_FAIL,           /* 109 */
-  OP_ACCEPT,         /* 110 */
-  OP_CLOSE,          /* 111 Used before OP_ACCEPT to close open captures */ 
+  OP_FAIL,           /* 111 */
+  OP_ACCEPT,         /* 112 */
+  OP_CLOSE,          /* 113 Used before OP_ACCEPT to close open captures */
 
   /* This is used to skip a subpattern with a {0} quantifier */
 
-  OP_SKIPZERO        /* 112 */
+  OP_SKIPZERO        /* 114 */
 };
 
 
@@ -1393,7 +1397,8 @@ for debugging. The macro is referenced only in pcre_printint.c. */
   "Alt", "Ket", "KetRmax", "KetRmin", "Assert", "Assert not",     \
   "AssertB", "AssertB not", "Reverse",                            \
   "Once", "Bra", "CBra", "Cond", "SBra", "SCBra", "SCond",        \
-  "Cond ref", "Cond rec", "Cond def", "Brazero", "Braminzero",    \
+  "Cond ref", "Cond nref", "Cond rec", "Cond nrec", "Cond def",   \
+  "Brazero", "Braminzero",                                        \
   "*PRUNE", "*SKIP", "*THEN", "*COMMIT", "*FAIL", "*ACCEPT",      \
   "Close", "Skip zero"
 
@@ -1455,15 +1460,16 @@ in UTF-8 mode. The code that uses this table must know about such things. */
   1+LINK_SIZE,                   /* SBRA                                   */ \
   3+LINK_SIZE,                   /* SCBRA                                  */ \
   1+LINK_SIZE,                   /* SCOND                                  */ \
-  3,                             /* CREF                                   */ \
-  3,                             /* RREF                                   */ \
+  3, 3,                          /* CREF, NCREF                            */ \
+  3, 3,                          /* RREF, NRREF                            */ \
   1,                             /* DEF                                    */ \
   1, 1,                          /* BRAZERO, BRAMINZERO                    */ \
   1, 1, 1, 1,                    /* PRUNE, SKIP, THEN, COMMIT,             */ \
   1, 1, 3, 1                     /* FAIL, ACCEPT, CLOSE, SKIPZERO          */
 
 
-/* A magic value for OP_RREF to indicate the "any recursion" condition. */
+/* A magic value for OP_RREF and OP_NRREF to indicate the "any recursion"
+condition. */
 
 #define RREF_ANY  0xffff
 
@@ -1521,17 +1527,17 @@ typedef struct pcre_study_data {
   pcre_uint32 size;               /* Total that was malloced */
   pcre_uint32 flags;              /* Private flags */
   uschar start_bits[32];          /* Starting char bits */
-  pcre_uint32 minlength;          /* Minimum subject length */ 
+  pcre_uint32 minlength;          /* Minimum subject length */
 } pcre_study_data;
 
-/* Structure for building a chain of open capturing subpatterns during 
-compiling, so that instructions to close them can be compiled when (*ACCEPT) is 
+/* Structure for building a chain of open capturing subpatterns during
+compiling, so that instructions to close them can be compiled when (*ACCEPT) is
 encountered. */
 
 typedef struct open_capitem {
   struct open_capitem *next;    /* Chain link */
   pcre_uint16 number;           /* Capture number */
-} open_capitem;    
+} open_capitem;
 
 /* Structure for passing "static" information around between the functions
 doing the compiling, so that they are thread-safe. */
@@ -1545,7 +1551,7 @@ typedef struct compile_data {
   const uschar *start_code;     /* The start of the compiled code */
   const uschar *start_pattern;  /* The start of the pattern */
   const uschar *end_pattern;    /* The end of the pattern */
-  open_capitem *open_caps;      /* Chain of open capture items */ 
+  open_capitem *open_caps;      /* Chain of open capture items */
   uschar *hwm;                  /* High watermark of workspace */
   uschar *name_table;           /* The name/number table */
   int  names_found;             /* Number of entries so far */
@@ -1558,7 +1564,7 @@ typedef struct compile_data {
   int  external_flags;          /* External flag bits to be set */
   int  req_varyopt;             /* "After variable item" flag for reqbyte */
   BOOL had_accept;              /* (*ACCEPT) encountered */
-  BOOL check_lookbehind;        /* Lookbehinds need later checking */ 
+  BOOL check_lookbehind;        /* Lookbehinds need later checking */
   int  nltype;                  /* Newline type */
   int  nllen;                   /* Newline string length */
   uschar nl[4];                 /* Newline string when fixed length */
@@ -1582,7 +1588,7 @@ typedef struct recursion_info {
   USPTR save_start;             /* Old value of mstart */
   int *offset_save;             /* Pointer to start of saved offsets */
   int saved_max;                /* Number of saved offsets */
-  int offset_top;               /* Current value of offset_top */ 
+  int offset_top;               /* Current value of offset_top */
 } recursion_info;
 
 /* Structure for building a chain of data for holding the values of the subject
@@ -1607,6 +1613,9 @@ typedef struct match_data {
   int    offset_max;            /* The maximum usable for return data */
   int    nltype;                /* Newline type */
   int    nllen;                 /* Newline string length */
+  int    name_count;            /* Number of names in name table */
+  int    name_entry_size;       /* Size of entry in names table */
+  uschar *name_table;           /* Table of names */  
   uschar nl[4];                 /* Newline string when fixed */
   const uschar *lcc;            /* Points to lower casing table */
   const uschar *ctypes;         /* Points to table of type maps */
diff --git a/pcre_printint.src b/pcre_printint.src
index e096f6d..60d12f9 100644
--- a/pcre_printint.src
+++ b/pcre_printint.src
@@ -251,6 +251,7 @@ for(;;)
     break;   
 
     case OP_CREF:
+    case OP_NCREF: 
     fprintf(f, "%3d %s", GET2(code,1), OP_names[*code]);
     break;
 
@@ -262,6 +263,14 @@ for(;;)
       fprintf(f, "    Cond recurse %d", c);
     break;
 
+    case OP_NRREF:
+    c = GET2(code, 1);
+    if (c == RREF_ANY)
+      fprintf(f, "    Cond nrecurse any");
+    else
+      fprintf(f, "    Cond nrecurse %d", c);
+    break;
+
     case OP_DEF:
     fprintf(f, "    Cond def");
     break;
diff --git a/pcre_study.c b/pcre_study.c
index 29f5482..23f51a0 100644
--- a/pcre_study.c
+++ b/pcre_study.c
@@ -140,7 +140,9 @@ for (;;)
 
     case OP_REVERSE:
     case OP_CREF:
+    case OP_NCREF:
     case OP_RREF:
+    case OP_NRREF:
     case OP_DEF:
     case OP_OPT:
     case OP_CALLOUT:
diff --git a/perltest.pl b/perltest.pl
index 0d290c1..c4f1c97 100755
--- a/perltest.pl
+++ b/perltest.pl
@@ -90,6 +90,10 @@ for (;;)
   # Remove /8 from a UTF-8 pattern.
 
   $utf8 = $pattern =~ s/8(?=[a-z]*$)//;
+  
+  # Remove /J from a pattern with duplicate names.
+  
+  $pattern =~ s/J(?=[a-z]*$)//;  
 
   # Check that the pattern is valid
 
diff --git a/testdata/testinput11 b/testdata/testinput11
index 7286c3d..936bdb1 100644
--- a/testdata/testinput11
+++ b/testdata/testinput11
@@ -297,4 +297,10 @@
     defdef
     abcdef    
 
+/(?:a(?<quote> (?<apostrophe>')|(?<realquote>")) |b(?<quote> (?<apostrophe>')|(?<realquote>")) ) (?('quote')[a-z]+|[0-9]+)/xJ
+    a\"aaaaa
+    b\"aaaaa 
+    ** Failers 
+    b\"11111
+
 /-- End of testinput11 --/
diff --git a/testdata/testinput2 b/testdata/testinput2
index 317a474..ac108c4 100644
--- a/testdata/testinput2
+++ b/testdata/testinput2
@@ -3104,4 +3104,25 @@ a random value. /Ix
 
 /(?|(?<a>A)|(?<b>B))/ 
 
+/(?:a(?<quote> (?<apostrophe>')|(?<realquote>")) |
+    b(?<quote> (?<apostrophe>')|(?<realquote>")) ) 
+    (?('quote')[a-z]+|[0-9]+)/JIx
+    a"aaaaa
+    b"aaaaa 
+    ** Failers 
+    b"11111
+    a"11111 
+    
+/^(?|(a)(b)(c)(?<D>d)|(?<D>e)) (?('D')X|Y)/JDx
+    abcdX
+    eX
+    ** Failers
+    abcdY
+    ey     
+    
+/(?<A>a) (b)(c)  (?<A>d  (?(R&A)$ | (?4)) )/JDx
+    abcdd
+    ** Failers
+    abcdde  
+
 /-- End of testinput2 --/
diff --git a/testdata/testoutput11 b/testdata/testoutput11
index aa4d592..734339a 100644
--- a/testdata/testoutput11
+++ b/testdata/testoutput11
@@ -628,4 +628,23 @@ No match
     abcdef    
 No match
 
+/(?:a(?<quote> (?<apostrophe>')|(?<realquote>")) |b(?<quote> (?<apostrophe>')|(?<realquote>")) ) (?('quote')[a-z]+|[0-9]+)/xJ
+    a\"aaaaa
+ 0: a"aaaaa
+ 1: "
+ 2: <unset>
+ 3: "
+    b\"aaaaa 
+ 0: b"aaaaa
+ 1: <unset>
+ 2: <unset>
+ 3: <unset>
+ 4: "
+ 5: <unset>
+ 6: "
+    ** Failers 
+No match
+    b\"11111
+No match
+
 /-- End of testinput11 --/
diff --git a/testdata/testoutput2 b/testdata/testoutput2
index 953ce75..f3afc0d 100644
--- a/testdata/testoutput2
+++ b/testdata/testoutput2
@@ -7469,7 +7469,7 @@ No match
         ^
         CBra 1
         Cond
-      2 Cond ref
+      2 Cond nref
         y
         Ket
         [()]
@@ -10240,4 +10240,136 @@ No need char
 /(?|(?<a>A)|(?<b>B))/ 
 Failed: different names for subpatterns of the same number are not allowed at offset 15
 
+/(?:a(?<quote> (?<apostrophe>')|(?<realquote>")) |
+    b(?<quote> (?<apostrophe>')|(?<realquote>")) ) 
+    (?('quote')[a-z]+|[0-9]+)/JIx
+Capturing subpattern count = 6
+Named capturing subpatterns:
+  apostrophe   2
+  apostrophe   5
+  quote        1
+  quote        4
+  realquote    3
+  realquote    6
+Options: extended dupnames
+No first char
+No need char
+    a"aaaaa
+ 0: a"aaaaa
+ 1: "
+ 2: <unset>
+ 3: "
+    b"aaaaa 
+ 0: b"aaaaa
+ 1: <unset>
+ 2: <unset>
+ 3: <unset>
+ 4: "
+ 5: <unset>
+ 6: "
+    ** Failers 
+No match
+    b"11111
+No match
+    a"11111 
+No match
+    
+/^(?|(a)(b)(c)(?<D>d)|(?<D>e)) (?('D')X|Y)/JDx
+------------------------------------------------------------------
+  0  79 Bra
+  3     ^
+  4  43 Bra
+  7   7 CBra 1
+ 12     a
+ 14   7 Ket
+ 17   7 CBra 2
+ 22     b
+ 24   7 Ket
+ 27   7 CBra 3
+ 32     c
+ 34   7 Ket
+ 37   7 CBra 4
+ 42     d
+ 44   7 Ket
+ 47  13 Alt
+ 50   7 CBra 1
+ 55     e
+ 57   7 Ket
+ 60  56 Ket
+ 63   8 Cond
+ 66   4 Cond nref
+ 69     X
+ 71   5 Alt
+ 74     Y
+ 76  13 Ket
+ 79  79 Ket
+ 82     End
+------------------------------------------------------------------
+Capturing subpattern count = 4
+Named capturing subpatterns:
+  D   4
+  D   1
+Options: anchored extended dupnames
+No first char
+No need char
+    abcdX
+ 0: abcdX
+ 1: a
+ 2: b
+ 3: c
+ 4: d
+    eX
+ 0: eX
+ 1: e
+    ** Failers
+No match
+    abcdY
+No match
+    ey     
+No match
+    
+/(?<A>a) (b)(c)  (?<A>d  (?(R&A)$ | (?4)) )/JDx
+------------------------------------------------------------------
+  0  65 Bra
+  3   7 CBra 1
+  8     a
+ 10   7 Ket
+ 13   7 CBra 2
+ 18     b
+ 20   7 Ket
+ 23   7 CBra 3
+ 28     c
+ 30   7 Ket
+ 33  29 CBra 4
+ 38     d
+ 40   7 Cond
+ 43     Cond nrecurse 1
+ 46     $
+ 47  12 Alt
+ 50   6 Once
+ 53  33 Recurse
+ 56   6 Ket
+ 59  19 Ket
+ 62  29 Ket
+ 65  65 Ket
+ 68     End
+------------------------------------------------------------------
+Capturing subpattern count = 4
+Named capturing subpatterns:
+  A   1
+  A   4
+Options: extended dupnames
+First char = 'a'
+Need char = 'd'
+    abcdd
+ 0: abcdd
+ 1: a
+ 2: b
+ 3: c
+ 4: dd
+    ** Failers
+No match
+    abcdde  
+No match
+
 /-- End of testinput2 --/
author	ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>	2009-10-04 09:21:39 +0000
committer	ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>	2009-10-04 09:21:39 +0000
commit	ff6ca31f93c0b34a945871afc954a0aa54800137 (patch)
tree	9f44d2b7a8367d7b81c7f31670b36b9d696bb8b4
parent	c0aaf57a170aff4923dab5442eb87ad8b09d6c58 (diff)
download	pcre-ff6ca31f93c0b34a945871afc954a0aa54800137.tar.gz