1 files changed, 48 insertions, 19 deletions
diff --git a/doc/html/pcre2pattern.html b/doc/html/pcre2pattern.html
index 0daddaf..c88e931 100644
--- a/doc/html/pcre2pattern.html
+++ b/doc/html/pcre2pattern.html
@@ -669,8 +669,8 @@ This is an example of an "atomic group", details of which are given
 This particular group matches either the two-character sequence CR followed by
 LF, or one of the single characters LF (linefeed, U+000A), VT (vertical tab,
 U+000B), FF (form feed, U+000C), CR (carriage return, U+000D), or NEL (next
-line, U+0085). The two-character sequence is treated as a single unit that
-cannot be split.
+line, U+0085). Because this is an atomic group, the two-character sequence is
+treated as a single unit that cannot be split.
 </P>
 <P>
 In other modes, two additional characters whose codepoints are greater than 255
@@ -1186,6 +1186,16 @@ when the <i>startoffset</i> argument of <b>pcre2_match()</b> is non-zero. The
 PCRE2_DOLLAR_ENDONLY option is ignored if PCRE2_MULTILINE is set.
 </P>
 <P>
+When the newline convention (see
+<a href="#newlines">"Newline conventions"</a>
+below) recognizes the two-character sequence CRLF as a newline, this is
+preferred, even if the single characters CR and LF are also recognized as
+newlines. For example, if the newline convention is "any", a multiline mode
+circumflex matches before "xyz" in the string "abc\r\nxyz" rather than after
+CR, even though CR on its own is a valid newline. (It also matches at the very
+start of the string, of course.)
+</P>
+<P>
 Note that the sequences \A, \Z, and \z can be used to match the start and
 end of the subject in both modes, and if all branches of a pattern start with
 \A it is always anchored, whether or not PCRE2_MULTILINE is set.
@@ -1236,7 +1246,7 @@ with \C in UTF-8 or UTF-16 mode means that the rest of the string may start
 with a malformed UTF character. This has undefined results, because PCRE2
 assumes that it is matching character by character in a valid UTF string (by
 default it checks the subject string's validity at the start of processing
-unless the PCRE2_NO_UTF_CHECK option is used). 
+unless the PCRE2_NO_UTF_CHECK option is used).
 </P>
 <P>
 An application can lock out the use of \C by setting the
@@ -1247,9 +1257,9 @@ build PCRE2 with the use of \C permanently disabled.
 PCRE2 does not allow \C to appear in lookbehind assertions
 <a href="#lookbehind">(described below)</a>
 in a UTF mode, because this would make it impossible to calculate the length of
-the lookbehind. Neither the alternative matching function 
-<b>pcre2_dfa_match()</b> not the JIT optimizer support \C in a UTF mode. The 
-former gives a match-time error; the latter fails to optimize and so the match 
+the lookbehind. Neither the alternative matching function
+<b>pcre2_dfa_match()</b> not the JIT optimizer support \C in a UTF mode. The
+former gives a match-time error; the latter fails to optimize and so the match
 is always run using the interpreter.
 </P>
 <P>
@@ -1341,11 +1351,11 @@ example [\000-\037]. Ranges can include any characters that are valid for the
 current mode.
 </P>
 <P>
-There is a special case in EBCDIC environments for ranges whose end points are 
-both specified as literal letters in the same case. For compatibility with 
-Perl, EBCDIC code points within the range that are not letters are omitted. For 
-example, [h-k] matches only four characters, even though the codes for h and k 
-are 0x88 and 0x92, a range of 11 code points. However, if the range is 
+There is a special case in EBCDIC environments for ranges whose end points are
+both specified as literal letters in the same case. For compatibility with
+Perl, EBCDIC code points within the range that are not letters are omitted. For
+example, [h-k] matches only four characters, even though the codes for h and k
+are 0x88 and 0x92, a range of 11 code points. However, if the range is
 specified numerically, for example, [\x88-\x92] or [h-\x92], all code points
 are included.
 </P>
@@ -1672,6 +1682,10 @@ first one in the pattern with the given number. The following pattern matches
 <pre>
   /(?|(abc)|(def))(?1)/
 </pre>
+A relative reference such as (?-1) is no different: it is just a convenient way
+of computing an absolute group number.
+</P>
+<P>
 If a
 <a href="#conditions">condition test</a>
 for a subpattern's having matched refers to a non-unique number, the test is
@@ -2512,7 +2526,7 @@ For example:
   (?(VERSION&#62;=10.4)yes|no)
 </pre>
 This pattern matches "yes" if the PCRE2 version is greater or equal to 10.4, or
-"no" otherwise. The fractional part of the version number may not contain more 
+"no" otherwise. The fractional part of the version number may not contain more
 than two digits.
 </P>
 <br><b>
@@ -2626,6 +2640,21 @@ parentheses preceding the recursion. In other words, a negative number counts
 capturing parentheses leftwards from the point at which it is encountered.
 </P>
 <P>
+Be aware however, that if
+<a href="#dupsubpatternnumber">duplicate subpattern numbers</a>
+are in use, relative references refer to the earliest subpattern with the
+appropriate number. Consider, for example:
+<pre>
+  (?|(a)|(b)) (c) (?-2)
+</pre>
+The first two capturing groups (a) and (b) are both numbered 1, and group (c)
+is number 2. When the reference (?-2) is encountered, the second most recently
+opened parentheses has the number 1, but it is the first such group (the (a)
+group) to which the recursion refers. This would be the same if an absolute
+reference (?1) was used. In other words, relative references are just a
+shorthand for computing a group number.
+</P>
+<P>
 It is also possible to refer to subsequently opened parentheses, by writing
 references such as (?+2). However, these cannot be recursive because the
 reference is not inside the parentheses that are referenced. They are always
@@ -2929,13 +2958,13 @@ depending on whether or not a name is present.
 </P>
 <P>
 By default, for compatibility with Perl, a name is any sequence of characters
-that does not include a closing parenthesis. The name is not processed in 
+that does not include a closing parenthesis. The name is not processed in
 any way, and it is not possible to include a closing parenthesis in the name.
-However, if the PCRE2_ALT_VERBNAMES option is set, normal backslash processing 
-is applied to verb names and only an unescaped closing parenthesis terminates 
-the name. A closing parenthesis can be included in a name either as \) or 
-between \Q and \E. If the PCRE2_EXTENDED option is set, unescaped whitespace 
-in verb names is skipped and #-comments are recognized, exactly as in the rest 
+However, if the PCRE2_ALT_VERBNAMES option is set, normal backslash processing
+is applied to verb names and only an unescaped closing parenthesis terminates
+the name. A closing parenthesis can be included in a name either as \) or
+between \Q and \E. If the PCRE2_EXTENDED option is set, unescaped whitespace
+in verb names is skipped and #-comments are recognized, exactly as in the rest
 of the pattern.
 </P>
 <P>
@@ -3359,7 +3388,7 @@ Cambridge, England.
 </P>
 <br><a name="SEC30" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 01 November 2015
+Last updated: 13 November 2015
 <br>
 Copyright &copy; 1997-2015 University of Cambridge.
 <br>