3 files changed, 724 insertions, 896 deletions
diff --git a/doc/html/pcre2partial.html b/doc/html/pcre2partial.html
index a2faa76..e0f37ea 100644
--- a/doc/html/pcre2partial.html
+++ b/doc/html/pcre2partial.html
@@ -14,85 +14,123 @@ please consult the man page, in case the conversion went wrong.
 <br>
 <ul>
 <li><a name="TOC1" href="#SEC1">PARTIAL MATCHING IN PCRE2</a>
-<li><a name="TOC2" href="#SEC2">PARTIAL MATCHING USING pcre2_match()</a>
-<li><a name="TOC3" href="#SEC3">PARTIAL MATCHING USING pcre2_dfa_match()</a>
-<li><a name="TOC4" href="#SEC4">PARTIAL MATCHING AND WORD BOUNDARIES</a>
-<li><a name="TOC5" href="#SEC5">EXAMPLE OF PARTIAL MATCHING USING PCRE2TEST</a>
+<li><a name="TOC2" href="#SEC2">REQUIREMENTS FOR A PARTIAL MATCH</a>
+<li><a name="TOC3" href="#SEC3">PARTIAL MATCHING USING pcre2_match()</a>
+<li><a name="TOC4" href="#SEC4">MULTI-SEGMENT MATCHING WITH pcre2_match()</a>
+<li><a name="TOC5" href="#SEC5">PARTIAL MATCHING USING pcre2_dfa_match()</a>
 <li><a name="TOC6" href="#SEC6">MULTI-SEGMENT MATCHING WITH pcre2_dfa_match()</a>
-<li><a name="TOC7" href="#SEC7">MULTI-SEGMENT MATCHING WITH pcre2_match()</a>
-<li><a name="TOC8" href="#SEC8">ISSUES WITH MULTI-SEGMENT MATCHING</a>
-<li><a name="TOC9" href="#SEC9">AUTHOR</a>
-<li><a name="TOC10" href="#SEC10">REVISION</a>
+<li><a name="TOC7" href="#SEC7">AUTHOR</a>
+<li><a name="TOC8" href="#SEC8">REVISION</a>
 </ul>
 <br><a name="SEC1" href="#TOC1">PARTIAL MATCHING IN PCRE2</a><br>
 <P>
-In normal use of PCRE2, if the subject string that is passed to a matching
-function matches as far as it goes, but is too short to match the entire
-pattern, PCRE2_ERROR_NOMATCH is returned. There are circumstances where it
-might be helpful to distinguish this case from other cases in which there is no
-match.
+In normal use of PCRE2, if there is a match up to the end of a subject string,
+but more characters are needed to match the entire pattern, PCRE2_ERROR_NOMATCH
+is returned, just like any other failing match. There are circumstances where
+it might be helpful to distinguish this "partial match" case.
 </P>
 <P>
-Consider, for example, an application where a human is required to type in data
-for a field with specific formatting requirements. An example might be a date
-in the form <i>ddmmmyy</i>, defined by this pattern:
-<pre>
-  ^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$
-</pre>
-If the application sees the user's keystrokes one by one, and can check that
-what has been typed so far is potentially valid, it is able to raise an error
-as soon as a mistake is made, by beeping and not reflecting the character that
-has been typed, for example. This immediate feedback is likely to be a better
-user interface than a check that is delayed until the entire string has been
-entered. Partial matching can also be useful when the subject string is very
-long and is not all available at once, as discussed below.
+One example is an application where the subject string is very long, and not
+all available at once. The requirement here is to be able to do the matching
+segment by segment, but special action is needed when a matched substring spans
+the boundary between two segments.
+</P>
+<P>
+Another example is checking a user input string as it is typed, to ensure that
+it conforms to a required format. Invalid characters can be immediately
+diagnosed and rejected, giving instant feedback.
 </P>
 <P>
-PCRE2 supports partial matching by means of the PCRE2_PARTIAL_SOFT and
-PCRE2_PARTIAL_HARD options, which can be set when calling a matching function.
-The difference between the two options is whether or not a partial match is
-preferred to an alternative complete match, though the details differ between
-the two types of matching function. If both options are set, PCRE2_PARTIAL_HARD
-takes precedence.
+Partial matching is a PCRE2-specific feature; it is not Perl-compatible. It is
+requested by setting one of the PCRE2_PARTIAL_HARD or PCRE2_PARTIAL_SOFT
+options when calling a matching function. The difference between the two
+options is whether or not a partial match is preferred to an alternative
+complete match, though the details differ between the two types of matching
+function. If both options are set, PCRE2_PARTIAL_HARD takes precedence.
 </P>
 <P>
-If you want to use partial matching with just-in-time optimized code, you must
-call <b>pcre2_jit_compile()</b> with one or both of these options:
+If you want to use partial matching with just-in-time optimized code, as well 
+as setting a partial match option for the matching function, you must also call
+<b>pcre2_jit_compile()</b> with one or both of these options:
 <pre>
-  PCRE2_JIT_PARTIAL_SOFT
   PCRE2_JIT_PARTIAL_HARD
+  PCRE2_JIT_PARTIAL_SOFT
 </pre>
 PCRE2_JIT_COMPLETE should also be set if you are going to run non-partial
-matches on the same pattern. If the appropriate JIT mode has not been compiled,
-interpretive matching code is used.
+matches on the same pattern. Separate code is compiled for each mode. If the
+appropriate JIT mode has not been compiled, interpretive matching code is used.
 </P>
 <P>
 Setting a partial matching option disables two of PCRE2's standard
-optimizations. PCRE2 remembers the last literal code unit in a pattern, and
-abandons matching immediately if it is not present in the subject string. This
-optimization cannot be used for a subject string that might match only
-partially. PCRE2 also knows the minimum length of a matching string, and does
+optimization hints. PCRE2 remembers the last literal code unit in a pattern,
+and abandons matching immediately if it is not present in the subject string.
+This optimization cannot be used for a subject string that might match only
+partially. PCRE2 also remembers a minimum length of a matching string, and does
 not bother to run the matching function on shorter strings. This optimization
 is also disabled for partial matching.
 </P>
-<br><a name="SEC2" href="#TOC1">PARTIAL MATCHING USING pcre2_match()</a><br>
+<br><a name="SEC2" href="#TOC1">REQUIREMENTS FOR A PARTIAL MATCH</a><br>
+<P>
+A possible partial match occurs during matching when the end of the subject
+string is reached successfully, but either more characters are needed to
+complete the match, or the addition of more characters might change what is
+matched.
+</P>
+<P>
+Example 1: if the pattern is /abc/ and the subject is "ab", more characters are
+definitely needed to complete a match. In this case both hard and soft matching
+options yield a partial match.
+</P>
+<P>
+Example 2: if the pattern is /ab+/ and the subject is "ab", a complete match
+can be found, but the addition of more characters might change what is
+matched. In this case, only PCRE2_PARTIAL_HARD returns a partial match;
+PCRE2_PARTIAL_SOFT returns the complete match.
+</P>
+<P>
+On reaching the end of the subject, when PCRE2_PARTIAL_HARD is set, if the next
+pattern item is \z, \Z, \b, \B, or $ there is always a partial match.
+Otherwise, for both options, the next pattern item must be one that inspects a
+character, and at least one of the following must be true:
+</P>
+<P>
+(1) At least one character has already been inspected. An inspected character
+need not form part of the final matched string; lookbehind assertions and the
+\K escape sequence provide ways of inspecting characters before the start of a
+matched string.
+</P>
 <P>
-A partial match occurs during a call to <b>pcre2_match()</b> when the end of the
-subject string is reached successfully, but matching cannot continue because
-more characters are needed, and in addition, either at least one character in
-the subject has been inspected or the pattern contains a lookbehind, or (when 
-PCRE2_PARTIAL_HARD is set) the pattern could match an empty string. An
-inspected character need not form part of the final matched string; lookbehind
-assertions and the \K escape sequence provide ways of inspecting characters
-before the start of a matched string.
+(2) The pattern contains one or more lookbehind assertions. This condition
+exists in case there is a lookbehind that inspects characters before the start 
+of the match.
 </P>
 <P>
-The three additional requirements define the cases where adding more characters
-to the existing subject may complete the same match that would occur if they
-had all been present in the first place. Without these conditions there would
-be a partial match of an empty string at the end of the subject for all
-unanchored patterns (and also for anchored patterns if the subject itself is
-empty).
+(3) There is a special case when the whole pattern can match an empty string.
+When the starting point is at the end of the subject, the empty string match is
+a possibility, and if PCRE2_PARTIAL_SOFT is set and neither of the above
+conditions is true, it is returned. However, because adding more characters
+might result in a non-empty match, PCRE2_PARTIAL_HARD returns a partial match,
+which in this case means "there is going to be a match at this point, but until
+some more characters are added, we do not know if it will be an empty string or
+something longer".
+</P>
+<br><a name="SEC3" href="#TOC1">PARTIAL MATCHING USING pcre2_match()</a><br>
+<P>
+When a partial matching option is set, the result of calling
+<b>pcre2_match()</b> can be one of the following:
+</P>
+<P>
+<b>A successful match</b>
+A complete match has been found, starting and ending within this subject.
+</P>
+<P>
+<b>PCRE2_ERROR_NOMATCH</b>
+No match can start anywhere in this subject.
+</P>
+<P>
+<b>PCRE2_ERROR_PARTIAL</b>
+Adding more characters may result in a complete match that uses one or more
+characters from the end of this subject.
 </P>
 <P>
 When a partial match is returned, the first two elements in the ovector point
@@ -110,26 +148,6 @@ these characters are needed for a subsequent re-match with additional
 characters.
 </P>
 <P>
-What happens when a partial match is identified depends on which of the two
-partial matching options is set.
-</P>
-<br><b>
-PCRE2_PARTIAL_SOFT WITH pcre2_match()
-</b><br>
-<P>
-If PCRE2_PARTIAL_SOFT is set when <b>pcre2_match()</b> identifies a partial
-match, the partial match is remembered, but matching continues as normal, and
-other alternatives in the pattern are tried. If no complete match can be found,
-PCRE2_ERROR_PARTIAL is returned instead of PCRE2_ERROR_NOMATCH.
-</P>
-<P>
-This option is "soft" because it prefers a complete match over a partial match.
-All the various matching items in a pattern behave as if the subject string is
-potentially complete. For example, \z, \Z, and $ match at the end of the
-subject, as normal, and for \b and \B the end of the subject is treated as a
-non-alphanumeric.
-</P>
-<P>
 If there is more than one partial match, the first one that was found provides
 the data that is returned. Consider this pattern:
 <pre>
@@ -138,26 +156,34 @@ the data that is returned. Consider this pattern:
 If this is matched against the subject string "abc123dog", both alternatives
 fail to match, but the end of the subject is reached during matching, so
 PCRE2_ERROR_PARTIAL is returned. The offsets are set to 3 and 9, identifying
-"123dog" as the first partial match that was found. (In this example, there are
-two partial matches, because "dog" on its own partially matches the second
-alternative.)
+"123dog" as the first partial match. (In this example, there are two partial
+matches, because "dog" on its own partially matches the second alternative.)
 </P>
 <br><b>
-PCRE2_PARTIAL_HARD WITH pcre2_match()
+How a partial match is processed by pcre2_match()
 </b><br>
 <P>
-If PCRE2_PARTIAL_HARD is set for <b>pcre2_match()</b>, PCRE2_ERROR_PARTIAL is
-returned as soon as a partial match is found, without continuing to search for
-possible complete matches. This option is "hard" because it prefers an earlier
-partial match over a later complete match. For this reason, the assumption is
-made that the end of the supplied subject string may not be the true end of the
-available data, and so, if \z, \Z, \b, \B, or $ are encountered at the end
-of the subject, the result is PCRE2_ERROR_PARTIAL, whether or not any 
-characters have been inspected.
+What happens when a partial match is identified depends on which of the two
+partial matching options is set.
+</P>
+<P>
+If PCRE2_PARTIAL_HARD is set, PCRE2_ERROR_PARTIAL is returned as soon as a
+partial match is found, without continuing to search for possible complete
+matches. This option is "hard" because it prefers an earlier partial match over
+a later complete match. For this reason, the assumption is made that the end of
+the supplied subject string is not the true end of the available data, which is 
+why \z, \Z, \b, \B, and $ always give a partial match.
+</P>
+<P>
+If PCRE2_PARTIAL_SOFT is set, the partial match is remembered, but matching
+continues as normal, and other alternatives in the pattern are tried. If no
+complete match can be found, PCRE2_ERROR_PARTIAL is returned instead of
+PCRE2_ERROR_NOMATCH. This option is "soft" because it prefers a complete match
+over a partial match. All the various matching items in a pattern behave as if
+the subject string is potentially complete; \z, \Z, and $ match at the end of
+the subject, as normal, and for \b and \B the end of the subject is treated
+as a non-alphanumeric.
 </P>
-<br><b>
-Comparing hard and soft partial matching
-</b><br>
 <P>
 The difference between the two partial matching options can be illustrated by a
 pattern such as:
@@ -182,154 +208,85 @@ to follow this explanation by thinking of the two patterns like this:
 The second pattern will never match "dogsbody", because it will always find the
 shorter match first.
 </P>
-<br><a name="SEC3" href="#TOC1">PARTIAL MATCHING USING pcre2_dfa_match()</a><br>
-<P>
-The DFA functions move along the subject string character by character, without
-backtracking, searching for all possible matches simultaneously. If the end of
-the subject is reached before the end of the pattern, there is the possibility
-of a partial match, again provided that at least one character has been
-inspected.
-</P>
-<P>
-When PCRE2_PARTIAL_SOFT is set, PCRE2_ERROR_PARTIAL is returned only if there
-have been no complete matches. Otherwise, the complete matches are returned.
-However, if PCRE2_PARTIAL_HARD is set, a partial match takes precedence over
-any complete matches. The portion of the string that was matched when the
-longest partial match was found is set as the first matching string.
-</P>
-<P>
-Because the DFA functions always search for all possible matches, and there is
-no difference between greedy and ungreedy repetition, their behaviour is
-different from the standard functions when PCRE2_PARTIAL_HARD is set. Consider
-the string "dog" matched against the ungreedy pattern shown above:
-<pre>
-  /dog(sbody)??/
-</pre>
-Whereas the standard function stops as soon as it finds the complete match for
-"dog", the DFA function also finds the partial match for "dogsbody", and so
-returns that when PCRE2_PARTIAL_HARD is set.
-</P>
-<br><a name="SEC4" href="#TOC1">PARTIAL MATCHING AND WORD BOUNDARIES</a><br>
+<br><b>
+Example of partial matching using pcre2test
+</b><br>
 <P>
-If a pattern ends with one of sequences \b or \B, which test for word
-boundaries, partial matching with PCRE2_PARTIAL_SOFT can give counter-intuitive
-results. Consider this pattern:
-<pre>
-  /\bcat\b/
-</pre>
-This matches "cat", provided there is a word boundary at either end. If the
-subject string is "the cat", the comparison of the final "t" with a following
-character cannot take place, so a partial match is found. However, normal
-matching carries on, and \b matches at the end of the subject when the last
-character is a letter, so a complete match is found. The result, therefore, is
-<i>not</i> PCRE2_ERROR_PARTIAL. Using PCRE2_PARTIAL_HARD in this case does yield
-PCRE2_ERROR_PARTIAL, because then the partial match takes precedence.
-</P>
-<br><a name="SEC5" href="#TOC1">EXAMPLE OF PARTIAL MATCHING USING PCRE2TEST</a><br>
-<P>
-If the <b>partial_soft</b> (or <b>ps</b>) modifier is present on a
-<b>pcre2test</b> data line, the PCRE2_PARTIAL_SOFT option is used for the match.
-Here is a run of <b>pcre2test</b> that uses the date example quoted above:
+The <b>pcre2test</b> data modifiers <b>partial_hard</b> (or <b>ph</b>) and
+<b>partial_soft</b> (or <b>ps</b>) set PCRE2_PARTIAL_HARD and PCRE2_PARTIAL_SOFT,
+respectively, when calling <b>pcre2_match()</b>. Here is a run of
+<b>pcre2test</b> using a pattern that matches the whole subject in the form of a
+date:
 <pre>
     re&#62; /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
-  data&#62; 25jun04\=ps
-   0: 25jun04
-   1: jun
-  data&#62; 25dec3\=ps
+  data&#62; 25dec3\=ph
   Partial match: 23dec3
-  data&#62; 3ju\=ps
+  data&#62; 3ju\=ph
   Partial match: 3ju
-  data&#62; 3juj\=ps
-  No match
-  data&#62; j\=ps
+  data&#62; 3juj\=ph
   No match
 </pre>
-The first data string is matched completely, so <b>pcre2test</b> shows the
-matched substrings. The remaining four strings do not match the complete
-pattern, but the first two are partial matches. Similar output is obtained
-if DFA matching is used.
-</P>
-<P>
-If the <b>partial_hard</b> (or <b>ph</b>) modifier is present on a
-<b>pcre2test</b> data line, the PCRE2_PARTIAL_HARD option is set for the match.
-</P>
-<br><a name="SEC6" href="#TOC1">MULTI-SEGMENT MATCHING WITH pcre2_dfa_match()</a><br>
-<P>
-When a partial match has been found using a DFA matching function, it is
-possible to continue the match by providing additional subject data and calling
-the function again with the same compiled regular expression, this time setting
-the PCRE2_DFA_RESTART option. You must pass the same working space as before,
-because this is where details of the previous partial match are stored. Here is
-an example using <b>pcre2test</b>:
+This example gives the same results for both hard and soft partial matching 
+options. Here is an example where there is a difference:
 <pre>
     re&#62; /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
-  data&#62; 23ja\=dfa,ps
-  Partial match: 23ja
-  data&#62; n05\=dfa,dfa_restart
-   0: n05
+  data&#62; 25jun04\=ps
+   0: 25jun04
+   1: jun
+  data&#62; 25jun04\=ph
+  Partial match: 25jun04 
 </pre>
-The first call has "23ja" as the subject, and requests partial matching; the
-second call has "n05" as the subject for the continued (restarted) match.
-Notice that when the match is complete, only the last part is shown; PCRE2 does
-not retain the previously partially-matched string. It is up to the calling
-program to do that if it needs to.
-</P>
-<P>
-That means that, for an unanchored pattern, if a continued match fails, it is
-not possible to try again at a new starting point. All this facility is capable
-of doing is continuing with the previous match attempt. In the previous
-example, if the second set of data is "ug23" the result is no match, even
-though there would be a match for "aug23" if the entire string were given at
-once. Depending on the application, this may or may not be what you want.
-The only way to allow for starting again at the next character is to retain the
-matched part of the subject and try a new complete match.
+With PCRE2_PARTIAL_SOFT, the subject is matched completely. For
+PCRE2_PARTIAL_HARD, however, the subject is assumed not to be complete, so
+there is only a partial match.
 </P>
+<br><a name="SEC4" href="#TOC1">MULTI-SEGMENT MATCHING WITH pcre2_match()</a><br>
 <P>
-You can set the PCRE2_PARTIAL_SOFT or PCRE2_PARTIAL_HARD options with
-PCRE2_DFA_RESTART to continue partial matching over multiple segments. This
-facility can be used to pass very long subject strings to the DFA matching
-functions.
+PCRE was not originally designed with multi-segment matching in mind. However,
+over time, features (including partial matching) that make multi-segment
+matching possible have been added. The string is searched segment by segment by
+calling <b>pcre2_match()</b> repeatedly, with the aim of achieving the same 
+results that would happen if the entire string was available for searching.
 </P>
-<br><a name="SEC7" href="#TOC1">MULTI-SEGMENT MATCHING WITH pcre2_match()</a><br>
 <P>
-Unlike the DFA function, it is not possible to restart the previous match with
-a new segment of data when using <b>pcre2_match()</b>. Instead, new data must be
-added to the previous subject string, and the entire match re-run, starting
-from the point where the partial match occurred. Earlier data can be discarded.
+Special logic must be implemented to handle a matched substring that spans a
+segment boundary. PCRE2_PARTIAL_HARD should be used, because it returns a
+partial match at the end of a segment whenever there is the possibility of
+changing the match by adding more characters. The PCRE2_NOTBOL option should
+also be set for all but the first segment.
 </P>
 <P>
-It is best to use PCRE2_PARTIAL_HARD in this situation, because it does not
-treat the end of a segment as the end of the subject when matching \z, \Z,
-\b, \B, and $. Consider an unanchored pattern that matches dates:
+When a partial match occurs, the next segment must be added to the current 
+subject and the match re-run, using the <i>startoffset</i> argument of 
+<b>pcre2_match()</b> to begin at the point where the partial match started.
+Multi-segment matching is usually used to search for substrings in the middle
+of very long sequences, so the patterns are normally not anchored. For example:
 <pre>
     re&#62; /\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d/
-  data&#62; The date is 23ja\=ph
+  data&#62; ...the date is 23ja\=ph
   Partial match: 23ja
+  data&#62; ...the date is 23jan19 and on that day...\=offset=15
+   0: 23jan19
+   1: jan
 </pre>
-At this stage, an application could discard the text preceding "23ja", add on
-text from the next segment, and call the matching function again. Unlike the
-DFA matching function, the entire matching string must always be available,
-and the complete matching process occurs for each call, so more memory and more
-processing time is needed.
+Note the use of the <b>offset</b> modifier to start the new match where the 
+partial match was found.
 </P>
-<br><a name="SEC8" href="#TOC1">ISSUES WITH MULTI-SEGMENT MATCHING</a><br>
 <P>
-Certain types of pattern may give problems with multi-segment matching,
-whichever matching function is used.
+In this simple example, the next segment was just added to the one in which the 
+partial match was found. However, if there are memory constraints, it may be 
+necessary to discard text that precedes the partial match before adding the 
+next segment. In cases such as the above, where the pattern does not contain
+any lookbehinds, it is sufficient to retain only the partially matched
+substring. However, if a pattern contains a lookbehind assertion, characters
+that precede the start of the partial match may have been inspected during the
+matching process.
 </P>
 <P>
-1. If the pattern contains a test for the beginning of a line, you need to pass
-the PCRE2_NOTBOL option when the subject string for any call does start at the
-beginning of a line. There is also a PCRE2_NOTEOL option, but in practice when
-doing multi-segment matching you should be using PCRE2_PARTIAL_HARD, which
-includes the effect of PCRE2_NOTEOL.
-</P>
-<P>
-2. If a pattern contains a lookbehind assertion, characters that precede the
-start of the partial match may have been inspected during the matching process.
-When using <b>pcre2_match()</b>, sufficient characters must be retained for the
-next match attempt. You can ensure that enough characters are retained by doing
-the following:
+The only lookbehind information that is available is the length of the longest
+lookbehind in a pattern. This may not, of course, be at the start of the
+pattern, but retaining that many characters before the partial match is
+sufficient, if not always strictly necessary. The way to do this is as follows:
 </P>
 <P>
 Before doing any matching, find the length of the longest lookbehind in the
@@ -339,13 +296,8 @@ partial match, moving back from the ovector[0] offset in the subject by the
 number of characters given for the maximum lookbehind gets you to the earliest
 character that must be retained. In a non-UTF or a 32-bit situation, moving
 back is just a subtraction, but in UTF-8 or UTF-16 you have to count characters
-while moving back through the code units.
-</P>
-<P>
-Characters before the point you have now reached can be discarded, and after
-the next segment has been added to what is retained, you should run the next
-match with the <b>startoffset</b> argument set so that the match begins at the
-same point as before.
+while moving back through the code units. Characters before the point you have
+now reached can be discarded.
 </P>
 <P>
 For example, if the pattern "(?&#60;=123)abc" is partially matched against the
@@ -353,62 +305,67 @@ string "xx123ab", the ovector offsets are 5 and 7 ("ab"). The maximum
 lookbehind count is 3, so all characters before offset 2 can be discarded. The
 value of <b>startoffset</b> for the next match should be 3. When <b>pcre2test</b>
 displays a partial match, it indicates the lookbehind characters with '&#60;'
-characters if the "allusedtext" modifier is set:
+characters if the <b>allusedtext</b> modifier is set:
 <pre>
     re&#62; "(?&#60;=123)abc"
   data&#62; xx123ab\=ph,allusedtext
   Partial match: 123ab
                  &#60;&#60;&#60;
 </pre>
-However, the "allusedtext" modifier is not available for JIT matching, because 
-JIT matching does not maintain the first and last consulted characters.
-</P>
-<P>
-3. Matching a subject string that is split into multiple segments may not
-always produce exactly the same result as matching over one single long string
-when PCRE2_PARTIAL_SOFT is used. The section "Partial Matching and Word
-Boundaries" above describes an issue that arises if the pattern ends with \b
-or \B. Another kind of difference may occur when there are multiple matching
-possibilities, because (for PCRE2_PARTIAL_SOFT) a partial match result is given
-only when there are no completed matches. This means that as soon as the
-shortest match has been found, continuation to a new subject segment is no
-longer possible. Consider this <b>pcre2test</b> example:
+Note that the \fPallusedtext\fP modifier is not available for JIT matching,
+because JIT matching does not maintain the first and last consulted characters.
+</P>
+<br><a name="SEC5" href="#TOC1">PARTIAL MATCHING USING pcre2_dfa_match()</a><br>
+<P>
+The DFA function moves along the subject string character by character, without
+backtracking, searching for all possible matches simultaneously. If the end of
+the subject is reached before the end of the pattern, there is the possibility
+of a partial match.
+</P>
+<P>
+When PCRE2_PARTIAL_SOFT is set, PCRE2_ERROR_PARTIAL is returned only if there
+have been no complete matches. Otherwise, the complete matches are returned.
+If PCRE2_PARTIAL_HARD is set, a partial match takes precedence over any
+complete matches. The portion of the string that was matched when the longest
+partial match was found is set as the first matching string.
+</P>
+<P>
+Because the DFA function always searches for all possible matches, and there is
+no difference between greedy and ungreedy repetition, its behaviour is
+different from the <b>pcre2_match()</b>. Consider the string "dog" matched
+against this ungreedy pattern:
 <pre>
-    re&#62; /dog(sbody)?/
-  data&#62; dogsb\=ps
-   0: dog
-  data&#62; do\=ps,dfa
-  Partial match: do
-  data&#62; gsb\=ps,dfa,dfa_restart
-   0: g
-  data&#62; dogsbody\=dfa
-   0: dogsbody
-   1: dog
+  /dog(sbody)??/
 </pre>
-The first data line passes the string "dogsb" to a standard matching function,
-setting the PCRE2_PARTIAL_SOFT option. Although the string is a partial match
-for "dogsbody", the result is not PCRE2_ERROR_PARTIAL, because the shorter
-string "dog" is a complete match. Similarly, when the subject is presented to
-a DFA matching function in several parts ("do" and "gsb" being the first two)
-the match stops when "dog" has been found, and it is not possible to continue.
-On the other hand, if "dogsbody" is presented as a single string, a DFA
-matching function finds both matches.
-</P>
-<P>
-Because of these problems, it is best to use PCRE2_PARTIAL_HARD when matching
-multi-segment data. The example above then behaves differently:
+Whereas the standard function stops as soon as it finds the complete match for
+"dog", the DFA function also finds the partial match for "dogsbody", and so
+returns that when PCRE2_PARTIAL_HARD is set.
+</P>
+<br><a name="SEC6" href="#TOC1">MULTI-SEGMENT MATCHING WITH pcre2_dfa_match()</a><br>
+<P>
+When a partial match has been found using the DFA matching function, it is
+possible to continue the match by providing additional subject data and calling
+the function again with the same compiled regular expression, this time setting
+the PCRE2_DFA_RESTART option. You must pass the same working space as before,
+because this is where details of the previous partial match are stored. You can
+set the PCRE2_PARTIAL_SOFT or PCRE2_PARTIAL_HARD options with PCRE2_DFA_RESTART
+to continue partial matching over multiple segments. Here is an example using
+<b>pcre2test</b>:
 <pre>
-    re&#62; /dog(sbody)?/
-  data&#62; dogsb\=ph
-  Partial match: dogsb
-  data&#62; do\=ps,dfa
-  Partial match: do
-  data&#62; gsb\=ph,dfa,dfa_restart
-  Partial match: gsb
+    re&#62; /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
+  data&#62; 23ja\=dfa,ps
+  Partial match: 23ja
+  data&#62; n05\=dfa,dfa_restart
+   0: n05
 </pre>
-4. Patterns that contain alternatives at the top level which do not all start
-with the same pattern item may not work as expected when PCRE2_DFA_RESTART is
-used. For example, consider this pattern:
+The first call has "23ja" as the subject, and requests partial matching; the
+second call has "n05" as the subject for the continued (restarted) match.
+Notice that when the match is complete, only the last part is shown; PCRE2 does
+not retain the previously partially-matched string. It is up to the calling
+program to do that if it needs to. This means that, for an unanchored pattern,
+if a continued match fails, it is not possible to try again at a new starting
+point. All this facility is capable of doing is continuing with the previous
+match attempt. For example, consider this pattern:
 <pre>
   1234|3789
 </pre>
@@ -417,30 +374,18 @@ alternative is found at offset 3. There is no partial match for the second
 alternative, because such a match does not start at the same point in the
 subject string. Attempting to continue with the string "7890" does not yield a
 match because only those alternatives that match at one point in the subject
-are remembered. The problem arises because the start of the second alternative
-matches within the first alternative. There is no problem with anchored
-patterns or patterns such as:
-<pre>
-  1234|ABCD
-</pre>
-where no string can be a partial match for both alternatives. This is not a
-problem if a standard matching function is used, because the entire match has
-to be rerun each time:
-<pre>
-    re&#62; /1234|3789/
-  data&#62; ABC123\=ph
-  Partial match: 123
-  data&#62; 1237890
-   0: 3789
-</pre>
-Of course, instead of using PCRE2_DFA_RESTART, the same technique of re-running
-the entire match can also be used with the DFA matching function. Another
-possibility is to work with two buffers. If a partial match at offset <i>n</i>
-in the first buffer is followed by "no match" when PCRE2_DFA_RESTART is used on
-the second buffer, you can then try a new match starting at offset <i>n+1</i> in
-the first buffer.
+are remembered. Depending on the application, this may or may not be what you
+want.
+</P>
+<P>
+If you do want to allow for starting again at the next character, one way of
+doing it is to retain the matched part of the segment and try a new complete
+match, as described for <b>pcre2_match()</b> above. Another possibility is to
+work with two buffers. If a partial match at offset <i>n</i> in the first buffer
+is followed by "no match" when PCRE2_DFA_RESTART is used on the second buffer,
+you can then try a new match starting at offset <i>n+1</i> in the first buffer.
 </P>
-<br><a name="SEC9" href="#TOC1">AUTHOR</a><br>
+<br><a name="SEC7" href="#TOC1">AUTHOR</a><br>
 <P>
 Philip Hazel
 <br>
@@ -449,9 +394,9 @@ University Computing Service
 Cambridge, England.
 <br>
 </P>
-<br><a name="SEC10" href="#TOC1">REVISION</a><br>
+<br><a name="SEC8" href="#TOC1">REVISION</a><br>
 <P>
-Last updated: 22 July 2019
+Last updated: 07 August 2019
 <br>
 Copyright &copy; 1997-2019 University of Cambridge.
 <br>
diff --git a/doc/pcre2.txt b/doc/pcre2.txt
index e3eb3f4..a990396 100644
--- a/doc/pcre2.txt
+++ b/doc/pcre2.txt
@@ -5650,72 +5650,109 @@ NAME
 
 PARTIAL MATCHING IN PCRE2
 
-       In  normal  use  of  PCRE2,  if  the subject string that is passed to a
-       matching function matches as far as it goes, but is too short to  match
-       the  entire pattern, PCRE2_ERROR_NOMATCH is returned. There are circum-
-       stances where it might be helpful to distinguish this case  from  other
-       cases in which there is no match.
-
-       Consider, for example, an application where a human is required to type
-       in data for a field with specific formatting requirements.  An  example
-       might be a date in the form ddmmmyy, defined by this pattern:
-
-         ^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$
-
-       If the application sees the user's keystrokes one by one, and can check
-       that what has been typed so far is potentially valid,  it  is  able  to
-       raise  an  error  as  soon as a mistake is made, by beeping and not re-
-       flecting the character that has been typed, for example. This immediate
-       feedback  is  likely to be a better user interface than a check that is
-       delayed until the entire string has been entered. Partial matching  can
-       also  be  useful  when  the  subject string is very long and is not all
-       available at once, as discussed below.
-
-       PCRE2 supports partial matching by means of the PCRE2_PARTIAL_SOFT  and
-       PCRE2_PARTIAL_HARD  options,  which  can be set when calling a matching
-       function.  The difference between the two options is whether or  not  a
-       partial match is preferred to an alternative complete match, though the
-       details differ between the two types of matching function. If both  op-
-       tions are set, PCRE2_PARTIAL_HARD takes precedence.
-
-       If  you  want to use partial matching with just-in-time optimized code,
-       you must call pcre2_jit_compile() with one or both of these options:
+       In  normal use of PCRE2, if there is a match up to the end of a subject
+       string, but more characters are needed to  match  the  entire  pattern,
+       PCRE2_ERROR_NOMATCH  is  returned,  just  like any other failing match.
+       There are circumstances where it might be helpful to  distinguish  this
+       "partial match" case.
+
+       One  example  is  an application where the subject string is very long,
+       and not all available at once. The requirement here is to be able to do
+       the  matching  segment  by segment, but special action is needed when a
+       matched substring spans the boundary between two segments.
+
+       Another example is checking a user input string as it is typed, to  en-
+       sure  that  it conforms to a required format. Invalid characters can be
+       immediately diagnosed and rejected, giving instant feedback.
+
+       Partial matching is a PCRE2-specific feature; it is  not  Perl-compati-
+       ble.  It  is  requested  by  setting  one  of the PCRE2_PARTIAL_HARD or
+       PCRE2_PARTIAL_SOFT options when calling a matching function.  The  dif-
+       ference  between  the  two options is whether or not a partial match is
+       preferred to an alternative complete match, though the  details  differ
+       between  the  two  types of matching function. If both options are set,
+       PCRE2_PARTIAL_HARD takes precedence.
+
+       If you want to use partial matching with just-in-time  optimized  code,
+       as  well  as  setting a partial match option for the matching function,
+       you must also call pcre2_jit_compile() with one or both  of  these  op-
+       tions:
 
-         PCRE2_JIT_PARTIAL_SOFT
          PCRE2_JIT_PARTIAL_HARD
+         PCRE2_JIT_PARTIAL_SOFT
 
-       PCRE2_JIT_COMPLETE should also be set if you are going to run  non-par-
-       tial  matches  on the same pattern. If the appropriate JIT mode has not
-       been compiled, interpretive matching code is used.
+       PCRE2_JIT_COMPLETE  should also be set if you are going to run non-par-
+       tial matches on the same pattern. Separate code is  compiled  for  each
+       mode.  If  the appropriate JIT mode has not been compiled, interpretive
+       matching code is used.
 
        Setting a partial matching option disables two of PCRE2's standard  op-
-       timizations.  PCRE2  remembers the last literal code unit in a pattern,
-       and abandons matching immediately if it is not present in  the  subject
-       string.  This  optimization  cannot  be  used for a subject string that
-       might match only partially. PCRE2 also knows the minimum  length  of  a
-       matching  string,  and  does not bother to run the matching function on
-       shorter strings. This optimization is also disabled for partial  match-
-       ing.
+       timization  hints. PCRE2 remembers the last literal code unit in a pat-
+       tern, and abandons matching immediately if it is  not  present  in  the
+       subject  string.  This optimization cannot be used for a subject string
+       that might match only partially. PCRE2 also remembers a minimum  length
+       of  a matching string, and does not bother to run the matching function
+       on shorter strings. This optimization  is  also  disabled  for  partial
+       matching.
+
+
+REQUIREMENTS FOR A PARTIAL MATCH
+
+       A  possible  partial  match  occurs during matching when the end of the
+       subject string is reached successfully, but either more characters  are
+       needed  to complete the match, or the addition of more characters might
+       change what is matched.
+
+       Example 1: if the pattern is /abc/ and the subject is "ab", more  char-
+       acters  are  definitely  needed  to complete a match. In this case both
+       hard and soft matching options yield a partial match.
+
+       Example 2: if the pattern is /ab+/ and the subject is "ab", a  complete
+       match  can  be  found, but the addition of more characters might change
+       what is matched. In this case, only PCRE2_PARTIAL_HARD returns  a  par-
+       tial match; PCRE2_PARTIAL_SOFT returns the complete match.
+
+       On  reaching the end of the subject, when PCRE2_PARTIAL_HARD is set, if
+       the next pattern item is \z, \Z, \b, \B, or $ there is always a partial
+       match.   Otherwise, for both options, the next pattern item must be one
+       that inspects a character, and at least one of the  following  must  be
+       true:
+
+       (1)  At  least  one  character has already been inspected. An inspected
+       character need not form part of the final  matched  string;  lookbehind
+       assertions  and the \K escape sequence provide ways of inspecting char-
+       acters before the start of a matched string.
+
+       (2) The pattern contains one or more lookbehind assertions. This condi-
+       tion  exists in case there is a lookbehind that inspects characters be-
+       fore the start of the match.
+
+       (3) There is a special case when the whole pattern can match  an  empty
+       string.   When  the  starting  point  is at the end of the subject, the
+       empty string match is a possibility, and if PCRE2_PARTIAL_SOFT  is  set
+       and  neither  of the above conditions is true, it is returned. However,
+       because adding more characters  might  result  in  a  non-empty  match,
+       PCRE2_PARTIAL_HARD  returns  a  partial match, which in this case means
+       "there is going to be a match at this point, but until some more  char-
+       acters are added, we do not know if it will be an empty string or some-
+       thing longer".
 
 
 PARTIAL MATCHING USING pcre2_match()
 
-       A  partial  match occurs during a call to pcre2_match() when the end of
-       the subject string is reached successfully, but  matching  cannot  con-
-       tinue  because  more  characters are needed, and in addition, either at
-       least one character in the subject has been inspected  or  the  pattern
-       contains  a lookbehind, or (when PCRE2_PARTIAL_HARD is set) the pattern
-       could match an empty string. An inspected character need not form  part
-       of  the  final  matched string; lookbehind assertions and the \K escape
-       sequence provide ways of inspecting characters before the  start  of  a
-       matched string.
-
-       The  three  additional  requirements define the cases where adding more
-       characters to the existing subject may complete  the  same  match  that
-       would  occur  if  they had all been present in the first place. Without
-       these conditions there would be a partial match of an empty  string  at
-       the  end  of  the subject for all unanchored patterns (and also for an-
-       chored patterns if the subject itself is empty).
+       When  a  partial  matching  option  is  set,  the  result  of   calling
+       pcre2_match() can be one of the following:
+
+       A successful match
+         A complete match has been found, starting and ending within this sub-
+         ject.
+
+       PCRE2_ERROR_NOMATCH
+         No match can start anywhere in this subject.
+
+       PCRE2_ERROR_PARTIAL
+         Adding more characters may result in a complete match that  uses  one
+         or more characters from the end of this subject.
 
        When a partial match is returned, the first two elements in the ovector
        point to the portion of the subject that was matched, but the values in
@@ -5725,29 +5762,12 @@ PARTIAL MATCHING USING pcre2_match()
          /abc\K123/
 
        If it is matched against "456abc123xyz" the result is a complete match,
-       and the ovector defines the matched string as "123", because \K  resets
-       the  "start  of  match" point. However, if a partial match is requested
-       and the subject string is "456abc12", a partial match is found for  the
-       string  "abc12",  because  all these characters are needed for a subse-
+       and  the ovector defines the matched string as "123", because \K resets
+       the "start of match" point. However, if a partial  match  is  requested
+       and  the subject string is "456abc12", a partial match is found for the
+       string "abc12", because all these characters are needed  for  a  subse-
        quent re-match with additional characters.
 
-       What happens when a partial match is identified depends on which of the
-       two partial matching options is set.
-
-   PCRE2_PARTIAL_SOFT WITH pcre2_match()
-
-       If  PCRE2_PARTIAL_SOFT  is  set when pcre2_match() identifies a partial
-       match, the partial match is remembered, but matching continues as  nor-
-       mal,  and  other  alternatives in the pattern are tried. If no complete
-       match  can  be  found,  PCRE2_ERROR_PARTIAL  is  returned  instead   of
-       PCRE2_ERROR_NOMATCH.
-
-       This  option  is "soft" because it prefers a complete match over a par-
-       tial match.  All the various matching items in a pattern behave  as  if
-       the  subject string is potentially complete. For example, \z, \Z, and $
-       match at the end of the subject, as normal, and for \b and \B  the  end
-       of the subject is treated as a non-alphanumeric.
-
        If  there  is more than one partial match, the first one that was found
        provides the data that is returned. Consider this pattern:
 
@@ -5756,23 +5776,31 @@ PARTIAL MATCHING USING pcre2_match()
        If this is matched against the subject string "abc123dog", both  alter-
        natives  fail  to  match,  but the end of the subject is reached during
        matching, so PCRE2_ERROR_PARTIAL is returned. The offsets are set to  3
-       and  9, identifying "123dog" as the first partial match that was found.
-       (In this example, there are two partial matches, because "dog"  on  its
-       own partially matches the second alternative.)
-
-   PCRE2_PARTIAL_HARD WITH pcre2_match()
-
-       If  PCRE2_PARTIAL_HARD is set for pcre2_match(), PCRE2_ERROR_PARTIAL is
-       returned as soon as a partial match is  found,  without  continuing  to
-       search  for possible complete matches. This option is "hard" because it
-       prefers an earlier partial match over a later complete match. For  this
-       reason,  the  assumption  is  made that the end of the supplied subject
-       string may not be the true end of the available data, and  so,  if  \z,
-       \Z,  \b, \B, or $ are encountered at the end of the subject, the result
-       is PCRE2_ERROR_PARTIAL, whether or not any  characters  have  been  in-
-       spected.
+       and  9, identifying "123dog" as the first partial match. (In this exam-
+       ple, there are two partial matches, because "dog" on its own  partially
+       matches the second alternative.)
 
-   Comparing hard and soft partial matching
+   How a partial match is processed by pcre2_match()
+
+       What happens when a partial match is identified depends on which of the
+       two partial matching options is set.
+
+       If PCRE2_PARTIAL_HARD is set, PCRE2_ERROR_PARTIAL is returned  as  soon
+       as  a partial match is found, without continuing to search for possible
+       complete matches. This option is "hard" because it prefers  an  earlier
+       partial match over a later complete match. For this reason, the assump-
+       tion is made that the end of the supplied subject  string  is  not  the
+       true  end of the available data, which is why \z, \Z, \b, \B, and $ al-
+       ways give a partial match.
+
+       If PCRE2_PARTIAL_SOFT is set, the  partial  match  is  remembered,  but
+       matching continues as normal, and other alternatives in the pattern are
+       tried. If no complete match can be found,  PCRE2_ERROR_PARTIAL  is  re-
+       turned instead of PCRE2_ERROR_NOMATCH. This option is "soft" because it
+       prefers a complete match over a partial match. All the various matching
+       items  in a pattern behave as if the subject string is potentially com-
+       plete; \z, \Z, and $ match at the end of the subject,  as  normal,  and
+       for \b and \B the end of the subject is treated as a non-alphanumeric.
 
        The  difference  between the two partial matching options can be illus-
        trated by a pattern such as:
@@ -5799,89 +5827,147 @@ PARTIAL MATCHING USING pcre2_match()
        The second pattern will never match "dogsbody", because it will  always
        find the shorter match first.
 
+   Example of partial matching using pcre2test
 
-PARTIAL MATCHING USING pcre2_dfa_match()
+       The  pcre2test data modifiers partial_hard (or ph) and partial_soft (or
+       ps) set PCRE2_PARTIAL_HARD and PCRE2_PARTIAL_SOFT,  respectively,  when
+       calling  pcre2_match(). Here is a run of pcre2test using a pattern that
+       matches the whole subject in the form of a date:
 
-       The DFA functions move along the subject string character by character,
-       without backtracking, searching for  all  possible  matches  simultane-
-       ously.  If the end of the subject is reached before the end of the pat-
-       tern, there is the possibility of a partial match, again provided  that
-       at least one character has been inspected.
+           re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
+         data> 25dec3\=ph
+         Partial match: 23dec3
+         data> 3ju\=ph
+         Partial match: 3ju
+         data> 3juj\=ph
+         No match
 
-       When PCRE2_PARTIAL_SOFT is set, PCRE2_ERROR_PARTIAL is returned only if
-       there have been no complete matches. Otherwise,  the  complete  matches
-       are  returned.   However, if PCRE2_PARTIAL_HARD is set, a partial match
-       takes precedence over any complete matches. The portion of  the  string
-       that was matched when the longest partial match was found is set as the
-       first matching string.
+       This example gives the same results for  both  hard  and  soft  partial
+       matching options. Here is an example where there is a difference:
+
+           re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
+         data> 25jun04\=ps
+          0: 25jun04
+          1: jun
+         data> 25jun04\=ph
+         Partial match: 25jun04
 
-       Because the DFA functions always search for all possible  matches,  and
-       there  is  no  difference between greedy and ungreedy repetition, their
-       behaviour is different from  the  standard  functions  when  PCRE2_PAR-
-       TIAL_HARD  is  set.  Consider  the string "dog" matched against the un-
-       greedy pattern shown above:
+       With   PCRE2_PARTIAL_SOFT,  the  subject  is  matched  completely.  For
+       PCRE2_PARTIAL_HARD, however, the subject is assumed not to be complete,
+       so there is only a partial match.
 
-         /dog(sbody)??/
 
-       Whereas the standard function stops as soon as it  finds  the  complete
-       match  for  "dog",  the  DFA  function also finds the partial match for
-       "dogsbody", and so returns that when PCRE2_PARTIAL_HARD is set.
+MULTI-SEGMENT MATCHING WITH pcre2_match()
 
+       PCRE  was  not originally designed with multi-segment matching in mind.
+       However, over time, features (including  partial  matching)  that  make
+       multi-segment matching possible have been added. The string is searched
+       segment by segment by calling pcre2_match() repeatedly, with the aim of
+       achieving  the  same results that would happen if the entire string was
+       available for searching.
+
+       Special logic must be implemented to handle a  matched  substring  that
+       spans a segment boundary. PCRE2_PARTIAL_HARD should be used, because it
+       returns a partial match at the end of a segment whenever there  is  the
+       possibility  of  changing  the  match  by  adding  more characters. The
+       PCRE2_NOTBOL option should also be set for all but the first segment.
+
+       When a partial match occurs, the next segment must be added to the cur-
+       rent  subject  and  the match re-run, using the startoffset argument of
+       pcre2_match() to begin at the point where the  partial  match  started.
+       Multi-segment  matching is usually used to search for substrings in the
+       middle of very long sequences, so the patterns  are  normally  not  an-
+       chored. For example:
+
+           re> /\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d/
+         data> ...the date is 23ja\=ph
+         Partial match: 23ja
+         data> ...the date is 23jan19 and on that day...\=offset=15
+          0: 23jan19
+          1: jan
+
+       Note  the  use  of the offset modifier to start the new match where the
+       partial match was found.
+
+       In this simple example, the next segment was just added to the  one  in
+       which  the  partial  match was found. However, if there are memory con-
+       straints, it may be necessary to discard text that precedes the partial
+       match before adding the next segment. In cases such as the above, where
+       the pattern does not contain any lookbehinds, it is sufficient  to  re-
+       tain  only  the partially matched substring. However, if a pattern con-
+       tains a lookbehind assertion, characters that precede the start of  the
+       partial match may have been inspected during the matching process.
+
+       The  only lookbehind information that is available is the length of the
+       longest lookbehind in a pattern. This may not, of  course,  be  at  the
+       start  of  the  pattern,  but retaining that many characters before the
+       partial match is sufficient, if not always strictly necessary. The  way
+       to do this is as follows:
 
-PARTIAL MATCHING AND WORD BOUNDARIES
+       Before doing any matching, find the length of the longest lookbehind in
+       the    pattern    by    calling    pcre2_pattern_info()    with     the
+       PCRE2_INFO_MAXLOOKBEHIND  option.  Note  that the resulting count is in
+       characters, not code units. After a partial match, moving back from the
+       ovector[0]  offset in the subject by the number of characters given for
+       the maximum lookbehind gets you to the earliest character that must  be
+       retained.  In  a  non-UTF  or a 32-bit situation, moving back is just a
+       subtraction, but in UTF-8 or UTF-16 you have to count characters  while
+       moving  back  through  the  code units. Characters before the point you
+       have now reached can be discarded.
+
+       For example, if the pattern "(?<=123)abc" is partially matched  against
+       the string "xx123ab", the ovector offsets are 5 and 7 ("ab"). The maxi-
+       mum lookbehind count is 3, so all characters before  offset  2  can  be
+       discarded.  The  value  of  startoffset for the next match should be 3.
+       When pcre2test displays a partial match, it  indicates  the  lookbehind
+       characters with '<' characters if the allusedtext modifier is set:
 
-       If a pattern ends with one of sequences \b or \B, which test  for  word
-       boundaries,  partial matching with PCRE2_PARTIAL_SOFT can give counter-
-       intuitive results. Consider this pattern:
+           re> "(?<=123)abc"
+         data> xx123ab\=ph,allusedtext
+         Partial match: 123ab
+                        <<<
 
-         /\bcat\b/
+       Note  that  the allusedtext modifier is not available for JIT matching,
+       because JIT matching does not maintain the  first  and  last  consulted
+       characters.
 
-       This matches "cat", provided there is a word boundary at either end. If
-       the subject string is "the cat", the comparison of the final "t" with a
-       following character cannot take place, so a  partial  match  is  found.
-       However,  normal  matching carries on, and \b matches at the end of the
-       subject when the last character is a letter, so  a  complete  match  is
-       found.   The  result,  therefore,  is  not  PCRE2_ERROR_PARTIAL.  Using
-       PCRE2_PARTIAL_HARD in this case does yield PCRE2_ERROR_PARTIAL, because
-       then the partial match takes precedence.
 
+PARTIAL MATCHING USING pcre2_dfa_match()
 
-EXAMPLE OF PARTIAL MATCHING USING PCRE2TEST
+       The DFA function moves along the subject string character by character,
+       without backtracking, searching for  all  possible  matches  simultane-
+       ously.  If the end of the subject is reached before the end of the pat-
+       tern, there is the possibility of a partial match.
 
-       If  the  partial_soft  (or  ps) modifier is present on a pcre2test data
-       line, the PCRE2_PARTIAL_SOFT option is used for the match.  Here  is  a
-       run of pcre2test that uses the date example quoted above:
+       When PCRE2_PARTIAL_SOFT is set, PCRE2_ERROR_PARTIAL is returned only if
+       there  have  been  no complete matches. Otherwise, the complete matches
+       are returned.  If PCRE2_PARTIAL_HARD is  set,  a  partial  match  takes
+       precedence  over  any  complete matches. The portion of the string that
+       was matched when the longest partial match was  found  is  set  as  the
+       first matching string.
 
-           re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
-         data> 25jun04\=ps
-          0: 25jun04
-          1: jun
-         data> 25dec3\=ps
-         Partial match: 23dec3
-         data> 3ju\=ps
-         Partial match: 3ju
-         data> 3juj\=ps
-         No match
-         data> j\=ps
-         No match
+       Because  the DFA function always searches for all possible matches, and
+       there is no difference between greedy and ungreedy repetition, its  be-
+       haviour  is different from the pcre2_match(). Consider the string "dog"
+       matched against this ungreedy pattern:
 
-       The  first  data  string  is matched completely, so pcre2test shows the
-       matched substrings. The remaining four strings do not  match  the  com-
-       plete pattern, but the first two are partial matches. Similar output is
-       obtained if DFA matching is used.
+         /dog(sbody)??/
 
-       If the partial_hard (or ph) modifier is present  on  a  pcre2test  data
-       line, the PCRE2_PARTIAL_HARD option is set for the match.
+       Whereas the standard function stops as soon as it  finds  the  complete
+       match  for  "dog",  the  DFA  function also finds the partial match for
+       "dogsbody", and so returns that when PCRE2_PARTIAL_HARD is set.
 
 
 MULTI-SEGMENT MATCHING WITH pcre2_dfa_match()
 
-       When  a  partial match has been found using a DFA matching function, it
-       is possible to continue the match by providing additional subject  data
-       and  calling  the function again with the same compiled regular expres-
+       When a partial match has been found using the DFA matching function, it
+       is  possible to continue the match by providing additional subject data
+       and calling the function again with the same compiled  regular  expres-
        sion, this time setting the PCRE2_DFA_RESTART option. You must pass the
        same working space as before, because this is where details of the pre-
-       vious partial match are stored. Here is an example using pcre2test:
+       vious  partial  match are stored. You can set the PCRE2_PARTIAL_SOFT or
+       PCRE2_PARTIAL_HARD options with PCRE2_DFA_RESTART to  continue  partial
+       matching over multiple segments. Here is an example using pcre2test:
 
            re> /^\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d$/
          data> 23ja\=dfa,ps
@@ -5889,146 +5975,15 @@ MULTI-SEGMENT MATCHING WITH pcre2_dfa_match()
          data> n05\=dfa,dfa_restart
           0: n05
 
-       The first call has "23ja" as the subject, and requests  partial  match-
-       ing;  the  second  call  has  "n05"  as  the  subject for the continued
-       (restarted) match.  Notice that when the match is  complete,  only  the
-       last  part  is  shown;  PCRE2 does not retain the previously partially-
-       matched string. It is up to the calling program to do that if it  needs
-       to.
-
-       That means that, for an unanchored pattern, if a continued match fails,
-       it is not possible to try again at a new starting point. All  this  fa-
-       cility  is  capable  of doing is continuing with the previous match at-
-       tempt. In the previous example, if the second set of data is "ug23" the
-       result  is  no match, even though there would be a match for "aug23" if
-       the entire string were given at once.  Depending  on  the  application,
-       this may or may not be what you want.  The only way to allow for start-
-       ing again at the next character is to retain the matched  part  of  the
-       subject and try a new complete match.
-
-       You  can  set the PCRE2_PARTIAL_SOFT or PCRE2_PARTIAL_HARD options with
-       PCRE2_DFA_RESTART to continue partial matching over multiple  segments.
-       This  facility can be used to pass very long subject strings to the DFA
-       matching functions.
-
-
-MULTI-SEGMENT MATCHING WITH pcre2_match()
-
-       Unlike the DFA function, it is not possible  to  restart  the  previous
-       match with a new segment of data when using pcre2_match(). Instead, new
-       data must be added to the previous subject string, and the entire match
-       re-run,  starting from the point where the partial match occurred. Ear-
-       lier data can be discarded.
-
-       It is best to use PCRE2_PARTIAL_HARD in this situation, because it does
-       not  treat the end of a segment as the end of the subject when matching
-       \z, \Z, \b, \B, and $. Consider  an  unanchored  pattern  that  matches
-       dates:
-
-           re> /\d?\d(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\d\d/
-         data> The date is 23ja\=ph
-         Partial match: 23ja
-
-       At  this stage, an application could discard the text preceding "23ja",
-       add on text from the next  segment,  and  call  the  matching  function
-       again.  Unlike  the  DFA  matching function, the entire matching string
-       must always be available, and the complete matching process occurs  for
-       each call, so more memory and more processing time is needed.
-
-
-ISSUES WITH MULTI-SEGMENT MATCHING
-
-       Certain types of pattern may give problems with multi-segment matching,
-       whichever matching function is used.
-
-       1. If the pattern contains a test for the beginning of a line, you need
-       to  pass  the  PCRE2_NOTBOL option when the subject string for any call
-       does start at the beginning of a line. There is also a PCRE2_NOTEOL op-
-       tion,  but  in practice when doing multi-segment matching you should be
-       using PCRE2_PARTIAL_HARD, which includes the effect of PCRE2_NOTEOL.
-
-       2. If a pattern contains a lookbehind assertion, characters  that  pre-
-       cede  the start of the partial match may have been inspected during the
-       matching process.  When using pcre2_match(), sufficient characters must
-       be  retained  for  the  next  match attempt. You can ensure that enough
-       characters are retained by doing the following:
-
-       Before doing any matching, find the length of the longest lookbehind in
-       the     pattern    by    calling    pcre2_pattern_info()    with    the
-       PCRE2_INFO_MAXLOOKBEHIND option. Note that the resulting  count  is  in
-       characters, not code units. After a partial match, moving back from the
-       ovector[0] offset in the subject by the number of characters given  for
-       the  maximum lookbehind gets you to the earliest character that must be
-       retained. In a non-UTF or a 32-bit situation, moving  back  is  just  a
-       subtraction,  but in UTF-8 or UTF-16 you have to count characters while
-       moving back through the code units.
-
-       Characters before the point you have now reached can be discarded,  and
-       after  the  next segment has been added to what is retained, you should
-       run the next match with the startoffset argument set so that the  match
-       begins at the same point as before.
-
-       For  example, if the pattern "(?<=123)abc" is partially matched against
-       the string "xx123ab", the ovector offsets are 5 and 7 ("ab"). The maxi-
-       mum  lookbehind  count  is  3, so all characters before offset 2 can be
-       discarded. The value of startoffset for the next  match  should  be  3.
-       When  pcre2test  displays  a partial match, it indicates the lookbehind
-       characters with '<' characters if the "allusedtext" modifier is set:
-
-           re> "(?<=123)abc"
-         data> xx123ab\=ph,allusedtext
-         Partial match: 123ab
-                        <<< However, the "allusedtext" modifier is not  avail-
-       able for JIT matching, because JIT matching does not maintain the first
-       and last consulted characters.
-
-       3. Matching a subject string that is split into multiple  segments  may
-       not  always produce exactly the same result as matching over one single
-       long string when  PCRE2_PARTIAL_SOFT  is  used.  The  section  "Partial
-       Matching  and  Word Boundaries" above describes an issue that arises if
-       the pattern ends with \b or \B. Another kind of  difference  may  occur
-       when there are multiple matching possibilities, because (for PCRE2_PAR-
-       TIAL_SOFT) a partial match result is given only when there are no  com-
-       pleted  matches. This means that as soon as the shortest match has been
-       found, continuation to a new subject segment  is  no  longer  possible.
-       Consider this pcre2test example:
-
-           re> /dog(sbody)?/
-         data> dogsb\=ps
-          0: dog
-         data> do\=ps,dfa
-         Partial match: do
-         data> gsb\=ps,dfa,dfa_restart
-          0: g
-         data> dogsbody\=dfa
-          0: dogsbody
-          1: dog
-
-       The  first  data  line passes the string "dogsb" to a standard matching
-       function, setting the PCRE2_PARTIAL_SOFT option. Although the string is
-       a  partial match for "dogsbody", the result is not PCRE2_ERROR_PARTIAL,
-       because the shorter string "dog" is a complete match.  Similarly,  when
-       the  subject  is  presented to a DFA matching function in several parts
-       ("do" and "gsb" being the first two) the match  stops  when  "dog"  has
-       been  found, and it is not possible to continue.  On the other hand, if
-       "dogsbody" is presented as a single string,  a  DFA  matching  function
-       finds both matches.
-
-       Because  of  these  problems, it is best to use PCRE2_PARTIAL_HARD when
-       matching multi-segment data. The example  above  then  behaves  differ-
-       ently:
-
-           re> /dog(sbody)?/
-         data> dogsb\=ph
-         Partial match: dogsb
-         data> do\=ps,dfa
-         Partial match: do
-         data> gsb\=ph,dfa,dfa_restart
-         Partial match: gsb
-
-       4. Patterns that contain alternatives at the top level which do not all
-       start with the  same  pattern  item  may  not  work  as  expected  when
-       PCRE2_DFA_RESTART is used. For example, consider this pattern:
+       The  first  call has "23ja" as the subject, and requests partial match-
+       ing; the second call  has  "n05"  as  the  subject  for  the  continued
+       (restarted)  match.   Notice  that when the match is complete, only the
+       last part is shown; PCRE2 does not  retain  the  previously  partially-
+       matched  string. It is up to the calling program to do that if it needs
+       to. This means that, for an unanchored pattern, if  a  continued  match
+       fails,  it  is  not  possible to try again at a new starting point. All
+       this facility is capable of doing is continuing with the previous match
+       attempt. For example, consider this pattern:
 
          1234|3789
 
@@ -6037,29 +5992,16 @@ ISSUES WITH MULTI-SEGMENT MATCHING
        the second alternative, because such a match does not start at the same
        point in the subject string. Attempting to  continue  with  the  string
        "7890"  does  not  yield  a  match because only those alternatives that
-       match at one point in the subject are remembered.  The  problem  arises
-       because  the  start  of the second alternative matches within the first
-       alternative. There is no problem with  anchored  patterns  or  patterns
-       such as:
-
-         1234|ABCD
-
-       where  no  string can be a partial match for both alternatives. This is
-       not a problem if a standard matching function is used, because the  en-
-       tire match has to be rerun each time:
-
-           re> /1234|3789/
-         data> ABC123\=ph
-         Partial match: 123
-         data> 1237890
-          0: 3789
+       match at one point in the subject are remembered. Depending on the  ap-
+       plication, this may or may not be what you want.
 
-       Of  course,  instead  of using PCRE2_DFA_RESTART, the same technique of
-       re-running the entire match can also be  used  with  the  DFA  matching
-       function. Another possibility is to work with two buffers. If a partial
-       match at offset n in the first buffer is followed by  "no  match"  when
-       PCRE2_DFA_RESTART  is used on the second buffer, you can then try a new
-       match starting at offset n+1 in the first buffer.
+       If  you  do want to allow for starting again at the next character, one
+       way of doing it is to retain the matched part of the segment and try  a
+       new  complete match, as described for pcre2_match() above. Another pos-
+       sibility is to work with two buffers. If a partial match at offset n in
+       the  first  buffer  is followed by "no match" when PCRE2_DFA_RESTART is
+       used on the second buffer, you can then try a  new  match  starting  at
+       offset n+1 in the first buffer.
 
 
 AUTHOR
@@ -6071,7 +6013,7 @@ AUTHOR
 
 REVISION
 
-       Last updated: 22 July 2019
+       Last updated: 07 August 2019
        Copyright (c) 1997-2019 University of Cambridge.
 ------------------------------------------------------------------------------
  
diff --git a/doc/pcre2partial.3 b/doc/pcre2partial.3
index adb7814..92d5038 100644
--- a/doc/pcre2partial.3
+++ b/doc/pcre2partial.3
@@ -1,73 +1,107 @@
-.TH PCRE2PARTIAL 3 "22 July 2019" "PCRE2 10.34"
+.TH PCRE2PARTIAL 3 "07 August 2019" "PCRE2 10.34"
 .SH NAME
 PCRE2 - Perl-compatible regular expressions
 .SH "PARTIAL MATCHING IN PCRE2"
 .rs
 .sp
-In normal use of PCRE2, if the subject string that is passed to a matching
-function matches as far as it goes, but is too short to match the entire
-pattern, PCRE2_ERROR_NOMATCH is returned. There are circumstances where it
-might be helpful to distinguish this case from other cases in which there is no
-match.
+In normal use of PCRE2, if there is a match up to the end of a subject string,
+but more characters are needed to match the entire pattern, PCRE2_ERROR_NOMATCH
+is returned, just like any other failing match. There are circumstances where
+it might be helpful to distinguish this "partial match" case.
 .P
-Consider, for example, an application where a human is required to type in data
-for a field with specific formatting requirements. An example might be a date
-in the form \fIddmmmyy\fP, defined by this pattern:
-.sp
-  ^\ed?\ed(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\ed\ed$
-.sp
-If the application sees the user's keystrokes one by one, and can check that
-what has been typed so far is potentially valid, it is able to raise an error
-as soon as a mistake is made, by beeping and not reflecting the character that
-has been typed, for example. This immediate feedback is likely to be a better
-user interface than a check that is delayed until the entire string has been
-entered. Partial matching can also be useful when the subject string is very
-long and is not all available at once, as discussed below.
+One example is an application where the subject string is very long, and not
+all available at once. The requirement here is to be able to do the matching
+segment by segment, but special action is needed when a matched substring spans
+the boundary between two segments.
 .P
-PCRE2 supports partial matching by means of the PCRE2_PARTIAL_SOFT and
-PCRE2_PARTIAL_HARD options, which can be set when calling a matching function.
-The difference between the two options is whether or not a partial match is
-preferred to an alternative complete match, though the details differ between
-the two types of matching function. If both options are set, PCRE2_PARTIAL_HARD
-takes precedence.
+Another example is checking a user input string as it is typed, to ensure that
+it conforms to a required format. Invalid characters can be immediately
+diagnosed and rejected, giving instant feedback.
 .P
-If you want to use partial matching with just-in-time optimized code, you must
-call \fBpcre2_jit_compile()\fP with one or both of these options:
+Partial matching is a PCRE2-specific feature; it is not Perl-compatible. It is
+requested by setting one of the PCRE2_PARTIAL_HARD or PCRE2_PARTIAL_SOFT
+options when calling a matching function. The difference between the two
+options is whether or not a partial match is preferred to an alternative
+complete match, though the details differ between the two types of matching
+function. If both options are set, PCRE2_PARTIAL_HARD takes precedence.
+.P
+If you want to use partial matching with just-in-time optimized code, as well 
+as setting a partial match option for the matching function, you must also call
+\fBpcre2_jit_compile()\fP with one or both of these options:
 .sp
-  PCRE2_JIT_PARTIAL_SOFT
   PCRE2_JIT_PARTIAL_HARD
+  PCRE2_JIT_PARTIAL_SOFT
 .sp
 PCRE2_JIT_COMPLETE should also be set if you are going to run non-partial
-matches on the same pattern. If the appropriate JIT mode has not been compiled,
-interpretive matching code is used.
+matches on the same pattern. Separate code is compiled for each mode. If the
+appropriate JIT mode has not been compiled, interpretive matching code is used.
 .P
 Setting a partial matching option disables two of PCRE2's standard
-optimizations. PCRE2 remembers the last literal code unit in a pattern, and
-abandons matching immediately if it is not present in the subject string. This
-optimization cannot be used for a subject string that might match only
-partially. PCRE2 also knows the minimum length of a matching string, and does
+optimization hints. PCRE2 remembers the last literal code unit in a pattern,
+and abandons matching immediately if it is not present in the subject string.
+This optimization cannot be used for a subject string that might match only
+partially. PCRE2 also remembers a minimum length of a matching string, and does
 not bother to run the matching function on shorter strings. This optimization
 is also disabled for partial matching.
 .
 .
-.SH "PARTIAL MATCHING USING pcre2_match()"
+.SH "REQUIREMENTS FOR A PARTIAL MATCH"
 .rs
 .sp
-A partial match occurs during a call to \fBpcre2_match()\fP when the end of the
-subject string is reached successfully, but matching cannot continue because
-more characters are needed, and in addition, either at least one character in
-the subject has been inspected or the pattern contains a lookbehind, or (when 
-PCRE2_PARTIAL_HARD is set) the pattern could match an empty string. An
-inspected character need not form part of the final matched string; lookbehind
-assertions and the \eK escape sequence provide ways of inspecting characters
-before the start of a matched string.
+A possible partial match occurs during matching when the end of the subject
+string is reached successfully, but either more characters are needed to
+complete the match, or the addition of more characters might change what is
+matched.
+.P
+Example 1: if the pattern is /abc/ and the subject is "ab", more characters are
+definitely needed to complete a match. In this case both hard and soft matching
+options yield a partial match.
+.P
+Example 2: if the pattern is /ab+/ and the subject is "ab", a complete match
+can be found, but the addition of more characters might change what is
+matched. In this case, only PCRE2_PARTIAL_HARD returns a partial match;
+PCRE2_PARTIAL_SOFT returns the complete match.
+.P
+On reaching the end of the subject, when PCRE2_PARTIAL_HARD is set, if the next
+pattern item is \ez, \eZ, \eb, \eB, or $ there is always a partial match.
+Otherwise, for both options, the next pattern item must be one that inspects a
+character, and at least one of the following must be true:
+.P
+(1) At least one character has already been inspected. An inspected character
+need not form part of the final matched string; lookbehind assertions and the
+\eK escape sequence provide ways of inspecting characters before the start of a
+matched string.
 .P
-The three additional requirements define the cases where adding more characters
-to the existing subject may complete the same match that would occur if they
-had all been present in the first place. Without these conditions there would
-be a partial match of an empty string at the end of the subject for all
-unanchored patterns (and also for anchored patterns if the subject itself is
-empty).
+(2) The pattern contains one or more lookbehind assertions. This condition
+exists in case there is a lookbehind that inspects characters before the start 
+of the match.
+.P
+(3) There is a special case when the whole pattern can match an empty string.
+When the starting point is at the end of the subject, the empty string match is
+a possibility, and if PCRE2_PARTIAL_SOFT is set and neither of the above
+conditions is true, it is returned. However, because adding more characters
+might result in a non-empty match, PCRE2_PARTIAL_HARD returns a partial match,
+which in this case means "there is going to be a match at this point, but until
+some more characters are added, we do not know if it will be an empty string or
+something longer".
+.
+.
+.
+.SH "PARTIAL MATCHING USING pcre2_match()"
+.rs
+.sp
+When a partial matching option is set, the result of calling
+\fBpcre2_match()\fP can be one of the following:
+.TP 2
+\fBA successful match\fP
+A complete match has been found, starting and ending within this subject.
+.TP
+\fBPCRE2_ERROR_NOMATCH\fP
+No match can start anywhere in this subject.
+.TP
+\fBPCRE2_ERROR_PARTIAL\fP
+Adding more characters may result in a complete match that uses one or more
+characters from the end of this subject.
 .P
 When a partial match is returned, the first two elements in the ovector point
 to the portion of the subject that was matched, but the values in the rest of
@@ -83,24 +117,6 @@ is "456abc12", a partial match is found for the string "abc12", because all
 these characters are needed for a subsequent re-match with additional
 characters.
 .P
-What happens when a partial match is identified depends on which of the two
-partial matching options is set.
-.
-.
-.SS "PCRE2_PARTIAL_SOFT WITH pcre2_match()"
-.rs
-.sp
-If PCRE2_PARTIAL_SOFT is set when \fBpcre2_match()\fP identifies a partial
-match, the partial match is remembered, but matching continues as normal, and
-other alternatives in the pattern are tried. If no complete match can be found,
-PCRE2_ERROR_PARTIAL is returned instead of PCRE2_ERROR_NOMATCH.
-.P
-This option is "soft" because it prefers a complete match over a partial match.
-All the various matching items in a pattern behave as if the subject string is
-potentially complete. For example, \ez, \eZ, and $ match at the end of the
-subject, as normal, and for \eb and \eB the end of the subject is treated as a
-non-alphanumeric.
-.P
 If there is more than one partial match, the first one that was found provides
 the data that is returned. Consider this pattern:
 .sp
@@ -109,27 +125,32 @@ the data that is returned. Consider this pattern:
 If this is matched against the subject string "abc123dog", both alternatives
 fail to match, but the end of the subject is reached during matching, so
 PCRE2_ERROR_PARTIAL is returned. The offsets are set to 3 and 9, identifying
-"123dog" as the first partial match that was found. (In this example, there are
-two partial matches, because "dog" on its own partially matches the second
-alternative.)
+"123dog" as the first partial match. (In this example, there are two partial
+matches, because "dog" on its own partially matches the second alternative.)
 .
 .
-.SS "PCRE2_PARTIAL_HARD WITH pcre2_match()"
-.rs
-.sp
-If PCRE2_PARTIAL_HARD is set for \fBpcre2_match()\fP, PCRE2_ERROR_PARTIAL is
-returned as soon as a partial match is found, without continuing to search for
-possible complete matches. This option is "hard" because it prefers an earlier
-partial match over a later complete match. For this reason, the assumption is
-made that the end of the supplied subject string may not be the true end of the
-available data, and so, if \ez, \eZ, \eb, \eB, or $ are encountered at the end
-of the subject, the result is PCRE2_ERROR_PARTIAL, whether or not any 
-characters have been inspected.
-.
-.
-.SS "Comparing hard and soft partial matching"
+.SS "How a partial match is processed by pcre2_match()"
 .rs
 .sp
+What happens when a partial match is identified depends on which of the two
+partial matching options is set.
+.P
+If PCRE2_PARTIAL_HARD is set, PCRE2_ERROR_PARTIAL is returned as soon as a
+partial match is found, without continuing to search for possible complete
+matches. This option is "hard" because it prefers an earlier partial match over
+a later complete match. For this reason, the assumption is made that the end of
+the supplied subject string is not the true end of the available data, which is 
+why \ez, \eZ, \eb, \eB, and $ always give a partial match.
+.P
+If PCRE2_PARTIAL_SOFT is set, the partial match is remembered, but matching
+continues as normal, and other alternatives in the pattern are tried. If no
+complete match can be found, PCRE2_ERROR_PARTIAL is returned instead of
+PCRE2_ERROR_NOMATCH. This option is "soft" because it prefers a complete match
+over a partial match. All the various matching items in a pattern behave as if
+the subject string is potentially complete; \ez, \eZ, and $ match at the end of
+the subject, as normal, and for \eb and \eB the end of the subject is treated
+as a non-alphanumeric.
+.P
 The difference between the two partial matching options can be illustrated by a
 pattern such as:
 .sp
@@ -154,157 +175,83 @@ The second pattern will never match "dogsbody", because it will always find the
 shorter match first.
 .
 .
-.SH "PARTIAL MATCHING USING pcre2_dfa_match()"
+.SS "Example of partial matching using pcre2test"
 .rs
 .sp
-The DFA functions move along the subject string character by character, without
-backtracking, searching for all possible matches simultaneously. If the end of
-the subject is reached before the end of the pattern, there is the possibility
-of a partial match, again provided that at least one character has been
-inspected.
-.P
-When PCRE2_PARTIAL_SOFT is set, PCRE2_ERROR_PARTIAL is returned only if there
-have been no complete matches. Otherwise, the complete matches are returned.
-However, if PCRE2_PARTIAL_HARD is set, a partial match takes precedence over
-any complete matches. The portion of the string that was matched when the
-longest partial match was found is set as the first matching string.
-.P
-Because the DFA functions always search for all possible matches, and there is
-no difference between greedy and ungreedy repetition, their behaviour is
-different from the standard functions when PCRE2_PARTIAL_HARD is set. Consider
-the string "dog" matched against the ungreedy pattern shown above:
-.sp
-  /dog(sbody)??/
-.sp
-Whereas the standard function stops as soon as it finds the complete match for
-"dog", the DFA function also finds the partial match for "dogsbody", and so
-returns that when PCRE2_PARTIAL_HARD is set.
-.
-.
-.SH "PARTIAL MATCHING AND WORD BOUNDARIES"
-.rs
-.sp
-If a pattern ends with one of sequences \eb or \eB, which test for word
-boundaries, partial matching with PCRE2_PARTIAL_SOFT can give counter-intuitive
-results. Consider this pattern:
-.sp
-  /\ebcat\eb/
-.sp
-This matches "cat", provided there is a word boundary at either end. If the
-subject string is "the cat", the comparison of the final "t" with a following
-character cannot take place, so a partial match is found. However, normal
-matching carries on, and \eb matches at the end of the subject when the last
-character is a letter, so a complete match is found. The result, therefore, is
-\fInot\fP PCRE2_ERROR_PARTIAL. Using PCRE2_PARTIAL_HARD in this case does yield
-PCRE2_ERROR_PARTIAL, because then the partial match takes precedence.
-.
-.
-.SH "EXAMPLE OF PARTIAL MATCHING USING PCRE2TEST"
-.rs
-.sp
-If the \fBpartial_soft\fP (or \fBps\fP) modifier is present on a
-\fBpcre2test\fP data line, the PCRE2_PARTIAL_SOFT option is used for the match.
-Here is a run of \fBpcre2test\fP that uses the date example quoted above:
+The \fBpcre2test\fP data modifiers \fBpartial_hard\fP (or \fBph\fP) and
+\fBpartial_soft\fP (or \fBps\fP) set PCRE2_PARTIAL_HARD and PCRE2_PARTIAL_SOFT,
+respectively, when calling \fBpcre2_match()\fP. Here is a run of
+\fBpcre2test\fP using a pattern that matches the whole subject in the form of a
+date:
 .sp
     re> /^\ed?\ed(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\ed\ed$/
-  data> 25jun04\e=ps
-   0: 25jun04
-   1: jun
-  data> 25dec3\e=ps
+  data> 25dec3\e=ph
   Partial match: 23dec3
-  data> 3ju\e=ps
+  data> 3ju\e=ph
   Partial match: 3ju
-  data> 3juj\e=ps
-  No match
-  data> j\e=ps
+  data> 3juj\e=ph
   No match
 .sp
-The first data string is matched completely, so \fBpcre2test\fP shows the
-matched substrings. The remaining four strings do not match the complete
-pattern, but the first two are partial matches. Similar output is obtained
-if DFA matching is used.
-.P
-If the \fBpartial_hard\fP (or \fBph\fP) modifier is present on a
-\fBpcre2test\fP data line, the PCRE2_PARTIAL_HARD option is set for the match.
-.
-.
-.SH "MULTI-SEGMENT MATCHING WITH pcre2_dfa_match()"
-.rs
-.sp
-When a partial match has been found using a DFA matching function, it is
-possible to continue the match by providing additional subject data and calling
-the function again with the same compiled regular expression, this time setting
-the PCRE2_DFA_RESTART option. You must pass the same working space as before,
-because this is where details of the previous partial match are stored. Here is
-an example using \fBpcre2test\fP:
+This example gives the same results for both hard and soft partial matching 
+options. Here is an example where there is a difference:
 .sp
     re> /^\ed?\ed(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\ed\ed$/
-  data> 23ja\e=dfa,ps
-  Partial match: 23ja
-  data> n05\e=dfa,dfa_restart
-   0: n05
-.sp
-The first call has "23ja" as the subject, and requests partial matching; the
-second call has "n05" as the subject for the continued (restarted) match.
-Notice that when the match is complete, only the last part is shown; PCRE2 does
-not retain the previously partially-matched string. It is up to the calling
-program to do that if it needs to.
-.P
-That means that, for an unanchored pattern, if a continued match fails, it is
-not possible to try again at a new starting point. All this facility is capable
-of doing is continuing with the previous match attempt. In the previous
-example, if the second set of data is "ug23" the result is no match, even
-though there would be a match for "aug23" if the entire string were given at
-once. Depending on the application, this may or may not be what you want.
-The only way to allow for starting again at the next character is to retain the
-matched part of the subject and try a new complete match.
-.P
-You can set the PCRE2_PARTIAL_SOFT or PCRE2_PARTIAL_HARD options with
-PCRE2_DFA_RESTART to continue partial matching over multiple segments. This
-facility can be used to pass very long subject strings to the DFA matching
-functions.
+  data> 25jun04\e=ps
+   0: 25jun04
+   1: jun
+  data> 25jun04\e=ph
+  Partial match: 25jun04 
+.sp    
+With PCRE2_PARTIAL_SOFT, the subject is matched completely. For
+PCRE2_PARTIAL_HARD, however, the subject is assumed not to be complete, so
+there is only a partial match.
+.
 .
 .
 .SH "MULTI-SEGMENT MATCHING WITH pcre2_match()"
 .rs
 .sp
-Unlike the DFA function, it is not possible to restart the previous match with
-a new segment of data when using \fBpcre2_match()\fP. Instead, new data must be
-added to the previous subject string, and the entire match re-run, starting
-from the point where the partial match occurred. Earlier data can be discarded.
+PCRE was not originally designed with multi-segment matching in mind. However,
+over time, features (including partial matching) that make multi-segment
+matching possible have been added. The string is searched segment by segment by
+calling \fBpcre2_match()\fP repeatedly, with the aim of achieving the same 
+results that would happen if the entire string was available for searching.
+.P
+Special logic must be implemented to handle a matched substring that spans a
+segment boundary. PCRE2_PARTIAL_HARD should be used, because it returns a
+partial match at the end of a segment whenever there is the possibility of
+changing the match by adding more characters. The PCRE2_NOTBOL option should
+also be set for all but the first segment.
 .P
-It is best to use PCRE2_PARTIAL_HARD in this situation, because it does not
-treat the end of a segment as the end of the subject when matching \ez, \eZ,
-\eb, \eB, and $. Consider an unanchored pattern that matches dates:
+When a partial match occurs, the next segment must be added to the current 
+subject and the match re-run, using the \fIstartoffset\fP argument of 
+\fBpcre2_match()\fP to begin at the point where the partial match started.
+Multi-segment matching is usually used to search for substrings in the middle
+of very long sequences, so the patterns are normally not anchored. For example:
 .sp
     re> /\ed?\ed(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\ed\ed/
-  data> The date is 23ja\e=ph
+  data> ...the date is 23ja\e=ph
   Partial match: 23ja
+  data> ...the date is 23jan19 and on that day...\e=offset=15
+   0: 23jan19
+   1: jan
 .sp
-At this stage, an application could discard the text preceding "23ja", add on
-text from the next segment, and call the matching function again. Unlike the
-DFA matching function, the entire matching string must always be available,
-and the complete matching process occurs for each call, so more memory and more
-processing time is needed.
-.
-.
-.SH "ISSUES WITH MULTI-SEGMENT MATCHING"
-.rs
-.sp
-Certain types of pattern may give problems with multi-segment matching,
-whichever matching function is used.
+Note the use of the \fBoffset\fP modifier to start the new match where the 
+partial match was found.
 .P
-1. If the pattern contains a test for the beginning of a line, you need to pass
-the PCRE2_NOTBOL option when the subject string for any call does start at the
-beginning of a line. There is also a PCRE2_NOTEOL option, but in practice when
-doing multi-segment matching you should be using PCRE2_PARTIAL_HARD, which
-includes the effect of PCRE2_NOTEOL.
+In this simple example, the next segment was just added to the one in which the 
+partial match was found. However, if there are memory constraints, it may be 
+necessary to discard text that precedes the partial match before adding the 
+next segment. In cases such as the above, where the pattern does not contain
+any lookbehinds, it is sufficient to retain only the partially matched
+substring. However, if a pattern contains a lookbehind assertion, characters
+that precede the start of the partial match may have been inspected during the
+matching process.
 .P
-2. If a pattern contains a lookbehind assertion, characters that precede the
-start of the partial match may have been inspected during the matching process.
-When using \fBpcre2_match()\fP, sufficient characters must be retained for the
-next match attempt. You can ensure that enough characters are retained by doing
-the following:
+The only lookbehind information that is available is the length of the longest
+lookbehind in a pattern. This may not, of course, be at the start of the
+pattern, but retaining that many characters before the partial match is
+sufficient, if not always strictly necessary. The way to do this is as follows:
 .P
 Before doing any matching, find the length of the longest lookbehind in the
 pattern by calling \fBpcre2_pattern_info()\fP with the PCRE2_INFO_MAXLOOKBEHIND
@@ -313,71 +260,78 @@ partial match, moving back from the ovector[0] offset in the subject by the
 number of characters given for the maximum lookbehind gets you to the earliest
 character that must be retained. In a non-UTF or a 32-bit situation, moving
 back is just a subtraction, but in UTF-8 or UTF-16 you have to count characters
-while moving back through the code units.
-.P
-Characters before the point you have now reached can be discarded, and after
-the next segment has been added to what is retained, you should run the next
-match with the \fBstartoffset\fP argument set so that the match begins at the
-same point as before.
+while moving back through the code units. Characters before the point you have
+now reached can be discarded.
 .P
 For example, if the pattern "(?<=123)abc" is partially matched against the
 string "xx123ab", the ovector offsets are 5 and 7 ("ab"). The maximum
 lookbehind count is 3, so all characters before offset 2 can be discarded. The
 value of \fBstartoffset\fP for the next match should be 3. When \fBpcre2test\fP
 displays a partial match, it indicates the lookbehind characters with '<'
-characters if the "allusedtext" modifier is set:
+characters if the \fBallusedtext\fP modifier is set:
 .sp
     re> "(?<=123)abc"
   data> xx123ab\e=ph,allusedtext
   Partial match: 123ab
                  <<<
-However, the "allusedtext" modifier is not available for JIT matching, because 
-JIT matching does not maintain the first and last consulted characters.
+.sp                  
+Note that the \fPallusedtext\fP modifier is not available for JIT matching,
+because JIT matching does not maintain the first and last consulted characters.
+.
+.
+.
+.SH "PARTIAL MATCHING USING pcre2_dfa_match()"
+.rs
+.sp
+The DFA function moves along the subject string character by character, without
+backtracking, searching for all possible matches simultaneously. If the end of
+the subject is reached before the end of the pattern, there is the possibility
+of a partial match.
 .P
-3. Matching a subject string that is split into multiple segments may not
-always produce exactly the same result as matching over one single long string
-when PCRE2_PARTIAL_SOFT is used. The section "Partial Matching and Word
-Boundaries" above describes an issue that arises if the pattern ends with \eb
-or \eB. Another kind of difference may occur when there are multiple matching
-possibilities, because (for PCRE2_PARTIAL_SOFT) a partial match result is given
-only when there are no completed matches. This means that as soon as the
-shortest match has been found, continuation to a new subject segment is no
-longer possible. Consider this \fBpcre2test\fP example:
-.sp
-    re> /dog(sbody)?/
-  data> dogsb\e=ps
-   0: dog
-  data> do\e=ps,dfa
-  Partial match: do
-  data> gsb\e=ps,dfa,dfa_restart
-   0: g
-  data> dogsbody\e=dfa
-   0: dogsbody
-   1: dog
-.sp
-The first data line passes the string "dogsb" to a standard matching function,
-setting the PCRE2_PARTIAL_SOFT option. Although the string is a partial match
-for "dogsbody", the result is not PCRE2_ERROR_PARTIAL, because the shorter
-string "dog" is a complete match. Similarly, when the subject is presented to
-a DFA matching function in several parts ("do" and "gsb" being the first two)
-the match stops when "dog" has been found, and it is not possible to continue.
-On the other hand, if "dogsbody" is presented as a single string, a DFA
-matching function finds both matches.
+When PCRE2_PARTIAL_SOFT is set, PCRE2_ERROR_PARTIAL is returned only if there
+have been no complete matches. Otherwise, the complete matches are returned.
+If PCRE2_PARTIAL_HARD is set, a partial match takes precedence over any
+complete matches. The portion of the string that was matched when the longest
+partial match was found is set as the first matching string.
 .P
-Because of these problems, it is best to use PCRE2_PARTIAL_HARD when matching
-multi-segment data. The example above then behaves differently:
+Because the DFA function always searches for all possible matches, and there is
+no difference between greedy and ungreedy repetition, its behaviour is
+different from the \fBpcre2_match()\fP. Consider the string "dog" matched
+against this ungreedy pattern:
+.sp
+  /dog(sbody)??/
+.sp
+Whereas the standard function stops as soon as it finds the complete match for
+"dog", the DFA function also finds the partial match for "dogsbody", and so
+returns that when PCRE2_PARTIAL_HARD is set.
+.
+.
+.SH "MULTI-SEGMENT MATCHING WITH pcre2_dfa_match()"
+.rs
 .sp
-    re> /dog(sbody)?/
-  data> dogsb\e=ph
-  Partial match: dogsb
-  data> do\e=ps,dfa
-  Partial match: do
-  data> gsb\e=ph,dfa,dfa_restart
-  Partial match: gsb
+When a partial match has been found using the DFA matching function, it is
+possible to continue the match by providing additional subject data and calling
+the function again with the same compiled regular expression, this time setting
+the PCRE2_DFA_RESTART option. You must pass the same working space as before,
+because this is where details of the previous partial match are stored. You can
+set the PCRE2_PARTIAL_SOFT or PCRE2_PARTIAL_HARD options with PCRE2_DFA_RESTART
+to continue partial matching over multiple segments. Here is an example using
+\fBpcre2test\fP:
 .sp
-4. Patterns that contain alternatives at the top level which do not all start
-with the same pattern item may not work as expected when PCRE2_DFA_RESTART is
-used. For example, consider this pattern:
+    re> /^\ed?\ed(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|dec)\ed\ed$/
+  data> 23ja\e=dfa,ps
+  Partial match: 23ja
+  data> n05\e=dfa,dfa_restart
+   0: n05
+.sp
+The first call has "23ja" as the subject, and requests partial matching; the
+second call has "n05" as the subject for the continued (restarted) match.
+Notice that when the match is complete, only the last part is shown; PCRE2 does
+not retain the previously partially-matched string. It is up to the calling
+program to do that if it needs to. This means that, for an unanchored pattern,
+if a continued match fails, it is not possible to try again at a new starting
+point. All this facility is capable of doing is continuing with the previous
+match attempt. For example, consider this pattern:
 .sp
   1234|3789
 .sp
@@ -386,28 +340,15 @@ alternative is found at offset 3. There is no partial match for the second
 alternative, because such a match does not start at the same point in the
 subject string. Attempting to continue with the string "7890" does not yield a
 match because only those alternatives that match at one point in the subject
-are remembered. The problem arises because the start of the second alternative
-matches within the first alternative. There is no problem with anchored
-patterns or patterns such as:
-.sp
-  1234|ABCD
-.sp
-where no string can be a partial match for both alternatives. This is not a
-problem if a standard matching function is used, because the entire match has
-to be rerun each time:
-.sp
-    re> /1234|3789/
-  data> ABC123\e=ph
-  Partial match: 123
-  data> 1237890
-   0: 3789
-.sp
-Of course, instead of using PCRE2_DFA_RESTART, the same technique of re-running
-the entire match can also be used with the DFA matching function. Another
-possibility is to work with two buffers. If a partial match at offset \fIn\fP
-in the first buffer is followed by "no match" when PCRE2_DFA_RESTART is used on
-the second buffer, you can then try a new match starting at offset \fIn+1\fP in
-the first buffer.
+are remembered. Depending on the application, this may or may not be what you
+want.
+.P
+If you do want to allow for starting again at the next character, one way of
+doing it is to retain the matched part of the segment and try a new complete
+match, as described for \fBpcre2_match()\fP above. Another possibility is to
+work with two buffers. If a partial match at offset \fIn\fP in the first buffer
+is followed by "no match" when PCRE2_DFA_RESTART is used on the second buffer,
+you can then try a new match starting at offset \fIn+1\fP in the first buffer.
 .
 .
 .SH AUTHOR
@@ -424,6 +365,6 @@ Cambridge, England.
 .rs
 .sp
 .nf
-Last updated: 22 July 2019
+Last updated: 07 August 2019
 Copyright (c) 1997-2019 University of Cambridge.
 .fi