summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2015-04-14 17:02:30 +0000
committerph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2015-04-14 17:02:30 +0000
commite2126bc127a8b4c3602dd28ebb2862ad6fbc27a6 (patch)
tree6b27073ef6f658d52820a9b884a97cd00743263a
parente75637b2efc61520e5bb4a99643620a27d4b036a (diff)
downloadpcre-e2126bc127a8b4c3602dd28ebb2862ad6fbc27a6.tar.gz
Documentation and tidies preparatory to 8.37 release.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1548 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r--AUTHORS6
-rw-r--r--ChangeLog36
-rw-r--r--LICENCE6
-rw-r--r--NEWS8
-rwxr-xr-xRunGrepTest2
-rw-r--r--configure.ac8
-rw-r--r--doc/html/NON-AUTOTOOLS-BUILD.txt10
-rw-r--r--doc/html/README.txt13
-rw-r--r--doc/html/pcre.html35
-rw-r--r--doc/pcre.txt17
-rw-r--r--pcre_compile.c2
-rw-r--r--pcre_internal.h2
-rw-r--r--pcre_study.c44
-rw-r--r--pcregrep.c60
-rw-r--r--pcretest.c34
-rw-r--r--testdata/testinput43
-rw-r--r--testdata/testinput63
-rw-r--r--testdata/testoutput44
-rw-r--r--testdata/testoutput64
19 files changed, 172 insertions, 125 deletions
diff --git a/AUTHORS b/AUTHORS
index 5eee1af..d33723f 100644
--- a/AUTHORS
+++ b/AUTHORS
@@ -8,7 +8,7 @@ Email domain: cam.ac.uk
University of Cambridge Computing Service,
Cambridge, England.
-Copyright (c) 1997-2014 University of Cambridge
+Copyright (c) 1997-2015 University of Cambridge
All rights reserved
@@ -19,7 +19,7 @@ Written by: Zoltan Herczeg
Email local part: hzmester
Emain domain: freemail.hu
-Copyright(c) 2010-2014 Zoltan Herczeg
+Copyright(c) 2010-2015 Zoltan Herczeg
All rights reserved.
@@ -30,7 +30,7 @@ Written by: Zoltan Herczeg
Email local part: hzmester
Emain domain: freemail.hu
-Copyright(c) 2009-2014 Zoltan Herczeg
+Copyright(c) 2009-2015 Zoltan Herczeg
All rights reserved.
diff --git a/ChangeLog b/ChangeLog
index 816a390..c79dfe3 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,8 +1,8 @@
ChangeLog for PCRE
------------------
-Version 8.37 xx-xxx-2015
-------------------------
+Version 8.37 14-April-2015
+--------------------------
1. When an (*ACCEPT) is triggered inside capturing parentheses, it arranges
for those parentheses to be closed with whatever has been captured so far.
@@ -65,7 +65,7 @@ Version 8.37 xx-xxx-2015
failed to allow the zero-repeat case if pcre2_exec() was called with an
ovector too small to capture the group.
-13. Fixed two bugs in pcretest that were discovered by fuzzing and reported by
+13. Fixed two bugs in pcretest that were discovered by fuzzing and reported by
Red Hat Product Security:
(a) A crash if /K and /F were both set with the option to save the compiled
@@ -74,20 +74,20 @@ Version 8.37 xx-xxx-2015
(b) Another crash if the option to print captured substrings in a callout
was combined with setting a null ovector, for example \O\C+ as a subject
string.
-
+
14. A pattern such as "((?2){0,1999}())?", which has a group containing a
forward reference repeated a large (but limited) number of times within a
repeated outer group that has a zero minimum quantifier, caused incorrect
code to be compiled, leading to the error "internal error:
previously-checked referenced subpattern not found" when an incorrect
memory address was read. This bug was reported as "heap overflow",
- discovered by Kai Lu of Fortinet's FortiGuard Labs and given the CVE number
- CVE-2015-2325.
-
+ discovered by Kai Lu of Fortinet's FortiGuard Labs and given the CVE number
+ CVE-2015-2325.
+
23. A pattern such as "((?+1)(\1))/" containing a forward reference subroutine
call within a group that also contained a recursive back reference caused
incorrect code to be compiled. This bug was reported as "heap overflow",
- discovered by Kai Lu of Fortinet's FortiGuard Labs, and given the CVE
+ discovered by Kai Lu of Fortinet's FortiGuard Labs, and given the CVE
number CVE-2015-2326.
24. Computing the size of the JIT read-only data in advance has been a source
@@ -102,7 +102,7 @@ Version 8.37 xx-xxx-2015
26. Fix JIT compilation of conditional blocks, which assertion
is converted to (*FAIL). E.g: /(?(?!))/.
-
+
27. The pattern /(?(?!)^)/ caused references to random memory. This bug was
discovered by the LLVM fuzzer.
@@ -110,19 +110,19 @@ Version 8.37 xx-xxx-2015
when this assertion was used as a condition, for example (?(?!)a|b). In
pcre2_match() it worked by luck; in pcre2_dfa_match() it gave an incorrect
error about an unsupported item.
-
+
29. For some types of pattern, for example /Z*(|d*){216}/, the auto-
- possessification code could take exponential time to complete. A recursion
- depth limit of 1000 has been imposed to limit the resources used by this
+ possessification code could take exponential time to complete. A recursion
+ depth limit of 1000 has been imposed to limit the resources used by this
optimization.
-
+
30. A pattern such as /(*UTF)[\S\V\H]/, which contains a negated special class
such as \S in non-UCP mode, explicit wide characters (> 255) can be ignored
because \S ensures they are all in the class. The code for doing this was
interacting badly with the code for computing the amount of space needed to
compile the pattern, leading to a buffer overflow. This bug was discovered
by the LLVM fuzzer.
-
+
31. A pattern such as /((?2)+)((?1))/ which has mutual recursion nested inside
other kinds of group caused stack overflow at compile time. This bug was
discovered by the LLVM fuzzer.
@@ -131,7 +131,7 @@ Version 8.37 xx-xxx-2015
between a subroutine call and its quantifier was incorrectly compiled,
leading to buffer overflow or other errors. This bug was discovered by the
LLVM fuzzer.
-
+
33. The illegal pattern /(?(?<E>.*!.*)?)/ was not being diagnosed as missing an
assertion after (?(. The code was failing to check the character after
(?(?< for the ! or = that would indicate a lookbehind assertion. This bug
@@ -145,7 +145,7 @@ Version 8.37 xx-xxx-2015
35. A mutual recursion within a lookbehind assertion such as (?<=((?2))((?1)))
caused a stack overflow instead of the diagnosis of a non-fixed length
lookbehind assertion. This bug was discovered by the LLVM fuzzer.
-
+
36. The use of \K in a positive lookbehind assertion in a non-anchored pattern
(e.g. /(?<=\Ka)/) could make pcregrep loop.
@@ -154,11 +154,11 @@ Version 8.37 xx-xxx-2015
38. If a greedy quantified \X was preceded by \C in UTF mode (e.g. \C\X*),
and a subsequent item in the pattern caused a non-match, backtracking over
the repeated \X did not stop, but carried on past the start of the subject,
- causing reference to random memory and/or a segfault. There were also some
+ causing reference to random memory and/or a segfault. There were also some
other cases where backtracking after \C could crash. This set of bugs was
discovered by the LLVM fuzzer.
-20. The function for finding the minimum length of a matching string could take
+39. The function for finding the minimum length of a matching string could take
a very long time if mutual recursion was present many times in a pattern,
for example, /((?2){73}(?2))((?1))/. A better mutual recursion detection
method has been implemented. This infelicity was discovered by the LLVM
diff --git a/LICENCE b/LICENCE
index 602e4ae..3b71353 100644
--- a/LICENCE
+++ b/LICENCE
@@ -24,7 +24,7 @@ Email domain: cam.ac.uk
University of Cambridge Computing Service,
Cambridge, England.
-Copyright (c) 1997-2014 University of Cambridge
+Copyright (c) 1997-2015 University of Cambridge
All rights reserved.
@@ -35,7 +35,7 @@ Written by: Zoltan Herczeg
Email local part: hzmester
Emain domain: freemail.hu
-Copyright(c) 2010-2014 Zoltan Herczeg
+Copyright(c) 2010-2015 Zoltan Herczeg
All rights reserved.
@@ -46,7 +46,7 @@ Written by: Zoltan Herczeg
Email local part: hzmester
Emain domain: freemail.hu
-Copyright(c) 2009-2014 Zoltan Herczeg
+Copyright(c) 2009-2015 Zoltan Herczeg
All rights reserved.
diff --git a/NEWS b/NEWS
index 5b8c60c..c8e48a8 100644
--- a/NEWS
+++ b/NEWS
@@ -1,6 +1,14 @@
News about PCRE releases
------------------------
+Release 8.37 14-April-2015
+--------------------------
+
+This is bug-fix release. Note that this library (now called PCRE1) is now being
+maintained for bug fixes only. New projects are advised to use the new PCRE2
+libraries.
+
+
Release 8.36 26-September-2014
------------------------------
diff --git a/RunGrepTest b/RunGrepTest
index 766278b..62e2a9b 100755
--- a/RunGrepTest
+++ b/RunGrepTest
@@ -509,7 +509,7 @@ echo "RC=$?" >>testtrygrep
echo "---------------------------- Test 107 -----------------------------" >>testtrygrep
echo "a" >testtemp1grep
echo "aaaaa" >>testtemp1grep
-(cd $srcdir; $valgrind $pcregrep --line-offsets '(?<=\Ka)' testtemp1grep) >>testtrygrep 2>&1
+(cd $srcdir; $valgrind $pcregrep --line-offsets '(?<=\Ka)' $builddir/testtemp1grep) >>testtrygrep 2>&1
echo "RC=$?" >>testtrygrep
# Now compare the results.
diff --git a/configure.ac b/configure.ac
index ec8496e..7bbbac7 100644
--- a/configure.ac
+++ b/configure.ac
@@ -11,15 +11,15 @@ dnl be defined as -RC2, for example. For real releases, it should be empty.
m4_define(pcre_major, [8])
m4_define(pcre_minor, [37])
m4_define(pcre_prerelease, [-RC1])
-m4_define(pcre_date, [2015-02-03])
+m4_define(pcre_date, [2015-04-14])
# NOTE: The CMakeLists.txt file searches for the above variables in the first
# 50 lines of this file. Please update that if the variables above are moved.
# Libtool shared library interface versions (current:revision:age)
-m4_define(libpcre_version, [3:4:2])
-m4_define(libpcre16_version, [2:4:2])
-m4_define(libpcre32_version, [0:4:0])
+m4_define(libpcre_version, [3:5:2])
+m4_define(libpcre16_version, [2:5:2])
+m4_define(libpcre32_version, [0:5:0])
m4_define(libpcreposix_version, [0:3:0])
m4_define(libpcrecpp_version, [0:1:0])
diff --git a/doc/html/NON-AUTOTOOLS-BUILD.txt b/doc/html/NON-AUTOTOOLS-BUILD.txt
index cddf3e0..1c3da84 100644
--- a/doc/html/NON-AUTOTOOLS-BUILD.txt
+++ b/doc/html/NON-AUTOTOOLS-BUILD.txt
@@ -1,6 +1,14 @@
Building PCRE without using autotools
-------------------------------------
+NOTE: This document relates to PCRE releases that use the original API, with
+library names libpcre, libpcre16, and libpcre32. January 2015 saw the first
+release of a new API, known as PCRE2, with release numbers starting at 10.00
+and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old libraries
+(now called PCRE1) are still being maintained for bug fixes, but there will be
+no new development. New projects are advised to use the new PCRE2 libraries.
+
+
This document contains the following sections:
General
@@ -761,4 +769,4 @@ There is also a mirror here:
http://www.vsoft-software.com/downloads.html
==========================
-Last Updated: 14 May 2013
+Last Updated: 10 February 2015
diff --git a/doc/html/README.txt b/doc/html/README.txt
index e30bd0f..4887ebf 100644
--- a/doc/html/README.txt
+++ b/doc/html/README.txt
@@ -1,7 +1,16 @@
README file for PCRE (Perl-compatible regular expression library)
-----------------------------------------------------------------
-The latest release of PCRE is always available in three alternative formats
+NOTE: This set of files relates to PCRE releases that use the original API,
+with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
+first release of a new API, known as PCRE2, with release numbers starting at
+10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
+libraries (now called PCRE1) are still being maintained for bug fixes, but
+there will be no new development. New projects are advised to use the new PCRE2
+libraries.
+
+
+The latest release of PCRE1 is always available in three alternative formats
from:
ftp://ftp.csx.cam.ac.uk/pub/software/programming/pcre/pcre-xxx.tar.gz
@@ -990,4 +999,4 @@ pcre_xxx, one with the name pcre16_xx, and a third with the name pcre32_xxx.
Philip Hazel
Email local part: ph10
Email domain: cam.ac.uk
-Last updated: 24 October 2014
+Last updated: 10 February 2015
diff --git a/doc/html/pcre.html b/doc/html/pcre.html
index c2b29aa..c87b106 100644
--- a/doc/html/pcre.html
+++ b/doc/html/pcre.html
@@ -13,13 +13,24 @@ from the original man page. If there is any nonsense in it, please consult the
man page, in case the conversion went wrong.
<br>
<ul>
-<li><a name="TOC1" href="#SEC1">INTRODUCTION</a>
-<li><a name="TOC2" href="#SEC2">SECURITY CONSIDERATIONS</a>
-<li><a name="TOC3" href="#SEC3">USER DOCUMENTATION</a>
-<li><a name="TOC4" href="#SEC4">AUTHOR</a>
-<li><a name="TOC5" href="#SEC5">REVISION</a>
+<li><a name="TOC1" href="#SEC1">PLEASE TAKE NOTE</a>
+<li><a name="TOC2" href="#SEC2">INTRODUCTION</a>
+<li><a name="TOC3" href="#SEC3">SECURITY CONSIDERATIONS</a>
+<li><a name="TOC4" href="#SEC4">USER DOCUMENTATION</a>
+<li><a name="TOC5" href="#SEC5">AUTHOR</a>
+<li><a name="TOC6" href="#SEC6">REVISION</a>
</ul>
-<br><a name="SEC1" href="#TOC1">INTRODUCTION</a><br>
+<br><a name="SEC1" href="#TOC1">PLEASE TAKE NOTE</a><br>
+<P>
+This document relates to PCRE releases that use the original API,
+with library names libpcre, libpcre16, and libpcre32. January 2015 saw the
+first release of a new API, known as PCRE2, with release numbers starting at
+10.00 and library names libpcre2-8, libpcre2-16, and libpcre2-32. The old
+libraries (now called PCRE1) are still being maintained for bug fixes, but
+there will be no new development. New projects are advised to use the new PCRE2
+libraries.
+</P>
+<br><a name="SEC2" href="#TOC1">INTRODUCTION</a><br>
<P>
The PCRE library is a set of functions that implement regular expression
pattern matching using the same syntax and semantics as Perl, with just a few
@@ -115,7 +126,7 @@ clashes. In some environments, it is possible to control which external symbols
are exported when a shared library is built, and in these cases the
undocumented symbols are not exported.
</P>
-<br><a name="SEC2" href="#TOC1">SECURITY CONSIDERATIONS</a><br>
+<br><a name="SEC3" href="#TOC1">SECURITY CONSIDERATIONS</a><br>
<P>
If you are using PCRE in a non-UTF application that permits users to supply
arbitrary patterns for compilation, you should be aware of a feature that
@@ -149,7 +160,7 @@ against this: see the PCRE_EXTRA_MATCH_LIMIT feature in the
<a href="pcreapi.html"><b>pcreapi</b></a>
page.
</P>
-<br><a name="SEC3" href="#TOC1">USER DOCUMENTATION</a><br>
+<br><a name="SEC4" href="#TOC1">USER DOCUMENTATION</a><br>
<P>
The user documentation for PCRE comprises a number of different sections. In
the "man" format, each of these is a separate "man page". In the HTML format,
@@ -188,7 +199,7 @@ follows:
In the "man" and HTML formats, there is also a short page for each C library
function, listing its arguments and results.
</P>
-<br><a name="SEC4" href="#TOC1">AUTHOR</a><br>
+<br><a name="SEC5" href="#TOC1">AUTHOR</a><br>
<P>
Philip Hazel
<br>
@@ -202,11 +213,11 @@ Putting an actual email address here seems to have been a spam magnet, so I've
taken it away. If you want to email me, use my two initials, followed by the
two digits 10, at the domain cam.ac.uk.
</P>
-<br><a name="SEC5" href="#TOC1">REVISION</a><br>
+<br><a name="SEC6" href="#TOC1">REVISION</a><br>
<P>
-Last updated: 08 January 2014
+Last updated: 10 February 2015
<br>
-Copyright &copy; 1997-2014 University of Cambridge.
+Copyright &copy; 1997-2015 University of Cambridge.
<br>
<p>
Return to the <a href="index.html">PCRE index page</a>.
diff --git a/doc/pcre.txt b/doc/pcre.txt
index ce27f4b..bc92e4f 100644
--- a/doc/pcre.txt
+++ b/doc/pcre.txt
@@ -13,7 +13,18 @@ PCRE(3) Library Functions Manual PCRE(3)
NAME
- PCRE - Perl-compatible regular expressions
+ PCRE - Perl-compatible regular expressions (original API)
+
+PLEASE TAKE NOTE
+
+ This document relates to PCRE releases that use the original API, with
+ library names libpcre, libpcre16, and libpcre32. January 2015 saw the
+ first release of a new API, known as PCRE2, with release numbers start-
+ ing at 10.00 and library names libpcre2-8, libpcre2-16, and
+ libpcre2-32. The old libraries (now called PCRE1) are still being main-
+ tained for bug fixes, but there will be no new development. New
+ projects are advised to use the new PCRE2 libraries.
+
INTRODUCTION
@@ -179,8 +190,8 @@ AUTHOR
REVISION
- Last updated: 08 January 2014
- Copyright (c) 1997-2014 University of Cambridge.
+ Last updated: 10 February 2015
+ Copyright (c) 1997-2015 University of Cambridge.
------------------------------------------------------------------------------
diff --git a/pcre_compile.c b/pcre_compile.c
index 5d819f0..0efad26 100644
--- a/pcre_compile.c
+++ b/pcre_compile.c
@@ -5524,13 +5524,13 @@ for (;; ptr++)
PUT(previous, 1, (int)(code - previous));
break; /* End of class handling */
}
-#endif
/* Even though any XCLASS list is now discarded, we must allow for
its memory. */
if (lengthptr != NULL)
*lengthptr += (int)(class_uchardata - class_uchardata_base);
+#endif
/* If there are no characters > 255, or they are all to be included or
excluded, set the opcode to OP_CLASS or OP_NCLASS, depending on whether the
diff --git a/pcre_internal.h b/pcre_internal.h
index 871520e..dd0ac7f 100644
--- a/pcre_internal.h
+++ b/pcre_internal.h
@@ -2446,7 +2446,7 @@ typedef struct compile_data {
BOOL had_pruneorskip; /* (*PRUNE) or (*SKIP) encountered */
BOOL check_lookbehind; /* Lookbehinds need later checking */
BOOL dupnames; /* Duplicate names exist */
- BOOL iscondassert; /* Next assert is a condition */
+ BOOL iscondassert; /* Next assert is a condition */
int nltype; /* Newline type */
int nllen; /* Newline string length */
pcre_uchar nl[4]; /* Newline string when fixed length */
diff --git a/pcre_study.c b/pcre_study.c
index a4c7428..998fe23 100644
--- a/pcre_study.c
+++ b/pcre_study.c
@@ -401,24 +401,24 @@ for (;;)
break;
}
else
- {
- recurse_check *r = recurses;
+ {
+ recurse_check *r = recurses;
for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
if (r != NULL) /* Mutual recursion */
- {
- d = 0;
- had_recurse = TRUE;
- break;
- }
+ {
+ d = 0;
+ had_recurse = TRUE;
+ break;
+ }
else
{
int dd;
this_recurse.prev = recurses;
- this_recurse.group = cs;
+ this_recurse.group = cs;
dd = find_minlength(re, cs, startcode, options, &this_recurse);
if (dd < d) d = dd;
}
- }
+ }
slot += re->name_entry_size;
}
}
@@ -439,20 +439,20 @@ for (;;)
had_recurse = TRUE;
}
else
- {
- recurse_check *r = recurses;
+ {
+ recurse_check *r = recurses;
for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
if (r != NULL) /* Mutual recursion */
- {
- d = 0;
- had_recurse = TRUE;
- }
+ {
+ d = 0;
+ had_recurse = TRUE;
+ }
else
{
this_recurse.prev = recurses;
- this_recurse.group = cs;
+ this_recurse.group = cs;
d = find_minlength(re, cs, startcode, options, &this_recurse);
- }
+ }
}
}
else d = 0;
@@ -504,18 +504,18 @@ for (;;)
if (cc > cs && cc < ce) /* Simple recursion */
had_recurse = TRUE;
else
- {
- recurse_check *r = recurses;
+ {
+ recurse_check *r = recurses;
for (r = recurses; r != NULL; r = r->prev) if (r->group == cs) break;
if (r != NULL) /* Mutual recursion */
- had_recurse = TRUE;
+ had_recurse = TRUE;
else
{
this_recurse.prev = recurses;
- this_recurse.group = cs;
+ this_recurse.group = cs;
branchlength += find_minlength(re, cs, startcode, options,
&this_recurse);
- }
+ }
}
cc += 1 + LINK_SIZE;
break;
diff --git a/pcregrep.c b/pcregrep.c
index b1af129..c49a74f 100644
--- a/pcregrep.c
+++ b/pcregrep.c
@@ -1582,14 +1582,14 @@ while (ptr < endptr)
int endlinelength;
int mrc = 0;
int startoffset = 0;
- int prevoffsets[2];
+ int prevoffsets[2];
unsigned int options = 0;
BOOL match;
char *matchptr = ptr;
char *t = ptr;
size_t length, linelength;
-
- prevoffsets[0] = prevoffsets[1] = -1;
+
+ prevoffsets[0] = prevoffsets[1] = -1;
/* At this point, ptr is at the start of a line. We need to find the length
of the subject string to pass to pcre_exec(). In multiline mode, it is the
@@ -1733,42 +1733,42 @@ while (ptr < endptr)
if (!invert)
{
int oldstartoffset = startoffset;
-
- /* It is possible, when a lookbehind assertion contains \K, for the
- same string to be found again. The code below advances startoffset, but
+
+ /* It is possible, when a lookbehind assertion contains \K, for the
+ same string to be found again. The code below advances startoffset, but
until it is past the "bumpalong" offset that gave the match, the same
substring will be returned. The PCRE1 library does not return the
bumpalong offset, so all we can do is ignore repeated strings. (PCRE2
does this better.) */
-
+
if (prevoffsets[0] != offsets[0] || prevoffsets[1] != offsets[1])
{
prevoffsets[0] = offsets[0];
- prevoffsets[1] = offsets[1];
-
+ prevoffsets[1] = offsets[1];
+
if (printname != NULL) fprintf(stdout, "%s:", printname);
if (number) fprintf(stdout, "%d:", linenumber);
-
+
/* Handle --line-offsets */
-
+
if (line_offsets)
fprintf(stdout, "%d,%d\n", (int)(matchptr + offsets[0] - ptr),
offsets[1] - offsets[0]);
-
+
/* Handle --file-offsets */
-
+
else if (file_offsets)
fprintf(stdout, "%d,%d\n",
(int)(filepos + matchptr + offsets[0] - ptr),
offsets[1] - offsets[0]);
-
+
/* Handle --only-matching, which may occur many times */
-
+
else
{
BOOL printed = FALSE;
omstr *om;
-
+
for (om = only_matching; om != NULL; om = om->next)
{
int n = om->groupnum;
@@ -1785,19 +1785,19 @@ while (ptr < endptr)
}
}
}
-
+
if (printed || printname != NULL || number) fprintf(stdout, "\n");
}
- }
-
- /* Prepare to repeat to find the next match. If the patterned contained
- a lookbehind tht included \K, it is possible that the end of the match
- might be at or before the actual strting offset we have just used. We
- need to start one character further on. Unfortunately, for unanchored
- patterns, the actual start offset can be greater that the one that was
- set as a result of "bumpalong". PCRE1 does not return the actual start
- offset, so we have to check against the original start offset. This may
- lead to duplicates - we we need the fudge above to avoid printing them.
+ }
+
+ /* Prepare to repeat to find the next match. If the patterned contained
+ a lookbehind tht included \K, it is possible that the end of the match
+ might be at or before the actual strting offset we have just used. We
+ need to start one character further on. Unfortunately, for unanchored
+ patterns, the actual start offset can be greater that the one that was
+ set as a result of "bumpalong". PCRE1 does not return the actual start
+ offset, so we have to check against the original start offset. This may
+ lead to duplicates - we we need the fudge above to avoid printing them.
(PCRE2 does this better.) */
match = FALSE;
@@ -1806,12 +1806,12 @@ while (ptr < endptr)
startoffset = offsets[1]; /* Restart after the match */
if (startoffset <= oldstartoffset)
{
- if ((size_t)startoffset >= length)
+ if ((size_t)startoffset >= length)
goto END_ONE_MATCH; /* We were at the end */
startoffset = oldstartoffset + 1;
if (utf8)
- while ((matchptr[startoffset] & 0xc0) == 0x80) startoffset++;
- }
+ while ((matchptr[startoffset] & 0xc0) == 0x80) startoffset++;
+ }
goto ONLY_MATCHING_RESTART;
}
}
diff --git a/pcretest.c b/pcretest.c
index 725c847..27107ca 100644
--- a/pcretest.c
+++ b/pcretest.c
@@ -2258,7 +2258,7 @@ if (callout_extra)
cb->callout_number, cb->capture_last);
if (cb->offset_vector != NULL)
- {
+ {
for (i = 0; i < cb->capture_top * 2; i += 2)
{
if (cb->offset_vector[i] < 0)
@@ -2271,7 +2271,7 @@ if (callout_extra)
fprintf(f, "\n");
}
}
- }
+ }
}
/* Re-print the subject in canonical form, the first time or if giving full
@@ -5605,12 +5605,12 @@ while (!done)
/* If not /g or /G we are done */
if (!do_g && !do_G) break;
-
+
if (use_offsets == NULL)
{
fprintf(outfile, "Cannot do global matching without an ovector\n");
break;
- }
+ }
/* If we have matched an empty string, first check to see if we are at
the end of the subject. If so, the /g loop is over. Otherwise, mimic what
@@ -5627,21 +5627,21 @@ while (!done)
g_notempty = PCRE_NOTEMPTY_ATSTART | PCRE_ANCHORED;
}
- /* For /g, update the start offset, leaving the rest alone. There is a
- tricky case when \K is used in a positive lookbehind assertion. This can
- cause the end of the match to be less than or equal to the start offset.
- In this case we restart at one past the start offset. This may return the
- same match if the original start offset was bumped along during the
- match, but eventually the new start offset will hit the actual start
- offset. (In PCRE2 the true start offset is available, and this can be
- done better. It is not worth doing more than making sure we do not loop
+ /* For /g, update the start offset, leaving the rest alone. There is a
+ tricky case when \K is used in a positive lookbehind assertion. This can
+ cause the end of the match to be less than or equal to the start offset.
+ In this case we restart at one past the start offset. This may return the
+ same match if the original start offset was bumped along during the
+ match, but eventually the new start offset will hit the actual start
+ offset. (In PCRE2 the true start offset is available, and this can be
+ done better. It is not worth doing more than making sure we do not loop
at this stage in the life of PCRE1.) */
- if (do_g)
+ if (do_g)
{
if (g_notempty == 0 && use_offsets[1] <= start_offset)
{
- if (start_offset >= len) break; /* End of subject */
+ if (start_offset >= len) break; /* End of subject */
start_offset++;
if (use_utf)
{
@@ -5651,9 +5651,9 @@ while (!done)
start_offset++;
}
}
- }
+ }
else start_offset = use_offsets[1];
- }
+ }
/* For /G, update the pointer and length */
@@ -5668,7 +5668,7 @@ while (!done)
} /* End of loop for data lines */
CONTINUE:
-
+
#if !defined NOPOSIX
if ((posix || do_posix) && preg.re_pcre != 0) regfree(&preg);
#endif
diff --git a/testdata/testinput4 b/testdata/testinput4
index f139c62..8bdbdac 100644
--- a/testdata/testinput4
+++ b/testdata/testinput4
@@ -724,9 +724,6 @@
"[\S\V\H]"8
-/\C\X*QT/8
- Ӆ\x0aT
-
/\C(\W?ſ)'?{{/8
\\C(\\W?ſ)'?{{
diff --git a/testdata/testinput6 b/testdata/testinput6
index 7c492ad..02cef0d 100644
--- a/testdata/testinput6
+++ b/testdata/testinput6
@@ -1499,4 +1499,7 @@
/[A-`]/i8
abcdefghijklmno
+/\C\X*QT/8
+ Ӆ\x0aT
+
/-- End of testinput6 --/
diff --git a/testdata/testoutput4 b/testdata/testoutput4
index ffd02ea..d43c123 100644
--- a/testdata/testoutput4
+++ b/testdata/testoutput4
@@ -1273,10 +1273,6 @@ No match
"[\S\V\H]"8
-/\C\X*QT/8
- Ӆ\x0aT
-No match
-
/\C(\W?ſ)'?{{/8
\\C(\\W?ſ)'?{{
No match
diff --git a/testdata/testoutput6 b/testdata/testoutput6
index a5de32a..3f035b8 100644
--- a/testdata/testoutput6
+++ b/testdata/testoutput6
@@ -2465,4 +2465,8 @@ No match
abcdefghijklmno
0: a
+/\C\X*QT/8
+ Ӆ\x0aT
+No match
+
/-- End of testinput6 --/