summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--ChangeLog28
-rw-r--r--Makefile.am11
-rw-r--r--NEWS6
-rw-r--r--NON-UNIX-USE22
-rwxr-xr-xPrepareRelease6
-rw-r--r--README68
-rwxr-xr-xRunGrepTest5
-rwxr-xr-xRunTest40
-rw-r--r--configure.ac2
-rw-r--r--pcre_compile.c10
-rw-r--r--pcre_exec.c22
-rw-r--r--pcre_fullinfo.c4
-rw-r--r--pcre_printint.c2
-rw-r--r--pcretest.c55
-rw-r--r--testdata/testinput142
-rw-r--r--testdata/testinput176
-rw-r--r--testdata/testinput184
-rw-r--r--testdata/testoutput144
-rw-r--r--testdata/testoutput1716
-rw-r--r--testdata/testoutput1812
20 files changed, 173 insertions, 152 deletions
diff --git a/ChangeLog b/ChangeLog
index 14ca682..e2d36bc 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -6,30 +6,30 @@ Version 8.30
1. Renamed "isnumber" as "is_a_number" because in some Mac environments this
name is defined in ctype.h.
-
+
2. Fixed a bug in the code for calculating the fixed length of lookbehind
assertions.
-
+
3. Removed the function pcre_info(), which has been obsolete and deprecated
- since it was replaced by pcre_fullinfo() in February 2000.
-
+ since it was replaced by pcre_fullinfo() in February 2000.
+
4. For a non-anchored pattern, if (*SKIP) was given with a name that did not
- match a (*MARK), and the match failed at the start of the subject, a
- reference to memory before the start of the subject could occur. This bug
+ match a (*MARK), and the match failed at the start of the subject, a
+ reference to memory before the start of the subject could occur. This bug
was introduced by fix 17 of release 8.21.
-
+
5. A reference to an unset group with zero minimum repetition was giving
totally wrong answers (in non-JavaScript-compatibility mode). For example,
- /(another)?(\1?)test/ matched against "hello world test". This bug was
+ /(another)?(\1?)test/ matched against "hello world test". This bug was
introduced in release 8.13.
-
-6. Add support for 16-bit character strings (a large amount of work involving
- many changes and refactorings).
-
+
+6. Add support for 16-bit character strings (a large amount of work involving
+ many changes and refactorings).
+
7. RunGrepTest failed on msys because \r\n was replaced by whitespace when the
- command "pattern=`printf 'xxx\r\njkl'`" was run. The pattern is now taken
+ command "pattern=`printf 'xxx\r\njkl'`" was run. The pattern is now taken
from a file.
-
+
Version 8.21 12-Dec-2011
------------------------
diff --git a/Makefile.am b/Makefile.am
index 4598ca6..2a8e972 100644
--- a/Makefile.am
+++ b/Makefile.am
@@ -362,6 +362,13 @@ EXTRA_DIST += \
testdata/grepoutput \
testdata/grepoutput8 \
testdata/grepoutputN \
+ testdata/greppatN4 \
+ testdata/saved16 \
+ testdata/saved16BE-1 \
+ testdata/saved16BE-2 \
+ testdata/saved16LE-1 \
+ testdata/saved16LE-2 \
+ testdata/saved8 \
testdata/testinput1 \
testdata/testinput2 \
testdata/testinput3 \
@@ -381,6 +388,7 @@ EXTRA_DIST += \
testdata/testinput17 \
testdata/testinput18 \
testdata/testinput19 \
+ testdata/testinput20 \
testdata/testoutput1 \
testdata/testoutput2 \
testdata/testoutput3 \
@@ -391,8 +399,8 @@ EXTRA_DIST += \
testdata/testoutput8 \
testdata/testoutput9 \
testdata/testoutput10 \
- testdata/testoutput11-8 \
testdata/testoutput11-16 \
+ testdata/testoutput11-8 \
testdata/testoutput12 \
testdata/testoutput13 \
testdata/testoutput14 \
@@ -401,6 +409,7 @@ EXTRA_DIST += \
testdata/testoutput17 \
testdata/testoutput18 \
testdata/testoutput19 \
+ testdata/testoutput20 \
testdata/wintestinput3 \
testdata/wintestoutput3 \
perltest.pl
diff --git a/NEWS b/NEWS
index 1d8d1a9..b792f87 100644
--- a/NEWS
+++ b/NEWS
@@ -4,9 +4,9 @@ News about PCRE releases
Release 8.30
------------
-Release 8.30 introduces a major new feature: support for 16-bit character
-strings, compiled as a separate library. There are no new features in the 8-bit
-library, but some bugs have been mended. However, note that the pcre_info()
+Release 8.30 introduces a major new feature: support for 16-bit character
+strings, compiled as a separate library. There are no new features in the 8-bit
+library, but some bugs have been mended. However, note that the pcre_info()
function, which has been obsolete for over 10 years, has been removed.
diff --git a/NON-UNIX-USE b/NON-UNIX-USE
index fef073b..26ed0ee 100644
--- a/NON-UNIX-USE
+++ b/NON-UNIX-USE
@@ -106,7 +106,7 @@ hand":
pcre_newline.c
pcre_ord2utf8.c
pcre_refcount.c
- pcre_string_utils.c
+ pcre_string_utils.c
pcre_study.c
pcre_tables.c
pcre_ucd.c
@@ -118,7 +118,7 @@ hand":
an unusual compiler) so that all included PCRE header files are first
sought in the current directory. Otherwise you run the risk of picking up
a previously-installed file from somewhere else.
-
+
(6) If you have defined SUPPORT_JIT in config.h, you must also compile
pcre_jit_compile.c
@@ -130,8 +130,8 @@ hand":
your system keeps such libraries. This is the basic PCRE C 8-bit library.
If your system has static and shared libraries, you may have to do this
once for each type.
-
- (8) If you want to build a 16-bit library (as well as, or instead of the 8-bit
+
+ (8) If you want to build a 16-bit library (as well as, or instead of the 8-bit
library) repeat steps 5-7 with the following files:
pcre16_byte_order.c
@@ -148,16 +148,16 @@ hand":
pcre16_newline.c
pcre16_ord2utf16.c
pcre16_refcount.c
- pcre16_string_utils.c
+ pcre16_string_utils.c
pcre16_study.c
pcre16_tables.c
pcre16_ucd.c
- pcre16_utf16_utils.c
+ pcre16_utf16_utils.c
pcre16_valid_utf16.c
pcre16_version.c
pcre16_xclass.c
- (9) If you want to build the POSIX wrapper functions (which apply only to the
+ (9) If you want to build the POSIX wrapper functions (which apply only to the
8-bit library), ensure that you have the pcreposix.h file and then compile
pcreposix.c (remembering -DHAVE_CONFIG_H if necessary). Link the result
(on its own) as the pcreposix library.
@@ -170,10 +170,10 @@ hand":
you compiled it with -DNOPOSIX.
(11) Run pcretest on the testinput files in the testdata directory, and check
- that the output matches the corresponding testoutput files. If you
- compiled both an 8-bit and a 16-bit library, you need to run pcretest with
+ that the output matches the corresponding testoutput files. If you
+ compiled both an 8-bit and a 16-bit library, you need to run pcretest with
the -16 option to do 16-bit tests.
-
+
Some tests are relevant only when certain build-time options are selected.
For example, test 4 is for UTF-8 or UTF-16 support, and will not run if
you have built PCRE without it. See the comments at the start of each
@@ -199,7 +199,7 @@ hand":
THE C++ WRAPPER FUNCTIONS
-The PCRE distribution also contains some C++ wrapper functions and tests,
+The PCRE distribution also contains some C++ wrapper functions and tests,
applicable to the 8-bit library, which were contributed by Google Inc. On a
system that can use "configure" and "make", the functions are automatically
built into a library called pcrecpp. It should be straightforward to compile
diff --git a/PrepareRelease b/PrepareRelease
index 4f57cfa..40d25ab 100755
--- a/PrepareRelease
+++ b/PrepareRelease
@@ -209,10 +209,10 @@ files="\
pcre_maketables.c \
pcre_newline.c \
pcre_ord2utf8.c \
- pcre_ord2utf16.c \
+ pcre16_ord2utf16.c \
pcre_printint.c \
pcre_refcount.c \
- pcre_stringutils.c \
+ pcre_string_utils.c \
pcre_study.c \
pcre_tables.c \
pcre_ucp_searchfuncs.c \
@@ -220,7 +220,7 @@ files="\
pcre_version.c \
pcre_xclass.c \
pcre16_utf16_utils.c \
- pcre16_valid_utf16.c \
+ pcre16_valid_utf16.c \
pcre_scanner.cc \
pcre_scanner.h \
pcre_scanner_unittest.cc \
diff --git a/README b/README
index f75cfa0..1a72ead 100644
--- a/README
+++ b/README
@@ -34,14 +34,14 @@ The contents of this README file are:
The PCRE APIs
-------------
-PCRE is written in C, and it has its own API. There are two sets of functions,
+PCRE is written in C, and it has its own API. There are two sets of functions,
one for the 8-bit library, which processes strings of bytes, and one for the
16-bit library, which processes strings of 16-bit values. The distribution also
includes a set of C++ wrapper functions (see the pcrecpp man page for details),
courtesy of Google Inc., which can be used to call the 8-bit PCRE library from
C++.
-In addition, there is a set of C wrapper functions (again, just for the 8-bit
+In addition, there is a set of C wrapper functions (again, just for the 8-bit
library) that are based on the POSIX regular expression API (see the pcreposix
man page). These end up in the library called libpcreposix. Note that this just
provides a POSIX calling interface to PCRE; the regular expressions themselves
@@ -171,12 +171,12 @@ library. They are also documented in the pcrebuild man page.
--disable-static
(See also "Shared libraries on Unix-like systems" below.)
-
-. By default, only the 8-bit library is built. If you add --enable-pcre16 to
- the "configure" command, the 16-bit library is also built. If you want only
+
+. By default, only the 8-bit library is built. If you add --enable-pcre16 to
+ the "configure" command, the 16-bit library is also built. If you want only
the 16-bit library, use "./configure --enable-pcre16 --disable-pcre8".
-. If you are building the 8-bit library and want to suppress the building of
+. If you are building the 8-bit library and want to suppress the building of
the C++ wrapper library, you can add --disable-cpp to the "configure"
command. Otherwise, when "configure" is run without --disable-pcre8, it will
try to find a C++ compiler and C++ header files, and if it succeeds, it will
@@ -200,13 +200,13 @@ library. They are also documented in the pcrebuild man page.
can only either be ASCII or UTF-8/16, even when running on EBCDIC platforms.
It is not possible to use both --enable-utf and --enable-ebcdic at the same
time.
-
-. The option --enable-utf8 is retained for backwards compatibility with earlier
- releases that did not support 16-bit character strings. It is synonymous with
- --enable-utf. It is not possible to configure one library with UTF support
- and the other without in the same configuration.
-. If, in addition to support for UTF-8/16 character strings, you want to
+. The option --enable-utf8 is retained for backwards compatibility with earlier
+ releases that did not support 16-bit character strings. It is synonymous with
+ --enable-utf. It is not possible to configure one library with UTF support
+ and the other without in the same configuration.
+
+. If, in addition to support for UTF-8/16 character strings, you want to
include support for the \P, \p, and \X sequences that recognize Unicode
character properties, you must add --enable-unicode-properties to the
"configure" command. This adds about 30K to the size of the library (in the
@@ -264,10 +264,10 @@ library. They are also documented in the pcrebuild man page.
sizes in the pcrestack man page.
. The default maximum compiled pattern size is around 64K. You can increase
- this by adding --with-link-size=3 to the "configure" command. In the 8-bit
- library, PCRE then uses three bytes instead of two for offsets to different
+ this by adding --with-link-size=3 to the "configure" command. In the 8-bit
+ library, PCRE then uses three bytes instead of two for offsets to different
parts of the compiled pattern. In the 16-bit library, --with-link-size=3 is
- the same as --with-link-size=4, which (in both libraries) uses four-byte
+ the same as --with-link-size=4, which (in both libraries) uses four-byte
offsets. Increasing the internal link size reduces performance.
. You can build PCRE so that its internal match() function that is called from
@@ -305,7 +305,7 @@ library. They are also documented in the pcrebuild man page.
when PCRE is built this way, it always operates in EBCDIC. It cannot support
both EBCDIC and UTF-8/16.
-. The pcregrep program currently supports only 8-bit data files, and so
+. The pcregrep program currently supports only 8-bit data files, and so
requires the 8-bit PCRE library. It is possible to compile pcregrep to use
libz and/or libbz2, in order to read .gz and .bz2 files (respectively), by
specifying one or both of
@@ -397,13 +397,13 @@ system. The following are installed (file names are all relative to the
pcre-config
Libraries (lib):
- libpcre16 (if 16-bit support is enabled)
+ libpcre16 (if 16-bit support is enabled)
libpcre (if 8-bit support is enabled)
libpcreposix (if 8-bit support is enabled)
libpcrecpp (if 8-bit and C++ support is enabled)
Configuration information (lib/pkgconfig):
- libpcre16.pc
+ libpcre16.pc
libpcre.pc
libpcreposix.pc
libpcrecpp.pc (if C++ support is enabled)
@@ -592,17 +592,17 @@ tests that are marked "never study" (see the pcretest program for how this is
done). If JIT support is available, the non-DFA tests are run a third time,
this time with a forced pcre_study() with the PCRE_STUDY_JIT_COMPILE option.
-When both 8-bit and 16-bit support is enabled, the entire set of tests is run
-twice, once for each library. If you want to run just one set of tests, call
+When both 8-bit and 16-bit support is enabled, the entire set of tests is run
+twice, once for each library. If you want to run just one set of tests, call
RunTest with either the -8 or -16 option.
-RunTest uses a file called testtry to hold the main output from pcretest
-(testsavedregex is also used as a working file). To run pcretest on just one or
-more specific test files, give their numbers as arguments to RunTest, for
-example:
+RunTest uses a file called testtry to hold the main output from pcretest.
+Other files whose names begin with "test" are used as working files in some
+tests. To run pcretest on just one or more specific test files, give their
+numbers as arguments to RunTest, for example:
RunTest 2 7 11
-
+
The first test file can be fed directly into the perltest.pl script to check
that Perl gives the same results. The only difference you should see is in the
first few lines, where the Perl version is given instead of the PCRE version.
@@ -658,12 +658,12 @@ The twelfth test is run only when JIT support is available, and the thirteenth
test is run only when JIT support is not available. They test some JIT-specific
features such as information output from pcretest about JIT compilation.
-The fourteenth, fifteenth, and sixteenth tests are run only in 8-bit mode, and
-the seventeenth, eighteenth, and nineteenth tests are run only in 16-bit mode.
-These are tests that generate different output in the two modes. They are for
+The fourteenth, fifteenth, and sixteenth tests are run only in 8-bit mode, and
+the seventeenth, eighteenth, and nineteenth tests are run only in 16-bit mode.
+These are tests that generate different output in the two modes. They are for
general cases, UTF-8/16 support, and Unicode property support, respectively.
-The twentieth test is run only in 16-bit mode. It tests some specific 16-bit
+The twentieth test is run only in 16-bit mode. It tests some specific 16-bit
features of the DFA matching engine.
@@ -724,8 +724,8 @@ will cause PCRE to malfunction.
File manifest
-------------
-The distribution should contain the files listed below. Where a file name is
-given as pcre[16]_xxx it means that there are two files, one with the name
+The distribution should contain the files listed below. Where a file name is
+given as pcre[16]_xxx it means that there are two files, one with the name
pcre_xxx and the other with the name pcre16_xxx.
(A) Source files of the PCRE library functions and their headers:
@@ -761,10 +761,10 @@ pcre_xxx and the other with the name pcre16_xxx.
pcre16_ord2utf16.c )
pcre16_utf16_utils.c )
pcre16_valid_utf16.c )
-
+
pcre[16]_printint.c ) debugging function that is used by pcretest,
) and can also be #included in pcre_compile()
-
+
pcre.h.in template for pcre.h when built by "configure"
pcreposix.h header for the external POSIX wrapper API
pcre_internal.h header for internal use
@@ -843,7 +843,7 @@ pcre_xxx and the other with the name pcre16_xxx.
testdata/testinput* test data for main library tests
testdata/testoutput* expected test results
testdata/grep* input and output for pcregrep tests
- testdata/* other supporting test files
+ testdata/* other supporting test files
(D) Auxiliary files for cmake support
diff --git a/RunGrepTest b/RunGrepTest
index 79c9fe6..fecc06e 100755
--- a/RunGrepTest
+++ b/RunGrepTest
@@ -9,9 +9,10 @@
LC_ALL=C
export LC_ALL
-# Remove any non-default colouring that the caller may have set.
+# Remove any non-default colouring and aliases that the caller may have set.
unset PCREGREP_COLOUR PCREGREP_COLOR
+unset cp ls mv rm
# Set the program to be tested, and valgrind settings when requested.
@@ -454,7 +455,7 @@ pattern=`printf 'def\rjkl'`
$valgrind $pcregrep -n --newline=cr -F "$pattern" testNinput >>testtry
printf "%c--------------------------- Test N4 ------------------------------\r\n" - >>testtry
-$valgrind $pcregrep -n --newline=crlf -F -f testdata/greppatN4 testNinput >>testtry
+$valgrind $pcregrep -n --newline=crlf -F -f $srcdir/testdata/greppatN4 testNinput >>testtry
printf "%c--------------------------- Test N5 ------------------------------\r\n" - >>testtry
$valgrind $pcregrep -n --newline=any "^(abc|def|ghi|jkl)" testNinput >>testtry
diff --git a/RunTest b/RunTest
index d595549..e91b027 100755
--- a/RunTest
+++ b/RunTest
@@ -18,7 +18,7 @@
# two tests for JIT-specific features, one to be run when JIT support is
# available, and one when it is not.
-# Whichever of the 8-bit and 16-bit libraries exist are tested. It is also
+# Whichever of the 8-bit and 16-bit libraries exist are tested. It is also
# possible to select which to test by the arguments -8 or -16.
# Other arguments for this script can be individual test numbers, or the word
@@ -32,6 +32,9 @@ sim=
arg8=
arg16=
+# This is in case the caller has set aliases (as I do - PH)
+unset cp ls mv rm
+
# Select which tests to run; for those that are explicitly requested, check
# that the necessary optional facilities are available.
@@ -77,7 +80,7 @@ while [ $# -gt 0 ] ; do
17) do17=yes;;
18) do18=yes;;
19) do19=yes;;
- 20) do20=yes;;
+ 20) do20=yes;;
-8) arg8=yes;;
-16) arg16=yes;;
valgrind) valgrind="valgrind -q --smc-check=all";;
@@ -190,7 +193,7 @@ if [ $utf -eq 0 ] ; then
fi
if [ $do18 = yes ] ; then
echo "Can't run test 18 because UTF support is not configured"
- fi
+ fi
fi
if [ $ucp -eq 0 ] ; then
@@ -262,7 +265,7 @@ if [ $do1 = no -a $do2 = no -a $do3 = no -a $do4 = no -a \
do17=yes
do18=yes
do19=yes
- do20=yes
+ do20=yes
fi
# Show which release and which test data
@@ -277,8 +280,8 @@ for bmode in "$test8" "$test16"; do
-16) if [ "$test8" != "skip" ] ; then echo ""; fi
bits=16; echo "---- Testing 16-bit library ----"; echo "";;
*) bits=8; echo "---- Testing 8-bit library ----"; echo "";;
- esac
-
+ esac
+
# Primary test, compatible with JIT and all versions of Perl >= 5.8
if [ $do1 = yes ] ; then
@@ -581,9 +584,8 @@ if [ "$do14" = yes ] ; then
echo "Test 14: specials for the basic 8-bit library"
if [ "$bits" = "16" ] ; then
echo " Skipped when running 16-bit tests"
- elif [ $utf -eq 0 ] ; then
- echo " Skipped because UTF-$bits support is not available"
- else
+ else
+ cp -f $testdata/saved16 testsaved16
for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput14 testtry
if [ $? = 0 ] ; then
@@ -607,7 +609,7 @@ if [ "$do15" = yes ] ; then
echo " Skipped when running 16-bit tests"
elif [ $utf -eq 0 ] ; then
echo " Skipped because UTF-$bits support is not available"
- else
+ else
for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput15 testtry
if [ $? = 0 ] ; then
@@ -631,7 +633,7 @@ if [ $do16 = yes ] ; then
echo " Skipped when running 16-bit tests"
elif [ $ucp -eq 0 ] ; then
echo " Skipped because Unicode property support is not available"
- else
+ else
for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput16 testtry
if [ $? = 0 ] ; then
@@ -653,7 +655,10 @@ if [ $do17 = yes ] ; then
echo "Test 17: specials for the basic 16-bit library"
if [ "$bits" = "8" ] ; then
echo " Skipped when running 8-bit tests"
- else
+ else
+ cp -f $testdata/saved8 testsaved8
+ cp -f $testdata/saved16LE-1 testsaved16LE-1
+ cp -f $testdata/saved16BE-1 testsaved16BE-1
for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput17 testtry
if [ $? = 0 ] ; then
@@ -677,7 +682,9 @@ if [ $do18 = yes ] ; then
echo " Skipped when running 8-bit tests"
elif [ $utf -eq 0 ] ; then
echo " Skipped because UTF-$bits support is not available"
- else
+ else
+ cp -f $testdata/saved16LE-2 testsaved16LE-2
+ cp -f $testdata/saved16BE-2 testsaved16BE-2
for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput18 testtry
if [ $? = 0 ] ; then
@@ -701,7 +708,7 @@ if [ $do19 = yes ] ; then
echo " Skipped when running 8-bit tests"
elif [ $ucp -eq 0 ] ; then
echo " Skipped because Unicode property support is not available"
- else
+ else
for opt in "" "-s" $jitopt; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput19 testtry
if [ $? = 0 ] ; then
@@ -723,7 +730,7 @@ if [ $do20 = yes ] ; then
echo "Test 20: DFA specials for the basic 16-bit library"
if [ "$bits" = "8" ] ; then
echo " Skipped when running 8-bit tests"
- else
+ else
for opt in "" "-s"; do
$sim $valgrind ./pcretest -q $bmode $opt $testdata/testinput20 testtry
if [ $? = 0 ] ; then
@@ -741,4 +748,7 @@ fi
# End of loop for 8-bit/16-bit tests
done
+# Clean up local working files
+rm -f test3input test3output testNinput testsaved* teststderr teststdout testtry
+
# End
diff --git a/configure.ac b/configure.ac
index f2a6b54..d1f98b4 100644
--- a/configure.ac
+++ b/configure.ac
@@ -741,7 +741,7 @@ AC_SUBST(EXTRA_LIBPCREPOSIX_LDFLAGS)
AC_SUBST(EXTRA_LIBPCRECPP_LDFLAGS)
# When we run 'make distcheck', use these arguments.
-DISTCHECK_CONFIGURE_FLAGS="--enable-jit --enable-cpp --enable-unicode-properties"
+DISTCHECK_CONFIGURE_FLAGS="--enable-pcre16 --enable-jit --enable-cpp --enable-unicode-properties"
AC_SUBST(DISTCHECK_CONFIGURE_FLAGS)
# Check that, if --enable-pcregrep-libz or --enable-pcregrep-libbz2 is
diff --git a/pcre_compile.c b/pcre_compile.c
index 78398fd..2b076cd 100644
--- a/pcre_compile.c
+++ b/pcre_compile.c
@@ -55,7 +55,7 @@ supporting internal functions that are not used by other modules. */
/* When PCRE_DEBUG is defined, we need the pcre(16)_printint() function, which
is also used by pcretest. PCRE_DEBUG is not defined when building a production
-library. We do not need to select pcre16_printint.c specially, because the
+library. We do not need to select pcre16_printint.c specially, because the
COMPILE_PCREx macro will already be appropriately set. */
#ifdef PCRE_DEBUG
@@ -1708,7 +1708,7 @@ for (;;)
int d;
pcre_uchar *ce, *cs;
register int op = *cc;
-
+
switch (op)
{
/* We only need to continue for OP_CBRA (normal capturing bracket) and
@@ -1769,7 +1769,7 @@ for (;;)
case OP_ASSERTBACK_NOT:
do cc += GET(cc, 1); while (*cc == OP_ALT);
cc += PRIV(OP_lengths)[*cc];
- break;
+ break;
/* Skip over things that don't match chars */
@@ -3526,7 +3526,7 @@ for (;; ptr++)
*lengthptr += (int)(code - last_code);
DPRINTF(("length=%d added %d c=%c (0x%x)\n", *lengthptr,
(int)(code - last_code), c, c));
-
+
/* If "previous" is set and it is not at the start of the work space, move
it back to there, in order to avoid filling up the work space. Otherwise,
if "previous" is NULL, reset the current code pointer to the start. */
@@ -4550,7 +4550,7 @@ for (;; ptr++)
#ifdef SUPPORT_UTF
#ifndef COMPILE_PCRE8
/* In non 8 bit mode, we can get here even if we are not in UTF mode. */
- if (!utf)
+ if (!utf)
*class_uchardata++ = c;
else
#endif
diff --git a/pcre_exec.c b/pcre_exec.c
index d89f36c..fb0822d 100644
--- a/pcre_exec.c
+++ b/pcre_exec.c
@@ -468,7 +468,7 @@ Returns: MATCH_MATCH if matched ) these values are >= 0
static int
match(REGISTER PCRE_PUCHAR eptr, REGISTER const pcre_uchar *ecode,
- PCRE_PUCHAR mstart, int offset_top, match_data *md, eptrblock *eptrb,
+ PCRE_PUCHAR mstart, int offset_top, match_data *md, eptrblock *eptrb,
unsigned int rdepth)
{
/* These variables do not need to be preserved over recursion in this function,
@@ -2631,8 +2631,8 @@ for (;;)
/* Handle repeated back references. If the length of the reference is
zero, just continue with the main loop. If the length is negative, it
- means the reference is unset in non-Java-compatible mode. If the minimum is
- zero, we can continue at the same level without recursion. For any other
+ means the reference is unset in non-Java-compatible mode. If the minimum is
+ zero, we can continue at the same level without recursion. For any other
minimum, carrying on will result in NOMATCH. */
if (length == 0) continue;
@@ -6030,10 +6030,10 @@ switch (frame->Xwhere)
LBL(53) LBL(54) LBL(55) LBL(56) LBL(57) LBL(58) LBL(63) LBL(64)
LBL(65) LBL(66)
#if defined SUPPORT_UTF || !defined COMPILE_PCRE8
- LBL(21)
+ LBL(21)
#endif
-#ifdef SUPPORT_UTF
- LBL(16) LBL(18) LBL(20)
+#ifdef SUPPORT_UTF
+ LBL(16) LBL(18) LBL(20)
LBL(22) LBL(23) LBL(28) LBL(30)
LBL(32) LBL(34) LBL(42) LBL(46)
#ifdef SUPPORT_UCP
@@ -6043,7 +6043,7 @@ switch (frame->Xwhere)
#endif /* SUPPORT_UTF */
default:
DPRINTF(("jump error in pcre match: label %d non-existent\n", frame->Xwhere));
-
+
printf("+++jump error in pcre match: label %d non-existent\n", frame->Xwhere);
return PCRE_ERROR_INTERNAL;
@@ -6209,7 +6209,7 @@ if (utf && (options & PCRE_NO_UTF8_CHECK) == 0)
#else
return (errorcode <= PCRE_UTF8_ERR5 && md->partial > 1)?
PCRE_ERROR_SHORTUTF8 : PCRE_ERROR_BADUTF8;
-#endif
+#endif
}
/* Check that a start_offset points to the start of a UTF character. */
@@ -6765,9 +6765,9 @@ for(;;)
/* If we have just passed a CR and we are now at a LF, and the pattern does
not contain any explicit matches for \r or \n, and the newline option is CRLF
- or ANY or ANYCRLF, advance the match position by one more character. In
- normal matching start_match will aways be greater than the first position at
- this stage, but a failed *SKIP can cause a return at the same point, which is
+ or ANY or ANYCRLF, advance the match position by one more character. In
+ normal matching start_match will aways be greater than the first position at
+ this stage, but a failed *SKIP can cause a return at the same point, which is
why the first test exists. */
if (start_match > (PCRE_PUCHAR)subject + start_offset &&
diff --git a/pcre_fullinfo.c b/pcre_fullinfo.c
index 58cdc7f..481a2c0 100644
--- a/pcre_fullinfo.c
+++ b/pcre_fullinfo.c
@@ -91,9 +91,9 @@ means that the pattern is likely compiled with different endianness. */
if (re->magic_number != MAGIC_NUMBER)
return re->magic_number == REVERSED_MAGIC_NUMBER?
PCRE_ERROR_BADENDIANNESS:PCRE_ERROR_BADMAGIC;
-
+
/* Check that this pattern was compiled in the correct bit mode */
-
+
if ((re->flags & PCRE_MODE) == 0) return PCRE_ERROR_BADMODE;
switch (what)
diff --git a/pcre_printint.c b/pcre_printint.c
index 7c29fc0..7d8f62d 100644
--- a/pcre_printint.c
+++ b/pcre_printint.c
@@ -478,7 +478,7 @@ for(;;)
if (PRINTABLE(c)) fprintf(f, " %s [^%c]", flag, c);
else if (utf || c > 0xff)
fprintf(f, " %s [^\\x{%02x}]", flag, c);
- else
+ else
fprintf(f, " %s [^\\x%02x]", flag, c);
break;
diff --git a/pcretest.c b/pcretest.c
index 6d254d3..0b5f4b2 100644
--- a/pcretest.c
+++ b/pcretest.c
@@ -1126,8 +1126,7 @@ for (j = i; j > 0; j--)
*utf8bytes = utf8_table2[i] | cvalue;
return i + 1;
}
-#endif /* NOUTF || SUPPORT_PCRE16 */
-
+#endif
#ifdef SUPPORT_PCRE16
@@ -1145,7 +1144,7 @@ Note that this function does not object to surrogate values. This is
deliberate; it makes it possible to construct UTF-16 strings that are invalid,
for the purpose of testing that they are correctly faulted.
-Patterns to be converted are either plain ASCII or UTF-8; data lines are always
+Patterns to be converted are either plain ASCII or UTF-8; data lines are always
in UTF-8 so that values greater than 255 can be handled.
Arguments:
@@ -1157,7 +1156,7 @@ Arguments:
Returns: number of 16-bit data items used (excluding trailing zero)
OR -1 if a UTF-8 string is malformed
OR -2 if a value > 0x10ffff is encountered
- OR -3 if a value > 0xffff is encountered when not in UTF mode
+ OR -3 if a value > 0xffff is encountered when not in UTF mode
*/
static int
@@ -2336,7 +2335,7 @@ while (argc > 1 && argv[op][0] == '-')
goto EXIT;
}
if (strcmp(argv[op + 1], "newline") == 0)
- {
+ {
(void)PCRE_CONFIG(PCRE_CONFIG_NEWLINE, &rc);
/* Note that these values are always the ASCII values, even
in EBCDIC environments. CR is 13 and NL is 10. */
@@ -2345,7 +2344,7 @@ while (argc > 1 && argv[op][0] == '-')
(rc == -2)? "ANYCRLF" :
(rc == -1)? "ANY" : "???");
goto EXIT;
- }
+ }
printf("Unknown -C option: %s\n", argv[op + 1]);
goto EXIT;
}
@@ -2869,11 +2868,11 @@ while (!done)
fprintf(outfile, "**Failed: character value greater than 0x10ffff "
"cannot be converted to UTF-16\n");
goto SKIP_DATA;
-
+
case -3: /* "Impossible error" when to16 is called arg1 FALSE */
fprintf(outfile, "**Failed: character value greater than 0xffff "
"cannot be converted to 16-bit in non-UTF mode\n");
- goto SKIP_DATA;
+ goto SKIP_DATA;
default:
break;
@@ -3386,23 +3385,23 @@ while (!done)
{
int i = 0;
int n = 0;
-
+
/* In UTF mode, input can be UTF-8, so just copy all non-backslash bytes.
In non-UTF mode, allow the value of the byte to fall through to later,
where values greater than 127 are turned into UTF-8 when running in
16-bit mode. */
-
+
if (c != '\\')
{
if (use_utf)
{
*q++ = c;
continue;
- }
- }
-
+ }
+ }
+
/* Handle backslash escapes */
-
+
else switch ((c = *p++))
{
case 'a': c = 7; break;
@@ -3442,10 +3441,10 @@ while (!done)
/* Not correct form for \x{...}; fall through */
}
- /* \x without {} always defines just one byte in 8-bit mode. This
- allows UTF-8 characters to be constructed byte by byte, and also allows
- invalid UTF-8 sequences to be made. Just copy the byte in UTF mode.
- Otherwise, pass it down to later code so that it can be turned into
+ /* \x without {} always defines just one byte in 8-bit mode. This
+ allows UTF-8 characters to be constructed byte by byte, and also allows
+ invalid UTF-8 sequences to be made. Just copy the byte in UTF mode.
+ Otherwise, pass it down to later code so that it can be turned into
UTF-8 when running in 16-bit mode. */
c = 0;
@@ -3455,10 +3454,10 @@ while (!done)
p++;
}
if (use_utf)
- {
+ {
*q++ = c;
- continue;
- }
+ continue;
+ }
break;
case 0: /* \ followed by EOF allows for an empty line */
@@ -3663,13 +3662,14 @@ while (!done)
continue;
}
- /* We now have a character value in c that may be greater than 255. In
- 16-bit mode, we always convert characters to UTF-8 so that values greater
+ /* We now have a character value in c that may be greater than 255. In
+ 16-bit mode, we always convert characters to UTF-8 so that values greater
than 255 can be passed to non-UTF 16-bit strings. In 8-bit mode we
- convert to UTF-8 if we are in UTF mode. Values greater than 127 in UTF
+ convert to UTF-8 if we are in UTF mode. Values greater than 127 in UTF
mode must have come from \x{...} or octal constructs because values from
\x.. get this far only in non-UTF mode. */
+#if !defined NOUTF || defined SUPPORT_PCRE16
if (use_pcre16 || use_utf)
{
pcre_uint8 buff8[8];
@@ -3678,6 +3678,7 @@ while (!done)
for (ii = 0; ii < utn; ii++) *q++ = buff8[ii];
}
else
+#endif
{
if (c > 255)
{
@@ -3689,9 +3690,9 @@ while (!done)
*q++ = c;
}
}
-
+
/* Reached end of subject string */
-
+
*q = 0;
len = (int)(q - dbuffer);
@@ -3793,7 +3794,7 @@ while (!done)
case -3:
fprintf(outfile, "**Failed: character value greater than 0xffff "
"cannot be converted to 16-bit in non-UTF mode\n");
- goto NEXT_DATA;
+ goto NEXT_DATA;
default:
break;
diff --git a/testdata/testinput14 b/testdata/testinput14
index 083e42e..8c5b940 100644
--- a/testdata/testinput14
+++ b/testdata/testinput14
@@ -283,7 +283,7 @@
\) )* # optional trailing comment
/xSI
-<!testdata/saved16
+<!testsaved16
/\h/SI
diff --git a/testdata/testinput17 b/testdata/testinput17
index e479f9d..2dbc741 100644
--- a/testdata/testinput17
+++ b/testdata/testinput17
@@ -213,7 +213,7 @@
\) )* # optional trailing comment
/xSI
-<!testdata/saved8
+<!testsaved8
/[\h]/BZ
>\x09<
@@ -276,8 +276,8 @@
/-- Generated from: ^[aL](?P<name>(?:[AaLl]+)[^xX-]*?)(?P<other>[\x{150}-\x{250}\x{300}]|[^\x{800}aAs-uS-U\x{d800}-\x{dfff}])++[^#\b\x{500}\x{1000}]{3,5}$ --/
-<!testdata/saved16LE-1
+<!testsaved16LE-1
-<!testdata/saved16BE-1
+<!testsaved16BE-1
/-- End of testinput17 --/
diff --git a/testdata/testinput18 b/testdata/testinput18
index 41b1e2d..f075d8a 100644
--- a/testdata/testinput18
+++ b/testdata/testinput18
@@ -240,8 +240,8 @@ correctly, but that messes up comparisons). --/
/-- Generated from: (?P<cbra1>[aZ\x{400}-\x{10ffff}]{4,}[\x{f123}\x{10039}\x{20000}-\x{21234}]?|[A-Cx-z\x{100000}-\x{1000a7}\x{101234}])(?<cb2>[^az]) --/
-<!testdata/saved16LE-2
+<!testsaved16LE-2
-<!testdata/saved16BE-2
+<!testsaved16BE-2
/-- End of testinput18 --/
diff --git a/testdata/testoutput14 b/testdata/testoutput14
index db1ea7e..643e899 100644
--- a/testdata/testoutput14
+++ b/testdata/testoutput14
@@ -355,8 +355,8 @@ Starting byte set: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e
f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f
-<!testdata/saved16
-Compiled pattern loaded from testdata/saved16
+<!testsaved16
+Compiled pattern loaded from testsaved16
No study data
Error -28 from pcre_fullinfo(0)
Running in 8-bit mode but pattern was compiled in 16-bit mode
diff --git a/testdata/testoutput17 b/testdata/testoutput17
index 1c90aa0..c356034 100644
--- a/testdata/testoutput17
+++ b/testdata/testoutput17
@@ -240,8 +240,8 @@ Starting byte set: \x09 \x20 ! " # $ % & ' ( * + - / 0 1 2 3 4 5 6 7 8
9 = ? A B C D E F G H I J K L M N O P Q R S T U V W X Y Z ^ _ ` a b c d e
f g h i j k l m n o p q r s t u v w x y z { | } ~ \x7f \xff
-<!testdata/saved8
-Compiled pattern loaded from testdata/saved8
+<!testsaved8
+Compiled pattern loaded from testsaved8
No study data
Error -28 from pcre16_fullinfo(0)
Running in 16-bit mode but pattern was compiled in 8-bit mode
@@ -456,9 +456,9 @@ Need char = \x{dd00}
/-- Generated from: ^[aL](?P<name>(?:[AaLl]+)[^xX-]*?)(?P<other>[\x{150}-\x{250}\x{300}]|[^\x{800}aAs-uS-U\x{d800}-\x{dfff}])++[^#\b\x{500}\x{1000}]{3,5}$ --/
-<!testdata/saved16LE-1
-Compiled pattern loaded from testdata/saved16LE-1
-Study data loaded from testdata/saved16LE-1
+<!testsaved16LE-1
+Compiled pattern loaded from testsaved16LE-1
+Study data loaded from testsaved16LE-1
------------------------------------------------------------------
0 134 Bra
2 ^
@@ -489,9 +489,9 @@ No need char
Subject length lower bound = 6
No set of starting bytes
-<!testdata/saved16BE-1
-Compiled pattern loaded from testdata/saved16BE-1
-Study data loaded from testdata/saved16BE-1
+<!testsaved16BE-1
+Compiled pattern loaded from testsaved16BE-1
+Study data loaded from testsaved16BE-1
------------------------------------------------------------------
0 134 Bra
2 ^
diff --git a/testdata/testoutput18 b/testdata/testoutput18
index 943d6e4..f1013eb 100644
--- a/testdata/testoutput18
+++ b/testdata/testoutput18
@@ -845,9 +845,9 @@ Error -24 (bad offset value)
/-- Generated from: (?P<cbra1>[aZ\x{400}-\x{10ffff}]{4,}[\x{f123}\x{10039}\x{20000}-\x{21234}]?|[A-Cx-z\x{100000}-\x{1000a7}\x{101234}])(?<cb2>[^az]) --/
Failed: character value in \x{...} sequence is too large at offset 49
-<!testdata/saved16LE-2
-Compiled pattern loaded from testdata/saved16LE-2
-Study data loaded from testdata/saved16LE-2
+<!testsaved16LE-2
+Compiled pattern loaded from testsaved16LE-2
+Study data loaded from testsaved16LE-2
------------------------------------------------------------------
0 101 Bra
2 45 CBra 1
@@ -872,9 +872,9 @@ No need char
Subject length lower bound = 2
No set of starting bytes
-<!testdata/saved16BE-2
-Compiled pattern loaded from testdata/saved16BE-2
-Study data loaded from testdata/saved16BE-2
+<!testsaved16BE-2
+Compiled pattern loaded from testsaved16BE-2
+Study data loaded from testsaved16BE-2
------------------------------------------------------------------
0 101 Bra
2 45 CBra 1