summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2007-03-30 13:41:47 +0000
committerph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>2007-03-30 13:41:47 +0000
commit1240bf7573cc0d87b6614571b1670ab887de595b (patch)
tree1dc4c7dd806911cd2262710e476c59aa45194e1c
parent50dcde8c8eac3579ac0fe78d5cd298555fe9fa29 (diff)
downloadpcre-1240bf7573cc0d87b6614571b1670ab887de595b.tar.gz
Documentation notes that fr_FR locale is "french" in Windows; add an
out-of-tree built test to maint/ManyConfigTests. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@139 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r--README3
-rwxr-xr-xRunTest4
-rw-r--r--doc/pcreapi.325
-rw-r--r--doc/pcrepattern.37
-rwxr-xr-xmaint/ManyConfigTests132
5 files changed, 126 insertions, 45 deletions
diff --git a/README b/README
index ec0ef3e..177a7e6 100644
--- a/README
+++ b/README
@@ -489,6 +489,9 @@ is output to say why. If running this test produces instances of the error
in the comparison output, it means that locale is not available on your system,
despite being listed by "locale". This does not mean that PCRE is broken.
+[If you are trying to run this test on Windows, you may be able to get it to
+work by changing "fr_FR" to "french" everywhere it occurs.]
+
The fourth test checks the UTF-8 support. It is not run automatically unless
PCRE is built with UTF-8 support. To do this you must set --enable-utf8 when
running "configure". This file can be also fed directly to the perltest script,
diff --git a/RunTest b/RunTest
index 4e35c71..431a093 100755
--- a/RunTest
+++ b/RunTest
@@ -157,7 +157,9 @@ if [ $do2 = yes ] ; then
fi
fi
-# Locale-specific tests, provided the "fr_FR" locale is available
+# Locale-specific tests, provided the "fr_FR" locale is available.
+# TODO: Try the locale name "french" instead - as used on Windows - but
+# this will mean modifying the input and output data.
if [ $do3 = yes ] ; then
locale -a | grep '^fr_FR$' >/dev/null
diff --git a/doc/pcreapi.3 b/doc/pcreapi.3
index 879170f..02c1016 100644
--- a/doc/pcreapi.3
+++ b/doc/pcreapi.3
@@ -729,19 +729,25 @@ bytes is created.
.SH "LOCALE SUPPORT"
.rs
.sp
-PCRE handles caseless matching, and determines whether characters are letters
+PCRE handles caseless matching, and determines whether characters are letters,
digits, or whatever, by reference to a set of tables, indexed by character
value. When running in UTF-8 mode, this applies only to characters with codes
less than 128. Higher-valued codes never match escapes such as \ew or \ed, but
can be tested with \ep if PCRE is built with Unicode character property
-support. The use of locales with Unicode is discouraged.
+support. The use of locales with Unicode is discouraged. If you are handling
+characters with codes greater than 128, you should either use UTF-8 and
+Unicode, or use locales, but not try to mix the two.
.P
-An internal set of tables is created in the default C locale when PCRE is
-built. This is used when the final argument of \fBpcre_compile()\fP is NULL,
-and is sufficient for many applications. An alternative set of tables can,
-however, be supplied. These may be created in a different locale from the
-default. As more and more applications change to using Unicode, the need for
-this locale support is expected to die away.
+PCRE contains an internal set of tables that are used when the final argument
+of \fBpcre_compile()\fP is NULL. These are sufficient for many applications.
+Normally, the internal tables recognize only ASCII characters. However, when
+PCRE is built, it is possible to cause the internal tables to be rebuilt in the
+default "C" locale of the local system, which may cause them to be different.
+.P
+The internal tables can always be overridden by tables supplied by the
+application that calls PCRE. These may be created in a different locale from
+the default. As more and more applications change to using Unicode, the need
+for this locale support is expected to die away.
.P
External tables are built by calling the \fBpcre_maketables()\fP function,
which has no arguments, in the relevant locale. The result can then be passed
@@ -754,6 +760,9 @@ the following code could be used:
tables = pcre_maketables();
re = pcre_compile(..., tables);
.sp
+The locale name "fr_FR" is used on Linux and other Unix-like systems; if you
+are using Windows, the name for the French locale is "french".
+.P
When \fBpcre_maketables()\fP runs, the tables are built in memory that is
obtained via \fBpcre_malloc\fP. It is the caller's responsibility to ensure
that the memory containing the tables remains available for as long as it is
diff --git a/doc/pcrepattern.3 b/doc/pcrepattern.3
index 47b6008..60d1f1f 100644
--- a/doc/pcrepattern.3
+++ b/doc/pcrepattern.3
@@ -291,8 +291,9 @@ in the
.\" HREF
\fBpcreapi\fP
.\"
-page). For example, in the "fr_FR" (French) locale, some character codes
-greater than 128 are used for accented letters, and these are matched by \ew.
+page). For example, in a French locale such as "fr_FR" in Unix-like systems,
+or "french" in Windows, some character codes greater than 128 are used for
+accented letters, and these are matched by \ew.
.P
In UTF-8 mode, characters with values greater than 128 never match \ed, \es, or
\ew, and always match \eD, \eS, and \eW. This is true even when Unicode
@@ -740,7 +741,7 @@ example [\ex{100}-\ex{2ff}].
If a range that includes letters is used when caseless matching is set, it
matches the letters in either case. For example, [W-c] is equivalent to
[][\e\e^_`wxyzabc], matched caselessly, and in non-UTF-8 mode, if character
-tables for the "fr_FR" locale are in use, [\exc8-\excb] matches accented E
+tables for a French locale are in use, [\exc8-\excb] matches accented E
characters in both cases. In UTF-8 mode, PCRE supports the concept of case for
characters with values greater than 128 only when it is compiled with Unicode
property support.
diff --git a/maint/ManyConfigTests b/maint/ManyConfigTests
index fd9c33e..cdf5358 100755
--- a/maint/ManyConfigTests
+++ b/maint/ManyConfigTests
@@ -3,14 +3,13 @@
# This is a script for the use of PCRE maintainers. It configures and rebuilds
# PCRE with a variety of configuration options, and in each case runs the tests
# to ensure that all goes well. Every possible combination would take far too
-# long, so we use a representative sample. As well as testing that they work,
-# we use --disable-shared or --disable-static after the first test (which
-# builds both) to save a bit of time by building only one version for the
-# subsequent tests.
+# long, so we use a representative sample. This script should be run in the
+# PCRE source directory.
# Some of the tests have to be skipped when PCRE is built with non-Unix newline
# recognition. I am planning to reduce this as much as possible in due course.
+
# This is in case the caller has set aliases (as I do - PH)
unset cp ls mv rm
@@ -20,35 +19,31 @@ unset cp ls mv rm
verbose=0
if [ "$1" = "-v" ] ; then verbose=1; fi
-# The first (empty) configuration builds with all the default settings.
+# This is a temporary directory for testing out-of-line builds
-for opts in \
- "" \
- "--enable-utf8 --disable-static" \
- "--enable-unicode-properties --disable-shared" \
- "--enable-unicode-properties --disable-stack-for-recursion --disable-shared" \
- "--enable-unicode-properties --disable-cpp --with-link-size=3 --disable-shared" \
- "--enable-rebuild-chartables --disable-shared" \
- "--enable-newline-is-any --disable-shared" \
- "--enable-newline-is-cr --disable-shared" \
- "--enable-newline-is-crlf --disable-shared" \
- "--enable-utf8 --enable-newline-is-any --enable-unicode-properties --disable-stack-for-recursion --disable-static --disable-cpp"
-do
+tmp=/tmp/pcretesting
+
+
+# This function runs a single test with the set of configuration options that
+# are in $opts. The source directory must be set in srcdir.
+
+function runtest()
+ {
rm -f *_unittest
if [ "$opts" = "" ] ; then
- echo "===> Configuring with: default settings"
+ echo "Configuring with: default settings"
else
olen=`expr length "$opts"`
- if [ $olen -gt 56 ] ; then
- echo "===> Configuring with:"
- echo " $opts"
+ if [ $olen -gt 53 ] ; then
+ echo "Configuring with:"
+ echo " $opts"
else
- echo "===> Configuring with: $opts"
+ echo "Configuring with: $opts"
fi
fi
- ./configure $opts >/dev/null 2>teststderr
+ $srcdir/configure $opts >/dev/null 2>teststderr
if [ $? -ne 0 ]; then
echo " "
echo "**** Error while configuring ****"
@@ -56,7 +51,7 @@ do
exit 1
fi
- echo "===> Making"
+ echo "Making"
make >/dev/null 2>teststderr
if [ $? -ne 0 ]; then
echo " "
@@ -73,8 +68,8 @@ do
nl=`expr match "$conf" ".*Newline sequence is \([A-Z]*\)"`
if [ "$nl" = "LF" -o "$nl" = "ANY" ]; then
- echo "===> Running C library tests"
- ./RunTest >teststdout
+ echo "Running C library tests"
+ $srcdir/RunTest >teststdout
if [ $? -ne 0 ]; then
echo " "
echo "**** Test failed ****"
@@ -82,12 +77,12 @@ do
exit 1
fi
else
- echo "===> Skipping C library tests: newline is $nl"
+ echo "Skipping C library tests: newline is $nl"
fi
if [ "$nl" = "LF" ]; then
- echo "===> Running pcregrep tests"
- ./RunGrepTest >teststdout 2>teststderr
+ echo "Running pcregrep tests"
+ $srcdir/RunGrepTest >teststdout 2>teststderr
if [ $? -ne 0 ]; then
echo " "
echo "**** Test failed ****"
@@ -96,7 +91,7 @@ do
exit 1
fi
else
- echo "===> Skipping pcregrep tests: newline is $nl"
+ echo "Skipping pcregrep tests: newline is $nl"
fi
if [ "$nl" = "LF" -o "$nl" = "ANY" ]; then
@@ -105,7 +100,7 @@ do
pcre_scanner_unittest \
pcre_stringpiece_unittest
do
- echo "===> Running $utest"
+ echo "Running $utest"
$utest >teststdout
if [ $? -ne 0 ]; then
echo " "
@@ -114,12 +109,83 @@ do
exit 1
fi
done
+ else
+ echo "Skipping C++ tests: pcrecpp_unittest does not exist"
fi
else
- echo "===> Skipping C++ tests: newline is $nl"
+ echo "Skipping C++ tests: newline is $nl"
fi
+ }
+
+# This set of tests builds PCRE and runs the tests with a variety of configure
+# options, in the current (source) directory. The first (empty) configuration
+# builds with all the default settings. As well as testing that these options
+# work, we use --disable-shared or --disable-static after the first test (which
+# builds both) to save a bit of time by building only one version of the
+# library for the subsequent tests.
+
+echo "Tests in the current directory"
+srcdir=.
+for opts in \
+ "" \
+ "--enable-utf8 --disable-static" \
+ "--enable-unicode-properties --disable-shared" \
+ "--enable-unicode-properties --disable-stack-for-recursion --disable-shared" \
+ "--enable-unicode-properties --disable-cpp --with-link-size=3 --disable-shared" \
+ "--enable-rebuild-chartables --disable-shared" \
+ "--enable-newline-is-any --disable-shared" \
+ "--enable-newline-is-cr --disable-shared" \
+ "--enable-newline-is-crlf --disable-shared" \
+ "--enable-utf8 --enable-newline-is-any --enable-unicode-properties --disable-stack-for-recursion --disable-static --disable-cpp"
+do
+ runtest
done
-echo "===> All done"
+
+
+# Clean up the distribution and then do at least one build and test in a
+# directory other than the source directory. It doesn't work unless the
+# source directory is cleaned up first - and anyway, it's best to leave it
+# in a clean state after all this reconfiguring.
+
+if [ -f Makefile ]; then
+ echo "Running 'make distclean'"
+ make distclean >/dev/null 2>&1
+ if [ $? -ne 0 ]; then
+ echo "** 'make distclean' failed"
+ exit 1
+ fi
+fi
+
+echo "Tests in the $tmp directory"
+srcdir=`pwd`
+export srcdir
+
+if [ ! -e $tmp ]; then
+ mkdir $tmp
+fi
+
+if [ ! -d $tmp ]; then
+ echo "** Failed to create $tmp or it is not a directory"
+ exit 1
+fi
+
+cd $tmp
+if [ $? -ne 0 ]; then
+ echo "** Failed to cd to $tmp"
+ exit 1
+fi
+
+for opts in \
+ "--enable-unicode-properties --disable-shared"
+do
+ runtest
+done
+
+echo "Removing $tmp"
+
+rm -rf $tmp
+
+echo "All done"
# End