diff options
author | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2007-03-30 13:41:47 +0000 |
---|---|---|
committer | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2007-03-30 13:41:47 +0000 |
commit | 1240bf7573cc0d87b6614571b1670ab887de595b (patch) | |
tree | 1dc4c7dd806911cd2262710e476c59aa45194e1c | |
parent | 50dcde8c8eac3579ac0fe78d5cd298555fe9fa29 (diff) | |
download | pcre-1240bf7573cc0d87b6614571b1670ab887de595b.tar.gz |
Documentation notes that fr_FR locale is "french" in Windows; add an
out-of-tree built test to maint/ManyConfigTests.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@139 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r-- | README | 3 | ||||
-rwxr-xr-x | RunTest | 4 | ||||
-rw-r--r-- | doc/pcreapi.3 | 25 | ||||
-rw-r--r-- | doc/pcrepattern.3 | 7 | ||||
-rwxr-xr-x | maint/ManyConfigTests | 132 |
5 files changed, 126 insertions, 45 deletions
@@ -489,6 +489,9 @@ is output to say why. If running this test produces instances of the error in the comparison output, it means that locale is not available on your system, despite being listed by "locale". This does not mean that PCRE is broken. +[If you are trying to run this test on Windows, you may be able to get it to +work by changing "fr_FR" to "french" everywhere it occurs.] + The fourth test checks the UTF-8 support. It is not run automatically unless PCRE is built with UTF-8 support. To do this you must set --enable-utf8 when running "configure". This file can be also fed directly to the perltest script, @@ -157,7 +157,9 @@ if [ $do2 = yes ] ; then fi fi -# Locale-specific tests, provided the "fr_FR" locale is available +# Locale-specific tests, provided the "fr_FR" locale is available. +# TODO: Try the locale name "french" instead - as used on Windows - but +# this will mean modifying the input and output data. if [ $do3 = yes ] ; then locale -a | grep '^fr_FR$' >/dev/null diff --git a/doc/pcreapi.3 b/doc/pcreapi.3 index 879170f..02c1016 100644 --- a/doc/pcreapi.3 +++ b/doc/pcreapi.3 @@ -729,19 +729,25 @@ bytes is created. .SH "LOCALE SUPPORT" .rs .sp -PCRE handles caseless matching, and determines whether characters are letters +PCRE handles caseless matching, and determines whether characters are letters, digits, or whatever, by reference to a set of tables, indexed by character value. When running in UTF-8 mode, this applies only to characters with codes less than 128. Higher-valued codes never match escapes such as \ew or \ed, but can be tested with \ep if PCRE is built with Unicode character property -support. The use of locales with Unicode is discouraged. +support. The use of locales with Unicode is discouraged. If you are handling +characters with codes greater than 128, you should either use UTF-8 and +Unicode, or use locales, but not try to mix the two. .P -An internal set of tables is created in the default C locale when PCRE is -built. This is used when the final argument of \fBpcre_compile()\fP is NULL, -and is sufficient for many applications. An alternative set of tables can, -however, be supplied. These may be created in a different locale from the -default. As more and more applications change to using Unicode, the need for -this locale support is expected to die away. +PCRE contains an internal set of tables that are used when the final argument +of \fBpcre_compile()\fP is NULL. These are sufficient for many applications. +Normally, the internal tables recognize only ASCII characters. However, when +PCRE is built, it is possible to cause the internal tables to be rebuilt in the +default "C" locale of the local system, which may cause them to be different. +.P +The internal tables can always be overridden by tables supplied by the +application that calls PCRE. These may be created in a different locale from +the default. As more and more applications change to using Unicode, the need +for this locale support is expected to die away. .P External tables are built by calling the \fBpcre_maketables()\fP function, which has no arguments, in the relevant locale. The result can then be passed @@ -754,6 +760,9 @@ the following code could be used: tables = pcre_maketables(); re = pcre_compile(..., tables); .sp +The locale name "fr_FR" is used on Linux and other Unix-like systems; if you +are using Windows, the name for the French locale is "french". +.P When \fBpcre_maketables()\fP runs, the tables are built in memory that is obtained via \fBpcre_malloc\fP. It is the caller's responsibility to ensure that the memory containing the tables remains available for as long as it is diff --git a/doc/pcrepattern.3 b/doc/pcrepattern.3 index 47b6008..60d1f1f 100644 --- a/doc/pcrepattern.3 +++ b/doc/pcrepattern.3 @@ -291,8 +291,9 @@ in the .\" HREF \fBpcreapi\fP .\" -page). For example, in the "fr_FR" (French) locale, some character codes -greater than 128 are used for accented letters, and these are matched by \ew. +page). For example, in a French locale such as "fr_FR" in Unix-like systems, +or "french" in Windows, some character codes greater than 128 are used for +accented letters, and these are matched by \ew. .P In UTF-8 mode, characters with values greater than 128 never match \ed, \es, or \ew, and always match \eD, \eS, and \eW. This is true even when Unicode @@ -740,7 +741,7 @@ example [\ex{100}-\ex{2ff}]. If a range that includes letters is used when caseless matching is set, it matches the letters in either case. For example, [W-c] is equivalent to [][\e\e^_`wxyzabc], matched caselessly, and in non-UTF-8 mode, if character -tables for the "fr_FR" locale are in use, [\exc8-\excb] matches accented E +tables for a French locale are in use, [\exc8-\excb] matches accented E characters in both cases. In UTF-8 mode, PCRE supports the concept of case for characters with values greater than 128 only when it is compiled with Unicode property support. diff --git a/maint/ManyConfigTests b/maint/ManyConfigTests index fd9c33e..cdf5358 100755 --- a/maint/ManyConfigTests +++ b/maint/ManyConfigTests @@ -3,14 +3,13 @@ # This is a script for the use of PCRE maintainers. It configures and rebuilds # PCRE with a variety of configuration options, and in each case runs the tests # to ensure that all goes well. Every possible combination would take far too -# long, so we use a representative sample. As well as testing that they work, -# we use --disable-shared or --disable-static after the first test (which -# builds both) to save a bit of time by building only one version for the -# subsequent tests. +# long, so we use a representative sample. This script should be run in the +# PCRE source directory. # Some of the tests have to be skipped when PCRE is built with non-Unix newline # recognition. I am planning to reduce this as much as possible in due course. + # This is in case the caller has set aliases (as I do - PH) unset cp ls mv rm @@ -20,35 +19,31 @@ unset cp ls mv rm verbose=0 if [ "$1" = "-v" ] ; then verbose=1; fi -# The first (empty) configuration builds with all the default settings. +# This is a temporary directory for testing out-of-line builds -for opts in \ - "" \ - "--enable-utf8 --disable-static" \ - "--enable-unicode-properties --disable-shared" \ - "--enable-unicode-properties --disable-stack-for-recursion --disable-shared" \ - "--enable-unicode-properties --disable-cpp --with-link-size=3 --disable-shared" \ - "--enable-rebuild-chartables --disable-shared" \ - "--enable-newline-is-any --disable-shared" \ - "--enable-newline-is-cr --disable-shared" \ - "--enable-newline-is-crlf --disable-shared" \ - "--enable-utf8 --enable-newline-is-any --enable-unicode-properties --disable-stack-for-recursion --disable-static --disable-cpp" -do +tmp=/tmp/pcretesting + + +# This function runs a single test with the set of configuration options that +# are in $opts. The source directory must be set in srcdir. + +function runtest() + { rm -f *_unittest if [ "$opts" = "" ] ; then - echo "===> Configuring with: default settings" + echo "Configuring with: default settings" else olen=`expr length "$opts"` - if [ $olen -gt 56 ] ; then - echo "===> Configuring with:" - echo " $opts" + if [ $olen -gt 53 ] ; then + echo "Configuring with:" + echo " $opts" else - echo "===> Configuring with: $opts" + echo "Configuring with: $opts" fi fi - ./configure $opts >/dev/null 2>teststderr + $srcdir/configure $opts >/dev/null 2>teststderr if [ $? -ne 0 ]; then echo " " echo "**** Error while configuring ****" @@ -56,7 +51,7 @@ do exit 1 fi - echo "===> Making" + echo "Making" make >/dev/null 2>teststderr if [ $? -ne 0 ]; then echo " " @@ -73,8 +68,8 @@ do nl=`expr match "$conf" ".*Newline sequence is \([A-Z]*\)"` if [ "$nl" = "LF" -o "$nl" = "ANY" ]; then - echo "===> Running C library tests" - ./RunTest >teststdout + echo "Running C library tests" + $srcdir/RunTest >teststdout if [ $? -ne 0 ]; then echo " " echo "**** Test failed ****" @@ -82,12 +77,12 @@ do exit 1 fi else - echo "===> Skipping C library tests: newline is $nl" + echo "Skipping C library tests: newline is $nl" fi if [ "$nl" = "LF" ]; then - echo "===> Running pcregrep tests" - ./RunGrepTest >teststdout 2>teststderr + echo "Running pcregrep tests" + $srcdir/RunGrepTest >teststdout 2>teststderr if [ $? -ne 0 ]; then echo " " echo "**** Test failed ****" @@ -96,7 +91,7 @@ do exit 1 fi else - echo "===> Skipping pcregrep tests: newline is $nl" + echo "Skipping pcregrep tests: newline is $nl" fi if [ "$nl" = "LF" -o "$nl" = "ANY" ]; then @@ -105,7 +100,7 @@ do pcre_scanner_unittest \ pcre_stringpiece_unittest do - echo "===> Running $utest" + echo "Running $utest" $utest >teststdout if [ $? -ne 0 ]; then echo " " @@ -114,12 +109,83 @@ do exit 1 fi done + else + echo "Skipping C++ tests: pcrecpp_unittest does not exist" fi else - echo "===> Skipping C++ tests: newline is $nl" + echo "Skipping C++ tests: newline is $nl" fi + } + +# This set of tests builds PCRE and runs the tests with a variety of configure +# options, in the current (source) directory. The first (empty) configuration +# builds with all the default settings. As well as testing that these options +# work, we use --disable-shared or --disable-static after the first test (which +# builds both) to save a bit of time by building only one version of the +# library for the subsequent tests. + +echo "Tests in the current directory" +srcdir=. +for opts in \ + "" \ + "--enable-utf8 --disable-static" \ + "--enable-unicode-properties --disable-shared" \ + "--enable-unicode-properties --disable-stack-for-recursion --disable-shared" \ + "--enable-unicode-properties --disable-cpp --with-link-size=3 --disable-shared" \ + "--enable-rebuild-chartables --disable-shared" \ + "--enable-newline-is-any --disable-shared" \ + "--enable-newline-is-cr --disable-shared" \ + "--enable-newline-is-crlf --disable-shared" \ + "--enable-utf8 --enable-newline-is-any --enable-unicode-properties --disable-stack-for-recursion --disable-static --disable-cpp" +do + runtest done -echo "===> All done" + + +# Clean up the distribution and then do at least one build and test in a +# directory other than the source directory. It doesn't work unless the +# source directory is cleaned up first - and anyway, it's best to leave it +# in a clean state after all this reconfiguring. + +if [ -f Makefile ]; then + echo "Running 'make distclean'" + make distclean >/dev/null 2>&1 + if [ $? -ne 0 ]; then + echo "** 'make distclean' failed" + exit 1 + fi +fi + +echo "Tests in the $tmp directory" +srcdir=`pwd` +export srcdir + +if [ ! -e $tmp ]; then + mkdir $tmp +fi + +if [ ! -d $tmp ]; then + echo "** Failed to create $tmp or it is not a directory" + exit 1 +fi + +cd $tmp +if [ $? -ne 0 ]; then + echo "** Failed to cd to $tmp" + exit 1 +fi + +for opts in \ + "--enable-unicode-properties --disable-shared" +do + runtest +done + +echo "Removing $tmp" + +rm -rf $tmp + +echo "All done" # End |