Prepare for release candidate.

git-svn-id: svn://vcs.exim.org/pcre/code/trunk@535 2f5784b3-3f2a-0410-8824-cb99058d5e15
author: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> 2010-06-03 19:18:24 +0000
committer: ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> 2010-06-03 19:18:24 +0000
commit: c8b8f5074c8e0f3ccf5621bf55a5b13b8c32043f (patch)
tree: 1c305bfeea11677c8369a04f363841e5ccc2d7fa /maint
parent: fb40fb6ad1eff9249f36732b6628ef6285ea9a39 (diff)
download: pcre-c8b8f5074c8e0f3ccf5621bf55a5b13b8c32043f.tar.gz
1 files changed, 62 insertions, 60 deletions
diff --git a/maint/README b/maint/README
index 82062ea..f6c9102 100644
--- a/maint/README
+++ b/maint/README
@@ -25,20 +25,20 @@ Builducptable    A Perl script that creates the contents of the ucptable.h file
 
 GenerateUtt.py   A Python script to generate part of the pcre_tables.c file
                  that contains Unicode script names in a long string with
-                 offsets, which is tedious to maintain by hand. 
+                 offsets, which is tedious to maintain by hand.
 
 ManyConfigTests  A shell script that runs "configure, make, test" a number of
                  times with different configuration settings.
-                 
+
 MultiStage2.py   A Python script that generates the file pcre_ucd.c from three
                  Unicode data tables, which are themselves downloaded from the
-                 Unicode web site. Run this script in the "maint" directory. 
+                 Unicode web site. Run this script in the "maint" directory.
                  The generated file contains the tables for a 2-stage lookup
-                 of Unicode properties.  
+                 of Unicode properties.
 
 README           This file.
 
-Unicode.tables   The files in this directory, DerivedGeneralCategory.txt, 
+Unicode.tables   The files in this directory, DerivedGeneralCategory.txt,
                  Scripts.txt and UnicodeData.txt, were downloaded from the
                  Unicode web site. They contain information about Unicode
                  characters and scripts.
@@ -71,9 +71,9 @@ can be run to generate a new version of pcre_ucd.c, and GenerateUtt.py can be
 run to generate the tricky tables for inclusion in pcre_tables.c.
 
 If MultiStage2.py gives the error "ValueError: list.index(x): x not in list",
-the cause is usually a missing (or misspelt) name in the list of scripts. I 
-couldn't find a straightforward list of scripts on the Unicode site, but 
-there's a useful Wikipedia page that list them, and notes the Unicode version 
+the cause is usually a missing (or misspelt) name in the list of scripts. I
+couldn't find a straightforward list of scripts on the Unicode site, but
+there's a useful Wikipedia page that list them, and notes the Unicode version
 in which they were introduced:
 
 http://en.wikipedia.org/wiki/Unicode_scripts#Table_of_Unicode_scripts
@@ -83,7 +83,7 @@ pcre_ucd.c work properly, using the data files in ucptestdata to check a number
 of test characters. The source file ucptest.c must be updated whenever new
 Unicode script names are added.
 
-Note also that both the pcresyntax.3 and pcrepattern.3 man pages contain lists 
+Note also that both the pcresyntax.3 and pcrepattern.3 man pages contain lists
 of Unicode script names.
 
 
@@ -94,20 +94,20 @@ This section contains a checklist of things that I consult before building a
 distribution for a new release.
 
 . Ensure that the version number and version date are correct in configure.ac.
-  
+
 . If new build options have been added, ensure that they are added to the CMake
-  files as well as to the autoconf files. 
+  files as well as to the autoconf files.
 
 . Run ./autogen.sh to ensure everything is up-to-date.
 
 . Compile and test with many different config options, and combinations of
   options. The maint/ManyConfigTests script now encapsulates this testing.
 
-. Run perltest.pl on the test data for tests 1, 4, 6, and 11. The first two can 
-  be run with Perl 5.8 or 5.10; the last two require Perl 5.10. The output
-  should match the PCRE test output, apart from the version identification at
-  the start of each test. The other tests are not Perl-compatible (they use
-  various PCRE-specific features or options).
+. Run perltest.pl on the test data for tests 1, 4, 6, and 11. The first two can
+  be run with Perl 5.8 or >= 5.10; the last two require Perl >= 5.10. The
+  output should match the PCRE test output, apart from the version
+  identification at the start of each test. The other tests are not
+  Perl-compatible (they use various PCRE-specific features or options).
 
 . Test with valgrind by running "RunTest valgrind". There is also "RunGrepTest
   valgrind", though that takes quite a long time.
@@ -130,14 +130,14 @@ distribution for a new release.
   used" warnings for the modules in which there is no call to memmove(). These
   can be ignored.
 
-. Documentation: check AUTHORS, COPYING, ChangeLog (check version and date), 
+. Documentation: check AUTHORS, COPYING, ChangeLog (check version and date),
   INSTALL, LICENCE, NEWS (check version and date), NON-UNIX-USE, and README.
   Many of these won't need changing, but over the long term things do change.
 
 . Man pages: Check all man pages for \ not followed by e or f or " because
-  that indicates a markup error. However, there is one exception: pcredemo.3, 
+  that indicates a markup error. However, there is one exception: pcredemo.3,
   which is created from the pcredemo.c program. It contains three instances
-  of \\n. 
+  of \\n.
 
 . When the release is built, test it on a number of different operating
   systems if possible, and using different compilers as well. For example,
@@ -154,10 +154,10 @@ spaces). Then run "make distcheck" to create the tarballs and the zipball.
 Double-check with "svn status", then create an SVN tagged copy:
 
   svn copy svn://vcs.exim.org/pcre/code/trunk \
-           svn://vcs.exim.org/pcre/code/tags/pcre-8.xx 
+           svn://vcs.exim.org/pcre/code/tags/pcre-8.xx
 
 Don't forget to update Freshmeat when the new release is out, and to tell
-webmaster@pcre.org and the mailing list. Also, update the list of version 
+webmaster@pcre.org and the mailing list. Also, update the list of version
 numbers in Bugzilla (edit products).
 
 
@@ -186,7 +186,7 @@ others are relatively new.
     over the existing "required byte" (reqbyte) feature that just remembers one
     byte.
 
-  * These probably need to go in study():
+  * These probably need to go in pcre_study():
 
     o Remember an initial string rather than just 1 char?
 
@@ -194,7 +194,14 @@ others are relatively new.
       earlier one if common to all alternatives.
 
     o Friedl contains other ideas.
-    
+
+  * pcre_study() does not set initial byte flags for Unicode property types
+    such as \p; I don't know how much benefit there would be for, for example,
+    setting the bits for 0-9 and all bytes >= xC0 when a pattern starts with
+    \p{N}.
+
+  * There is scope for more "auto-possessifying" in connection with \p and \P.
+
 . If Perl gets to a consistent state over the settings of capturing sub-
   patterns inside repeats, see if we can match it. One example of the
   difference is the matching of /(main(O)?)+/ against mainOmain, where PCRE
@@ -205,11 +212,6 @@ others are relatively new.
 
 . Unicode
 
-  * Note that in Perl, \s matches \pZ and similarly for \d, \w and the POSIX
-    character classes. For the moment, I've chosen not to support this for
-    backward compatibility, for speed, and because it would be messy to
-    implement.
-
   * A different approach to Unicode might be to use a typedef to do everything
     in unsigned shorts instead of unsigned chars. Actually, we'd have to have a
     new typedef to distinguish data from bits of compiled pattern that are in
@@ -271,54 +273,54 @@ others are relatively new.
 
 . Someone suggested --disable-callout to save code space when callouts are
   never wanted. This seems rather marginal.
-  
-. Check names that consist entirely of digits: PCRE allows, but do Perl and 
-  Python, etc? 
-  
-. A user suggested a parameter to limit the length of string matched, for 
-  example if the parameter is N, the current match should fail if the matched 
-  substring exceeds N. This could apply to both match functions. The value 
+
+. Check names that consist entirely of digits: PCRE allows, but do Perl and
+  Python, etc?
+
+. A user suggested a parameter to limit the length of string matched, for
+  example if the parameter is N, the current match should fail if the matched
+  substring exceeds N. This could apply to both match functions. The value
   could be a new field in the extra block.
-  
+
 . Callouts with arguments: (?Cn:ARG) for instance.
 
-. A user is going to supply a patch to generalize the API for user-specific 
+. A user is going to supply a patch to generalize the API for user-specific
   memory allocation so that it is more flexible in threaded environments. This
   was promised a long time ago, and never appeared...
-  
+
 . Write a function that generates random matching strings for a compiled regex.
 
-. Write a wrapper to maintain a structure with specified runtime parameters, 
-  such as recurse limit, and pass these to PCRE each time it is called. Also 
+. Write a wrapper to maintain a structure with specified runtime parameters,
+  such as recurse limit, and pass these to PCRE each time it is called. Also
   maybe malloc and free. A user sent a prototype.
-  
-. Pcregrep: an option to specify the output line separator, either as a string 
-  or select from a fixed list. This is not dead easy, because at the moment it 
+
+. Pcregrep: an option to specify the output line separator, either as a string
+  or select from a fixed list. This is not dead easy, because at the moment it
   outputs whatever is in the input file.
-  
-. Improve the code for duplicate checking in pcre_dfa_exec(). An incomplete, 
-  non-thread-safe patch showed that this can help performance for patterns 
-  where there are many alternatives. However, a simple thread-safe 
-  implementation that I tried made things worse in many simple cases, so this 
+
+. Improve the code for duplicate checking in pcre_dfa_exec(). An incomplete,
+  non-thread-safe patch showed that this can help performance for patterns
+  where there are many alternatives. However, a simple thread-safe
+  implementation that I tried made things worse in many simple cases, so this
   is not an obviously good thing.
-  
-. Make the longest lookbehind available via pcre_fullinfo(). This is not 
-  straightforward because lookbehinds can be nested inside lookbehinds. This 
-  case will have to be identified, and the amounts added. This should then give 
-  the maximum possible lookbehind length. The reason for wanting this is to 
+
+. Make the longest lookbehind available via pcre_fullinfo(). This is not
+  straightforward because lookbehinds can be nested inside lookbehinds. This
+  case will have to be identified, and the amounts added. This should then give
+  the maximum possible lookbehind length. The reason for wanting this is to
   help when implementing multi-segment matching using pcre_exec() with partial
   matching and overlapping segments.
-  
+
 . PCRE cannot at present distinguish between subpatterns with different names,
-  but the same number (created by the use of ?|). In order to do so, a way of 
+  but the same number (created by the use of ?|). In order to do so, a way of
   remembering *which* subpattern numbered n matched is needed. Bugzilla #760.
-  
-. Instead of having #ifdef HAVE_CONFIG_H in each module, put #include 
+  Now that (*MARK) has been implemented, it can perhaps be used as a way round
+  this problem.
+
+. Instead of having #ifdef HAVE_CONFIG_H in each module, put #include
   "something" and the the #ifdef appears only in one place, in "something".
-  
-. Support for (*MARK) and arguments for (*PRUNE) and friends. 
 
 Philip Hazel
 Email local part: ph10
 Email domain: cam.ac.uk
-Last updated: 10 March 2010
+Last updated: 03 June 2010
author	ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>	2010-06-03 19:18:24 +0000
committer	ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15>	2010-06-03 19:18:24 +0000
commit	c8b8f5074c8e0f3ccf5621bf55a5b13b8c32043f (patch)
tree	1c305bfeea11677c8369a04f363841e5ccc2d7fa /maint
parent	fb40fb6ad1eff9249f36732b6628ef6285ea9a39 (diff)
download	pcre-c8b8f5074c8e0f3ccf5621bf55a5b13b8c32043f.tar.gz