diff options
author | ph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069> | 2016-06-05 16:05:34 +0000 |
---|---|---|
committer | ph10 <ph10@6239d852-aaf2-0410-a92c-79f79f948069> | 2016-06-05 16:05:34 +0000 |
commit | 74550ef81518b2052fa9d57a12ee59541ae7b95a (patch) | |
tree | 91fbda016c843666111790da13d6b59c25c2ff7a /doc/html | |
parent | b69f2c210fbe21e22eb139cd616e811c37c10035 (diff) | |
download | pcre2-74550ef81518b2052fa9d57a12ee59541ae7b95a.tar.gz |
Implement PCRE2_NO_JIT, update HTML docs as well.
git-svn-id: svn://vcs.exim.org/pcre2/code/trunk@522 6239d852-aaf2-0410-a92c-79f79f948069
Diffstat (limited to 'doc/html')
-rw-r--r-- | doc/html/README.txt | 19 | ||||
-rw-r--r-- | doc/html/pcre2api.html | 56 | ||||
-rw-r--r-- | doc/html/pcre2build.html | 52 | ||||
-rw-r--r-- | doc/html/pcre2jit.html | 10 | ||||
-rw-r--r-- | doc/html/pcre2serialize.html | 40 | ||||
-rw-r--r-- | doc/html/pcre2test.html | 3 |
6 files changed, 112 insertions, 68 deletions
diff --git a/doc/html/README.txt b/doc/html/README.txt index 48d2ffd..6cb1bbb 100644 --- a/doc/html/README.txt +++ b/doc/html/README.txt @@ -168,15 +168,12 @@ library. They are also documented in the pcre2build man page. built. If you want only the 16-bit or 32-bit library, use --disable-pcre2-8 to disable building the 8-bit library. -. If you want to include support for just-in-time compiling, which can give - large performance improvements on certain platforms, add --enable-jit to the - "configure" command. This support is available only for certain hardware +. If you want to include support for just-in-time (JIT) compiling, which can + give large performance improvements on certain platforms, add --enable-jit to + the "configure" command. This support is available only for certain hardware architectures. If you try to enable it on an unsupported architecture, there will be a compile time error. -. When JIT support is enabled, pcre2grep automatically makes use of it, unless - you add --disable-pcre2grep-jit to the "configure" command. - . If you do not want to make use of the support for UTF-8 Unicode character strings in the 8-bit library, UTF-16 Unicode character strings in the 16-bit library, or UTF-32 Unicode character strings in the 32-bit library, you can @@ -324,6 +321,14 @@ library. They are also documented in the pcre2build man page. running "make" to build PCRE2. There is more information about coverage reporting in the "pcre2build" documentation. +. When JIT support is enabled, pcre2grep automatically makes use of it, unless + you add --disable-pcre2grep-jit to the "configure" command. + +. On non-Windows sytems there is support for calling external scripts during + matching in the pcre2grep command via PCRE2's callout facility with string + arguments. This support can be disabled by adding --disable-pcre2grep-callout + to the "configure" command. + . The pcre2grep program currently supports only 8-bit data files, and so requires the 8-bit PCRE2 library. It is possible to compile pcre2grep to use libz and/or libbz2, in order to read .gz and .bz2 files (respectively), by @@ -840,4 +845,4 @@ The distribution should contain the files listed below. Philip Hazel Email local part: ph10 Email domain: cam.ac.uk -Last updated: 16 October 2015 +Last updated: 01 April 2016 diff --git a/doc/html/pcre2api.html b/doc/html/pcre2api.html index a037d4b..db4e7c1 100644 --- a/doc/html/pcre2api.html +++ b/doc/html/pcre2api.html @@ -417,9 +417,10 @@ More complicated programs might need to make use of the specialist functions <b>pcre2_jit_stack_assign()</b> in order to control the JIT code's memory usage. </P> <P> -JIT matching is automatically used by <b>pcre2_match()</b> if it is available. -There is also a direct interface for JIT matching, which gives improved -performance. The JIT-specific functions are discussed in the +JIT matching is automatically used by <b>pcre2_match()</b> if it is available, +unless the PCRE2_NO_JIT option is set. There is also a direct interface for JIT +matching, which gives improved performance. The JIT-specific functions are +discussed in the <a href="pcre2jit.html"><b>pcre2jit</b></a> documentation. </P> @@ -555,7 +556,7 @@ least until a pattern has been compiled. The logic can be something like this: Get a write (unique) lock for pointer pointer = pcre2_compile(... } - Release the lock + Release the lock Use pointer in pcre2_match() </pre> Of course, testing for compilation errors should also be included in the code. @@ -563,9 +564,9 @@ Of course, testing for compilation errors should also be included in the code. <P> If JIT is being used, but the JIT compilation is not being done immediately, (perhaps waiting to see if the pattern is used often enough) similar logic is -required. JIT compilation updates a pointer within the compiled code block, so -a thread must gain unique write access to the pointer before calling -<b>pcre2_jit_compile()</b>. Alternatively, <b>pcre2_code_copy()</b> can be used +required. JIT compilation updates a pointer within the compiled code block, so +a thread must gain unique write access to the pointer before calling +<b>pcre2_jit_compile()</b>. Alternatively, <b>pcre2_code_copy()</b> can be used to obtain a private copy of the compiled code. </P> <br><b> @@ -1062,7 +1063,7 @@ The <b>pcre2_compile()</b> function compiles a pattern into an internal form. The pattern is defined by a pointer to a string of code units and a length. If the pattern is zero-terminated, the length can be specified as PCRE2_ZERO_TERMINATED. The function returns a pointer to a block of memory that -contains the compiled pattern and related data. +contains the compiled pattern and related data. </P> <P> If the compile context argument <i>ccontext</i> is NULL, memory for the compiled @@ -1071,12 +1072,12 @@ the same memory function that was used for the compile context. The caller must free the memory by calling <b>pcre2_code_free()</b> when it is no longer needed. </P> <P> -The function <b>pcre2_code_copy()</b> makes a copy of the compiled code in new -memory, using the same memory allocator as was used for the original. However, +The function <b>pcre2_code_copy()</b> makes a copy of the compiled code in new +memory, using the same memory allocator as was used for the original. However, if the code has been processed by the JIT compiler (see <a href="#jitcompiling">below),</a> -the JIT information cannot be copied (because it is position-dependent). -The new copy can initially be used only for non-JIT matching, though it can be +the JIT information cannot be copied (because it is position-dependent). +The new copy can initially be used only for non-JIT matching, though it can be passed to <b>pcre2_jit_compile()</b> if required. The <b>pcre2_code_copy()</b> function provides a way for individual threads in a multithreaded application to acquire a private copy of shared compiled code. @@ -1630,10 +1631,15 @@ are as follows: Return a copy of the pattern's options. The third argument should point to a <b>uint32_t</b> variable. PCRE2_INFO_ARGOPTIONS returns exactly the options that were passed to <b>pcre2_compile()</b>, whereas PCRE2_INFO_ALLOPTIONS returns -the compile options as modified by any top-level option settings such as (*UTF) -at the start of the pattern itself. For example, if the pattern /(*UTF)abc/ is -compiled with the PCRE2_EXTENDED option, the result is PCRE2_EXTENDED and -PCRE2_UTF. +the compile options as modified by any top-level (*XXX) option settings such as +(*UTF) at the start of the pattern itself. +</P> +<P> +For example, if the pattern /(*UTF)abc/ is compiled with the PCRE2_EXTENDED +option, the result for PCRE2_INFO_ALLOPTIONS is PCRE2_EXTENDED and PCRE2_UTF. +Option settings such as (?i) that can change within a pattern do not affect the +result of PCRE2_INFO_ALLOPTIONS, even if they appear right at the start of the +pattern. (This was different in some earlier releases.) </P> <P> A pattern compiled without PCRE2_ANCHORED is automatically anchored by PCRE2 if @@ -2088,14 +2094,15 @@ Option bits for <b>pcre2_match()</b> <P> The unused bits of the <i>options</i> argument for <b>pcre2_match()</b> must be zero. The only bits that may be set are PCRE2_ANCHORED, PCRE2_NOTBOL, -PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_UTF_CHECK, -PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is described below. +PCRE2_NOTEOL, PCRE2_NOTEMPTY, PCRE2_NOTEMPTY_ATSTART, PCRE2_NO_JIT, +PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. Their action is +described below. </P> <P> Setting PCRE2_ANCHORED at match time is not supported by the just-in-time (JIT) compiler. If it is set, JIT matching is disabled and the normal interpretive -code in <b>pcre2_match()</b> is run. The remaining options are supported for JIT -matching. +code in <b>pcre2_match()</b> is run. Apart from PCRE2_NO_JIT (obviously), the +remaining options are supported for JIT matching. <pre> PCRE2_ANCHORED </pre> @@ -2143,6 +2150,13 @@ the starting offset. An empty string match later in the subject is permitted. If the pattern is anchored, such a match can occur only if the pattern contains \K. <pre> + PCRE2_NO_JIT +</pre> +By default, if a pattern has been successfully processed by +<b>pcre2_jit_compile()</b>, JIT is automatically used when <b>pcre2_match()</b> +is called with options that JIT supports. Setting PCRE2_NO_JIT disables the use +of JIT; it forces matching to be done by the interpreter. +<pre> PCRE2_NO_UTF_CHECK </pre> When PCRE2_UTF is set at compile time, the validity of the subject as a UTF @@ -3184,7 +3198,7 @@ Cambridge, England. </P> <br><a name="SEC40" href="#TOC1">REVISION</a><br> <P> -Last updated: 26 February 2016 +Last updated: 05 June 2016 <br> Copyright © 1997-2016 University of Cambridge. <br> diff --git a/doc/html/pcre2build.html b/doc/html/pcre2build.html index 1e5f737..ac55598 100644 --- a/doc/html/pcre2build.html +++ b/doc/html/pcre2build.html @@ -27,15 +27,16 @@ please consult the man page, in case the conversion went wrong. <li><a name="TOC12" href="#SEC12">LIMITING PCRE2 RESOURCE USAGE</a> <li><a name="TOC13" href="#SEC13">CREATING CHARACTER TABLES AT BUILD TIME</a> <li><a name="TOC14" href="#SEC14">USING EBCDIC CODE</a> -<li><a name="TOC15" href="#SEC15">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a> -<li><a name="TOC16" href="#SEC16">PCRE2GREP BUFFER SIZE</a> -<li><a name="TOC17" href="#SEC17">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a> -<li><a name="TOC18" href="#SEC18">INCLUDING DEBUGGING CODE</a> -<li><a name="TOC19" href="#SEC19">DEBUGGING WITH VALGRIND SUPPORT</a> -<li><a name="TOC20" href="#SEC20">CODE COVERAGE REPORTING</a> -<li><a name="TOC21" href="#SEC21">SEE ALSO</a> -<li><a name="TOC22" href="#SEC22">AUTHOR</a> -<li><a name="TOC23" href="#SEC23">REVISION</a> +<li><a name="TOC15" href="#SEC15">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a> +<li><a name="TOC16" href="#SEC16">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a> +<li><a name="TOC17" href="#SEC17">PCRE2GREP BUFFER SIZE</a> +<li><a name="TOC18" href="#SEC18">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a> +<li><a name="TOC19" href="#SEC19">INCLUDING DEBUGGING CODE</a> +<li><a name="TOC20" href="#SEC20">DEBUGGING WITH VALGRIND SUPPORT</a> +<li><a name="TOC21" href="#SEC21">CODE COVERAGE REPORTING</a> +<li><a name="TOC22" href="#SEC22">SEE ALSO</a> +<li><a name="TOC23" href="#SEC23">AUTHOR</a> +<li><a name="TOC24" href="#SEC24">REVISION</a> </ul> <br><a name="SEC1" href="#TOC1">BUILDING PCRE2</a><br> <P> @@ -349,7 +350,16 @@ The options that select newline behaviour, such as --enable-newline-is-cr, and equivalent run-time options, refer to these character values in an EBCDIC environment. </P> -<br><a name="SEC15" href="#TOC1">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br> +<br><a name="SEC15" href="#TOC1">PCRE2GREP SUPPORT FOR EXTERNAL SCRIPTS</a><br> +<P> +By default, on non-Windows systems, <b>pcre2grep</b> supports the use of +callouts with string arguments within the patterns it is matching, in order to +run external scripts. For details, see the +<a href="pcre2grep.html"><b>pcre2grep</b></a> +documentation. This support can be disabled by adding +--disable-pcre2grep-callout to the <b>configure</b> command. +</P> +<br><a name="SEC16" href="#TOC1">PCRE2GREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br> <P> By default, <b>pcre2grep</b> reads all files as plain text. You can build it so that it recognizes files whose names end in <b>.gz</b> or <b>.bz2</b>, and reads @@ -362,7 +372,7 @@ to the <b>configure</b> command. These options naturally require that the relevant libraries are installed on your system. Configuration will fail if they are not. </P> -<br><a name="SEC16" href="#TOC1">PCRE2GREP BUFFER SIZE</a><br> +<br><a name="SEC17" href="#TOC1">PCRE2GREP BUFFER SIZE</a><br> <P> <b>pcre2grep</b> uses an internal buffer to hold a "window" on the file it is scanning, in order to be able to output "before" and "after" lines when it @@ -375,9 +385,9 @@ parameter value by adding, for example, --with-pcre2grep-bufsize=50K </pre> to the <b>configure</b> command. The caller of \fPpcre2grep\fP can override this -value by using --buffer-size on the command line.. +value by using --buffer-size on the command line. </P> -<br><a name="SEC17" href="#TOC1">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a><br> +<br><a name="SEC18" href="#TOC1">PCRE2TEST OPTION FOR LIBREADLINE SUPPORT</a><br> <P> If you add one of <pre> @@ -411,7 +421,7 @@ automatically included, you may need to add something like </pre> immediately before the <b>configure</b> command. </P> -<br><a name="SEC18" href="#TOC1">INCLUDING DEBUGGING CODE</a><br> +<br><a name="SEC19" href="#TOC1">INCLUDING DEBUGGING CODE</a><br> <P> If you add <pre> @@ -420,7 +430,7 @@ If you add to the <b>configure</b> command, additional debugging code is included in the build. This feature is intended for use by the PCRE2 maintainers. </P> -<br><a name="SEC19" href="#TOC1">DEBUGGING WITH VALGRIND SUPPORT</a><br> +<br><a name="SEC20" href="#TOC1">DEBUGGING WITH VALGRIND SUPPORT</a><br> <P> If you add <pre> @@ -430,7 +440,7 @@ to the <b>configure</b> command, PCRE2 will use valgrind annotations to mark certain memory regions as unaddressable. This allows it to detect invalid memory accesses, and is mostly useful for debugging PCRE2 itself. </P> -<br><a name="SEC20" href="#TOC1">CODE COVERAGE REPORTING</a><br> +<br><a name="SEC21" href="#TOC1">CODE COVERAGE REPORTING</a><br> <P> If your C compiler is gcc, you can build a version of PCRE2 that can generate a code coverage report for its test suite. To enable this, you must install @@ -487,11 +497,11 @@ This cleans all coverage data including the generated coverage report. For more information about code coverage, see the <b>gcov</b> and <b>lcov</b> documentation. </P> -<br><a name="SEC21" href="#TOC1">SEE ALSO</a><br> +<br><a name="SEC22" href="#TOC1">SEE ALSO</a><br> <P> <b>pcre2api</b>(3), <b>pcre2-config</b>(3). </P> -<br><a name="SEC22" href="#TOC1">AUTHOR</a><br> +<br><a name="SEC23" href="#TOC1">AUTHOR</a><br> <P> Philip Hazel <br> @@ -500,11 +510,11 @@ University Computing Service Cambridge, England. <br> </P> -<br><a name="SEC23" href="#TOC1">REVISION</a><br> +<br><a name="SEC24" href="#TOC1">REVISION</a><br> <P> -Last updated: 16 October 2015 +Last updated: 01 April 2016 <br> -Copyright © 1997-2015 University of Cambridge. +Copyright © 1997-2016 University of Cambridge. <br> <p> Return to the <a href="index.html">PCRE2 index page</a>. diff --git a/doc/html/pcre2jit.html b/doc/html/pcre2jit.html index 48ee122..b1aa326 100644 --- a/doc/html/pcre2jit.html +++ b/doc/html/pcre2jit.html @@ -152,6 +152,10 @@ PCRE2_NO_UTF_CHECK, PCRE2_PARTIAL_HARD, and PCRE2_PARTIAL_SOFT. The PCRE2_ANCHORED option is not supported at match time. </P> <P> +If the PCRE2_NO_JIT option is passed to <b>pcre2_match()</b> it disables the +use of JIT, forcing matching by the interpreter code. +</P> +<P> The only unsupported pattern items are \C (match a single data unit) when running in a UTF mode, and a callout immediately before an assertion condition in a conditional group. @@ -403,7 +407,7 @@ The fast path function is called <b>pcre2_jit_match()</b>, and it takes exactly the same arguments as <b>pcre2_match()</b>. The return values are also the same, plus PCRE2_ERROR_JIT_BADOPTION if a matching mode (partial or complete) is requested that was not compiled. Unsupported option bits (for example, -PCRE2_ANCHORED) are ignored. +PCRE2_ANCHORED) are ignored, as is the PCRE2_NO_JIT option. </P> <P> When you call <b>pcre2_match()</b>, as well as testing for invalid options, a @@ -432,9 +436,9 @@ Cambridge, England. </P> <br><a name="SEC13" href="#TOC1">REVISION</a><br> <P> -Last updated: 14 November 2015 +Last updated: 05 June 2016 <br> -Copyright © 1997-2015 University of Cambridge. +Copyright © 1997-2016 University of Cambridge. <br> <p> Return to the <a href="index.html">PCRE2 index page</a>. diff --git a/doc/html/pcre2serialize.html b/doc/html/pcre2serialize.html index 3747c0a..97ee138 100644 --- a/doc/html/pcre2serialize.html +++ b/doc/html/pcre2serialize.html @@ -14,10 +14,11 @@ please consult the man page, in case the conversion went wrong. <br> <ul> <li><a name="TOC1" href="#SEC1">SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS</a> -<li><a name="TOC2" href="#SEC2">SAVING COMPILED PATTERNS</a> -<li><a name="TOC3" href="#SEC3">RE-USING PRECOMPILED PATTERNS</a> -<li><a name="TOC4" href="#SEC4">AUTHOR</a> -<li><a name="TOC5" href="#SEC5">REVISION</a> +<li><a name="TOC2" href="#SEC2">SECURITY CONCERNS</a> +<li><a name="TOC3" href="#SEC3">SAVING COMPILED PATTERNS</a> +<li><a name="TOC4" href="#SEC4">RE-USING PRECOMPILED PATTERNS</a> +<li><a name="TOC5" href="#SEC5">AUTHOR</a> +<li><a name="TOC6" href="#SEC6">REVISION</a> </ul> <br><a name="SEC1" href="#TOC1">SAVING AND RE-USING PRECOMPILED PCRE2 PATTERNS</a><br> <P> @@ -48,7 +49,15 @@ and PCRE2_SIZE type. For example, patterns compiled on a 32-bit system using PCRE2's 16-bit library cannot be reloaded on a 64-bit system, nor can they be reloaded using the 8-bit library. </P> -<br><a name="SEC2" href="#TOC1">SAVING COMPILED PATTERNS</a><br> +<br><a name="SEC2" href="#TOC1">SECURITY CONCERNS</a><br> +<P> +The facility for saving and restoring compiled patterns is intended for use +within individual applications. As such, the data supplied to +<b>pcre2_serialize_decode()</b> is expected to be trusted data, not data from +arbitrary external sources. There is only some simple consistency checking, not +complete validation of what is being re-loaded. +</P> +<br><a name="SEC3" href="#TOC1">SAVING COMPILED PATTERNS</a><br> <P> Before compiled patterns can be saved they must be serialized, that is, converted to a stream of bytes. A single byte stream may contain any number of @@ -110,7 +119,7 @@ still be used for matching. Their memory must eventually be freed in the usual way by calling <b>pcre2_code_free()</b>. When you have finished with the byte stream, it too must be freed by calling <b>pcre2_serialize_free()</b>. </P> -<br><a name="SEC3" href="#TOC1">RE-USING PRECOMPILED PATTERNS</a><br> +<br><a name="SEC4" href="#TOC1">RE-USING PRECOMPILED PATTERNS</a><br> <P> In order to re-use a set of saved patterns you must first make the serialized byte stream available in main memory (for example, by reading from a file). The @@ -142,11 +151,12 @@ is filled with those that fit, and the remainder are ignored. The yield of the function is the number of decoded patterns, or one of the following negative error codes: <pre> - PCRE2_ERROR_BADDATA second argument is zero or less - PCRE2_ERROR_BADMAGIC mismatch of id bytes in the data - PCRE2_ERROR_BADMODE mismatch of variable unit size or PCRE2 version - PCRE2_ERROR_MEMORY memory allocation failed - PCRE2_ERROR_NULL first or third argument is NULL + PCRE2_ERROR_BADDATA second argument is zero or less + PCRE2_ERROR_BADMAGIC mismatch of id bytes in the data + PCRE2_ERROR_BADMODE mismatch of code unit size or PCRE2 version + PCRE2_ERROR_BADSERIALIZEDDATA other sanity check failure + PCRE2_ERROR_MEMORY memory allocation failed + PCRE2_ERROR_NULL first or third argument is NULL </pre> PCRE2_ERROR_BADMAGIC may mean that the data is corrupt, or that it was compiled on a system with different endianness. @@ -169,7 +179,7 @@ serialized, the JIT data is discarded and so is no longer available after a save/restore cycle. You can, however, process a restored pattern with <b>pcre2_jit_compile()</b> if you wish. </P> -<br><a name="SEC4" href="#TOC1">AUTHOR</a><br> +<br><a name="SEC5" href="#TOC1">AUTHOR</a><br> <P> Philip Hazel <br> @@ -178,11 +188,11 @@ University Computing Service Cambridge, England. <br> </P> -<br><a name="SEC5" href="#TOC1">REVISION</a><br> +<br><a name="SEC6" href="#TOC1">REVISION</a><br> <P> -Last updated: 03 November 2015 +Last updated: 24 May 2016 <br> -Copyright © 1997-2015 University of Cambridge. +Copyright © 1997-2016 University of Cambridge. <br> <p> Return to the <a href="index.html">PCRE2 index page</a>. diff --git a/doc/html/pcre2test.html b/doc/html/pcre2test.html index d0cc2ec..bbe8fa5 100644 --- a/doc/html/pcre2test.html +++ b/doc/html/pcre2test.html @@ -962,6 +962,7 @@ for a description of their effects. anchored set PCRE2_ANCHORED dfa_restart set PCRE2_DFA_RESTART dfa_shortest set PCRE2_DFA_SHORTEST + no_jit set PCRE2_NO_JIT no_utf_check set PCRE2_NO_UTF_CHECK notbol set PCRE2_NOTBOL notempty set PCRE2_NOTEMPTY @@ -1697,7 +1698,7 @@ Cambridge, England. </P> <br><a name="SEC21" href="#TOC1">REVISION</a><br> <P> -Last updated: 06 February 2016 +Last updated: 05 June 2016 <br> Copyright © 1997-2016 University of Cambridge. <br> |