diff options
Diffstat (limited to 'doc/pcreapi.3')
-rw-r--r-- | doc/pcreapi.3 | 26 |
1 files changed, 25 insertions, 1 deletions
diff --git a/doc/pcreapi.3 b/doc/pcreapi.3 index fbd3d5d..0149f50 100644 --- a/doc/pcreapi.3 +++ b/doc/pcreapi.3 @@ -371,6 +371,18 @@ in the main .\" page. + PCRE_NO_UTF8_CHECK + +When PCRE_UTF8 is set, the validity of the pattern as a UTF-8 string is +automatically checked. If an invalid UTF-8 sequence of bytes is found, +\fBpcre_compile()\fR returns an error. If you already know that your pattern is +valid, and you want to skip this check for performance reasons, you can set the +PCRE_NO_UTF8_CHECK option. When it is set, the effect of passing an invalid +UTF-8 string as a pattern is undefined. It may cause your program to crash. +Note that there is a similar option for suppressing the checking of subject +strings passed to \fBpcre_exec()\fR. + + .SH STUDYING A PATTERN .rs .sp @@ -698,6 +710,14 @@ first matching position. However, if a pattern was compiled with PCRE_ANCHORED, or turned out to be anchored by virtue of its contents, it cannot be made unachored at matching time. +When PCRE_UTF8 was set at compile time, the validity of the subject as a UTF-8 +string is automatically checked. If an invalid UTF-8 sequence of bytes is +found, \fBpcre_exec()\fR returns the error PCRE_ERROR_BADUTF8. If you already +know that your subject is valid, and you want to skip this check for +performance reasons, you can set the PCRE_NO_UTF8_CHECK option when calling +\fBpcre_exec()\fR. When this option is set, the effect of passing an invalid +UTF-8 string as a subject is undefined. It may cause your program to crash. + There are also three further options that can be set only at matching time: PCRE_NOTBOL @@ -872,6 +892,10 @@ This error is never generated by \fBpcre_exec()\fR itself. It is provided for use by callout functions that want to yield a distinctive error code. See the \fBpcrecallout\fR documentation for details. + PCRE_ERROR_BADUTF8 (-10) + +A string that contains an invalid UTF-8 byte sequence was passed as a subject. + .SH EXTRACTING CAPTURED SUBSTRINGS BY NUMBER .rs .sp @@ -1011,6 +1035,6 @@ then call \fIpcre_copy_substring()\fR or \fIpcre_get_substring()\fR, as appropriate. .in 0 -Last updated: 03 February 2003 +Last updated: 20 August 2003 .br Copyright (c) 1997-2003 University of Cambridge. |