diff options
author | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2014-02-09 18:55:03 +0000 |
---|---|---|
committer | ph10 <ph10@2f5784b3-3f2a-0410-8824-cb99058d5e15> | 2014-02-09 18:55:03 +0000 |
commit | 8546a4442ddba4fa7ac6143f8c88ebce46fa3dd2 (patch) | |
tree | ac1a23f96b3b5d2671387e4e066486bbb88fa16e | |
parent | 4cb4c9cc746011f77bb8d1b6f5a8480b86e4c8ee (diff) | |
download | pcre-8546a4442ddba4fa7ac6143f8c88ebce46fa3dd2.tar.gz |
Implement pcre_stack_guard.
git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1454 2f5784b3-3f2a-0410-8824-cb99058d5e15
-rw-r--r-- | ChangeLog | 6 | ||||
-rw-r--r-- | doc/pcreapi.3 | 24 | ||||
-rw-r--r-- | doc/pcretest.1 | 14 | ||||
-rw-r--r-- | pcre.h.in | 8 | ||||
-rw-r--r-- | pcre_compile.c | 12 | ||||
-rw-r--r-- | pcre_globals.c | 4 | ||||
-rw-r--r-- | pcre_internal.h | 2 | ||||
-rw-r--r-- | pcreposix.c | 6 | ||||
-rw-r--r-- | pcretest.c | 71 | ||||
-rw-r--r-- | testdata/testinput2 | 8 | ||||
-rw-r--r-- | testdata/testoutput2 | 10 |
11 files changed, 146 insertions, 19 deletions
@@ -99,6 +99,12 @@ Version 8.35-RC1 xx-xxxx-201x 20. The fast forward newline mechanism could enter to an infinite loop on certain invalid UTF-8 input. Although we don't support these cases this issue can be fixed by a performance optimization. + +21. Change 33 of 8.34 is not sufficient to ensure stack safety because it does + not take account if existing stack usage. There is now a new global + variable called pcre_stack_guard that can be set to point to an external + function to check stack availability. It is called at the start of + processing every parenthesized group. Version 8.34 15-December-2013 diff --git a/doc/pcreapi.3 b/doc/pcreapi.3 index 0404939..8ffa9b7 100644 --- a/doc/pcreapi.3 +++ b/doc/pcreapi.3 @@ -1,4 +1,4 @@ -.TH PCREAPI 3 "03 January 2014" "PCRE 8.35" +.TH PCREAPI 3 "09 February 2014" "PCRE 8.35" .SH NAME PCRE - Perl-compatible regular expressions .sp @@ -116,6 +116,8 @@ PCRE - Perl-compatible regular expressions .B void (*pcre_stack_free)(void *); .sp .B int (*pcre_callout)(pcre_callout_block *); +.sp +.B int (*pcre_stack_guard)(void); .fi . . @@ -286,6 +288,14 @@ points during a matching operation. Details are given in the \fBpcrecallout\fP .\" documentation. +.P +The global variable \fBpcre_stack_guard\fP initially contains NULL. It can be +set by the caller to a function that is called by PCRE whenever it starts +to compile a parenthesized part of a pattern. When parentheses are nested, PCRE +uses recursive function calls, which use up the system stack. This function is +provided so that applications with restricted stacks can force a compilation +error if the stack runs out. The function should return zero if all is well, or +non-zero to force an error. . . .\" HTML <a name="newlines"></a> @@ -337,7 +347,8 @@ controlled in a similar way, but by separate options. The PCRE functions can be used in multi-threading applications, with the proviso that the memory management functions pointed to by \fBpcre_malloc\fP, \fBpcre_free\fP, \fBpcre_stack_malloc\fP, and \fBpcre_stack_free\fP, and the -callout function pointed to by \fBpcre_callout\fP, are shared by all threads. +callout and stack-checking functions pointed to by \fBpcre_callout\fP and +\fBpcre_stack_guard\fP, are shared by all threads. .P The compiled form of a regular expression is not altered during matching, so the same compiled pattern can safely be used by several threads at once. @@ -465,7 +476,10 @@ documentation. The output is a long integer that gives the maximum depth of nesting of parentheses (of any kind) in a pattern. This limit is imposed to cap the amount of system stack used when a pattern is compiled. It is specified when PCRE is -built; the default is 250. +built; the default is 250. This limit does not take into account the stack that +may already be used by the calling application. For finer control over +compilation stack usage, you can set a pointer to an external checking function +in \fBpcre_stack_guard\fP. .sp PCRE_CONFIG_MATCH_LIMIT .sp @@ -991,6 +1005,8 @@ have fallen out of use. To avoid confusion, they have not been re-used. 81 missing opening brace after \eo 82 parentheses are too deeply nested 83 invalid range in character class + 84 group name must start with a non-digit + 85 parentheses are too deeply nested (stack check) .sp The numbers 32 and 10000 in errors 48 and 49 are defaults; different values may be used if the limits were changed when PCRE was built. @@ -2898,6 +2914,6 @@ Cambridge CB2 3QH, England. .rs .sp .nf -Last updated: 03 January 2014 +Last updated: 09 February 2014 Copyright (c) 1997-2014 University of Cambridge. .fi diff --git a/doc/pcretest.1 b/doc/pcretest.1 index acdfcc7..733fc6f 100644 --- a/doc/pcretest.1 +++ b/doc/pcretest.1 @@ -1,4 +1,4 @@ -.TH PCRETEST 1 "17 January 2014" "PCRE 8.35" +.TH PCRETEST 1 "09 February 2014" "PCRE 8.35" .SH NAME pcretest - a program for testing Perl-compatible regular expressions. .SH SYNOPSIS @@ -333,6 +333,7 @@ sections. \fB/N\fP set PCRE_NO_AUTO_CAPTURE \fB/O\fP set PCRE_NO_AUTO_POSSESS \fB/P\fP use the POSIX wrapper + \fB/Q\fP test external stack check function \fB/S\fP study the pattern after compilation \fB/s\fP set PCRE_DOTALL \fB/T\fP select character tables @@ -519,6 +520,15 @@ the compiled pattern to be output. This does not include the size of the successfully studied with the PCRE_STUDY_JIT_COMPILE option, the size of the JIT compiled code is also output. .P +The \fB/Q\fP modifier is used to test the use of \fBpcre_stack_guard\fP. It +must be followed by '0' or '1', specifying the return code to be given from an +external function that is passed to PCRE and used for stack checking during +compilation (see the +.\" HREF +\fBpcreapi\fP +.\" +documentation for details). +.P The \fB/S\fP modifier causes \fBpcre[16|32]_study()\fP to be called after the expression has been compiled, and the results used when the expression is matched. There are a number of qualifying characters that may follow \fB/S\fP. @@ -1141,6 +1151,6 @@ Cambridge CB2 3QH, England. .rs .sp .nf -Last updated: 17 January 2014 +Last updated: 09 February 2014 Copyright (c) 1997-2014 University of Cambridge. .fi @@ -5,7 +5,7 @@ /* This is the public header file for the PCRE library, to be #included by applications that call the PCRE functions. - Copyright (c) 1997-2013 University of Cambridge + Copyright (c) 1997-2014 University of Cambridge ----------------------------------------------------------------------------- Redistribution and use in source and binary forms, with or without @@ -491,36 +491,42 @@ PCRE_EXP_DECL void (*pcre_free)(void *); PCRE_EXP_DECL void *(*pcre_stack_malloc)(size_t); PCRE_EXP_DECL void (*pcre_stack_free)(void *); PCRE_EXP_DECL int (*pcre_callout)(pcre_callout_block *); +PCRE_EXP_DECL int (*pcre_stack_guard)(void); PCRE_EXP_DECL void *(*pcre16_malloc)(size_t); PCRE_EXP_DECL void (*pcre16_free)(void *); PCRE_EXP_DECL void *(*pcre16_stack_malloc)(size_t); PCRE_EXP_DECL void (*pcre16_stack_free)(void *); PCRE_EXP_DECL int (*pcre16_callout)(pcre16_callout_block *); +PCRE_EXP_DECL int (*pcre16_stack_guard)(void); PCRE_EXP_DECL void *(*pcre32_malloc)(size_t); PCRE_EXP_DECL void (*pcre32_free)(void *); PCRE_EXP_DECL void *(*pcre32_stack_malloc)(size_t); PCRE_EXP_DECL void (*pcre32_stack_free)(void *); PCRE_EXP_DECL int (*pcre32_callout)(pcre32_callout_block *); +PCRE_EXP_DECL int (*pcre32_stack_guard)(void); #else /* VPCOMPAT */ PCRE_EXP_DECL void *pcre_malloc(size_t); PCRE_EXP_DECL void pcre_free(void *); PCRE_EXP_DECL void *pcre_stack_malloc(size_t); PCRE_EXP_DECL void pcre_stack_free(void *); PCRE_EXP_DECL int pcre_callout(pcre_callout_block *); +PCRE_EXP_DECL int pcre_stack_guard(void); PCRE_EXP_DECL void *pcre16_malloc(size_t); PCRE_EXP_DECL void pcre16_free(void *); PCRE_EXP_DECL void *pcre16_stack_malloc(size_t); PCRE_EXP_DECL void pcre16_stack_free(void *); PCRE_EXP_DECL int pcre16_callout(pcre16_callout_block *); +PCRE_EXP_DECL int pcre16_stack_guard(void); PCRE_EXP_DECL void *pcre32_malloc(size_t); PCRE_EXP_DECL void pcre32_free(void *); PCRE_EXP_DECL void *pcre32_stack_malloc(size_t); PCRE_EXP_DECL void pcre32_stack_free(void *); PCRE_EXP_DECL int pcre32_callout(pcre32_callout_block *); +PCRE_EXP_DECL int pcre32_stack_guard(void); #endif /* VPCOMPAT */ /* User defined callback which provides a stack just before the match starts. */ diff --git a/pcre_compile.c b/pcre_compile.c index 99017c4..e220799 100644 --- a/pcre_compile.c +++ b/pcre_compile.c @@ -547,6 +547,8 @@ static const char error_texts[] = "parentheses are too deeply nested\0" "invalid range in character class\0" "group name must start with a non-digit\0" + /* 85 */ + "parentheses are too deeply nested (stack check)\0" ; /* Table to identify digits and hex digits. This is used when compiling @@ -8033,6 +8035,16 @@ unsigned int orig_bracount; unsigned int max_bracount; branch_chain bc; +/* If set, call the external function that checks for stack availability. */ + +if (PUBL(stack_guard) != NULL && PUBL(stack_guard)()) + { + *errorcodeptr= ERR85; + return FALSE; + } + +/* Miscellaneous initialization */ + bc.outer = bcptr; bc.current_branch = code; diff --git a/pcre_globals.c b/pcre_globals.c index 36e6ddb..0f106aa 100644 --- a/pcre_globals.c +++ b/pcre_globals.c @@ -6,7 +6,7 @@ and semantics are as close as possible to those of the Perl 5 language. Written by Philip Hazel - Copyright (c) 1997-2012 University of Cambridge + Copyright (c) 1997-2014 University of Cambridge ----------------------------------------------------------------------------- Redistribution and use in source and binary forms, with or without @@ -72,6 +72,7 @@ PCRE_EXP_DATA_DEFN void (*PUBL(free))(void *) = LocalPcreFree; PCRE_EXP_DATA_DEFN void *(*PUBL(stack_malloc))(size_t) = LocalPcreMalloc; PCRE_EXP_DATA_DEFN void (*PUBL(stack_free))(void *) = LocalPcreFree; PCRE_EXP_DATA_DEFN int (*PUBL(callout))(PUBL(callout_block) *) = NULL; +PCRE_EXP_DATA_DEFN int (*PUBL(stack_guard))(void) = NULL; #elif !defined VPCOMPAT PCRE_EXP_DATA_DEFN void *(*PUBL(malloc))(size_t) = malloc; @@ -79,6 +80,7 @@ PCRE_EXP_DATA_DEFN void (*PUBL(free))(void *) = free; PCRE_EXP_DATA_DEFN void *(*PUBL(stack_malloc))(size_t) = malloc; PCRE_EXP_DATA_DEFN void (*PUBL(stack_free))(void *) = free; PCRE_EXP_DATA_DEFN int (*PUBL(callout))(PUBL(callout_block) *) = NULL; +PCRE_EXP_DATA_DEFN int (*PUBL(stack_guard))(void) = NULL; #endif /* End of pcre_globals.c */ diff --git a/pcre_internal.h b/pcre_internal.h index 7e07d63..2c7f5f8 100644 --- a/pcre_internal.h +++ b/pcre_internal.h @@ -2281,7 +2281,7 @@ enum { ERR0, ERR1, ERR2, ERR3, ERR4, ERR5, ERR6, ERR7, ERR8, ERR9, ERR50, ERR51, ERR52, ERR53, ERR54, ERR55, ERR56, ERR57, ERR58, ERR59, ERR60, ERR61, ERR62, ERR63, ERR64, ERR65, ERR66, ERR67, ERR68, ERR69, ERR70, ERR71, ERR72, ERR73, ERR74, ERR75, ERR76, ERR77, ERR78, ERR79, - ERR80, ERR81, ERR82, ERR83, ERR84, ERRCOUNT }; + ERR80, ERR81, ERR82, ERR83, ERR84, ERR85, ERRCOUNT }; /* JIT compiling modes. The function list is indexed by them. */ diff --git a/pcreposix.c b/pcreposix.c index 7cf4a4a..ed50619 100644 --- a/pcreposix.c +++ b/pcreposix.c @@ -6,7 +6,7 @@ and semantics are as close as possible to those of the Perl 5 language. Written by Philip Hazel - Copyright (c) 1997-2012 University of Cambridge + Copyright (c) 1997-2014 University of Cambridge ----------------------------------------------------------------------------- Redistribution and use in source and binary forms, with or without @@ -170,7 +170,9 @@ static const int eint[] = { REG_BADPAT, /* missing opening brace after \o */ REG_BADPAT, /* parentheses too deeply nested */ REG_BADPAT, /* invalid range in character class */ - REG_BADPAT /* group name must start with a non-digit */ + REG_BADPAT, /* group name must start with a non-digit */ + /* 85 */ + REG_BADPAT /* parentheses too deeply nested (stack check) */ }; /* Table of texts corresponding to POSIX error codes */ @@ -233,6 +233,9 @@ argument, the casting might be incorrectly applied. */ #define SET_PCRE_CALLOUT8(callout) \ pcre_callout = callout +#define SET_PCRE_STACK_GUARD8(stack_guard) \ + pcre_stack_guard = stack_guard + #define PCRE_ASSIGN_JIT_STACK8(extra, callback, userdata) \ pcre_assign_jit_stack(extra, callback, userdata) @@ -317,6 +320,9 @@ argument, the casting might be incorrectly applied. */ #define SET_PCRE_CALLOUT16(callout) \ pcre16_callout = (int (*)(pcre16_callout_block *))callout +#define SET_PCRE_STACK_GUARD16(stack_guard) \ + pcre16_stack_guard = (int (*)(void))stack_guard + #define PCRE_ASSIGN_JIT_STACK16(extra, callback, userdata) \ pcre16_assign_jit_stack((pcre16_extra *)extra, \ (pcre16_jit_callback)callback, userdata) @@ -406,6 +412,9 @@ argument, the casting might be incorrectly applied. */ #define SET_PCRE_CALLOUT32(callout) \ pcre32_callout = (int (*)(pcre32_callout_block *))callout +#define SET_PCRE_STACK_GUARD32(stack_guard) \ + pcre32_stack_guard = (int (*)(void))stack_guard + #define PCRE_ASSIGN_JIT_STACK32(extra, callback, userdata) \ pcre32_assign_jit_stack((pcre32_extra *)extra, \ (pcre32_jit_callback)callback, userdata) @@ -533,6 +542,14 @@ cases separately. */ else \ SET_PCRE_CALLOUT8(callout) +#define SET_PCRE_STACK_GUARD(stack_guard) \ + if (pcre_mode == PCRE32_MODE) \ + SET_PCRE_STACK_GUARD32(stack_guard); \ + else if (pcre_mode == PCRE16_MODE) \ + SET_PCRE_STACK_GUARD16(stack_guard); \ + else \ + SET_PCRE_STACK_GUARD8(stack_guard) + #define STRLEN(p) (pcre_mode == PCRE32_MODE ? STRLEN32(p) : pcre_mode == PCRE16_MODE ? STRLEN16(p) : STRLEN8(p)) #define PCRE_ASSIGN_JIT_STACK(extra, callback, userdata) \ @@ -756,6 +773,12 @@ the three different cases. */ else \ G(SET_PCRE_CALLOUT,BITTWO)(callout) +#define SET_PCRE_STACK_GUARD(stack_guard) \ + if (pcre_mode == G(G(PCRE,BITONE),_MODE)) \ + G(SET_PCRE_STACK_GUARD,BITONE)(stack_guard); \ + else \ + G(SET_PCRE_STACK_GUARD,BITTWO)(stack_guard) + #define STRLEN(p) ((pcre_mode == G(G(PCRE,BITONE),_MODE)) ? \ G(STRLEN,BITONE)(p) : G(STRLEN,BITTWO)(p)) @@ -897,6 +920,7 @@ the three different cases. */ #define PCHARSV PCHARSV8 #define READ_CAPTURE_NAME READ_CAPTURE_NAME8 #define SET_PCRE_CALLOUT SET_PCRE_CALLOUT8 +#define SET_PCRE_STACK_GUARD SET_PCRE_STACK_GUARD8 #define STRLEN STRLEN8 #define PCRE_ASSIGN_JIT_STACK PCRE_ASSIGN_JIT_STACK8 #define PCRE_COMPILE PCRE_COMPILE8 @@ -927,6 +951,7 @@ the three different cases. */ #define PCHARSV PCHARSV16 #define READ_CAPTURE_NAME READ_CAPTURE_NAME16 #define SET_PCRE_CALLOUT SET_PCRE_CALLOUT16 +#define SET_PCRE_STACK_GUARD SET_PCRE_STACK_GUARD16 #define STRLEN STRLEN16 #define PCRE_ASSIGN_JIT_STACK PCRE_ASSIGN_JIT_STACK16 #define PCRE_COMPILE PCRE_COMPILE16 @@ -957,6 +982,7 @@ the three different cases. */ #define PCHARSV PCHARSV32 #define READ_CAPTURE_NAME READ_CAPTURE_NAME32 #define SET_PCRE_CALLOUT SET_PCRE_CALLOUT32 +#define SET_PCRE_STACK_GUARD SET_PCRE_STACK_GUARD32 #define STRLEN STRLEN32 #define PCRE_ASSIGN_JIT_STACK PCRE_ASSIGN_JIT_STACK32 #define PCRE_COMPILE PCRE_COMPILE32 @@ -1015,6 +1041,7 @@ static int first_callout; static int jit_was_used; static int locale_set = 0; static int show_malloc; +static int stack_guard_return; static int use_utf; static const unsigned char *last_callout_mark = NULL; @@ -2201,6 +2228,18 @@ return p; /************************************************* +* Stack guard function * +*************************************************/ + +/* Called from PCRE when set in pcre_stack_guard. We give an error (non-zero) +return when a count overflows. */ + +static int stack_guard(void) +{ +return stack_guard_return; +} + +/************************************************* * Callout function * *************************************************/ @@ -3445,6 +3484,7 @@ while (!done) use_utf = 0; debug_lengths = 1; + SET_PCRE_STACK_GUARD(NULL); if (extend_inputline(infile, buffer, " re> ") == NULL) break; if (infile != stdin) fprintf(outfile, "%s", (char *)buffer); @@ -3745,6 +3785,21 @@ while (!done) case 'P': do_posix = 1; break; #endif + case 'Q': + switch (*pp) + { + case '0': + case '1': + stack_guard_return = *pp++ - '0'; + break; + + default: + fprintf(outfile, "** Missing 0 or 1 after /Q\n"); + goto SKIP_DATA; + } + SET_PCRE_STACK_GUARD(stack_guard); + break; + case 'S': do_study = 1; for (;;) @@ -5198,7 +5253,7 @@ while (!done) if (count * 2 > use_size_offsets) count = use_size_offsets/2; } - /* Output the captured substrings. Note that, for the matched string, + /* Output the captured substrings. Note that, for the matched string, the use of \K in an assertion can make the start later than the end. */ for (i = 0; i < count * 2; i += 2) @@ -5217,23 +5272,23 @@ while (!done) { int start = use_offsets[i]; int end = use_offsets[i+1]; - + if (start > end) { start = use_offsets[i+1]; end = use_offsets[i]; - fprintf(outfile, "Start of matched string is beyond its end - " - "displaying from end to start.\n"); - } - + fprintf(outfile, "Start of matched string is beyond its end - " + "displaying from end to start.\n"); + } + fprintf(outfile, "%2d: ", i/2); PCHARSV(bptr, start, end - start, outfile); if (verify_jit && jit_was_used) fprintf(outfile, " (JIT)"); fprintf(outfile, "\n"); - + /* Note: don't use the start/end variables here because we want to show the text from what is reported as the end. */ - + if (do_showcaprest || (i == 0 && do_showrest)) { fprintf(outfile, "%2d+ ", i/2); diff --git a/testdata/testinput2 b/testdata/testinput2 index c1b1db1..71df1a8 100644 --- a/testdata/testinput2 +++ b/testdata/testinput2 @@ -4050,5 +4050,13 @@ backtracking verbs. --/ /abcd/f<lf> xx\nxabcd + +/ -- Test stack check external calls --/ + +/(((((a)))))/Q0 + +/(((((a)))))/Q1 + +/(((((a)))))/Q /-- End of testinput2 --/ diff --git a/testdata/testoutput2 b/testdata/testoutput2 index e3e2f58..e9d3265 100644 --- a/testdata/testoutput2 +++ b/testdata/testoutput2 @@ -14134,5 +14134,15 @@ Start of matched string is beyond its end - displaying from end to start. /abcd/f<lf> xx\nxabcd No match + +/ -- Test stack check external calls --/ + +/(((((a)))))/Q0 + +/(((((a)))))/Q1 +Failed: parentheses are too deeply nested (stack check) at offset 0 + +/(((((a)))))/Q +** Missing 0 or 1 after /Q /-- End of testinput2 --/ |