From da94ea30b05d1eb6bb763a51708e3e72cdbb4158 Mon Sep 17 00:00:00 2001 From: ph10 Date: Fri, 6 Feb 2015 16:10:27 +0000 Subject: Catch auto-possessification potential loop for bad UTF pattern with NO_UTF_CHECK. git-svn-id: svn://vcs.exim.org/pcre/code/trunk@1518 2f5784b3-3f2a-0410-8824-cb99058d5e15 --- ChangeLog | 4 ++++ pcre_compile.c | 8 ++++++++ 2 files changed, 12 insertions(+) diff --git a/ChangeLog b/ChangeLog index fe2efb2..8425b08 100644 --- a/ChangeLog +++ b/ChangeLog @@ -23,6 +23,10 @@ Version 8.37 xx-xxx-2015 5. Fixed a memory leak during matching that could occur for a subpattern subroutine call (recursive or otherwise) if the number of captured groups that had to be saved was greater than ten. + +6. Catch a bad opcode during auto-possessification after compiling a bad UTF + string with NO_UTF_CHECK. This is a tidyup, not a bug fix, as passing bad + UTF with NO_UTF_CHECK is documented as having an undefined outcome. Version 8.36 26-September-2014 diff --git a/pcre_compile.c b/pcre_compile.c index efc0b21..03f5e56 100644 --- a/pcre_compile.c +++ b/pcre_compile.c @@ -3610,6 +3610,14 @@ for (;;) { c = *code; + /* When a pattern with bad UTF-8 encoding is compiled with NO_UTF_CHECK, + it may compile without complaining, but may get into a loop here if the code + pointer points to a bad value. This is, of course a documentated possibility, + when NO_UTF_CHECK is set, so it isn't a bug, but we can detect this case and + just give up on this optimization. */ + + if (c >= OP_TABLE_LENGTH) return; + if (c >= OP_STAR && c <= OP_TYPEPOSUPTO) { c -= get_repeat_base(c) - OP_STAR; -- cgit v1.2.1