summaryrefslogtreecommitdiff
path: root/ext/pcre/php_pcre.c
Commit message (Collapse)AuthorAgeFilesLines
* Fixed bug #79257Nikita Popov2020-02-111-5/+17
| | | | Replace an existing entry for a given name only if we have a match.
* PCRE: Only remember valid UTF-8 if start offset zeroNikita Popov2020-02-071-4/+7
| | | | | | | PCRE only validates the string starting from the start offset (minus maximum look-behind, but let's ignore that), so we can only remember that the string is fully valid UTF-8 is the original start offset is zero.
* PCRE: Check whether start offset is on char boundaryNikita Popov2020-02-071-1/+17
| | | | | | We need not just the whole string to be UTF-8, but the start position to be on a character boundary as well. Check this by looking for a continuation byte.
* Merge branch 'PHP-7.3' into PHP-7.4Nikita Popov2020-02-051-18/+16
|\ | | | | | | | | * PHP-7.3: Fixed bug #79188
| * Fixed bug #79188Nikita Popov2020-02-051-18/+16
| |
* | Merge branch 'PHP-7.3' into PHP-7.4Christoph M. Becker2019-11-221-1/+5
|\ \ | |/ | | | | | | * PHP-7.3: Fix #78853: preg_match() may return integer > 1
| * Fix #78853: preg_match() may return integer > 1Christoph M. Becker2019-11-221-1/+5
| | | | | | | | | | | | | | Commit 54ebebd[1] optimized the match loop, but for this case it has been overlooked, that we must only loop if we're doing global matching. [1] <http://git.php.net/?p=php-src.git;a=commit;h=54ebebd686255c5f124af718c966edb392782d4a>
* | Merge branch 'PHP-7.3' into PHP-7.4Nikita Popov2019-11-071-2/+4
|\ \ | |/ | | | | | | * PHP-7.3: Fix php_pcre_mutex_free()
| * Fix php_pcre_mutex_free()Nikita Popov2019-11-071-2/+4
| | | | | | | | | | | | We should only set the mutex to NULL if we actually freed it. Due to missing braces non-main threads may currently set it to NULL first.
* | Merge branch 'PHP-7.3' into PHP-7.4Nikita Popov2019-10-081-2/+10
|\ \ | |/
| * Merge branch 'PHP-7.2' into PHP-7.3Nikita Popov2019-10-081-2/+10
| |\
| | * Add pcre_get_compiled_regex_cache_ex() with local_aware flagSergei Turchanov2019-10-081-2/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | A new function `pcre_get_compiled_regex_cache_ex()` is introduced, which allows to compile regexp pattern using the "C" locale instead of a current locale. This will be needed to replace setlocale() usage in fileinfo, which is not thread-safe.
* | | Merge branch 'PHP-7.3' into PHP-7.4Nikita Popov2019-10-041-0/+6
|\ \ \ | |/ /
| * | Improve diagnostic on PCRE JIT mmap failureNikita Popov2019-10-041-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Print a more informative message that indicates that this is likely a permission issue, and also indicate that pcre.jit=0 can be used to work around it. Also automatically disable the JIT, so that this message is only shown once. See bug #78630.
* | | Mark PCRE locale key as local persistentNikita Popov2019-08-131-0/+1
| | |
* | | Split destructorDmitry Stogov2019-07-041-2/+11
| | |
* | | Merge branch 'PHP-7.3' into PHP-7.4Nikita Popov2019-06-171-1/+1
|\ \ \ | |/ /
| * | Merge branch 'PHP-7.2' into PHP-7.3Nikita Popov2019-06-171-1/+1
| |\ \ | | |/
| | * Accept null for preg_quote delimiter argumentNikita Popov2019-06-171-1/+1
| | | | | | | | | | | | Related to bug #78163.
* | | Add specialized pair construction APINikita Popov2019-06-111-20/+13
| | | | | | | | | | | | Closes GH-3990.
* | | Allow exceptions in __toString()Nikita Popov2019-06-051-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RFC: https://wiki.php.net/rfc/tostring_exceptions And convert some object to string conversion related recoverable fatal errors into Error exceptions. Improve exception safety of internal code performing string conversions.
* | | Use ZEND_TRY_ASSIGN_REF_... macros for arguments passed to internal function ↵Dmitry Stogov2019-04-241-3/+3
| | | | | | | | | | | | by reference
* | | Remove checks for locale.h, setlocale, localeconvPeter Kokot2019-04-071-35/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The `<loccale.h>` header file, setlocale, and localeconv are part of the standard C89 [1] and on current systems can be used unconditionally. Since PHP 7.4 requires at least C89 or greater, the `HAVE_LOCALE_H`, `HAVE_SETLOCALE`, and `HAVE_LOCALECONV` symbols defined by Autoconf in configure.ac [2] can be ommitted and simplifed. The bundled libmagic (file) has also been patched already in version 5.35 and up in upstream location so when it will be patched also in php-src the check for locale.h header is still left in the configure.ac and in windows headers definition file. [1] https://port70.net/~nsz/c/c89/c89-draft.html#4.4 [2] https://git.savannah.gnu.org/cgit/autoconf.git/tree/lib/autoconf/headers.m4 Omit the bundled libmagic files
* | | Merge branch 'PHP-7.3' into PHP-7.4Christoph M. Becker2019-03-311-0/+1
|\ \ \ | |/ / | | | | | | | | | * PHP-7.3: Fix #77827: preg_match does not ignore \r in regex flags
| * | Merge branch 'PHP-7.2' into PHP-7.3Christoph M. Becker2019-03-311-0/+1
| |\ \ | | |/ | | | | | | | | | * PHP-7.2: Fix #77827: preg_match does not ignore \r in regex flags
| | * Fix #77827: preg_match does not ignore \r in regex flagsChristoph M. Becker2019-03-311-0/+1
| | |
| | * Fixed possible incorrect "mark" usageDmitry Stogov2018-01-091-0/+5
| | |
| | * year++Xinchen Hui2018-01-021-1/+1
| | |
| | * Merge branch 'PHP-7.1' into PHP-7.2Anatol Belski2017-12-061-1/+3
| | |\ | | | | | | | | | | | | | | | | * PHP-7.1: Fix yet one data race in PCRE
| | | * Fix yet one data race in PCREAnatol Belski2017-12-061-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | PCRE 8.x initializes the pattern compiler on demand during the first pcre_study call. It could be worse, but since the compiled patterns are cached, the locking impact is minimal. PCRE 10.x always compiles the pattern and thread sanitizer doesn't complain about the compiler initialization, thus the newer PCRE version seems to be unafected.
* | | | Make PCRE cache per-request on CLINikita Popov2019-03-261-16/+29
| | | | | | | | | | | | | | | | | | | | | | | | There will only be one request on the CLI SAPI, so there is no advantage to having a persistent PCRE cache. Using a non-persistent cache allows us to use arbitrary strings as cache keys.
* | | | Remove HAVE_PCRE/HAVE_BUNDLED_PCRE checksNikita Popov2019-03-221-4/+0
| | | | | | | | | | | | | | | | PCRE is always available.
* | | | Try to create interned strings in preg_split as wellNikita Popov2019-03-211-16/+12
| | | | | | | | | | | | | | | | | | | | And convert last_match to last_match_offset, which is more convenient now.
* | | | Cleanup add_offset_pair APINikita Popov2019-03-211-61/+64
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Accept the two offsets directly, rather than doing length calculations at all callsites. Also extract the logic to create a possibly interned string. Switch the split implementation to work on a char* subject internally, because ZSTR_VAL(subject_str) is a mouthful...
* | | | Fix bug #73948Nikita Popov2019-03-211-6/+32
| | | | | | | | | | | | | | | | | | | | | | | | If PREG_UNMATCHED_AS_NULL is used, make sure that unmatched capturing groups at the end are also set to null, rather than just those in the middle.
* | | | Respect OFFSET_CAPTURE when padding preg_match_all() resultsNikita Popov2019-03-191-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This issue was mentioned in bug #73948. The PREG_PATTERN_ORDER padding was performed without respecting the PREF_OFFSET_CAPTURE flag, which resulted in unmatched subpatterns being either null or [null, -1] depending on where they occur. Now they will always be [null, -1], consistent with other usages.
* | | | Merge branch 'PHP-7.3' into PHP-7.4Nikita Popov2019-03-191-1/+7
|\ \ \ \ | |/ / /
| * | | Fixed bug #76127Nikita Popov2019-03-191-1/+7
| | | | | | | | | | | | | | | | | | | | Per documentation, and consistent with other preg functions, we should return false if an error occurred.
* | | | Don't create a new array for empty/null match every timeNikita Popov2019-03-191-19/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If PREG_OFFSET_CAPTURE is used, unmatched subpatterns will be either [null, -1] or ['', -1] depending on PREG_UNMATCHED_AS_NULL mode. Instead of creating a new array like this every time, cache it inside a global (per-request -- could make it immutable though). Additionally check whether the subpattern is an empty string or single character string and use an existing interned string in that case. Empty / single-char subpatterns are common, so let's avoid allocating strings for them.
* | | | Revert unintended changeNikita Popov2019-03-191-1/+0
| | | | | | | | | | | | | | | | | | | | I wanted to cache subpat names, but we can't do that because the cache relives request boundaries.
* | | | Use zend_string for subpat_names tableNikita Popov2019-03-191-24/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When used with preg_match_all or preg_replace_callback(_array), subpattern names can be used in the matches array many times. Switch the subpat_names table to use zend_string, so we don't have to allocate a new string every time. Also don't bother creating the table if no $matches were passed. This might be a regression for the case where preg_match() is used with many trailing named subpatterns that are skipped in the result array, but that seems rather contrived.
* | | | Avoid copying subpat twice if named subpats are usedNikita Popov2019-03-191-30/+23
| | | |
* | | | Fix #77094: Add flags support for pcre_replace_callback(_array)Nikita Popov2019-03-191-161/+99
| | | |
* | | | Fixed bug #72685Nikita Popov2019-03-181-2/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently have a large performance problem when implementing lexers working on UTF-8 strings in PHP. This kind of code tends to perform a large number of matches at different offsets on a single string. This is generally fast. However, if /u mode is used, the full string will be UTF-8 validated on each match. This results in quadratic runtime. This patch fixes the issue by adding a IS_STR_VALID_UTF8 flag, which is set when we have determined that the string is valid UTF8 and further validation is skipped. A limitation of this approach is that we can't set the flag for interned strings. I think this is not a problem for this use-case which will generally work on dynamic data. If we want to use this flag for other purposes as well (mbstring?) then it might be worthwhile to UTF-8 validate strings during interning. But right now this doesn't seem useful.
* | | | Accept zend_string* instead of char* in php_pcre_match_impl()Nikita Popov2019-03-181-2/+5
| | | |
* | | | Remove local variablesPeter Kokot2019-02-031-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch removes the so called local variables defined per file basis for certain editors to properly show tab width, and similar settings. These are mainly used by Vim and Emacs editors yet with recent changes the once working definitions don't work anymore in Vim without custom plugins or additional configuration. Neither are these settings synced across the PHP code base. A simpler and better approach is EditorConfig and fixing code using some code style fixing tools in the future instead. This patch also removes the so called modelines for Vim. Modelines allow Vim editor specifically to set some editor configuration such as syntax highlighting, indentation style and tab width to be set in the first line or the last 5 lines per file basis. Since the php test files have syntax highlighting already set in most editors properly and EditorConfig takes care of the indentation settings, this patch removes these as well for the Vim 6.0 and newer versions. With the removal of local variables for certain editors such as Emacs and Vim, the footer is also probably not needed anymore when creating extensions using ext_skel.php script. Additionally, Vim modelines for setting php syntax and some editor settings has been removed from some *.phpt files. All these are mostly not relevant for phpt files neither work properly in the middle of the file.
* | | | Remove yearly range from copyright noticeZeev Suraski2019-01-301-1/+1
| | | |
* | | | Implement typed propertiesNikita Popov2019-01-111-12/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | RFC: https://wiki.php.net/rfc/typed_properties_v2 This is a squash of PR #3734, which is a squash of PR #3313. Co-authored-by: Bob Weinand <bobwei9@hotmail.com> Co-authored-by: Joe Watkins <krakjoe@php.net> Co-authored-by: Dmitry Stogov <dmitry@zend.com>
* | | | Use ZEND_PARSE_PARAMETERS_NONE in pcreNikita Popov2019-01-021-2/+1
| | | | | | | | | | | | | | | | Instead of the manual ZEND_PARSE_PARAMETERS_START(0, 0) form.
* | | | Remove preg_options param from pcre_get_compiled_regex()Nikita Popov2018-12-261-5/+2
|/ / / | | | | | | | | | | | | | | | | | | This parameter is always zero and not necessary to call pcre2_match. I'm leaving the parameter behind on the _ex() variant, so the preg_flags are still accessible in some way.