summaryrefslogtreecommitdiff
path: root/lib/regexec.c
Commit message (Collapse)AuthorAgeFilesLines
* maint: run 'make update-copyright'Simon Josefsson2023-01-011-1/+1
|
* regex: fix minor over-allocationPaul Eggert2022-03-111-1/+1
| | | | | * lib/regexec.c (push_fail_stack): Fix off-by-one error that over-allocated the stack.
* regex: fix free_fail_stack undefined behaviorPaul Eggert2022-03-111-2/+3
| | | | | | | | | * lib/regexec.c (push_fail_stack): Don’t increment number of re_fail_stack_t entries until after successful allocation. This prevents a crash if re_realloc or re_malloc fails here, and a later free_fail_stack examines regs or a later pop_fail_stack examines node. Problem discovered by Coverity scan sent 2022-03-11 11:03:52Z.
* maint: run 'make update-copyright'Paul Eggert2022-01-011-1/+1
|
* regex: pacify Coverity clean_state_log_if_neededPaul Eggert2021-12-071-0/+1
| | | | | | | | | Problem reported by Robbie Harwood in: https://lists.gnu.org/r/bug-gnulib/2021-12/msg00005.html * lib/regexec.c (clean_state_log_if_needed): Add a DEBUG_ASSERT; this both pacifies Coverity and will help to debug in case some other change mistakenly causes the assertion to become false.
* regex: merge from glibcPaul Eggert2021-11-241-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | The main change here, imported from Glibc, is for the regex code to stop using nested functions when _LIBC is defined. The intent is for the result to be copied back to Glibc so that the two implementations can resync. * lib/regcomp.c (re_set_fastmap, seek_collating_symbol_entry) (lookup_collation_sequence_value, build_range_exp) (build_collating_symbol): * lib/regexec.c (acquire_init_state_context): Declare with __always_inline instead of with ‘inline __attribute__ ((always_inline))’. * lib/regexec.c (init_word_char): Move uint64_t comment to regex_internal.h. (parse_byte): Change multibyte-detecting arg from re_charset_t * to re_dfa_t const *. All callers changed. (build_range_exp, build_collating_symbol) [!_LIBC]: Change signature to match _LIBC well enough so that the caller can be simplified to assume _LIBC. (parse_bracket_exp): Pull its nested functions seek_collating_symbol_entry, lookup_collation_sequence_value, build_range_exp, build_collating_symbol out to the top level, adding args to pass the information instead of having them access nonlocal vars. Use types in local vars that do not assume glibc. * lib/regex_internal.h: Explain uint64_t etc. here.
* regex: assume RE_ENABLE_I18NPaul Eggert2021-11-241-58/+15
| | | | | | | | | | | These days there is no longer any need to port to platforms lacking iswctype etc., since Gnulib now has substitutes. * config/srclist.txt: Comment out regex_internal.c and regex_internal.h for now, since they no longer match glibc. The intent is to merge them again soon. * lib/regex_internal.h (RE_ENABLE_I18N): Remove. All uses changed to assume that RE_ENABLE_I18N is 1. * modules/regex (Depends-on): Add iswctype.
* regex: redo style of previous changePaul Eggert2021-10-191-3/+3
|
* regex: improve commentPaul Eggert2021-10-191-2/+1
|
* regex: fix buffer read overrrunPaul Eggert2021-10-181-1/+1
| | | | | | | * config/srclist.txt: Remove posix/regexec.c for now. * lib/regexec.c (re_search_internal): Fix buffer read overrun reported by Benno Schulenberg in: https://lists.gnu.org/r/bug-gnulib/2021-10/msg00035.html
* regex: revert much of previous changePaul Eggert2021-08-261-44/+34
| | | | | Use a more-conservative change that syncs closer with glibc, and then merely marks regexec and __compat_regexec.
* regex: use C99-style array arg syntaxPaul Eggert2021-08-261-34/+44
| | | | | | | | | | | | | | | | | | | | This should help with some static checking. Derived from a suggestion by Martin Sebor in: https://sourceware.org/pipermail/libc-alpha/2021-August/130336.html This also ports recent and relevant Glibc changes to Gnulib and prepares to copy back. * lib/cdefs.h (__ARG_NELTS): New macro. * lib/regex.c: Ignore -Wvla for the whole file. * lib/regex.h (_ARG_NELTS_, _Attr_access_): New macros. Ignore -Wvla when declaring regexec. * lib/regex.h (re_compile_pattern, re_search, re_search_2) (re_match, re_match_2, regcomp, regerror): Use _Attr_access_ where that could help static checking. * lib/regexec.c (regexec, __compat_regexec, re_copy_regs) (re_search_internal, proceed_next_node, push_fail_stack) (pop_fail_stack, set_regs, update_regs): Use __ARG_NELTS for each array parameter whose size is another arg, but which might be null.
* regex: fix undefined behaviorEgor Ignatov2021-06-221-3/+7
| | | | | | | | | Problem reported by Paul Eggert in: https://lists.gnu.org/r/bug-gnulib/2021-06/msg00115.html * lib/regexec.c (proceed_next_node): Don’t insert already-inserted node. 2021-06-06 Egor Ignatov <egori@altlinux.org> (tiny change)
* regex: fix match with possessive quantifierEgor Ignatov2021-06-061-1/+1
| | | | | | | | | | | Fix behaviour introduced in 70b673eb7, where regexps with possessive quantifier("*+") didn't match. * lib/regexec.c (set_regs): Pop if CUR_NODE has already been checked only when we have a fail stack. Fixes: 70b673eb7 ("regex: fix longstanding backref match bug") Signed-off-by: Egor Ignatov <egori@altlinux.org>
* regex: fix comment locationPaul Eggert2021-02-051-1/+1
| | | | * lib/regexec.c (update_regs): Move comment.
* regex: fix longstanding backref match bugPaul Eggert2021-02-051-9/+17
| | | | | | | | | | | | | | This fixes a longstanding glibc bug concerning backreferences <https://sourceware.org/11053> (2009-12-04). * lib/regexec.c (proceed_next_node, push_fail_stack) (pop_fail_stack): Push and pop the previous registers as well as the current ones. All callers changed. (set_regs): Also pop if CUR_NODE has already been checked, so that it does not get added as a duplicate set entry. (update_regs): Fix comment location. * tests/test-regex.c (tests): New constant. (bug_regex11): New test function. (main): Bump alarm value. Call new test function.
* regex: minor refactoringPaul Eggert2021-02-051-8/+6
| | | | * lib/regexec.c (proceed_next_node): Use more-local decls.
* regex: avoid undefined behaviorPaul Eggert2021-02-051-16/+14
| | | | | | | * lib/regexec.c (pop_fail_stack): If the stack is empty, return -1 instead of indulging in undefined behavior. This simplifies callers, and avoids undefined behavior in some cases (see glibc bug 11053, though this change does not fix that overall bug).
* regex: improve commentsPaul Eggert2021-02-051-8/+13
| | | | * lib/regexec.c: Add and correct comments about return values.
* regexec: remove alloca usage in build_trtablePaul Eggert2021-01-081-62/+13
| | | | | | | | | | | | | | | | Prompted by this different change proposed by Adhemerval Zanella: https://sourceware.org/pipermail/libc-alpha/2021-January/121373.html * lib/regexec.c (build_trtable): Prevent inlining, so that it doesn’t bloat the caller’s stack. Use auto variables instead of alloca/malloc. After these changes, build_trtable’s total stack allocation is only 20 KiB on a 64-bit machine, and this is less than glibc’s 64 KiB cutoff so there’s little point to using alloca to shrink it. Although Gnulib traditionally has used a 4 KiB cutoff, going to 20 KiB here should not be a significant problem in practice; Gnulib-using packages concerned about overflow of tiny stacks can compile with something like gcc -fstack-clash-protection. * config/srclist.txt: Comment out regexec.c for now.
* regex: remove alloca usage on regex set_regsPaul Eggert2021-01-081-22/+18
| | | | | | | | | | | Derived from this patch by Adhemerval Zanella: https://sourceware.org/pipermail/libc-alpha/2021-January/121372.html * lib/regex_internal.h: Include dynarray.h, for Gnulib. * lib/regexec.c (DYNARRAY_STRUCT, DYNARRAY_ELEMENT) (DYNARRAY_PREFIX): New macros. Include malloc/dynarray-skeleton.c. (set_regs): Use dynarray rather than alloca. * modules/regex (Depends-on): Add dynarray.
* autoupdateKarl Berry2021-01-031-1/+1
|
* autoupdatePaul Eggert2019-12-311-1/+1
|
* autoupdatePaul Eggert2019-11-111-2/+5
|
* Simplify and regularize regex use of ‘assert’Paul Eggert2019-10-111-50/+23
| | | | | | | | | | | | | | | | | | Also, tell GCC about the asserts even when compiling without debugging, to give it further optimization opportunities. * lib/regex_internal.h (DEBUG_ASSERT): New macro. * lib/regcomp.c (link_nfa_nodes, calc_eclosure) (parse_expression, parse_bracket_exp): * lib/regex_internal.c (build_wcs_buffer) (build_wcs_upper_buffer, re_string_reconstruct) (re_string_context_at): * lib/regexec.c (re_search_stub, re_copy_regs) (re_search_internal, prune_impossible_nodes, check_matching) (check_halt_state_context, set_regs, sift_states_backward) (build_sifted_states, transit_state_mb, transit_state_bkref) (check_arrival_add_next_nodes, check_arrival_expand_ecl) (match_ctx_add_subtop): Use it instead of plain ‘assert’.
* regex: omit debug assignment when not debuggingPaul Eggert2019-10-091-0/+2
| | | | | * lib/regexec.c (re_search_internal) [!DEBUG]: Remove unnecessary assignment to pacify Coverity.
* regex: tell compiler there’s at most 256 arcs outPaul Eggert2019-10-091-0/+1
| | | | | | | | | Partly this is to help the reader (and maybe help GCC); partly this is to pacify Coverity. * lib/regex_internal.h: Include verify.h. * lib/regexec.c (group_nodes_into_DFAstates): Tell the compiler that ndests cannot exceed SBC_MAX. * modules/regex (Depends-on): Add ‘verify’.
* regex: simplify by assuming C99Paul Eggert2019-10-091-11/+0
| | | | | | | | | | * config/srclist.txt: Comment out regex_internal.h and regexec.c temporarily. * lib/regex_internal.h (lock_define, re_match_context_t): Simplify by assuming C99 macros and const. * lib/regexec.c (re_search_internal): Simplify by assuming C99 initializers. Remove unnecessary assignment, as mctx is now safely initialized earlier.
* autoupdatePaul Eggert2019-03-171-7/+7
|
* autoupdatePaul Eggert2019-01-311-2/+4
|
* Fix typos found by codespell.Tim Rühsen2019-01-121-1/+1
| | | | | * lib/*.[hc]: Fix typos in comments. * pygnulib/*.py: Fix typos in error messages and comments.
* autoupdatePaul Eggert2018-12-311-1/+1
|
* autoupdatePaul Eggert2018-12-271-3/+3
|
* autoupdatePaul Eggert2018-12-161-0/+3
|
* autoupdatePaul Eggert2018-10-141-178/+185
|
* autoupdatePaul Eggert2018-07-131-23/+20
|
* maint: Run 'make update-copyright'Paul Eggert2018-01-011-1/+1
|
* regex: use re_malloc etc. consistentlyPaul Eggert2017-12-191-10/+9
| | | | | | | | Problem and original patch reported by Arnold Robbins in: https://sourceware.org/ml/libc-alpha/2017-12/msg00241.html * lib/regcomp.c (re_comp): * lib/regexec.c (push_fail_stack, build_trtable, match_ctx_clean): Use re_malloc/re_realloc/re_free instead of malloc/realloc/free.
* regex: merge from glibcPaul Eggert2017-11-201-137/+72
| | | | | | * lib/regcomp.c (__regcomp, __regfree) [_LIBC]: Now hidden. * lib/regex_internal.h (internal_function): Remove. All uses removed.
* all: prefer https: URLsPaul Eggert2017-09-131-1/+1
|
* regex: work with GCC7's -Werror=implicit-fallthrough=Paul Eggert2017-07-261-1/+1
| | | | | | * lib/regex_internal.h (FALLTHROUGH): New macro. * lib/regcomp.c (peek_token_bracket, parse_expression): * lib/regexec.c (check_node_accept): Use it.
* version-etc: new yearPaul Eggert2017-01-011-1/+1
| | | | | | | | | | * build-aux/gendocs.sh (version): * doc/gendocs_template: * doc/gendocs_template_min: * doc/gnulib.texi: * lib/version-etc.c (COPYRIGHT_YEAR): Update copyright dates by hand in templates and the like. * all files: Run 'make update-copyright'.
* regex: fix integer-overflow bug in never-used codePaul Eggert2016-12-161-2/+4
| | | | | | | | | Problem reported by Clément Pit–Claudel in: http://lists.gnu.org/archive/html/emacs-devel/2016-12/msg00654.html * lib/regex_internal.h: Include intprops.h. * lib/regexec.c (re_search_2_stub): Use it to avoid undefined behavior on integer overflow. * modules/regex (Depends-on): Add intprops.
* regex: make it closer to libcPaul Eggert2016-02-191-47/+47
| | | | | | | | | | | | | | | Make Idx a signed type, rather than possibly unsigned. The unsignedness was not really buying us anything, since the code overflows for other reasons before getting to PTRDIFF_MAX. Making it signed allows us to use -1 and -2 with abandon, like libc does, thus lessening the number of differences between gnulib and libc. Also, it should help avoid gratuitous warnings like the one reported by Nelson H. F. Beebe in: http://bugs.gnu.org/22702 * lib/regex.h (__re_idx_t): Remove. All uses changed to regoff_t. * lib/regex_internal.h (SSIZE_MAX): Define if <limits.h> doesn't. (IDX_MAX) [_REGEX_LARGE_OFFSETS]: Now SSIZE_MAX. (REG_MISSING, REG_ERROR, REG_VALID_INDEX, REG_VALID_NONZERO_INDEX): Remove. Revert all uses to their libc versions.
* regex: merge patches from libcPaul Eggert2016-02-191-48/+27
| | | | | | | | | | | | | 2015-10-21 Joseph Myers <joseph@codesourcery.com> 2015-10-20 Joseph Myers <joseph@codesourcery.com> Convert miscellaneous function definitions to prototype style. * lib/regcomp.c (re_compile_pattern, re_set_syntax) (re_compile_fastmap, regcomp, regerror, regfree, re_comp): * lib/regexec.c (regexec, re_match, re_search, re_match_2, re_search_2) (re_search_2_stub, re_search_stub, re_set_registers, re_exec) (re_search_internal): Convert to prototype-style function definition. Use internal_function for internal functions.
* version-etc: new yearPaul Eggert2016-01-011-1/+1
| | | | | | | | | | * build-aux/gendocs.sh (version): * doc/gendocs_template: * doc/gendocs_template_min: * doc/gnulib.texi: * lib/version-etc.c (COPYRIGHT_YEAR): Update copyright dates by hand in templates and the like. * all files: Run 'make update-copyright'.
* Revert previous patch, as it did not fix the bug after all.Paul Eggert2015-09-191-1/+1
|
* regex: fix dangling-backreference bugPaul Eggert2015-09-191-1/+1
| | | | | | | | | | | This should fix the following bugs: http://bugs.gnu.org/21513 assertion error in pop_fail_stack http://bugs.gnu.org/19775 Test failing after the CVE fix https://sourceware.org/bugzilla/show_bug.cgi?id=11053 Wrong results with backreferences https://sourceware.org/bugzilla/show_bug.cgi?id=17356 regex assertion violation with triple backreferences * lib/regexec.c (set_regs): Don't pop an empty failure stack.
* regex: merge patches from libcPaul Eggert2015-09-191-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 2015-09-08 Joseph Myers <joseph@codesourcery.com> Move bits/libc-lock.h and bits/libc-lockP.h out of bits/ (bug 14912). * lib/regex_internal.h: Include <libc-lock.h> instead of <bits/libc-lock.h>. 2015-06-09 Joseph Myers <joseph@codesourcery.com> Fix regcomp wcscoll, wcscmp namespace (bug 18497). * lib/regcomp.c (build_range_exp): Call __wcscoll instead of wcscoll. * lib/regexec.c (check_node_accept_bytes): Likewise. 2015-06-05 Joseph Myers <joseph@codesourcery.com> Fix regex wcrtomb namespace (bug 18496). * lib/regex_internal.c (build_wcs_upper_buffer): Call __wcrtomb instead of wcrtomb. 2015-06-05 Joseph Myers <joseph@codesourcery.com> Fix regex wctype namespace (bug 18495). * lib/regcomp.c (re_compile_fastmap_iter): Call __towlower instead of towlower. * lib/regex_internal.c (build_wcs_upper_buffer): Call __iswlower instead of iswlower. Call __towupper instead of towupper. * lib/regex_internal.h (IS_WIDE_WORD_CHAR): Call __iswalnum instead of iswalnum. 2015-01-07 Chris Metcalf <cmetcalf@ezchip.com> * lib/regcomp.c (parse_bracket_exp): Initialize type to COLL_SYM in a couple of places to avoid uninitialized variable wanings on tilegx gcc 4.8.2. 2014-11-24 Siddhesh Poyarekar <siddhesh@redhat.com> * lib/regex_internal.h: Remove NOT_IN_libc. 2014-11-17 Andreas Schwab <schwab@suse.de> * lib/regex_internal.h: Don't include <locale/elem-hash.h>. 2014-09-11 Roland McGrath <roland@hack.frob.com> Move findidx nested functions to top-level. * lib/regcomp.c [_LIBC]: #include <locale/weight.h>. (build_equiv_class) [_LIBC]: Don't #include it inside the function. Pass new arguments to findidx. * lib/regexec.c [RE_ENABLE_I18N] [_LIBC]: #include <locale/weight.h>. [RE_ENABLE_I18N] (check_node_accept_bytes) [_LIBC]: Don't #include it inside the function. Pass new arguments to findidx. * lib/regex_internal.h: [!NOT_IN_libc] [_LIBC]: #include <locale/weight.h>. (re_string_elem_size_at): Don't #include it inside the function. Pass new arguments to findidx. 2014-08-01 Siddhesh Poyarekar <siddhesh@redhat.com> Check if DEBUG is defined in regex_internal.c * lib/regex_internal.c: Check if DEBUG is defined and is set.
* version-etc: new yearPaul Eggert2014-12-311-1/+1
| | | | | | * doc/gnulib.texi: * lib/version-etc.c (COPYRIGHT_YEAR): Update copyright date. * all files: Run 'make update-copyright'.