summaryrefslogtreecommitdiff
path: root/lib/regcomp.c
Commit message (Collapse)AuthorAgeFilesLines
* maint: run 'make update-copyright'Simon Josefsson2023-01-011-1/+1
|
* regex: match [...---...] like V7 grepPaul Eggert2022-04-211-3/+13
| | | | | | | | Problem reported by Arnold Robbins in: https://bugs.gnu.org/20657 https://lists.gnu.org/r/bug-gnulib/2022-04/msg00053.html * lib/regcomp.c (peek_token_bracket): Let [...---...] match '-'. This is an extension to POSIX, and matches V7 Unix grep.
* maint: run 'make update-copyright'Paul Eggert2022-01-011-1/+1
|
* regex: merge from glibcPaul Eggert2021-11-241-261/+265
| | | | | | | | | | | | | | | | | | | | | | | | | | The main change here, imported from Glibc, is for the regex code to stop using nested functions when _LIBC is defined. The intent is for the result to be copied back to Glibc so that the two implementations can resync. * lib/regcomp.c (re_set_fastmap, seek_collating_symbol_entry) (lookup_collation_sequence_value, build_range_exp) (build_collating_symbol): * lib/regexec.c (acquire_init_state_context): Declare with __always_inline instead of with ‘inline __attribute__ ((always_inline))’. * lib/regexec.c (init_word_char): Move uint64_t comment to regex_internal.h. (parse_byte): Change multibyte-detecting arg from re_charset_t * to re_dfa_t const *. All callers changed. (build_range_exp, build_collating_symbol) [!_LIBC]: Change signature to match _LIBC well enough so that the caller can be simplified to assume _LIBC. (parse_bracket_exp): Pull its nested functions seek_collating_symbol_entry, lookup_collation_sequence_value, build_range_exp, build_collating_symbol out to the top level, adding args to pass the information instead of having them access nonlocal vars. Use types in local vars that do not assume glibc. * lib/regex_internal.h: Explain uint64_t etc. here.
* regex: assume RE_ENABLE_I18NPaul Eggert2021-11-241-220/+77
| | | | | | | | | | | These days there is no longer any need to port to platforms lacking iswctype etc., since Gnulib now has substitutes. * config/srclist.txt: Comment out regex_internal.c and regex_internal.h for now, since they no longer match glibc. The intent is to merge them again soon. * lib/regex_internal.h (RE_ENABLE_I18N): Remove. All uses changed to assume that RE_ENABLE_I18N is 1. * modules/regex (Depends-on): Add iswctype.
* regex: break regcomp.c link with glibcPaul Eggert2021-11-071-241/+223
| | | | | | | Problem reported by Bruno Haible in: https://lists.gnu.org/r/bug-gnulib/2021-11/msg00005.html * config/srclist.txt: Comment out regcomp.c for now. * lib/regcomp.c: Revert previous change.
* autoupdateKarl Berry2021-11-031-223/+241
|
* regex: avoid duplicate in espilon closurePaul Eggert2021-02-051-5/+3
| | | | | | * lib/regcomp.c (calc_eclosure_iter): Insert NODE into epsilon closure first rather than last. Otherwise, the epsilon closure might contain a duplicate of NODE.
* maint: run 'make update-copyright'Paul Eggert2020-12-311-1/+1
|
* Revert autoupdate's revert.Bruno Haible2020-08-171-1/+1
| | | | * config/srclist.txt: Mark regcomp.c as needing sync with glibc.
* autoupdateKarl Berry2020-08-171-1/+1
|
* regex: Use initializer shorthand syntax also with clang.Bruno Haible2020-08-161-1/+1
| | | | | * lib/regcomp.c (utf8_sb_map): Use the initializer shorthand syntax also with clang.
* autoupdatePaul Eggert2019-12-311-1/+1
|
* Simplify and regularize regex use of ‘assert’Paul Eggert2019-10-111-14/+8
| | | | | | | | | | | | | | | | | | Also, tell GCC about the asserts even when compiling without debugging, to give it further optimization opportunities. * lib/regex_internal.h (DEBUG_ASSERT): New macro. * lib/regcomp.c (link_nfa_nodes, calc_eclosure) (parse_expression, parse_bracket_exp): * lib/regex_internal.c (build_wcs_buffer) (build_wcs_upper_buffer, re_string_reconstruct) (re_string_context_at): * lib/regexec.c (re_search_stub, re_copy_regs) (re_search_internal, prune_impossible_nodes, check_matching) (check_halt_state_context, set_regs, sift_states_backward) (build_sifted_states, transit_state_mb, transit_state_bkref) (check_arrival_add_next_nodes, check_arrival_expand_ecl) (match_ctx_add_subtop): Use it instead of plain ‘assert’.
* regex: avoid copying of uninitialized storagePaul Eggert2019-10-091-11/+2
| | | | | | | * config/srclist.txt: Comment out regcomp.c temporarily. * lib/regcomp.c (build_charclass_op, create_tree) [! (GCC_LINT||lint)]: Initialize even when not checking for lint, as the behavior is arguably undefined otherwise and Coverity warns about it.
* autoupdatePaul Eggert2019-03-171-10/+10
|
* autoupdatePaul Eggert2018-12-311-1/+1
|
* autoupdatePaul Eggert2018-12-271-14/+2
|
* autoupdatePaul Eggert2018-10-141-147/+158
|
* autoupdatePaul Eggert2018-08-101-5/+4
|
* autoupdatePaul Eggert2018-08-011-13/+5
|
* regex: glibc does not use intprops.hPaul Eggert2018-06-291-4/+0
| | | | | | | | Maybe we can talk glibc into using intprops.h someday, but now doesn’t seem to be a good time. * lib/regcomp.c (TYPE_SIGNED): Remove; regex_internal.h now defines. * lib/regex_internal.h [_LIBC]: Do not include intprops.h. (TYPE_SIGNED, INT_ADD_WRAPV): New macros.
* maint: Run 'make update-copyright'Paul Eggert2018-01-011-1/+1
|
* regex: use re_malloc etc. consistentlyPaul Eggert2017-12-191-2/+2
| | | | | | | | Problem and original patch reported by Arnold Robbins in: https://sourceware.org/ml/libc-alpha/2017-12/msg00241.html * lib/regcomp.c (re_comp): * lib/regexec.c (push_fail_stack, build_trtable, match_ctx_clean): Use re_malloc/re_realloc/re_free instead of malloc/realloc/free.
* regex: merge from glibcPaul Eggert2017-11-221-0/+3
| | | | * lib/regcomp.c (init_word_char): Add comments.
* regex: merge from glibcPaul Eggert2017-11-201-8/+3
| | | | | | * lib/regcomp.c (__regcomp, __regfree) [_LIBC]: Now hidden. * lib/regex_internal.h (internal_function): Remove. All uses removed.
* all: prefer https: URLsPaul Eggert2017-09-131-1/+1
|
* regex: work with GCC7's -Werror=implicit-fallthrough=Paul Eggert2017-07-261-4/+17
| | | | | | * lib/regex_internal.h (FALLTHROUGH): New macro. * lib/regcomp.c (peek_token_bracket, parse_expression): * lib/regexec.c (check_node_accept): Use it.
* version-etc: new yearPaul Eggert2017-01-011-1/+1
| | | | | | | | | | * build-aux/gendocs.sh (version): * doc/gendocs_template: * doc/gendocs_template_min: * doc/gnulib.texi: * lib/version-etc.c (COPYRIGHT_YEAR): Update copyright dates by hand in templates and the like. * all files: Run 'make update-copyright'.
* regex: port to Sun CPaul Eggert2016-06-081-2/+2
| | | | | | | Reported by Daiki Ueno. * lib/regcomp.c (regcomp, regerror): Use _Restrict_, not __restrict, in prototype. This fixes a problem I introduced in the 2016-02-19 merge from glibc.
* Use GCC_LINT, not lintPaul Eggert2016-05-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | FreeBSD and Cygwin #define _Noreturn to empty if 'lint' is defined. Problem reported by Ken Brown in: http://bugs.gnu.org/23640 * doc/posix-headers/stdnoreturn.texi (stdnoreturn.h): Document problem with lint and _Noreturn. * lib/diffseq.h (IF_LINT, IF_LINT2): * lib/fts.c (sccsid): * lib/getndelim2.c (IF_LINT): * lib/gl_anylinked_list2.h (gl_linked_iterator) (gl_linked_iterator_from_to): * lib/gl_anytree_list2.h (gl_tree_iterator) (gl_tree_iterator_from_to): * lib/gl_anytree_oset.h (gl_tree_iterator): * lib/gl_array_list.c (gl_array_iterator) (gl_array_iterator_from_to): * lib/gl_array_oset.c (gl_array_iterator): * lib/gl_carray_list.c (gl_carray_iterator) (gl_carray_iterator_from_to): * lib/idcache.c: * lib/inet_ntop.c (IF_LINT): * lib/regcomp.c (build_charclass_op, create_tree): * lib/regex_internal.c (re_acquire_state) (re_acquire_state_context): * lib/trigl.c (rcsid): * lib/trim.c (IF_LINT): * lib/vasnprintf.c (IF_LINT): * lib/verify.h (assume): Treat GCC_LINT like lint.
* regex: make it closer to libcPaul Eggert2016-02-191-33/+32
| | | | | | | | | | | | | | | Make Idx a signed type, rather than possibly unsigned. The unsignedness was not really buying us anything, since the code overflows for other reasons before getting to PTRDIFF_MAX. Making it signed allows us to use -1 and -2 with abandon, like libc does, thus lessening the number of differences between gnulib and libc. Also, it should help avoid gratuitous warnings like the one reported by Nelson H. F. Beebe in: http://bugs.gnu.org/22702 * lib/regex.h (__re_idx_t): Remove. All uses changed to regoff_t. * lib/regex_internal.h (SSIZE_MAX): Define if <limits.h> doesn't. (IDX_MAX) [_REGEX_LARGE_OFFSETS]: Now SSIZE_MAX. (REG_MISSING, REG_ERROR, REG_VALID_INDEX, REG_VALID_NONZERO_INDEX): Remove. Revert all uses to their libc versions.
* regex: merge patches from libcPaul Eggert2016-02-191-31/+7
| | | | | | | | | | | | | 2015-10-21 Joseph Myers <joseph@codesourcery.com> 2015-10-20 Joseph Myers <joseph@codesourcery.com> Convert miscellaneous function definitions to prototype style. * lib/regcomp.c (re_compile_pattern, re_set_syntax) (re_compile_fastmap, regcomp, regerror, regfree, re_comp): * lib/regexec.c (regexec, re_match, re_search, re_match_2, re_search_2) (re_search_2_stub, re_search_stub, re_set_registers, re_exec) (re_search_internal): Convert to prototype-style function definition. Use internal_function for internal functions.
* regex: treat [x] as x if x is a unibyte encoding errorPaul Eggert2016-01-241-2/+15
| | | | | | | Problem reported by Aharon Robbins in: http://lists.gnu.org/archive/html/bug-gnulib/2016-01/msg00091.html * lib/regcomp.c (parse_byte) [!_LIBC && RE_ENABLE_I18N]: New function. (build_range_exp) [!_LIBC && RE_ENABLE_I18N]: Use it.
* regex: pacify static checkersPaul Eggert2016-01-181-0/+6
| | | | | | | Problem and draft fix reported by Aharon Robbins in: http://lists.gnu.org/archive/html/bug-gnulib/2016-01/msg00082.html * lib/regcomp.c (build_charclass_op, create_tree) [lint]: Clear memory to pacify static checkers.
* regex: fix [ diagnosticPaul Eggert2016-01-181-2/+2
| | | | | | | | Problem and fix reported by Aharon Robbins in: http://lists.gnu.org/archive/html/bug-gnulib/2016-01/msg00082.html * lib/regcomp.c (REG_EBRACK_IDX): Fix misleading diagnostic about [. * lib/regcomp.c (build_range_exp, build_charclass_op)
* regex: fix memory leaksPaul Eggert2016-01-181-14/+13
| | | | | | | | Problem and draft fix reported by Aharon Robbins in: http://lists.gnu.org/archive/html/bug-gnulib/2016-01/msg00082.html * lib/regcomp.c (build_range_exp, build_charclass_op): * lib/regex_internal.c (re_dfa_add_node): Fix memory leak on failure.
* version-etc: new yearPaul Eggert2016-01-011-1/+1
| | | | | | | | | | * build-aux/gendocs.sh (version): * doc/gendocs_template: * doc/gendocs_template_min: * doc/gnulib.texi: * lib/version-etc.c (COPYRIGHT_YEAR): Update copyright dates by hand in templates and the like. * all files: Run 'make update-copyright'.
* Diagnose ERE '()|\1'Paul Eggert2015-09-191-0/+4
| | | | | | | | | Problem reported by Hanno Böck in: http://bugs.gnu.org/21513 * lib/regcomp.c (parse_reg_exp): While parsing alternatives, keep track of the set of previously-completed subexpressions available before the first alternative, and restore this set just before parsing each subsequent alternative. This lets us diagnose the invalid back-reference in the ERE '()|\1'.
* regex: merge patches from libcPaul Eggert2015-09-191-6/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 2015-09-08 Joseph Myers <joseph@codesourcery.com> Move bits/libc-lock.h and bits/libc-lockP.h out of bits/ (bug 14912). * lib/regex_internal.h: Include <libc-lock.h> instead of <bits/libc-lock.h>. 2015-06-09 Joseph Myers <joseph@codesourcery.com> Fix regcomp wcscoll, wcscmp namespace (bug 18497). * lib/regcomp.c (build_range_exp): Call __wcscoll instead of wcscoll. * lib/regexec.c (check_node_accept_bytes): Likewise. 2015-06-05 Joseph Myers <joseph@codesourcery.com> Fix regex wcrtomb namespace (bug 18496). * lib/regex_internal.c (build_wcs_upper_buffer): Call __wcrtomb instead of wcrtomb. 2015-06-05 Joseph Myers <joseph@codesourcery.com> Fix regex wctype namespace (bug 18495). * lib/regcomp.c (re_compile_fastmap_iter): Call __towlower instead of towlower. * lib/regex_internal.c (build_wcs_upper_buffer): Call __iswlower instead of iswlower. Call __towupper instead of towupper. * lib/regex_internal.h (IS_WIDE_WORD_CHAR): Call __iswalnum instead of iswalnum. 2015-01-07 Chris Metcalf <cmetcalf@ezchip.com> * lib/regcomp.c (parse_bracket_exp): Initialize type to COLL_SYM in a couple of places to avoid uninitialized variable wanings on tilegx gcc 4.8.2. 2014-11-24 Siddhesh Poyarekar <siddhesh@redhat.com> * lib/regex_internal.h: Remove NOT_IN_libc. 2014-11-17 Andreas Schwab <schwab@suse.de> * lib/regex_internal.h: Don't include <locale/elem-hash.h>. 2014-09-11 Roland McGrath <roland@hack.frob.com> Move findidx nested functions to top-level. * lib/regcomp.c [_LIBC]: #include <locale/weight.h>. (build_equiv_class) [_LIBC]: Don't #include it inside the function. Pass new arguments to findidx. * lib/regexec.c [RE_ENABLE_I18N] [_LIBC]: #include <locale/weight.h>. [RE_ENABLE_I18N] (check_node_accept_bytes) [_LIBC]: Don't #include it inside the function. Pass new arguments to findidx. * lib/regex_internal.h: [!NOT_IN_libc] [_LIBC]: #include <locale/weight.h>. (re_string_elem_size_at): Don't #include it inside the function. Pass new arguments to findidx. 2014-08-01 Siddhesh Poyarekar <siddhesh@redhat.com> Check if DEBUG is defined in regex_internal.c * lib/regex_internal.c: Check if DEBUG is defined and is set.
* version-etc: new yearPaul Eggert2014-12-311-1/+1
| | | | | | * doc/gnulib.texi: * lib/version-etc.c (COPYRIGHT_YEAR): Update copyright date. * all files: Run 'make update-copyright'.
* regex: don't deref NULL upon heap allocation failureJim Meyering2014-07-121-0/+2
| | | | | | | | | | | * lib/regcomp.c (parse_dup_op): Handle duplicate_tree failure in one more place. To trigger the segfault, configure grep -with-included-regex, build it, and run these commands: ( ulimit -v 300000; echo a|src/grep -E a+++++++++++++++++++++ ) I discovered this while replying to a private report from Jens Schleusener about excessive memory consumption by grep when using a regular expression like the one above.
* regex: fix memory leak in compilerPaul Eggert2014-07-111-1/+5
| | | | | | | Fix by Andreas Schwab in: https://sourceware.org/ml/libc-alpha/2014-06/msg00503.html * lib/regcomp.c (parse_reg_exp): Deallocate partially constructed tree before returning error.
* regex: fix memory leak in compilerPaul Eggert2014-06-191-3/+11
| | | | | | | Fix by Andreas Schwab in: https://sourceware.org/ml/libc-alpha/2014-06/msg00462.html * lib/regcomp.c (parse_expression): Deallocate partially constructed tree before returning error.
* maint: update copyrightEric Blake2014-01-011-1/+1
| | | | | | I ran 'make update-copyright'. Signed-off-by: Eric Blake <eblake@redhat.com>
* c-ctype, regex, verify: port to gcc -std=c90 -pedanticPaul Eggert2013-05-291-1/+1
| | | | | | | | | | | | Avoid constructions that are rejected by gcc -std=c90 -pedantic. This fixes a porting bug I recently reintroduced in regex, and some other instances that I discovered while testing the fix. * lib/c-ctype.h [__STRICT_ANSI__]: Avoid ({ ... }). * lib/regcomp.c (utf8_sb_map) [__STRICT_ANSI__]: Avoid [0 ... N] = E. * lib/regex_internal.h [!_LIBC && GNULIB_LOCK]: Do not use a macro with an empty argument if this is a pedantic pre-C99 GCC. * lib/verify.h: Do not use _Static_assert if this is a pedantic pre-C11 GCC.
* regex: fix dfa race in multithreaded usesPaul Eggert2013-05-191-3/+8
| | | | | | | | | | | | | | | | | | | | Problem reported by Ludovic Courtès in <http://lists.gnu.org/archive/html/bug-gnulib/2013-05/msg00058.html>. * lib/regex_internal.h (lock_define, lock_init, lock_fini): New macros. All uses of __libc_lock_define, __libc_lock_init changed to use the first two of these. (__libc_lock_lock, __libc_lock_unlock): New macros, for non-glibc platforms. (struct re_dfa_t): Define the lock unconditionally. * lib/regexec.c (regexec, re_search_stub): Remove some now-incorrect '#ifdef _LIBC"s. * modules/regex (Depends-on): Add pthread, if we use the included regex. * lib/regcomp.c: Do actions that are not needed for glibc, but may be needed elsewhere. (regfree, re_compile_internal): Destroy the lock. (re_compile_internal): Check for lock-initialization failure.
* regex: rename remaining __attribute calls to __attribute__.Gary V. Vaughan2013-03-081-5/+5
| | | | | | | | | | | | | | | Commit 930b85b changed definition of __attribute, but left some uses unchanged, preventing compilation of regex module on most non-gcc environments: * lib/regcomp.c (re_set_fastmap, seek_collating_symbol_entry) (lookup_collation_sequence_value, build_range_exp) (build_collating_symbol): Set attributes with newly renamed __attribute__ decorator. * lib/regex_internal.c (re_string_peek_byte_case) (re_node_set_compare, re_node_set_contains): Likewise. * lib/regexec.c (acquire_init_state_context): Likewise. Signed-off-by: Gary V. Vaughan <gary@gnu.org>
* regex: merge patches from libcPaul Eggert2013-02-251-45/+27
| | | | | | | | | | | | | 2013-02-26 Siddhesh Poyarekar <siddhesh@redhat.com> * lib/regex_internal.h (__attribute__): Rename from __attribute. All uses changed. (bitset_not, bitset_merge, bitset_mask, re_string_char_size_at) (re_string_wchar_at, re_string_elem_size_at): Mark function as possibly unused. 2013-02-12 Andreas Schwab <schwab@suse.de> [BZ #11561] * lib/regcomp.c (parse_bracket_exp) [_LIBC]: When looking up collating elements compare against the byte sequence of it, not its name.
* regex: conform to strict CPaul Eggert2013-01-051-1/+2
| | | | | * lib/regcomp.c (parse_bracket_exp): Add cast to conform to strict C. From Aharon Robbins.