summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* version 3.4v3.4Jim Meyering2020-01-021-1/+1
| | | | * NEWS: Record release date.
* build: update gnulib to latest, for mbrtowc-vs-Irix build fixJim Meyering2020-01-021-0/+0
|
* doc: fix --exclude description in man pagePaul Eggert2020-01-021-2/+2
| | | | | | | Problem reported by Duncan Moore (Bug#37212). * src/grep.c (usage): Fix incorrect statement about --exclude and directories. Standardize on “that match GLOB” instead of “matching GLOB”.
* doc: fix missing “more” in man pagePaul Eggert2020-01-021-1/+1
| | | | | Problem reported by Philippe Schnoebelen (Bug#34078). * doc/grep.in.1: Add missing “more”.
* doc: add [:blank:] to man pagePaul Eggert2020-01-011-0/+1
| | | | * doc/grep.in.1: Mention [:blank:] (Bug#33291).
* maint: update all copyright year number rangesJim Meyering2020-01-01116-115/+115
| | | | | | | | Run "make update-copyright" and then... * gnulib: Update to latest with copyright year adjusted. * tests/init.sh: Sync with gnulib to pick up copyright year. * bootstrap: Likewise. * doc/grep.in.1: Use "-" in copyright year ranges, not \en.
* tests: avoid unwarranted failure in a netbsd 8.1 VMJim Meyering2019-12-311-0/+7
| | | | | * tests/mb-non-UTF8-perf-Fw: Run twice, to avoid first-read penalty. Reported by Nelson H.F. Beebe.
* build: update gnulib to latest (for localeinfo perf fix)Jim Meyering2019-12-301-0/+0
|
* maint: add syntax-check rule to prohibit "backreference" spellingJim Meyering2019-12-301-0/+6
| | | | * cfg.mk (sc_prohibit_backref): New rule.
* maint: remove too-long line from AUTHORSPaul Eggert2019-12-301-1/+0
| | | | * AUTHORS: Remove URL that’s too long.
* maint: update AUTHORSPaul Eggert2019-12-301-14/+17
| | | | * AUTHORS: Update to better reflect current authorship.
* avoid new syntax-check failuresJim Meyering2019-12-301-1/+1
| | | | * cfg.mk (old_NEWS_hash): Updating old news, we must also udpate this.
* doc: don’t encourage back-referencesPaul Eggert2019-12-301-25/+0
| | | | | | | | * doc/grep.texi (Usage): Remove palindrome question. Bondioni’s RE makes grep issue a ‘grep: stack overflow’ diagnostic, and we shouldn’t be encouraging fancy back-references anyway, due to all the bugs in this area (Bug#26864). Plus, the allusion to “GNU extensions” doesn't seem to be correct here.
* doc: robustify some examplesPaul Eggert2019-12-301-17/+26
| | | | | Prompted by suggestions by Stephane Chazelas (Bug#38792#20). * doc/grep.texi (Usage): Make examples more robust.
* doc: fix bug# typoPaul Eggert2019-12-301-1/+1
|
* doc: spell "back-reference" more consistentlyPaul Eggert2019-12-307-18/+18
|
* doc: mention back-reference bugsPaul Eggert2019-12-301-1/+18
| | | | | | Inspired by Bug#26864. * doc/grep.texi (Known Bugs): New section. Mention back-reference issues.
* doc: Add -- to more-complex examplePaul Eggert2019-12-292-6/+12
| | | | | Suggested by Stephane Chazelas (Bug#38792). * doc/grep.in.1, doc/grep.texi: Add ‘--’ to recently-added example.
* doc: improve subsection title (Bug#26132)Paul Eggert2019-12-291-1/+1
| | | | * doc/grep.in.1: Rename "Matcher Selection" to "Pattern Syntax".
* doc: fix typo in previous patchPaul Eggert2019-12-291-1/+1
|
* doc: document quoting betterPaul Eggert2019-12-292-34/+101
| | | | | | | | | | Problem reported by Martin Simons (Bug#38792). * doc/grep.texi: Fix quoting used in examples. Say that patterns should be quoted, use quoting more consistently in examples, and give an example illustrating the difference between patterns and globbing. Don’t assume zgrep expertise in example. * doc/grep.in.1: Likewise. Also, reorder sections to match GNU/Linux man-pages style.
* maint: tweak NEWS wordingJim Meyering2019-12-261-2/+2
| | | | * NEWS: Minor wording change.
* build: update gnulib to latest; and sync tests/init.shJim Meyering2019-12-262-2/+8
| | | | | * gnulib: update * tests/init.sh: Sync from gnulib (this removes the LC_ALL=C setting).
* tests: avoid spurious failure due to 1-second timeoutJim Meyering2019-12-261-1/+1
| | | | | | | * tests/grep-dev-null-out: Use a 10-second timeout, rather than a 1-second one. This avoids false failure on slow systems. Reported by Assaf Gordon in https://lists.gnu.org/r/grep-devel/2019-12/msg00018.html
* build: update gnulib submodule to latestPaul Eggert2019-12-261-0/+0
|
* maint: adjust surrogate-pair for 16-bit wchar_tPaul Eggert2019-12-261-2/+5
| | | | | | * tests/surrogate-pair: Adjust to match fixed behavior on AIX 7.2, where wchar_t is 16 bits and cannot represent the test case data.
* tests: fix typo in name of test fileJim Meyering2019-12-252-1/+1
| | | | | | * tests/backslash-s-vs-invalid-multitype: Rename to... * tests/backslash-s-vs-invalid-multibyte: ...this. * tests/Makefile.am (TESTS): Reflect renaming.
* tests: ensure we use require_timeout_ when neededJim Meyering2019-12-251-0/+13
| | | | * cfg.mk (sc_timeout_prereq): New syntax-check rule.
* tests: require timeoutJim Meyering2019-12-251-0/+1
| | | | | | | | * tests/mb-non-UTF8-perf-Fw: This test uses "timeout", so must first call require_timeout_. This avoids test spurious failure when running with no timeout program. Reported by Bruno Haible in https://lists.gnu.org/r/grep-devel/2019-12/msg00008.html
* tests: work around AIX 7.2 sh printf bugPaul Eggert2019-12-253-5/+7
| | | | | | | | | | AIX 7.2 /bin/sh’s printf command mishandles octal escapes in multibyte locales: it treats them as characters, not bytes. * tests/backslash-s-vs-invalid-multitype, tests/encoding-error: Use the C locale when employing the printf command with an octal escape that AIX 7.2 sh might mishandle. * tests/init.sh (setup_): Use the C locale for tests. This has the side benefit of making them more reproducible.
* maint: adjust new commentsJim Meyering2019-12-221-7/+7
| | | | | * src/dfasearch.c (possible_backrefs_in_pattern): Remove a duplicate "a", insert a "be" and a comma, and reformat.
* build: update gnulib to latestJim Meyering2019-12-223-227/+294
| | | | | | * gnulib: Update submodule to latest. * bootstrap: Copy from gnulib. * tests/init.sh: Likewise.
* grep: fix some bugs in pattern-grouping speedupPaul Eggert2019-12-222-47/+82
| | | | | | | | | | | | | | | | | | | This fixes some bugs in the previous commit, and should finish the fix for Bug#33249. * NEWS: Mention fix for Bug#33249. * src/dfasearch.c (possible_backrefs_in_pattern, regex_compile) (GEAcompile): In new code, prefer ptrdiff_t to size_t when either will do, since ptrdiff_t has better error checking. At some point we should adjust the old code too. (possible_backrefs_in_pattern): Rename from find_backref_in_pattern. New arg BS_SAFE. All uses changed. Fix false negative if a multibyte character ends in a single '\\' byte, followed by the two bytes '\\', '1'. (regex_compile): Simplify. (GEAcompile): Avoid quadratic behavior when reallocating growing buffers. Fix a couple of bugs in copying pattern data involving backreferences. Fix another bug in copying pattern metadata involving backreferences, by removing the need to copy it.
* grep: grouping of a pattern with multiple linesNorihiro Tanaka2019-12-221-20/+107
| | | | | | | | | | | | | | | | | | | | When grep uses regex, it splits a pattern with multiple lines by newline character into fragments. Compilation and execution run for each fragment. That causes slowdown. By this change, each fragment is divided into groups by whether the fragment includes back references. A fragment with back references constitutes group, and all fragments that lack back references also constitute a group. This change extremely speeds-up following case. $ seq -f '%040g' 0 9999 | sed '1s/$/\\(0\\)\\1/' >pat $ yes 00000000000000000000000000000000000000000x | head -10000 >in $ time -p env LC_ALL=C src/grep -f pat in * src/dfasearch.c (find_backref_in_pattern, regex_compile): New functions. (GEAcompile): Use the new functions to group fragments as mentioned above.
* maint: add NEWS for Bug#34951 fixPaul Eggert2019-12-191-0/+4
| | | | * NEWS: Mention Bug#34951.
* dfa: separate parse and compile phaseNorihiro Tanaka2019-12-191-1/+2
| | | | | | | | | | DFAMUST() must be called after parse and before tokens re-order which is introduced in commit 5c7a0371823876cca7a1347fa09ca26bbbff0c98, but both are executed in compilation phase. * lib/dfa.c (dfaparse): Change it to global function. (dfacomp): If first argument is NULL, skip parse. * lib/dfa.h: (dfaparse): Add a prototype.
* build: update gnulib submodule to latestPaul Eggert2019-12-191-0/+0
|
* grep: speed up multiple word matchingNorihiro Tanaka2019-12-191-0/+18
| | | | | | | | | | | | | | grep uses its KWset matcher for multiple word matching, but that is very slow when most of the parts matched to a pattern are not words. So, if the first match to a pattern is not a word, use the grep matcher to match for its line. Note that when START_PTR is set, the grep matcher uses the regex matcher which is very slow to match words. Therefore, we use the grep matcher when only START_PTR is NULL. * src/kwsearch.c (Fexecute): If an initial match is incomplete because not on a word boundary, use the grep matcher to find a matching line.
* maint: sort test namesJim Meyering2019-12-181-1/+1
| | | | | * tests/Makefile.am (TESTS): Alphabetize the new addition, mb-non-UTF8-perf-Fw to placate syntax-check's sc_sorted_tests.
* maint: adjust to recent Gnulib changePaul Eggert2019-12-181-1/+0
| | | | * po/POTFILES.in: Remove lib/xstrtol-error.c.
* grep: do not match invalid UTF-8Paul Eggert2019-12-175-2/+35
| | | | | | | | Update Gnulib to latest. Also: * src/dfasearch.c (EGexecute): Use ptrdiff_t, not size_t, to match new Gnulib API. * tests/Makefile.am (TESTS): Add dfa-invalid-utf8. * tests/dfa-invalid-utf8: New file.
* tests: add test that would have detected -Fw perf regressionJim Meyering2019-12-012-0/+32
| | | | | | * tests/mb-non-UTF8-perf-Fw: New file. Detect v3.3-22-g090a4db's performance regression. * tests/Makefile.am (TESTS): Add it.
* maint: fix test commentJim Meyering2019-11-301-1/+1
| | | | | * tests/mb-non-UTF8-word-boundary: Also correct "introduced-in" version number in a comment here.
* maint: correct NEWS blurbJim Meyering2019-11-261-1/+1
| | | | | * NEWS (Bug fixes): Correction: the -Fw bug was introduced in 2.28, not in 3.0. Reported by Paul Eggert.
* grep: improve grep -Fw performance in non-UTF8 multibyte localesNorihiro Tanaka2019-11-174-22/+29
| | | | | | | * src/searchutils.c (mb_goback): New parameter. All callers changed. * src/search.h (mb_goback): Update prototype. * src/kwsearch.c (Fexecute): Use mb_goback's MBCLEN to detect a word-boundary even more efficiently.
* grep: fix performance regression with previous patchNorihiro Tanaka2019-11-171-3/+12
| | | | | * src/kwsearch.c (Fexecute): Avoid unnecessary back-up in non-UTF8 multibyte locales.
* maint: rename a variable: bol -> nlJim Meyering2019-11-161-2/+2
| | | | * src/kwsearch.c (Fexecute): Change misleading name: s/bol/nl/
* build: update gnulib to latestJim Meyering2019-11-161-0/+0
|
* maint: correct and clarify a commentJim Meyering2019-11-161-3/+3
| | | | * src/kwsearch.c (Fexecute): Logic was reversed.
* grep: avoid false -Fw match in non-UTF8 multibyte localesJim Meyering2019-11-164-3/+40
| | | | | | | | | | | | | | For example, this command would erroneously print its input line: echo ab | LC_CTYPE=ja_JP.eucjp grep -Fw b This arose when the "memrchr" search for a preceding newline failed: in that case, MB_START was not adjusted and was initially the same as BEG, so wordchar_prev mistakenly returned 0. * src/kwsearch.c (Fexecute): Set MB_START also when there is no preceding newline. * NEWS (Bug fixes): Mention it. * tests/mb-non-UTF8-word-boundary: New file. Test for the bug. * tests/Makefile.am (TESTS): Add it. Reported by NIDE, Naoyuki in https://bugs.gnu.org/38223.