diff options
author | Hamza Mahfooz <someguy@effective-light.com> | 2021-10-15 12:13:56 -0400 |
---|---|---|
committer | Junio C Hamano <gitster@pobox.com> | 2021-10-15 12:45:39 -0700 |
commit | ae39ba431ab861548eb60b4bd2e1d8b8813db76f (patch) | |
tree | a2c09de1515827b3a657064b689a5280f8b51c8b /grep.c | |
parent | 6a5c337922a5221d1f6d025d84e18b526df9944c (diff) | |
download | git-ae39ba431ab861548eb60b4bd2e1d8b8813db76f.tar.gz |
grep/pcre2: fix an edge case concerning ascii patterns and UTF-8 data
If we attempt to grep non-ascii log message text with an ascii pattern, we
run into the following issue:
$ git log --color --author='.var.*Bjar' -1 origin/master | grep ^Author
grep: (standard input): binary file matches
So, to fix this teach the grep code to use PCRE2_UTF, as long as the log
output is encoded in UTF-8.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Hamza Mahfooz <someguy@effective-light.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Diffstat (limited to 'grep.c')
-rw-r--r-- | grep.c | 6 |
1 files changed, 4 insertions, 2 deletions
@@ -382,8 +382,10 @@ static void compile_pcre2_pattern(struct grep_pat *p, const struct grep_opt *opt } options |= PCRE2_CASELESS; } - if (!opt->ignore_locale && is_utf8_locale() && has_non_ascii(p->pattern) && - !(!opt->ignore_case && (p->fixed || p->is_fixed))) + if ((!opt->ignore_locale && !has_non_ascii(p->pattern)) || + (!opt->ignore_locale && is_utf8_locale() && + has_non_ascii(p->pattern) && !(!opt->ignore_case && + (p->fixed || p->is_fixed)))) options |= (PCRE2_UTF | PCRE2_MATCH_INVALID_UTF); #ifdef GIT_PCRE2_VERSION_10_36_OR_HIGHER |