summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'bc/vcs-svn-cleanup'Junio C Hamano2017-08-263-17/+10
|\ | | | | | | | | | | | | | | Code clean-up. * bc/vcs-svn-cleanup: vcs-svn: rename repo functions to "svn_repo" vcs-svn: remove unused prototypes
| * vcs-svn: rename repo functions to "svn_repo"bc/vcs-svn-cleanupbrian m. carlson2017-08-203-10/+10
| | | | | | | | | | | | | | | | | | | | There were several functions in the Subversion code that started with "repo_". This namespace is also used by the Git struct repository code. Rename these functions to start with "svn_repo" to avoid any future conflicts. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * vcs-svn: remove unused prototypesbrian m. carlson2017-08-201-7/+0
| | | | | | | | | | | | | | | | | | | | The Subversion code had prototypes for several functions which were not ever defined or used. These functions all had names starting with "repo_", some of which conflict with those in repository.h. To avoid the conflict, remove those unused prototypes. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'tb/apply-with-crlf'Junio C Hamano2017-08-264-16/+71
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git apply" that is used as a better "patch -p1" failed to apply a taken from a file with CRLF line endings to a file with CRLF line endings. The root cause was because it misused convert_to_git() that tried to do "safe-crlf" processing by looking at the index entry at the same path, which is a nonsense---in that mode, "apply" is not working on the data in (or derived from) the index at all. This has been fixed. * tb/apply-with-crlf: apply: file commited with CRLF should roundtrip diff and apply convert: add SAFE_CRLF_KEEP_CRLF
| * | apply: file commited with CRLF should roundtrip diff and applytb/apply-with-crlfTorsten Bögershausen2017-08-192-11/+63
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a file had been commited with CRLF but now .gitattributes say "* text=auto" (or core.autocrlf is true), the following does not roundtrip, `git apply` fails: printf "Added line\r\n" >>file && git diff >patch && git checkout -- . && git apply patch Before applying the patch, the file from working tree is converted into the index format (clean filter, CRLF conversion, ...). Here, when commited with CRLF, the line endings should not be converted. Note that `git apply --index` or `git apply --cache` doesn't call convert_to_git() because the source material is already in index format. Analyze the patch if there is a) any context line with CRLF, or b) if any line with CRLF is to be removed. In this case the patch file `patch` has mixed line endings, for a) it looks like this: diff --git a/one b/one index 533790e..c30dea8 100644 --- a/one +++ b/one @@ -1 +1,2 @@ a\r +b\r And for b) it looks like this: diff --git a/one b/one index 533790e..485540d 100644 --- a/one +++ b/one @@ -1 +1 @@ -a\r +b\r If `git apply` detects that the patch itself has CRLF, (look at the line " a\r" or "-a\r" above), the new flag crlf_in_old is set in "struct patch" and two things will happen: - read_old_data() will not convert CRLF into LF by calling convert_to_git(..., SAFE_CRLF_KEEP_CRLF); - The WS_CR_AT_EOL bit is set in the "white space rule", CRLF are no longer treated as white space. While at there, make it clear that read_old_data() in apply.c knows what it wants convert_to_git() to do with respect to CRLF. In fact, this codepath is about applying a patch to a file in the filesystem, which may not exist in the index, or may exist but may not match what is recorded in the index, or in the extreme case, we may not even be in a Git repository. If convert_to_git() peeked at the index while doing its work, it *would* be a bug. Pass NULL instead of &the_index to convert_to_git() to make sure we catch future bugs to clarify this. Update the test in t4124: split one test case into 3: - Detect the " a\r" line in the patch - Detect the "-a\r" line in the patch - Use LF in repo and CLRF in the worktree. Reported-by: Anthony Sottile <asottile@umich.edu> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | convert: add SAFE_CRLF_KEEP_CRLFTorsten Bögershausen2017-08-162-5/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When convert_to_git() is called, the caller may want to keep CRLF to be kept as CRLF (and not converted into LF). This will be used in the next commit, when apply works with files that have CRLF and patches are applied onto these files. Add the new value "SAFE_CRLF_KEEP_CRLF" to safe_crlf. Prepare convert_to_git() to be able to run the clean filter, skip the CRLF conversion and run the ident filter. Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'jt/stash-tests'Junio C Hamano2017-08-261-0/+34
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Test update to improve coverage for "git stash" operations. * jt/stash-tests: stash: add a test for stashing in a detached state stash: add a test for when apply fails during stash branch stash: add a test for stash create with no files
| * | | stash: add a test for stashing in a detached statejt/stash-testsJoel Teichroeb2017-08-191-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All that we are really testing here is that the message is correct when we are not on any branch. All other functionality is already tested elsewhere. Signed-off-by: Joel Teichroeb <joel@teichroeb.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | stash: add a test for when apply fails during stash branchJoel Teichroeb2017-08-191-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If the return value of merge recursive is not checked, the stash could end up being dropped even though it was not applied properly Signed-off-by: Joel Teichroeb <joel@teichroeb.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | stash: add a test for stash create with no filesJoel Teichroeb2017-08-191-0/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Ensure the command suceeds and outputs nothing Signed-off-by: Joel Teichroeb <joel@teichroeb.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Merge branch 'jk/trailers-parse'Junio C Hamano2017-08-268-52/+314
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git interpret-trailers" has been taught a "--parse" and a few other options to make it easier for scripts to grab existing trailer lines from a commit log message. * jk/trailers-parse: doc/interpret-trailers: fix "the this" typo pretty: support normalization options for %(trailers) t4205: refactor %(trailers) tests pretty: move trailer formatting to trailer.c interpret-trailers: add --parse convenience option interpret-trailers: add an option to unfold values interpret-trailers: add an option to show only existing trailers interpret-trailers: add an option to show only the trailers trailer: put process_trailers() options into a struct
| * | | | doc/interpret-trailers: fix "the this" typojk/trailers-parseMartin Ã…gren2017-08-201-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | pretty: support normalization options for %(trailers)Jeff King2017-08-154-6/+79
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The interpret-trailers command recently learned some options to make its output easier to parse (for a caller whose only interested in picking out the trailer values). But it's not very efficient for asking for the trailers of many commits in a single invocation. We already have "%(trailers)" to do that, but it doesn't know about unfolding or omitting non-trailers. Let's plumb those options through, so you can have the best of both. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | t4205: refactor %(trailers) testsJeff King2017-08-151-8/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We currently have one test for %(trailers). In preparation for more, let's refactor a few bits: - move the commit creation to its own setup step so it can be reused by multiple tests - add a trailer with whitespace continuation (to confirm that it is left untouched) - fix the sample text which claims the placeholder is %bT. This was switched long ago to %(trailers) - replace one "cat" with an "echo" when generating the expected output. This saves a process (and sets a better pattern for future tests to follow). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | pretty: move trailer formatting to trailer.cJeff King2017-08-153-11/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The next commit will add many features to the %(trailer) placeholder in pretty.c. We'll need to access some internal functions of trailer.c for that, so our options are either: 1. expose those functions publicly or 2. make an entry point into trailer.c to do the formatting Doing (2) ends up exposing less surface area, though do note that caveats in the docstring of the new function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | interpret-trailers: add --parse convenience optionJeff King2017-08-152-7/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The last few commits have added command line options that can turn interpret-trailers into a parsing tool. Since they'd most often be used together, let's provide a convenient single option for callers to invoke this mode. This is implemented as a callback rather than a boolean so that its effect is applied immediately, as if those options had been specified. Later options can then override them. E.g.: git interpret-trailers --parse --no-unfold would work. Let's also update the documentation to make clear that this parsing mode behaves quite differently than the normal "add trailers to the input" mode. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | interpret-trailers: add an option to unfold valuesJeff King2017-08-155-0/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The point of "--only-trailers" is to give a caller an output that's easy for them to parse. Getting rid of the non-trailer material helps, but we still may see more complicated syntax like whitespace continuation. Let's add an option to unfold any continuation, giving the output as a single "key: value" line per trailer. As a bonus, this could be used even without --only-trailers to clean up unusual formatting in the incoming data. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | interpret-trailers: add an option to show only existing trailersJeff King2017-08-155-4/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It can be useful to invoke interpret-trailers for the primary purpose of parsing existing trailers. But in that case, we don't want to apply existing ifMissing or ifExists rules from the config. Let's add a special mode where we avoid applying those rules. Coupled with --only-trailers, this gives us a reasonable parsing tool. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | interpret-trailers: add an option to show only the trailersJeff King2017-08-155-9/+58
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In theory it's easy for any reader who wants to parse trailers to do so. But there are a lot of subtle corner cases around what counts as a trailer, when the trailer block begins and ends, etc. Since interpret-trailers already has our parsing logic, let's let callers ask it to just output the trailers. They still have to parse the "key: value" lines, but at least they can ignore all of the other corner cases. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | trailer: put process_trailers() options into a structJeff King2017-08-103-12/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We already have two options and are about to add a few more. To avoid having a huge number of boolean arguments, let's convert to an options struct which can be passed in. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | Merge branch 'pb/trailers-from-command-line'Junio C Hamano2017-08-265-53/+274
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git interpret-trailers" learned to take the trailer specifications from the command line that overrides the configured values. * pb/trailers-from-command-line: interpret-trailers: fix documentation typo interpret-trailers: add options for actions trailers: introduce struct new_trailer_item trailers: export action enums and corresponding lookup functions
| * | | | | interpret-trailers: fix documentation typopb/trailers-from-command-linePaolo Bonzini2017-08-141-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Self-explanatory... trailer.ifexists is documented with the right name, but after a while it switches to ifexist. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | interpret-trailers: add options for actionsPaolo Bonzini2017-08-145-6/+156
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Allow using non-default values for trailers without having to set them up in .gitconfig first. For example, if you have the following configuration trailer.signed-off-by.where = end you may use "--where before" when a patch author forgets his Signed-off-by and provides it in a separate email. Likewise for --if-exists and --if-missing Reverting to the behavior specified by .gitconfig is done with --no-where, --no-if-exists and --no-if-missing. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | trailers: introduce struct new_trailer_itemPaolo Bonzini2017-08-143-13/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This will provide a place to store the current state of the --where, --if-exists and --if-missing options. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | trailers: export action enums and corresponding lookup functionsPaolo Bonzini2017-07-252-32/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Separate the mechanical changes out of the next patch. The functions are changed to take a pointer to enum, because struct conf_info is not going to be public. Set the default values explicitly in default_conf_info, since they are not anymore close to default_conf_info and it's not obvious which constant has value 0. With the next patches, in fact, the values will not be zero anymore! Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | Merge branch 'jt/diff-color-move-fix'Junio C Hamano2017-08-264-82/+236
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A handful of bugfixes and an improvement to "diff --color-moved". * jt/diff-color-move-fix: diff: define block by number of alphanumeric chars diff: respect MIN_BLOCK_LENGTH for last block diff: avoid redundantly clearing a flag
| * | | | | | diff: define block by number of alphanumeric charsjt/diff-color-move-fixJonathan Tan2017-08-164-81/+183
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The existing behavior of diff --color-moved=zebra does not define the minimum size of a block at all, instead relying on a heuristic applied later to filter out sets of adjacent moved lines that are shorter than 3 lines long. This can be confusing, because a block could thus be colored as moved at the source but not at the destination (or vice versa), depending on its neighbors. Instead, teach diff that the minimum size of a block is 20 alphanumeric characters, the same heuristic used by "git blame". This allows diff to still exclude uninteresting lines appearing on their own (such as those solely consisting of one or a few closing braces), as was the intention of the adjacent-moved-line heuristic. This requires a change in some tests in that some of their lines are no longer considered to be part of a block, because they are too short. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff: respect MIN_BLOCK_LENGTH for last blockJonathan Tan2017-08-162-7/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, MIN_BLOCK_LENGTH is only checked when diff encounters a line that does not belong to the current block. In particular, this means that MIN_BLOCK_LENGTH is not checked after all lines are encountered. Perform that check. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff: avoid redundantly clearing a flagJonathan Tan2017-08-141-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | No code in diff.c sets DIFF_SYMBOL_MOVED_LINE except in mark_color_as_moved(), so it is redundant to clear it for the current line. Therefore, clear it only for previous lines. This makes a refactoring in a subsequent patch easier. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | | Merge branch 'sb/diff-color-move'Junio C Hamano2017-08-269-316/+1618
|\ \ \ \ \ \ \ | |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git diff" has been taught to optionally paint new lines that are the same as deleted lines elsewhere differently from genuinely new lines. * sb/diff-color-move: (25 commits) diff: document the new --color-moved setting diff.c: add dimming to moved line detection diff.c: color moved lines differently, plain mode diff.c: color moved lines differently diff.c: buffer all output if asked to diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP diff.c: convert word diffing to use emit_diff_symbol diff.c: convert show_stats to use emit_diff_symbol diff.c: convert emit_binary_diff_body to use emit_diff_symbol submodule.c: migrate diff output to use emit_diff_symbol diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR_{PLUS, MINUS} diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS[_PORCELAIN] diff.c: migrate emit_line_checked to use emit_diff_symbol diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO ...
| * | | | | | diff: document the new --color-moved settingStefan Beller2017-06-302-2/+49
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: add dimming to moved line detectionStefan Beller2017-06-304-13/+254
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Any lines inside a moved block of code are not interesting. Boundaries of blocks are only interesting if they are next to another block of moved code. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: color moved lines differently, plain modeStefan Beller2017-06-303-3/+59
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add the 'plain' mode for move detection of code. This omits the checking for adjacent blocks, so it is not as useful. If you have a lot of the same blocks moved in the same patch, the 'Zebra' would end up slow as it is O(n^2) (n is number of same blocks). So this may be useful there and is generally easy to add. Instead be very literal at the move detection, do not skip over short blocks here. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: color moved lines differentlyStefan Beller2017-06-303-15/+602
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a patch consists mostly of moving blocks of code around, it can be quite tedious to ensure that the blocks are moved verbatim, and not undesirably modified in the move. To that end, color blocks that are moved within the same patch differently. For example (OM, del, add, and NM are different colors): [OM] -void sensitive_stuff(void) [OM] -{ [OM] - if (!is_authorized_user()) [OM] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OM] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NM] + sensitive_stuff(spanning, [NM] + multiple, [NM] + lines); [NM] +} However adjacent blocks may be problematic. For example, in this potentially malicious patch, the swapping of blocks can be spotted: [OM] -void sensitive_stuff(void) [OM] -{ [OMA] - if (!is_authorized_user()) [OMA] - die("unauthorized"); [OM] - sensitive_stuff(spanning, [OM] - multiple, [OM] - lines); [OMA] -} void another_function() { [del] - printf("foo"); [add] + printf("bar"); } [NM] +void sensitive_stuff(void) [NM] +{ [NMA] + sensitive_stuff(spanning, [NMA] + multiple, [NMA] + lines); [NM] + if (!is_authorized_user()) [NM] + die("unauthorized"); [NMA] +} If the moved code is larger, it is easier to hide some permutation in the code, which is why some alternative coloring is needed. This patch implements the first mode: * basic alternating 'Zebra' mode This conveys all information needed to the user. Defer customization to later patches. First I implemented an alternative design, which would try to fingerprint a line by its neighbors to detect if we are in a block or at the boundary. This idea iss error prone as it inspected each line and its neighboring lines to determine if the line was (a) moved and (b) if was deep inside a hunk by having matching neighboring lines. This is unreliable as the we can construct hunks which have equal neighbors that just exceed the number of lines inspected. (Think of 'AXYZBXYZCXYZD..' with each letter as a line, that is permutated to AXYZCXYZBXYZD..'). Instead this provides a dynamic programming greedy algorithm that finds the largest moved hunk and then has several modes on highlighting bounds. A note on the options '--submodule=diff' and '--color-words/--word-diff': In the conversion to use emit_line in the prior patches both submodules as well as word diff output carefully chose to call emit_line with sign=0. All output with sign=0 is ignored for move detection purposes in this patch, such that no weird looking output will be generated for these cases. This leads to another thought: We could pass on '--color-moved' to submodules such that they color up moved lines for themselves. If we'd do so only line moves within a repository boundary are marked up. It is useful to have moved lines colored, but there are annoying corner cases, such as a single line moved, that is very common. For example in a typical patch of C code, we have closing braces that end statement blocks or functions. While it is technically true that these lines are moved as they show up elsewhere, it is harmful for the review as the reviewers attention is drawn to such a minor side annoyance. For now let's have a simple solution of hardcoding the number of moved lines to be at least 3 before coloring them. Note, that the length is applied across all blocks to find the 'lonely' blocks that pollute new code, but do not interfere with a permutated block where each permutation has less lines than 3. Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: buffer all output if asked toStefan Beller2017-06-302-2/+109
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Introduce a new option 'emitted_symbols' in the struct diff_options which controls whether all output is buffered up until all output is available. It is set internally in diff.c when necessary. We'll have a new struct 'emitted_string' in diff.c which will be used to buffer each line. The emitted_string will duplicate the memory of the line to buffer as that is easiest to reason about for now. In a future patch we may want to decrease the memory usage by not duplicating all output for buffering but rather we may want to store offsets into the file or in case of hunk descriptions such as the similarity score, we could just store the relevant number and reproduce the text later on. This approach was chosen as a first step because it is quite simple compared to the alternative with less memory footprint. emit_diff_symbol factors out the emission part and depending on the diff_options->emitted_symbols the emission will be performed directly when calling emit_diff_symbol or after the whole process is done, i.e. by buffering we have add the possibility for a second pass over the whole output before doing the actual output. In 6440d34 (2012-03-14, diff: tweak a _copy_ of diff_options with word-diff) we introduced a duplicate diff options struct for word emissions as we may have different regex settings in there. When buffering the output, we need to operate on just one buffer, so we have to copy back the emissions of the word buffer into the main buffer. Unconditionally enable output via buffer in this patch as it yields a great opportunity for testing, i.e. all the diff tests from the test suite pass without having reordering issues (i.e. only parts of the output got buffered, and we forgot to buffer other parts). The test suite passes, which gives confidence that we converted all functions to use emit_string for output. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARYStefan Beller2017-06-301-30/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEPStefan Beller2017-06-301-3/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: convert word diffing to use emit_diff_symbolStefan Beller2017-06-301-33/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The word diffing is not line oriented and would need some serious effort to be transformed into a line oriented approach, so just go with a symbol DIFF_SYMBOL_WORD_DIFF that is a partial line. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: convert show_stats to use emit_diff_symbolStefan Beller2017-06-302-44/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We call print_stat_summary from builtin/apply, so we still need the version with a file pointer, so introduce print_stat_summary_0 that uses emit_string machinery and keep print_stat_summary with the same arguments around. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: convert emit_binary_diff_body to use emit_diff_symbolStefan Beller2017-06-301-17/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | submodule.c: migrate diff output to use emit_diff_symbolStefan Beller2017-06-304-68/+121
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | As the submodule process is no longer attached to the same file pointer 'o->file' as the superprojects process, there is a different result in color.c::check_auto_color. That is why we need to pass coloring explicitly, such that the submodule coloring decision will be made by the child process processing the submodule. Only DIFF_SYMBOL_SUBMODULE_PIPETHROUGH contains color, the other symbols are for embedding the submodule output into the superprojects output. Remove the colors from the function signatures, as all the coloring decisions will be made either inside the child process or the final emit_diff_symbol, but not in the functions driving the submodule diff. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFFStefan Beller2017-06-301-14/+21
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILESStefan Beller2017-06-301-5/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | we could save a little bit of memory when buffering in a later mode by just passing the inner part ("%s and %s", file1, file 2), but those a just a few bytes, so instead let's reuse the implementation from DIFF_SYMBOL_HEADER and keep the whole line around. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADERStefan Beller2017-06-301-8/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The header is constructed lazily including line breaks, so just emit the raw string as is. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR_{PLUS, MINUS}Stefan Beller2017-06-301-21/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have to use fprintf instead of emit_line, because we want to emit the tab after the color. This is important for ancient versions of gnu patch AFAICT, although we probably do not want to feed colored output to the patch utility, such that it would not matter if the trailing tab is colored. Keep the corner case as-is though. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETEStefan Beller2017-06-301-2/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The context marker use the exact same output pattern, so reuse it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS[_PORCELAIN]Stefan Beller2017-06-301-16/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: migrate emit_line_checked to use emit_diff_symbolStefan Beller2017-06-303-44/+80
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add a new flags field to emit_diff_symbol, that will be used by context lines for: * white space rules that are applicable (The first 12 bits) Take a note in cahe.c as well, when this ws rules are extended we have to fix the bits in the flags field. * how the rules are evaluated (actually this double encodes the sign of the line, but the code is easier to keep this way, bits 13,14,15) * if the line a blank line at EOF (bit 16) The check if new lines need to be marked up as extra lines at the end of file, is now done unconditionally. That should be ok, as 'new_blank_line_at_eof' has a quick early return. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOFStefan Beller2017-06-301-8/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFOStefan Beller2017-06-301-2/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>