summaryrefslogtreecommitdiff
path: root/git-add--interactive.perl
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'va/i18n-perl-scripts'Junio C Hamano2017-04-191-1/+1
|\ | | | | | | | | | | | | Message fix. * va/i18n-perl-scripts: git-add--interactive.perl: add missing dot in a message
| * git-add--interactive.perl: add missing dot in a messageva/i18n-perl-scriptsRalf Thielow2017-04-131-1/+1
| | | | | | | | | | | | | | | | | | One message appears twice in the translations and the only difference is a dot at the end. So add this dot to make the messages being identical. Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'jk/add-i-use-pathspecs'Junio C Hamano2017-03-171-11/+2
|\ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git add -p <pathspec>" unnecessarily expanded the pathspec to a list of individual files that matches the pathspec by running "git ls-files <pathspec>", before feeding it to "git diff-index" to see which paths have changes, because historically the pathspec language supported by "diff-index" was weaker. These days they are equivalent and there is no reason to internally expand it. This helps both performance and avoids command line argument limit on some platforms. * jk/add-i-use-pathspecs: add--interactive: do not expand pathspecs with ls-files
| * | add--interactive: do not expand pathspecs with ls-filesjk/add-i-use-pathspecsJeff King2017-03-141-11/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we want to get the list of modified files, we first expand any user-provided pathspecs with "ls-files", and then feed the resulting list of paths as arguments to "diff-index" and "diff-files". If your pathspec expands into a large number of paths, you may run into one of two problems: 1. The OS may complain about the size of the argument list, and refuse to run. For example: $ (ulimit -s 128 && git add -p drivers) Can't exec "git": Argument list too long at .../git-add--interactive line 177. Died at .../git-add--interactive line 177. That's on the linux.git repository, which has about 20K files in the "drivers" directory (none of them modified in this case). The "ulimit -s" trick is necessary to show the problem on Linux even for such a gigantic set of paths. Other operating systems have much smaller limits (e.g., a real-world case was seen with only 5K files on OS X). 2. Even when it does work, it's really slow. The pathspec code is not optimized for huge numbers of paths. Here's the same case without the ulimit: $ time git add -p drivers No changes. real 0m16.559s user 0m53.140s sys 0m0.220s We can improve this by skipping "ls-files" completely, and just feeding the original pathspecs to the diff commands. This solution was discussed in 2010: http://public-inbox.org/git/20100105041438.GB12574@coredump.intra.peff.net/ but at the time the diff code's pathspecs were more primitive than those used by ls-files (e.g., they did not support globs). Making the change would have caused a user-visible regression, so we didn't. Since then, the pathspec code has been unified, and the diff commands natively understand pathspecs like '*.c'. This patch implements that solution. That skips the argument-list limits, and the result runs much faster: $ time git add -p drivers No changes. real 0m0.149s user 0m0.116s sys 0m0.080s There are two new tests. The first just exercises the globbing behavior to confirm that we are not causing a regression there. The second checks the actual argument behavior using GIT_TRACE. We _could_ do it with the "ulimit -s" trick, as above. But that would mean the test could only run where "ulimit -s" works. And tests of that sort are expensive, because we have to come up with enough files to actually bust the limit (we can't just shrink the "128" down infinitely, since it is also the in-program stack size). Finally, two caveats and possibilities for future work: a. This fixes one argument-list expansion, but there may be others. In fact, it's very likely that if you run "git add -i" and select a large number of modified files that the script would try to feed them all to a single git command. In practice this is probably fine. The real issue here is that the argument list was growing with the _total_ number of files, not the number of modified or selected files. b. If the repository contains filenames with literal wildcard characters (e.g., "foo*"), the original code expanded them via "ls-files" and then fed those wildcard names to "diff-index", which would have treated them as wildcards. This was a bug, which is now fixed (though unless you really go through some contortions with ":(literal)", it's likely that your original pathspec would match whatever the accidentally-expanded wildcard would anyway). So this takes us one step closer to working correctly with files whose names contain wildcard characters, but it's likely that others remain (e.g., if "git add -i" feeds the selected paths to "git add"). Reported-by: Wincent Colaiuta <win@wincent.com> Reported-by: Mislav Marohnić <mislav.marohnic@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | add--interactive: fix missing file prompt for patch mode with "-i"jk/add-i-patch-do-promptJeff King2017-03-021-4/+4
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When invoked as "git add -i", each menu interactive menu option prompts the user to select a list of files. This includes the "patch" option, which gets the list before starting the hunk-selection loop. As "git add -p", it behaves differently, and jumps straight to the hunk selection loop. Since 0539d5e6d (i18n: add--interactive: mark patch prompt for translation, 2016-12-14), the "add -i" case mistakenly jumps to straight to the hunk-selection loop. Prior to that commit the distinction between the two cases was managed by the $patch_mode variable. That commit used $patch_mode for something else, and moved the old meaning to the "$cmd" variable. But it forgot to update the $patch_mode check inside patch_update_cmd() which controls the file-list behavior. The simplest fix would be to change that line to check $cmd. But while we're here, let's use a less obscure name for this flag: $patch_mode_only, a boolean which tells whether we are in full-interactive mode or only in patch-mode. Reported-by: Henrik Grubbström <grubba@grubba.org> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'rt/align-add-i-help-text'Junio C Hamano2017-02-241-1/+1
|\ \ | | | | | | | | | | | | | | | | | | Doc update. * rt/align-add-i-help-text: git add -i: replace \t with blanks in the help message
| * | git add -i: replace \t with blanks in the help messagert/align-add-i-help-textRalf Thielow2017-02-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Within the help message of 'git add -i', the 'diff' command uses one tab character and blanks to create the space between the name and the description while the others use blanks only. So if the tab size is not at 4 characters, this description will not be in range. Replace the tab character with blanks. Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | Merge branch 'jc/retire-compaction-heuristics'Junio C Hamano2017-01-101-3/+0
|\ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git diff" and its family had two experimental heuristics to shift the contents of a hunk to make the patch easier to read. One of them turns out to be better than the other, so leave only the "--indent-heuristic" option and remove the other one. * jc/retire-compaction-heuristics: diff: retire "compaction" heuristics
| * | | diff: retire "compaction" heuristicsjc/retire-compaction-heuristicsJunio C Hamano2016-12-231-3/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When a patch inserts a block of lines, whose last lines are the same as the existing lines that appear before the inserted block, "git diff" can choose any place between these existing lines as the boundary between the pre-context and the added lines (adjusting the end of the inserted block as appropriate) to come up with variants of the same patch, and some variants are easier to read than others. We have been trying to improve the choice of this boundary, and Git 2.11 shipped with an experimental "compaction-heuristic". Since then another attempt to improve the logic further resulted in a new "indent-heuristic" logic. It is agreed that the latter gives better result overall, and the former outlived its usefulness. Retire "compaction", and keep "indent" as an experimental feature. The latter hopefully will be turned on by default in a future release, but that should be done as a separate step. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Merge branch 'va/i18n-perl-scripts'Junio C Hamano2016-12-271-114/+215
|\ \ \ \ | |/ / / |/| | / | | |/ | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Porcelain scripts written in Perl are getting internationalized. * va/i18n-perl-scripts: i18n: difftool: mark warnings for translation i18n: send-email: mark composing message for translation i18n: send-email: mark string with interpolation for translation i18n: send-email: mark warnings and errors for translation i18n: send-email: mark strings for translation i18n: add--interactive: mark status words for translation i18n: add--interactive: remove %patch_modes entries i18n: add--interactive: mark edit_hunk_manually message for translation i18n: add--interactive: i18n of help_patch_cmd i18n: add--interactive: mark patch prompt for translation i18n: add--interactive: mark plural strings i18n: clean.c: match string with git-add--interactive.perl i18n: add--interactive: mark strings with interpolation for translation i18n: add--interactive: mark simple here-documents for translation i18n: add--interactive: mark strings for translation Git.pm: add subroutines for commenting lines
| * | i18n: add--interactive: mark status words for translationVasco Almeida2016-12-141-6/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark words 'nothing', 'unchanged' and 'binary' used to display what has been staged or not, in "git add -i" status command. Alternatively one could mark N__('nothing') no-op in order to xgettext(1) extract the string and then trigger the translation at run time only with __($print->{FILE}), but that has the side effect of triggering retrieval of translations for the changes indicator too (e.g. +2/-1) which may or may not be a problem. To avoid that potential problem, mark only where there is certain to trigger translation only of those words but in this case we must also retrieve the translation for the eq tests, since the value assigned was of the translation, not the English source. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | i18n: add--interactive: remove %patch_modes entriesVasco Almeida2016-12-141-21/+0
| | | | | | | | | | | | | | | | | | | | | | | | Remove unnecessary entries from %patch_modes. After the i18n conversion, these entries are not used anymore. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | i18n: add--interactive: mark edit_hunk_manually message for translationVasco Almeida2016-12-141-13/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark message of edit_hunk_manually displayed in the editing file when user chooses 'e' option. The message had to be unfolded to allow translation of the $participle verb. Some messages end up being exactly the same for some use cases, but left it for easier change in the future, e.g., wanting to change wording of one particular use case. The comment character is now used according to the git configuration core.commentchar. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | i18n: add--interactive: i18n of help_patch_cmdVasco Almeida2016-12-141-8/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | Mark help message of help_patch_cmd for translation. The message must be unfolded to be free of variables so we can have high quality translations. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | i18n: add--interactive: mark patch prompt for translationVasco Almeida2016-12-141-8/+46
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark prompt message assembled in place for translation, unfolding each use case for each entry in the %patch_modes hash table. Previously, this script relied on whether $patch_mode was set to run the command patch_update_cmd() or show status and loop the main loop. Now, it uses $cmd to indicate we must run patch_update_cmd() and $patch_mode is used to tell which flavor of the %patch_modes are we on. This is introduced in order to be able to mark and unfold the message prompt knowing in which context we are. The tracking of context was done previously by point %patch_mode_flavour hash table to the correct entry of %patch_modes, focusing only on value of %patch_modes. Now, we are also interested in the key ('staged', 'stash', 'checkout_head', ...). Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | i18n: add--interactive: mark plural stringsVasco Almeida2016-12-141-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark plural strings for translation. Unfold each action case in one entire sentence. Pass new keyword for xgettext to extract. Update test to include new subroutine __n() for plural strings handling. Update documentation to include a description of the new __n() subroutine. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | i18n: add--interactive: mark strings with interpolation for translationVasco Almeida2016-12-141-12/+13
| | | | | | | | | | | | | | | | | | | | | | | | | | | Since at this point Git::I18N.perl lacks support for Perl i18n placeholder substitution, use of sprintf following die or error_msg is necessary for placeholder substitution take place. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | i18n: add--interactive: mark simple here-documents for translationVasco Almeida2016-12-141-3/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark messages in here-documents without interpolation for translation. The here-document delimiter \EOF, which is the same as 'EOF', indicates that the text is to be treated literally without interpolation of its content. Unfortunately xgettext is not able to extract here-documents delimited with \EOF but it is with delimiter enclosed in single quotes. So change \EOF to 'EOF', although in this case does not make difference what variation of here-document to use since there is nothing to interpolate. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | i18n: add--interactive: mark strings for translationVasco Almeida2016-12-141-34/+42
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Mark simple strings (without interpolation) for translation. Brackets around first parameter of ternary operator is necessary because otherwise xgettext fails to extract strings marked for translation from the rest of the file. Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | diff: improve positioning of add/delete blocks in diffsMichael Haggerty2016-09-191-1/+4
|/ / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some groups of added/deleted lines in diffs can be slid up or down, because lines at the edges of the group are not unique. Picking good shifts for such groups is not a matter of correctness but definitely has a big effect on aesthetics. For example, consider the following two diffs. The first is what standard Git emits: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) { } if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} +if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { $smtp_server = $_; The following diff is equivalent, but is obviously preferable from an aesthetic point of view: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) { $initial_reply_to =~ s/(^\s+|\s+$)//g; } +if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { This patch teaches Git to pick better positions for such "diff sliders" using heuristics that take the positions of nearby blank lines and the indentation of nearby lines into account. The existing Git code basically always shifts such "sliders" as far down in the file as possible. The only exception is when the slider can be aligned with a group of changed lines in the other file, in which case Git favors depicting the change as one add+delete block rather than one add and a slightly offset delete block. This naive algorithm often yields ugly diffs. Commit d634d61ed6 improved the situation somewhat by preferring to position add/delete groups to make their last line a blank line, when that is possible. This heuristic does more good than harm, but (1) it can only help if there are blank lines in the right places, and (2) always picks the last blank line, even if there are others that might be better. The end result is that it makes perhaps 1/3 as many errors as the default Git algorithm, but that still leaves a lot of ugly diffs. This commit implements a new and much better heuristic for picking optimal "slider" positions using the following approach: First observe that each hypothetical positioning of a diff slider introduces two splits: one between the context lines preceding the group and the first added/deleted line, and the other between the last added/deleted line and the first line of context following it. It tries to find the positioning that creates the least bad splits. Splits are evaluated based only on the presence and locations of nearby blank lines, and the indentation of lines near the split. Basically, it prefers to introduce splits adjacent to blank lines, between lines that are indented less, and between lines with the same level of indentation. In more detail: 1. It measures the following characteristics of a proposed splitting position in a `struct split_measurement`: * the number of blank lines above the proposed split * whether the line directly after the split is blank * the number of blank lines following that line * the indentation of the nearest non-blank line above the split * the indentation of the line directly below the split * the indentation of the nearest non-blank line after that line 2. It combines the measured attributes using a bunch of empirically-optimized weighting factors to derive a `struct split_score` that measures the "badness" of splitting the text at that position. 3. It combines the `split_score` for the top and the bottom of the slider at each of its possible positions, and selects the position that has the best `split_score`. I determined the initial set of weighting factors by collecting a corpus of Git histories from 29 open-source software projects in various programming languages. I generated many diffs from this corpus, and determined the best positioning "by eye" for about 6600 diff sliders. I used about half of the repositories in the corpus (corresponding to about 2/3 of the sliders) as a training set, and optimized the weights against this corpus using a crude automated search of the parameter space to get the best agreement with the manually-determined values. Then I tested the resulting heuristic against the full corpus. The results are summarized in the following table, in column `indent-1`: | repository | count | Git 2.9.0 | compaction | compaction-fixed | indent-1 | indent-2 | | --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- | | afnetworking | 109 | 89 (81.7%) | 37 (33.9%) | 37 (33.9%) | 2 (1.8%) | 2 (1.8%) | | alamofire | 30 | 18 (60.0%) | 14 (46.7%) | 15 (50.0%) | 0 (0.0%) | 0 (0.0%) | | angular | 184 | 127 (69.0%) | 39 (21.2%) | 23 (12.5%) | 5 (2.7%) | 5 (2.7%) | | animate | 313 | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | | ant | 380 | 356 (93.7%) | 152 (40.0%) | 148 (38.9%) | 15 (3.9%) | 15 (3.9%) | * | bugzilla | 306 | 263 (85.9%) | 109 (35.6%) | 99 (32.4%) | 14 (4.6%) | 15 (4.9%) | * | corefx | 126 | 91 (72.2%) | 22 (17.5%) | 21 (16.7%) | 6 (4.8%) | 6 (4.8%) | | couchdb | 78 | 44 (56.4%) | 26 (33.3%) | 28 (35.9%) | 6 (7.7%) | 6 (7.7%) | * | cpython | 937 | 158 (16.9%) | 50 (5.3%) | 49 (5.2%) | 5 (0.5%) | 5 (0.5%) | * | discourse | 160 | 95 (59.4%) | 42 (26.2%) | 36 (22.5%) | 18 (11.2%) | 13 (8.1%) | | docker | 307 | 194 (63.2%) | 198 (64.5%) | 253 (82.4%) | 8 (2.6%) | 8 (2.6%) | * | electron | 163 | 132 (81.0%) | 38 (23.3%) | 39 (23.9%) | 6 (3.7%) | 6 (3.7%) | | git | 536 | 470 (87.7%) | 73 (13.6%) | 78 (14.6%) | 16 (3.0%) | 16 (3.0%) | * | gitflow | 127 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | | ionic | 133 | 89 (66.9%) | 29 (21.8%) | 38 (28.6%) | 1 (0.8%) | 1 (0.8%) | | ipython | 482 | 362 (75.1%) | 167 (34.6%) | 169 (35.1%) | 11 (2.3%) | 11 (2.3%) | * | junit | 161 | 147 (91.3%) | 67 (41.6%) | 66 (41.0%) | 1 (0.6%) | 1 (0.6%) | * | lighttable | 15 | 5 (33.3%) | 0 (0.0%) | 2 (13.3%) | 0 (0.0%) | 0 (0.0%) | | magit | 88 | 75 (85.2%) | 11 (12.5%) | 9 (10.2%) | 1 (1.1%) | 0 (0.0%) | | neural-style | 28 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | | nodejs | 781 | 649 (83.1%) | 118 (15.1%) | 111 (14.2%) | 4 (0.5%) | 5 (0.6%) | * | phpmyadmin | 491 | 481 (98.0%) | 75 (15.3%) | 48 (9.8%) | 2 (0.4%) | 2 (0.4%) | * | react-native | 168 | 130 (77.4%) | 79 (47.0%) | 81 (48.2%) | 0 (0.0%) | 0 (0.0%) | | rust | 171 | 128 (74.9%) | 30 (17.5%) | 27 (15.8%) | 16 (9.4%) | 14 (8.2%) | | spark | 186 | 149 (80.1%) | 52 (28.0%) | 52 (28.0%) | 2 (1.1%) | 2 (1.1%) | | tensorflow | 115 | 66 (57.4%) | 48 (41.7%) | 48 (41.7%) | 5 (4.3%) | 5 (4.3%) | | test-more | 19 | 15 (78.9%) | 2 (10.5%) | 2 (10.5%) | 1 (5.3%) | 1 (5.3%) | * | test-unit | 51 | 34 (66.7%) | 14 (27.5%) | 8 (15.7%) | 2 (3.9%) | 2 (3.9%) | * | xmonad | 23 | 22 (95.7%) | 2 (8.7%) | 2 (8.7%) | 1 (4.3%) | 1 (4.3%) | * | --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- | | totals | 6668 | 4391 (65.9%) | 1496 (22.4%) | 1491 (22.4%) | 150 (2.2%) | 144 (2.2%) | | totals (training set) | 4552 | 3195 (70.2%) | 1053 (23.1%) | 1061 (23.3%) | 86 (1.9%) | 88 (1.9%) | | totals (test set) | 2116 | 1196 (56.5%) | 443 (20.9%) | 430 (20.3%) | 64 (3.0%) | 56 (2.6%) | In this table, the numbers are the count and percentage of human-rated sliders that the corresponding algorithm got *wrong*. The columns are * "repository" - the name of the repository used. I used the diffs between successive non-merge commits on the HEAD branch of the corresponding repository. * "count" - the number of sliders that were human-rated. I chose most, but not all, sliders to rate from those among which the various algorithms gave different answers. * "Git 2.9.0" - the default algorithm used by `git diff` in Git 2.9.0. * "compaction" - the heuristic used by `git diff --compaction-heuristic` in Git 2.9.0. * "compaction-fixed" - the heuristic used by `git diff --compaction-heuristic` after the fixes from earlier in this patch series. Note that the results are not dramatically different than those for "compaction". Both produce non-ideal diffs only about 1/3 as often as the default `git diff`. * "indent-1" - the new `--indent-heuristic` algorithm, using the first set of weighting factors, determined as described above. * "indent-2" - the new `--indent-heuristic` algorithm, using the final set of weighting factors, determined as described below. * `*` - indicates that repo was part of training set used to determine the first set of weighting factors. The fact that the heuristic performed nearly as well on the test set as on the training set in column "indent-1" is a good indication that the heuristic was not over-trained. Given that fact, I ran a second round of optimization, using the entire corpus as the training set. The resulting set of weights gave the results in column "indent-2". These are the weights included in this patch. The final result gives consistently and significantly better results across the whole corpus than either `git diff` or `git diff --compaction-heuristic`. It makes only about 1/30 as many errors as the former and about 1/10 as many errors as the latter. (And a good fraction of the remaining errors are for diffs that involve weirdly-formatted code, sometimes apparently machine-generated.) The tools that were used to do this optimization and analysis, along with the human-generated data values, are recorded in a separate project [1]. This patch adds a new command-line option `--indent-heuristic`, and a new configuration setting `diff.indentHeuristic`, that activate this heuristic. This interface is only meant for testing purposes, and should be finalized before including this change in any release. [1] https://github.com/mhagger/diff-slider-tools Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | add--interactive: respect diff.compactionHeuristicjk/add-i-diff-compact-heuristicsJeff King2016-06-161-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | We use plumbing to generate the diff, so it doesn't automatically pick up UI config like compactionHeuristic. Let's forward it on, since interactive adding is porcelain. Note that we only need to handle the "true" case. There's no point in passing --no-compaction-heuristic when the variable is false, since nothing else could have turned it on. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | add--interactive: allow custom diff highlighting programsjk/add-i-highlightJeff King2016-02-281-2/+10
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The patch hunk selector of add--interactive knows how ask git for colorized diffs, and correlate them with the uncolored diffs we apply. But there's not any way for somebody who uses a diff-filter tool like contrib's diff-highlight to see their normal highlighting. This patch lets users define an arbitrary shell command to pipe the colorized diff through. The exact output shouldn't matter (since we just show the result to humans) as long as it is line-compatible with the original diff (so that hunk-splitting can split the colorized version, too). I left two minor issues with the new system that I don't think are worth fixing right now, but could be done later: 1. We only filter colorized diffs. Theoretically a user could want to filter a non-colorized diff, but I find it unlikely in practice. Users who are doing things like diff-highlighting are likely to want color, too. 2. add--interactive will re-colorize a diff which has been hand-edited, but it won't have run through the filter. Fixing this is conceptually easy (just pipe the diff through the filter), but practically hard to do without using tempfiles (it would need to feed data to and read the result from the filter without deadlocking; this raises portability questions with respect to Windows). I've punted on both issues for now, and if somebody really cares later, they can do a patch on top. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'ak/add-i-empty-candidates'Junio C Hamano2015-02-171-0/+5
|\ | | | | | | | | | | | | | | | | | | The interactive "show a list and let the user choose from it" interface "add -i" used showed and prompted to the user even when the candidate list was empty, against which the only "choice" the user could have made was to choose nothing. * ak/add-i-empty-candidates: add -i: return from list_and_choose if there is no candidate
| * add -i: return from list_and_choose if there is no candidateak/add-i-empty-candidatesAlexander Kuleshov2015-01-221-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The list_and_choose() helper is given a prompt and a list, asks the user to make selection from the list, and then returns a list of items chosen. Even when it is given an empty list as the original candidate set to choose from, it gave a prompt to the user, who can only say "I am done choosing". Return an empty result when the input is an empty list without bothering the user. The existing caller must already have a logic to say "Nothing to do" or an equivalent when the returned list is empty (i.e. the user chose to select nothing) if it is necessary, so no change to the callers is necessary. This fixes the case where "add untracked" is asked in "git add -i" and there is no untracked files in the working tree. We used to give an empty list of files to choose from with a prompt, but with this change, we no longer do. Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | add--interactive: leave main loop on read errorjk/add-i-read-errorJeff King2014-12-151-0/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The main hunk loop for add--interactive will loop if it does not get a known input. This is a good thing if the user typed some invalid input. However, if we have an uncorrectable read error, we'll end up looping infinitely. We can fix this by noticing read errors (i.e., <STDIN> returns undef) and breaking out of the loop. One easy way to trigger this is if you have an editor that does not take over the terminal (e.g., one that spawns a window in an existing process and waits), start the editor with the hunk-edit command, and hit ^C to send SIGINT. The editor process dies due to SIGINT, but the perl add--interactive process does not (perl suspends SIGINT for the duration of our system() call). We return to the main loop, but further reads from stdin don't work. The SIGINT _also_ killed our parent git process, which orphans our process group, meaning that further reads from the terminal will always fail. We loop infinitely, getting EIO on each read. Note that there are several other spots where we read from stdin, too. However, in each of those cases, we do something sane when the read returns undef (breaking out of the loop, taking the input as "no", etc). They don't need similar treatment. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jl/nor-or-nand-and'Junio C Hamano2014-04-081-2/+2
|\ | | | | | | | | | | | | | | | | | | | | | | Eradicate mistaken use of "nor" (that is, essentially "nor" used not in "neither A nor B" ;-)) from in-code comments, command output strings, and documentations. * jl/nor-or-nand-and: code and test: fix misuses of "nor" comments: fix misuses of "nor" contrib: fix misuses of "nor" Documentation: fix misuses of "nor"
| * code and test: fix misuses of "nor"jl/nor-or-nand-andJustin Lebar2014-03-311-2/+2
| | | | | | | | | | Signed-off-by: Justin Lebar <jlebar@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | git-add--interactive: warn if module for interactive.singlekey is missingsr/add--interactive-term-readkeySimon Ruderich2014-03-031-0/+3
|/ | | | | | Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Simon Ruderich <simon@ruderich.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add-interactive: handle unborn branch in patch modeJeff King2013-10-251-9/+13
| | | | | | | | | | The list_modified function already knows how to handle an unborn branch by diffing against the empty tree. However, the diff we perform to get the actual hunks does not. Let's use the same logic for both diffs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add--interactive: fix external command invocation on Windowsjs/add-i-mingwJohannes Sixt2013-09-041-1/+1
| | | | | | | | | | | | | | Back in 21e9757e (Hack git-add--interactive to make it work with ActiveState Perl, 2007-08-01), the invocation of external commands was changed to use qx{} on Windows. The rationale was that the command interpreter on Windows is not a POSIX shell, but rather Windows's CMD. That patch was wrong to include 'msys' in the check whether to use qx{} or not: 'msys' identifies MSYS perl as shipped with Git for Windows, which does not need the special treatment; qx{} should be used only with ActiveState perl, which is identified by 'MSWin32'. Signed-off-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add -i: add extra options at the right place in "diff" command lineJunio C Hamano2013-06-231-1/+1
| | | | | | | | | | | | | | | | Appending "--diff-algorithm=histogram" at the end of canned command line for various modes of "diff" is correct for most of them but not for "stash" that has a non-option already wired in, like so: 'stash' => { DIFF => 'diff-index -p HEAD', Appending an extra option after non-option may happen to work due to overly lax command line parser, but that is not something we should rely on. Instead, splice in the extra argument immediately after the command name (i.e. 'diff-index', 'diff-files', etc.). Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add--interactive: respect diff.algorithmJohn Keeping2013-06-121-0/+5
| | | | | | | | | | | | | | | | | | | | | | When staging hunks interactively it is sometimes useful to use an alternative diff algorithm which splits the changes into hunks in a more logical manner. This is not possible because the plumbing commands called by add--interactive ignore the "diff.algorithm" configuration option (as they should). Since add--interactive is a porcelain command it should respect this configuration variable. To do this, make it read diff.algorithm and pass its value to the underlying diff-index and diff-files invocations. At this point, do not add options to "git add", "git reset" or "git checkout" (all of which can call git-add--interactive). If a user wants to override the value on the command line they can use: git -c diff.algorithm=$ALGO ... Signed-off-by: John Keeping <john@keeping.me.uk> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Correct common spelling mistakes in comments and testsStefano Lattarini2013-04-121-1/+1
| | | | | | | | | Most of these were found using Lucas De Marchi's codespell tool. Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* git-add--interactive.perl: Remove two unused variablesThomas Badie2012-06-251-2/+0
| | | | | | | | | The patch 8f0bef6 refactored this script and made the variable $fh unneeded in subs diff_applies and patch_update_file, but forgot to remove them. Signed-off-by: Thomas Badie <badie@lrde.epita.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add--interactive: ignore unmerged entries in patch modeJeff King2012-04-051-7/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When "add -p" sees an unmerged entry, it shows the combined diff and then immediately skips the hunk. This can be confusing in a variety of ways, depending on whether there are other changes to stage (in which case you get the superfluous combined diff output in between other hunks) or not (in which case you get the combined diff and the program exits immediately, rather than seeing "No changes"). The current behavior was not planned, and is just what the implementation happens to do. Instead, let's explicitly remove unmerged entries from our list of modified files, and print a warning that we are ignoring them. We can cheaply find which entries are unmerged by adding "--raw" output to the "diff-files --numstat" we already run. There is one non-obvious thing we must change when parsing this combined output. Before this patch, when we saw a numstat line for a file that did not have index changes, we would create a new record with 'unchanged' in the 'INDEX' field. Because "--raw" comes before "--numstat", we must move this special-case down to the raw-line case (and it is sufficient to move it rather than handle it in both places, since any file which has a --numstat will also have a --raw entry). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add -i: ignore terminal escape sequencesThomas Rast2011-05-171-0/+19
| | | | | | | | | | | | | | | | | | | | | | On the author's terminal, the up-arrow input sequence is ^[[A, and thus fat-fingering an up-arrow into 'git checkout -p' is quite dangerous: git-add--interactive.perl will ignore the ^[ and [ characters and happily treat A as "discard everything". As a band-aid fix, use Term::Cap to get all terminal capabilities. Then use the heuristic that any capability value that starts with ^[ (i.e., \e in perl) must be a key input sequence. Finally, given an input that starts with ^[, read more characters until we have read a full escape sequence, then return that to the caller. We use a timeout of 0.5 seconds on the subsequent reads to avoid getting stuck if the user actually input a lone ^[. Since none of the currently recognized keys start with ^[, the net result is that the sequence as a whole will be ignored and the help displayed. Signed-off-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* "add -p": work-around an old laziness that does not coalesce hunksJunio C Hamano2011-04-291-1/+1
| | | | | | | | | | | | | | | | | Since 0beee4c (git-add--interactive: remove hunk coalescing, 2008-07-02), "git add--interactive" behaves lazily and passes overlapping hunks to the underlying "git apply" without coalescing. This was partially corrected by 7a26e65 (its partial revert, 2009-05-16), but overlapping hunks are still passed when the patch is edited. Teach "git apply" the --allow-overlap option that disables a safety feature that avoids misapplication of patches by not applying patches to overlapping hunks, and pass this option form "add -p" codepath. Do not even advertise the option, as this is merely a workaround, and the correct fix is to make "add -p" correctly coalesce adjacent patch hunks. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add--interactive.perl: factor out repeated --recount optionJunio C Hamano2011-04-291-8/+8
| | | | | | | | | | | | | Depending on the direction and the target of patch application, we would need to pass --cached and --reverse to underlying "git apply". Also we only pass --check when we are not applying but just checking. But we always pass --recount since 8cbd431 (git-add--interactive: replace hunk recounting with apply --recount, 2008-07-02). Instead of repeating the same --recount over and over again, move it to a single place that actually runs the command, namely, "run_git_apply" subroutine. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add -p: 'q' should really quitJunio C Hamano2011-04-291-5/+4
| | | | | | | | | | | | | The "quit" command was added in 9a7a1e0 (git add -p: new "quit" command at the prompt, 2009-04-10) to allow the user to say that hunks other than what have already been chosen are undesirable, and exit the interactive loop immediately. It forgot that there may be an undecided hunk before the current one. In such a case, the interactive loop still goes back to the beginning. Clear all the USE bit for undecided hunks and exit the loop. Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jl/add-p-reverse-message'Junio C Hamano2010-11-291-2/+11
|\ | | | | | | | | * jl/add-p-reverse-message: Correct help blurb in checkout -p and friends
| * Correct help blurb in checkout -p and friendsJonathan "Duke" Leto2010-10-281-2/+11
| | | | | | | | | | | | | | | | | | | | When git checkout -p from the index or HEAD is run in edit mode, the help message about removing '-' and '+' lines was backwards. Because it is reverse applying the patch, the meanings of '-' and '+' are reversed. Signed-off-by: Jonathan "Duke" Leto <jonathan@leto.net> Acked-by: Thomas Rast <trast@student.ethz.ch> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | perl: use "use warnings" instead of -wÆvar Arnfjörð Bjarmason2010-09-271-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the Perl scripts to turn on lexical warnings instead of setting the global $^W variable via the -w switch. The -w sets warnings for all code that interpreter runs, while "use warnings" is lexically scoped. The former is probably not what the authors wanted. As an auxiliary benefit it's now possible to build Git with: PERL_PATH='/usr/bin/env perl' Which would previously result in failures, since "#!/usr/bin/env perl -w" doesn't work as a shebang. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | perl: bump the required Perl version to 5.8 from 5.6.[21]Ævar Arnfjörð Bjarmason2010-09-271-0/+1
|/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Formalize our dependency on perl 5.8, bumped from 5.6.[12]. We already used the three-arg form of open() which was introduced in 5.6.1, but t/t9700/test.pl explicitly depended on 5.6.2. However git-add--interactive.pl has been failing on the 5.6 line since it was introduced in v1.5.0-rc0~12^2~2 back in 2006 due to this open syntax: sub run_cmd_pipe { my $fh = undef; open($fh, '-|', @_) or die; return <$fh>; } Which when executed dies on "Can't use an undefined value as filehandle reference". Several of our tests also fail on 5.6 (even more when compiled with NO_PERL_MAKEMAKER=1): t2016-checkout-patch.sh t3904-stash-patch.sh t3701-add-interactive.sh t7105-reset-patch.sh t7501-commit.sh t9700-perl-git.sh Our code is bitrotting on 5.6 with no-one interested in fixing it, and pinning us to such an ancient release of Perl is keeping us from using useful features introduced in the 5.8 release. The 5.6 series is now over 10 years old, and the 5.6.2 maintenance release almost 7. 5.8 on the other hand is more than 8 years old. All the modern Unix-like operating systems have now upgraded to it or a later version, and 5.8 packages are available for old IRIX, AIX Solaris and Tru64 systems. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Acked-by: Tor Arntsen <tor@spacetec.no> Acked-by: Randal L. Schwartz <merlyn@stonehenge.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* add-interactive: Clarify “remaining hunks in the file”Jonathan Nieder2010-06-131-3/+3
| | | | | | | | | | | | | | | | The "a" and "d" commands to ‘add --patch’ (accept/reject rest of file) interact with "j", "g", and "/" (skip some hunks) in a perhaps confusing way: after accepting or rejecting all _later_ hunks in the file, they return to the earlier, skipped hunks and prompt the user about them again. This behavior can be very useful in practice. One can still accept or reject _all_ undecided hunks in a file by using the "g" command to move to hunk #1 first. Reported-by: Frédéric Brière <fbriere@fbriere.net> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'jk/maint-add--interactive-delete'Junio C Hamano2010-03-071-1/+23
|\ | | | | | | | | * jk/maint-add--interactive-delete: add-interactive: fix bogus diff header line ordering
| * add-interactive: fix bogus diff header line orderingJeff King2010-02-221-1/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we look at a patch for adding hunks interactively, we first split it into a header and a list of hunks. Some of the header lines, such as mode changes and deletion, however, become their own selectable hunks. Later when we reassemble the patch, we simply concatenate the header and the selected hunks. This leads to patches like this: diff --git a/file b/file index d95f3ad..0000000 --- a/file +++ /dev/null deleted file mode 100644 @@ -1 +0,0 @@ -content Notice how the deletion comes _after_ the ---/+++ lines, when it should come before. In many cases, we can get away with this as git-apply accepts the slightly bogus input. However, in the specific case of a deletion line that is being applied via "apply -R", this malformed patch triggers an assert in git-apply. This comes up when discarding a deletion via "git checkout -p". Rather than try to make git-apply accept our odd input, let's just reassemble the patch in the correct order. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'maint'Junio C Hamano2009-12-081-1/+5
|\ \ | |/ | | | | | | | | * maint: add-interactive: fix deletion of non-empty files pull: clarify advice for the unconfigured error case
| * Merge branch 'jk/maint-add-p-delete-fix' into maintJunio C Hamano2009-12-081-1/+5
| |\ | | | | | | | | | | | | * jk/maint-add-p-delete-fix: add-interactive: fix deletion of non-empty files
| | * add-interactive: fix deletion of non-empty filesJeff King2009-12-071-1/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 24ab81a fixed the deletion of empty files, but broke deletion of non-empty files. The approach it took was to factor out the "deleted" line from the patch header into its own hunk, the same way we do for mode changes. However, unlike mode changes, we only showed the special "delete this file" hunk if there were no other hunks. Otherwise, the user would annoyingly be presented with _two_ hunks: one for deleting the file and one for deleting the content. This meant that in the non-empty case, we forgot about the deleted line entirely, and we submitted a bogus patch to git-apply (with "/dev/null" as the destination file, but not marked as a deletion). Instead, this patch combines the file deletion hunk and the content deletion hunk (if there is one) into a single deletion hunk which is either staged or not. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | Merge branch 'jk/maint-add-p-empty' into maintJunio C Hamano2009-11-161-5/+13
| |\ \ | | |/ | | | | | | | | | * jk/maint-add-p-empty: add-interactive: handle deletion of empty files