summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* merge: don't leak the index during reloadsethomson/issue-4203Edward Thomson2018-10-201-3/+4
|
* merge: add error handling for index reloadEtiene Dalcol2017-11-111-3/+4
| | | | Cleans up should git_repository_index or git_index_read fail
* tests: add test case for index reloads on mergeEtiene Dalcol2017-11-111-0/+29
| | | | | Adds a test case for the issue #4203, when diverging indexes on memory and disk cause git merge to abort with GIT_ECONFLICT
* merge: reload index before git_mergeGreg Collinge2017-11-111-0/+3
| | | | | | | | If the index in memory is different from the index on the disk, previously merge would abort with GIT_ECONFLICT. Reload the index before merging to fix this. Fixes #4203
* Merge pull request #4403 from hkleynhans/select_bundled_zlibEdward Thomson2017-11-112-12/+17
|\ | | | | cmake: Allow user to select bundled zlib
| * cmake: Allow user to select bundled zlibHenry Kleynhans2017-11-112-12/+17
|/ | | | | | | | | | | | | Under some circumstances the installed / system version of zlib may not be desirable due to being too old or buggy. This patch adds the option `USE_BUNDLED_ZLIB` that will cause the bundled version of zlib to be used. We may also want to add similar functionality to allow the user to select other bundled 3rd-party dependencies instead of using the system versions. /cc @pks-t @ethomson
* Merge pull request #4308 from pks-t/pks/header-state-machineEdward Thomson2017-11-112-46/+103
|\ | | | | patch_parse: implement state machine for parsing patch headers
| * patch_parse: fix parsing patches only containing exact renamesPatrick Steinhardt2017-09-012-0/+22
| | | | | | | | | | | | | | | | Patches which contain exact renames only will not contain an actual diff body, but only a list of files that were renamed. Thus, the patch header is immediately followed by the terminating sequence "-- ". We currently do not recognize this character sequence as a possible terminating sequence. Add it and create a test to catch the failure.
| * patch_parse: implement state machine for parsing patch headersPatrick Steinhardt2017-08-251-46/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our code parsing Git patch headers is rather lax in parsing headers of a Git-style patch. Most notably, we do not care for the exact order in which header lines appear and as such, we may parse patch files which are not really valid after all. Furthermore, the state transitions inside of the parser are not as obvious as they could be, making it harder than required to follow its logic. To improve upon this situation, this patch introduces a real state machine to parse the patches. Instead of simply parsing each line without caring for previous state and the exact ordering, we define a set of states with their allowed transitions. This makes the patch parser more strict in only allowing valid successions of header lines. As the transition table is defined inside of a single structure with the expected line, required state as well as the state that we end up in, all state transitions are immediately obvious from just having a look at this structure. This improves both maintainability and eases reasoning about the patch parser.
* | Merge pull request #4401 from ktdreyer/describe-h-spellingEdward Thomson2017-11-101-1/+1
|\ \ | | | | | | describe.h: fix spelling in comments
| * | describe.h: fix spelling in commentsKen Dreyer2017-11-101-1/+1
|/ / | | | | | | optios -> options
* | Merge pull request #4283 from tiennou/generic-tlsPatrick Steinhardt2017-11-0923-109/+207
|\ \ | | | | | | CMake: make HTTPS support more generic
| * | cmake: move Darwin-specific block aroundEtienne Samson2017-10-232-15/+12
| | | | | | | | | This allows us to only link against CoreFoundation when using the SecureTransport backend
| * | cmake: Add USE_HTTPS as a CMake optionEtienne Samson2017-10-233-30/+57
| | | | | | | | | | | | | | | | | | It defaults to ON, e.g. "pick whatever default is appropriate for the platform". It accepts one of SecureTransport, OpenSSL, WinHTTP, or OFF. It errors if the backend library couldn't be found.
| * | cmake: braces are not needed hereEtienne Samson2017-10-231-2/+2
| | |
| * | cmake: use FeatureSummary to display which features we end up usingEtienne Samson2017-10-232-0/+24
| | |
| * | cmake: make our macOS helpers more CMake-yEtienne Samson2017-10-233-28/+58
| | |
| * | cmake: fix indentation before enhancingEtienne Samson2017-10-232-12/+12
| | |
| * | https: correct some error messagesEtienne Samson2017-10-231-2/+2
| | |
| * | clar: exit immediately on initialization failureEtienne Samson2017-10-231-1/+6
| | |
| * | https: Prevent OpenSSL from namespace-leakingEtienne Samson2017-10-234-10/+23
| | |
| * | stream: Gather streams to src/streamsEtienne Samson2017-10-2316-29/+32
| | |
| * | cmake: simplify some HTTPS testsEtienne Samson2017-10-232-3/+2
| | |
* | | Merge pull request #4394 from libgit2/cmn/macos-ramdiskEdward Thomson2017-11-061-0/+9
|\ \ \ | | | | | | | | travis: put clar's sandbox in a ramdisk on macOS
| * | | travis: let's try a 5GB ramdiskcmn/macos-ramdiskCarlos Martín Nieto2017-10-311-2/+2
| | | |
| * | | travis: put clar's sandbox in a ramdisk on macOSCarlos Martín Nieto2017-10-311-0/+9
| | | | | | | | | | | | | | | | | | | | The macOS tests are by far the slowest right now. This attempts to remedy the situation somewhat by asking clar to put its test data on a ramdisk.
* | | | Merge pull request #4397 from pks-t/pks/appveyor-examplesEdward Thomson2017-11-063-4/+41
|\ \ \ \ | | | | | | | | | | appveyor: build examples
| * | | | examples: network: fix Win32 linking errors due to getlinePatrick Steinhardt2017-11-061-2/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The getline(3) function call is not part of ISO C and, most importantly, it is not implemented on Microsoft Windows platforms. As our networking example code makes use of getline, this breaks builds on MSVC and MinGW. As this code wasn't built prior to the previous commit, this was never noticed. Fix the error by instead implementing a `readline` function, which simply reads the password from stdin until it reads a newline character.
| * | | | appveyor: build examplesPatrick Steinhardt2017-11-062-2/+2
|/ / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | By default, CMake will not build our examples directory. As we do not instruct either the MinGW or MSVC builds on AppVeyor to enable building these examples, we cannot verify that those examples at least build on Windows systems. Fix that by passing `-DBUILD_EXAMPLES=ON` to AppVeyor's CMake invocation.
* | | | Merge pull request #4386 from novalis/gitignore-ignore-spaceCarlos Martín Nieto2017-11-042-0/+20
|\ \ \ \ | | | | | | | | | | ignore spaces in .gitignore files
| * | | | Ignore trailing whitespace in .gitignore files (as git itself does)David Turner2017-10-292-0/+20
| | | | |
* | | | | CHANGELOG: add note about supporting conditional includesCarlos Martín Nieto2017-11-041-0/+2
| | | | |
* | | | | Merge pull request #4332 from pks-t/pks/conditional-includesCarlos Martín Nieto2017-11-0417-113/+359
|\ \ \ \ \ | |_|/ / / |/| | | | Conditional includes
| * | | | Merge remote-tracking branch 'upstream/master' into pks/conditional-includesCarlos Martín Nieto2017-11-0417-187/+649
| |\ \ \ \ | |/ / / / |/| | | |
* | | | | Merge pull request #4393 from libgit2/ethomson/pgpkeyCarlos Martín Nieto2017-10-311-1/+1
|\ \ \ \ \ | | | | | | | | | | | | travis: grab pgp key from www.edwardthomson.com
| * | | | | travis: grab pgp key from www.edwardthomson.comEdward Thomson2017-10-301-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Getting the key from the MIT keyserver is surprisingly unreliable. Try getting it from my website instead...
* | | | | | Merge pull request #4392 from libgit2/cmn/config-write-preserve-caseCarlos Martín Nieto2017-10-313-9/+47
|\ \ \ \ \ \ | |/ / / / / |/| | | | | Preserve the input casing when writing config files
| * | | | | config: check for OOM when writingcmn/config-write-preserve-caseCarlos Martín Nieto2017-10-301-0/+2
| | | | | |
| * | | | | CHANGELOG: add note about config writing changesCarlos Martín Nieto2017-10-301-0/+4
| | | | | |
| * | | | | config: preserve the original case when writing out new sections and varsCarlos Martín Nieto2017-10-301-9/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For sections we will still use the existing one even if the case disagrees, but the variable always gets written with the case given by the caller.
| * | | | | config: add failing test for preserving case when writing keysCarlos Martín Nieto2017-10-301-0/+23
|/ / / / / | | | | | | | | | | | | | | | | | | | | While most parts of a configuration key are case-insensitive, we should still be case-preserving and write down whatever string the caller provided.
* | | | | Merge pull request #4373 from cjhoward92/examples/log-show-log-sizeCarlos Martín Nieto2017-10-292-7/+10
|\ \ \ \ \ | | | | | | | | | | | | example-log: add support for --log-size
| * | | | | examples: log: pass options pointer to print_commitCarson Howard2017-10-131-7/+7
| | | | | | | | | | | | | | | | | | Cleaned up the PR to address styling issues.
| * | | | | PROJECTS: remove example for --log-sizeCarson Howard2017-10-111-4/+0
| | | | | |
| * | | | | example-log: add support for --log-sizeCarson Howard2017-10-111-4/+11
| | | | | |
* | | | | | Merge pull request #3944 from mhagger/diff-indent-heuristicCarlos Martín Nieto2017-10-292-78/+523
|\ \ \ \ \ \ | | | | | | | | | | | | | | Implement a diff indent heuristic
| * \ \ \ \ \ Merge remote-tracking branch 'upstream/master' into diff-indent-heuristicCarlos Martín Nieto2017-10-29552-4452/+15648
| |\ \ \ \ \ \
| * | | | | | | Introduce a new `XDL_INLINE` macro and use it instead of `inline`Michael Haggerty2017-10-141-3/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | `inline` is not portable enough, and the `xdiff` code doesn't import the `GIT_INLINE` macro. So introduce a new `XDL_INLINE` macro (with the same definition as `GIT_INLINE`). Use the new macro to inline two functions in `xdiffi.c`.
| * | | | | | | xdiff: rename "struct group" to "struct xdlgroup"Jeff King2016-10-031-7/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit a49895b593 (xdl_change_compact(): introduce the concept of a change group, 2016-08-22) added a "struct group" type to xdiff/xdiffi.c. But the POSIX system header "grp.h" already defines "struct group" (it is part of the getgrnam interface). Let's resolve by giving the xdiff variant a scoped name, which is closer to other xdiff types anyway (e.g., xdlfile_t, though note that xdiff is fond if typedefs when Git usually is not).
| * | | | | | | diff: improve positioning of add/delete blocks in diffsMichael Haggerty2016-09-292-0/+327
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some groups of added/deleted lines in diffs can be slid up or down, because lines at the edges of the group are not unique. Picking good shifts for such groups is not a matter of correctness but definitely has a big effect on aesthetics. For example, consider the following two diffs. The first is what standard Git emits: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) { } if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} +if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { $smtp_server = $_; The following diff is equivalent, but is obviously preferable from an aesthetic point of view: --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl @@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) { $initial_reply_to =~ s/(^\s+|\s+$)//g; } +if (!$smtp_server) { + $smtp_server = $repo->config('sendemail.smtpserver'); +} if (!$smtp_server) { foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) { if (-x $_) { This patch teaches Git to pick better positions for such "diff sliders" using heuristics that take the positions of nearby blank lines and the indentation of nearby lines into account. The existing Git code basically always shifts such "sliders" as far down in the file as possible. The only exception is when the slider can be aligned with a group of changed lines in the other file, in which case Git favors depicting the change as one add+delete block rather than one add and a slightly offset delete block. This naive algorithm often yields ugly diffs. Commit d634d61ed6 improved the situation somewhat by preferring to position add/delete groups to make their last line a blank line, when that is possible. This heuristic does more good than harm, but (1) it can only help if there are blank lines in the right places, and (2) always picks the last blank line, even if there are others that might be better. The end result is that it makes perhaps 1/3 as many errors as the default Git algorithm, but that still leaves a lot of ugly diffs. This commit implements a new and much better heuristic for picking optimal "slider" positions using the following approach: First observe that each hypothetical positioning of a diff slider introduces two splits: one between the context lines preceding the group and the first added/deleted line, and the other between the last added/deleted line and the first line of context following it. It tries to find the positioning that creates the least bad splits. Splits are evaluated based only on the presence and locations of nearby blank lines, and the indentation of lines near the split. Basically, it prefers to introduce splits adjacent to blank lines, between lines that are indented less, and between lines with the same level of indentation. In more detail: 1. It measures the following characteristics of a proposed splitting position in a `struct split_measurement`: * the number of blank lines above the proposed split * whether the line directly after the split is blank * the number of blank lines following that line * the indentation of the nearest non-blank line above the split * the indentation of the line directly below the split * the indentation of the nearest non-blank line after that line 2. It combines the measured attributes using a bunch of empirically-optimized weighting factors to derive a `struct split_score` that measures the "badness" of splitting the text at that position. 3. It combines the `split_score` for the top and the bottom of the slider at each of its possible positions, and selects the position that has the best `split_score`. I determined the initial set of weighting factors by collecting a corpus of Git histories from 29 open-source software projects in various programming languages. I generated many diffs from this corpus, and determined the best positioning "by eye" for about 6600 diff sliders. I used about half of the repositories in the corpus (corresponding to about 2/3 of the sliders) as a training set, and optimized the weights against this corpus using a crude automated search of the parameter space to get the best agreement with the manually-determined values. Then I tested the resulting heuristic against the full corpus. The results are summarized in the following table, in column `indent-1`: | repository | count | Git 2.9.0 | compaction | compaction-fixed | indent-1 | indent-2 | | --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- | | afnetworking | 109 | 89 (81.7%) | 37 (33.9%) | 37 (33.9%) | 2 (1.8%) | 2 (1.8%) | | alamofire | 30 | 18 (60.0%) | 14 (46.7%) | 15 (50.0%) | 0 (0.0%) | 0 (0.0%) | | angular | 184 | 127 (69.0%) | 39 (21.2%) | 23 (12.5%) | 5 (2.7%) | 5 (2.7%) | | animate | 313 | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | 2 (0.6%) | | ant | 380 | 356 (93.7%) | 152 (40.0%) | 148 (38.9%) | 15 (3.9%) | 15 (3.9%) | * | bugzilla | 306 | 263 (85.9%) | 109 (35.6%) | 99 (32.4%) | 14 (4.6%) | 15 (4.9%) | * | corefx | 126 | 91 (72.2%) | 22 (17.5%) | 21 (16.7%) | 6 (4.8%) | 6 (4.8%) | | couchdb | 78 | 44 (56.4%) | 26 (33.3%) | 28 (35.9%) | 6 (7.7%) | 6 (7.7%) | * | cpython | 937 | 158 (16.9%) | 50 (5.3%) | 49 (5.2%) | 5 (0.5%) | 5 (0.5%) | * | discourse | 160 | 95 (59.4%) | 42 (26.2%) | 36 (22.5%) | 18 (11.2%) | 13 (8.1%) | | docker | 307 | 194 (63.2%) | 198 (64.5%) | 253 (82.4%) | 8 (2.6%) | 8 (2.6%) | * | electron | 163 | 132 (81.0%) | 38 (23.3%) | 39 (23.9%) | 6 (3.7%) | 6 (3.7%) | | git | 536 | 470 (87.7%) | 73 (13.6%) | 78 (14.6%) | 16 (3.0%) | 16 (3.0%) | * | gitflow | 127 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | | ionic | 133 | 89 (66.9%) | 29 (21.8%) | 38 (28.6%) | 1 (0.8%) | 1 (0.8%) | | ipython | 482 | 362 (75.1%) | 167 (34.6%) | 169 (35.1%) | 11 (2.3%) | 11 (2.3%) | * | junit | 161 | 147 (91.3%) | 67 (41.6%) | 66 (41.0%) | 1 (0.6%) | 1 (0.6%) | * | lighttable | 15 | 5 (33.3%) | 0 (0.0%) | 2 (13.3%) | 0 (0.0%) | 0 (0.0%) | | magit | 88 | 75 (85.2%) | 11 (12.5%) | 9 (10.2%) | 1 (1.1%) | 0 (0.0%) | | neural-style | 28 | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | | nodejs | 781 | 649 (83.1%) | 118 (15.1%) | 111 (14.2%) | 4 (0.5%) | 5 (0.6%) | * | phpmyadmin | 491 | 481 (98.0%) | 75 (15.3%) | 48 (9.8%) | 2 (0.4%) | 2 (0.4%) | * | react-native | 168 | 130 (77.4%) | 79 (47.0%) | 81 (48.2%) | 0 (0.0%) | 0 (0.0%) | | rust | 171 | 128 (74.9%) | 30 (17.5%) | 27 (15.8%) | 16 (9.4%) | 14 (8.2%) | | spark | 186 | 149 (80.1%) | 52 (28.0%) | 52 (28.0%) | 2 (1.1%) | 2 (1.1%) | | tensorflow | 115 | 66 (57.4%) | 48 (41.7%) | 48 (41.7%) | 5 (4.3%) | 5 (4.3%) | | test-more | 19 | 15 (78.9%) | 2 (10.5%) | 2 (10.5%) | 1 (5.3%) | 1 (5.3%) | * | test-unit | 51 | 34 (66.7%) | 14 (27.5%) | 8 (15.7%) | 2 (3.9%) | 2 (3.9%) | * | xmonad | 23 | 22 (95.7%) | 2 (8.7%) | 2 (8.7%) | 1 (4.3%) | 1 (4.3%) | * | --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- | | totals | 6668 | 4391 (65.9%) | 1496 (22.4%) | 1491 (22.4%) | 150 (2.2%) | 144 (2.2%) | | totals (training set) | 4552 | 3195 (70.2%) | 1053 (23.1%) | 1061 (23.3%) | 86 (1.9%) | 88 (1.9%) | | totals (test set) | 2116 | 1196 (56.5%) | 443 (20.9%) | 430 (20.3%) | 64 (3.0%) | 56 (2.6%) | In this table, the numbers are the count and percentage of human-rated sliders that the corresponding algorithm got *wrong*. The columns are * "repository" - the name of the repository used. I used the diffs between successive non-merge commits on the HEAD branch of the corresponding repository. * "count" - the number of sliders that were human-rated. I chose most, but not all, sliders to rate from those among which the various algorithms gave different answers. * "Git 2.9.0" - the default algorithm used by `git diff` in Git 2.9.0. * "compaction" - the heuristic used by `git diff --compaction-heuristic` in Git 2.9.0. * "compaction-fixed" - the heuristic used by `git diff --compaction-heuristic` after the fixes from earlier in this patch series. Note that the results are not dramatically different than those for "compaction". Both produce non-ideal diffs only about 1/3 as often as the default `git diff`. * "indent-1" - the new `--indent-heuristic` algorithm, using the first set of weighting factors, determined as described above. * "indent-2" - the new `--indent-heuristic` algorithm, using the final set of weighting factors, determined as described below. * `*` - indicates that repo was part of training set used to determine the first set of weighting factors. The fact that the heuristic performed nearly as well on the test set as on the training set in column "indent-1" is a good indication that the heuristic was not over-trained. Given that fact, I ran a second round of optimization, using the entire corpus as the training set. The resulting set of weights gave the results in column "indent-2". These are the weights included in this patch. The final result gives consistently and significantly better results across the whole corpus than either `git diff` or `git diff --compaction-heuristic`. It makes only about 1/30 as many errors as the former and about 1/10 as many errors as the latter. (And a good fraction of the remaining errors are for diffs that involve weirdly-formatted code, sometimes apparently machine-generated.) The tools that were used to do this optimization and analysis, along with the human-generated data values, are recorded in a separate project [1]. [1] https://github.com/mhagger/diff-slider-tools Original Git commit: 433860f3d0beb0c6f205290bd16cda413148f098