summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* Merge branch 'jk/path-name-safety-2.6' into jk/path-name-safety-2.7jk/path-name-safety-2.7Junio C Hamano2016-03-1610-142/+46
|\ | | | | | | | | | | | | | | | | | | | | * jk/path-name-safety-2.6: list-objects: pass full pathname to callbacks list-objects: drop name_path entirely list-objects: convert name_path to a strbuf show_object_with_name: simplify by using path_name() http-push: stop using name_path tree-diff: catch integer overflow in combine_diff_path allocation add helpers for detecting size_t overflow
| * Merge branch 'jk/path-name-safety-2.5' into jk/path-name-safety-2.6jk/path-name-safety-2.6Junio C Hamano2016-03-1613-146/+84
| |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * jk/path-name-safety-2.5: list-objects: pass full pathname to callbacks list-objects: drop name_path entirely list-objects: convert name_path to a strbuf show_object_with_name: simplify by using path_name() http-push: stop using name_path tree-diff: catch integer overflow in combine_diff_path allocation add helpers for detecting size_t overflow
| | * Merge branch 'jk/path-name-safety-2.4' into jk/path-name-safety-2.5jk/path-name-safety-2.5Junio C Hamano2016-03-1613-146/+84
| | |\ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * jk/path-name-safety-2.4: list-objects: pass full pathname to callbacks list-objects: drop name_path entirely list-objects: convert name_path to a strbuf show_object_with_name: simplify by using path_name() http-push: stop using name_path tree-diff: catch integer overflow in combine_diff_path allocation add helpers for detecting size_t overflow
| | | * list-objects: pass full pathname to callbacksjk/path-name-safety-2.4Jeff King2016-03-169-58/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When we find a blob at "a/b/c", we currently pass this to our show_object_fn callbacks as two components: "a/b/" and "c". Callbacks which want the full value then call path_name(), which concatenates the two. But this is an inefficient interface; the path is a strbuf, and we could simply append "c" to it temporarily, then roll back the length, without creating a new copy. So we could improve this by teaching the callsites of path_name() this trick (and there are only 3). But we can also notice that no callback actually cares about the broken-down representation, and simply pass each callback the full path "a/b/c" as a string. The callback code becomes even simpler, then, as we do not have to worry about freeing an allocated buffer, nor rolling back our modification to the strbuf. This is theoretically less efficient, as some callbacks would not bother to format the final path component. But in practice this is not measurable. Since we use the same strbuf over and over, our work to grow it is amortized, and we really only pay to memcpy a few bytes. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| | | * list-objects: drop name_path entirelyJeff King2016-03-169-25/+19
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | In the previous commit, we left name_path as a thin wrapper around a strbuf. This patch drops it entirely. As a result, every show_object_fn callback needs to be adjusted. However, none of their code needs to be changed at all, because the only use was to pass it to path_name(), which now handles the bare strbuf. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| | | * list-objects: convert name_path to a strbufJeff King2016-03-163-36/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The "struct name_path" data is examined in only two places: we generate it in process_tree(), and we convert it to a single string in path_name(). Everyone else just passes it through to those functions. We can further note that process_tree() already keeps a single strbuf with the leading tree path, for use with tree_entry_interesting(). Instead of building a separate name_path linked list, let's just use the one we already build in "base". This reduces the amount of code (especially tricky code in path_name() which did not check for integer overflows caused by deep or large pathnames). It is also more efficient in some instances. Any time we were using tree_entry_interesting, we were building up the strbuf anyway, so this is an immediate and obvious win there. In cases where we were not, we trade off storing "pathname/" in a strbuf on the heap for each level of the path, instead of two pointers and an int on the stack (with one pointer into the tree object). On a 64-bit system, the latter is 20 bytes; so if path components are less than that on average, this has lower peak memory usage. In practice it probably doesn't matter either way; we are already holding in memory all of the tree objects leading up to each pathname, and for normal-depth pathnames, we are only talking about hundreds of bytes. This patch leaves "struct name_path" as a thin wrapper around the strbuf, to avoid disrupting callbacks. We should fix them, but leaving it out makes this diff easier to view. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| | | * show_object_with_name: simplify by using path_name()Jeff King2016-03-161-34/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When "git rev-list" shows an object with its associated path name, it does so by walking the name_path linked list and printing each component (stopping at any embedded NULs or newlines). We'd like to eventually get rid of name_path entirely in favor of a single buffer, and dropping this custom printing code is part of that. As a first step, let's use path_name() to format the list into a single buffer, and print that. This is strictly less efficient than the original, but it's a temporary step in the refactoring; our end game will be to get the fully formatted name in the first place. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| | | * http-push: stop using name_pathJeff King2016-03-161-16/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The graph traversal code here passes along a name_path to build up the pathname at which we find each blob. But we never actually do anything with the resulting names, making it a waste of code and memory. This usage came in aa1dbc9 (Update http-push functionality, 2006-03-07), and originally the result was passed to "add_object" (which stored it, but didn't really use it, either). But we stopped using that function in 1f1e895 (Add "named object array" concept, 2006-06-19) in favor of storing just the objects themselves. Moreover, the generation of the name in process_tree() is buggy. It sticks "name" onto the end of the name_path linked list, and then passes it down again as it recurses (instead of "entry.path"). So it's a good thing this was unused, as the resulting path for "a/b/c/d" would end up as "a/a/a/a". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| | | * tree-diff: catch integer overflow in combine_diff_path allocationJeff King2016-03-162-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A combine_diff_path struct has two "flex" members allocated alongside the struct: a string to hold the pathname, and an array of parent pointers. We use an "int" to compute this, meaning we may easily overflow it if the pathname is extremely long. We can fix this by using size_t, and checking for overflow with the st_add helper. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| | | * add helpers for detecting size_t overflowJeff King2016-03-161-0/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Performing computations on size_t variables that we feed to xmalloc and friends can be dangerous, as an integer overflow can cause us to allocate a much smaller chunk than we realized. We already have unsigned_add_overflows(), but let's add unsigned_mult_overflows() to that. Furthermore, rather than have each site manually check and die on overflow, we can provide some helpers that will: - promote the arguments to size_t, so that we know we are doing our computation in the same size of integer that will ultimately be fed to xmalloc - check and die on overflow - return the result so that computations can be done in the parameter list of xmalloc. These functions are a lot uglier to use than normal arithmetic operators (you have to do "st_add(foo, bar)" instead of "foo + bar"). To at least limit the damage, we also provide multi-valued versions. So rather than: st_add(st_add(a, b), st_add(c, d)); you can write: st_add4(a, b, c, d); This isn't nearly as elegant as a varargs function, but it's a lot harder to get it wrong. You don't have to remember to add a sentinel value at the end, and the compiler will complain if you get the number of arguments wrong. This patch adds only the numbered variants required to convert the current code base; we can easily add more later if needed. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Git 2.7.3v2.7.3Junio C Hamano2016-03-104-3/+66
| | | | | | | | | | | | | | | | Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Merge branch 'ma/update-hooks-sample-typofix' into maintJunio C Hamano2016-03-101-1/+1
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | * ma/update-hooks-sample-typofix: templates/hooks: fix minor typo in the sample update-hook
| * | | | templates/hooks: fix minor typo in the sample update-hookma/update-hooks-sample-typofixMartin Amdisen2016-02-251-1/+1
| | |_|/ | |/| | | | | | | | | | | | | | Signed-off-by: Martin Mosegaard Amdisen <martin.amdisen@praqma.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | Merge branch 'dt/initial-ref-xn-commit-doc' into maintJunio C Hamano2016-03-101-0/+12
|\ \ \ \ | | | | | | | | | | | | | | | | | | | | * dt/initial-ref-xn-commit-doc: refs: document transaction semantics
| * | | | refs: document transaction semanticsdt/initial-ref-xn-commit-docDavid Turner2016-02-251-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add some comments on ref transaction semantics to refs.h Signed-off-by: David Turner <dturner@twopensource.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | Merge branch 'ps/plug-xdl-merge-leak' into maintJunio C Hamano2016-03-101-2/+7
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | * ps/plug-xdl-merge-leak: xdiff/xmerge: fix memory leak in xdl_merge
| * | | | | xdiff/xmerge: fix memory leak in xdl_mergeps/plug-xdl-merge-leakPatrick Steinhardt2016-02-231-2/+7
| | |_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When building the script for the second file that is to be merged we have already allocated memory for data structures related to the first file. When we encounter an error in building the second script we only free allocated memory related to the second file before erroring out. Fix this memory leak by also releasing allocated memory related to the first file. Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | Merge branch 'ak/git-strip-extension-from-dashed-command' into maintJunio C Hamano2016-03-102-15/+15
|\ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Code simplification. * ak/git-strip-extension-from-dashed-command: git.c: simplify stripping extension of a file in handle_builtin()
| * | | | | git.c: simplify stripping extension of a file in handle_builtin()ak/git-strip-extension-from-dashed-commandAlexander Kuleshov2016-02-212-15/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The handle_builtin() starts from stripping of command extension if STRIP_EXTENSION is enabled. Actually STRIP_EXTENSION does not used anywhere else. This patch introduces strip_extension() helper to strip STRIP_EXTENSION extension from argv[0] with the strip_suffix() instead of manually stripping. Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | Merge branch 'ak/extract-argv0-last-dir-sep' into maintJunio C Hamano2016-03-101-4/+2
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Code simplification. * ak/extract-argv0-last-dir-sep: exec_cmd.c: use find_last_dir_sep() for code simplification
| * | | | | | exec_cmd.c: use find_last_dir_sep() for code simplificationak/extract-argv0-last-dir-sepAlexander Kuleshov2016-02-191-4/+2
| | |_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We are trying to extract dirname from argv0 in the git_extract_argv0_path(). But in the same time, the <git-compat-util.h> provides find_last_dir_sep() to get dirname from a given path. Let's use it instead of loop for the code simplification. Signed-off-by: Alexander Kuleshov <kuleshovmail@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | Merge branch 'jk/pack-idx-corruption-safety' into maintJunio C Hamano2016-03-104-0/+207
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code to read the pack data using the offsets stored in the pack idx file has been made more carefully check the validity of the data in the idx. * jk/pack-idx-corruption-safety: sha1_file.c: mark strings for translation use_pack: handle signed off_t overflow nth_packed_object_offset: bounds-check extended offset t5313: test bounds-checks of corrupted/malicious pack/idx files
| * | | | | | sha1_file.c: mark strings for translationNguyễn Thái Ngọc Duy2016-02-271-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | use_pack: handle signed off_t overflowJeff King2016-02-252-1/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A v2 pack index file can specify an offset within a packfile of up to 2^64-1 bytes. On a system with a signed 64-bit off_t, we can represent only up to 2^63-1. This means that a corrupted .idx file can end up with a negative offset in the pack code. Our bounds-checking use_pack function looks for too-large offsets, but not for ones that have wrapped around to negative. Let's do so, which fixes an out-of-bounds access demonstrated in t5313. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | nth_packed_object_offset: bounds-check extended offsetJeff King2016-02-254-1/+27
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If a pack .idx file has a corrupted offset for an object, we may try to access an offset in the .idx or .pack file that is larger than the file's size. For the .pack case, we have use_pack() to protect us, which realizes the access is out of bounds. But if the corrupted value asks us to look in the .idx file's secondary 64-bit offset table, we blindly add it to the mmap'd index data and access arbitrary memory. We can fix this with a simple bounds-check compared to the size we found when we opened the .idx file. Note that there's similar code in index-pack that is triggered only during "index-pack --verify". To support both, we pull the bounds-check into a separate function, which dies when it sees a corrupted file. It would be nice if we could return an error, so that the pack code could try to find a good copy of the object elsewhere. Currently nth_packed_object_offset doesn't have any way to return an error, but it could probably use "0" as a sentinel value (since no object can start there). This is the minimal fix, and we can improve the resilience later on top. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | t5313: test bounds-checks of corrupted/malicious pack/idx filesJeff King2016-02-251-0/+179
| |/ / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our on-disk .pack and .idx files may reference other data by offset. We should make sure that we are not fooled by corrupt data into accessing memory outside of our mmap'd boundaries. This patch adds a series of tests for offsets found in .pack and .idx files. For the most part we get this right, but there are two tests of .idx files marked as failures: we do not bounds-check offsets in the v2 index's extended offset table, nor do we handle .idx offsets that overflow a signed off_t. With these tests, we should have good coverage of all offsets found in these files. Note that this doesn't cover .bitmap files, which may have similar bugs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | Merge branch 'js/config-set-in-non-repository' into maintJunio C Hamano2016-03-102-0/+14
|\ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | "git config section.var value" to set a value in per-repository configuration file failed when it was run outside any repository, but didn't say the reason correctly. * js/config-set-in-non-repository: git config: report when trying to modify a non-existing repo config
| * | | | | | git config: report when trying to modify a non-existing repo configjs/config-set-in-non-repositoryJohannes Schindelin2016-02-252-0/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | It is a pilot error to call `git config section.key value` outside of any Git worktree. The message error: could not lock config file .git/config: No such file or directory is not very helpful in that situation, though. Let's print a helpful message instead. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | | Merge branch 'sb/submodule-module-list-fix' into maintJunio C Hamano2016-03-102-8/+27
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | A helper function "git submodule" uses since v2.7.0 to list the modules that match the pathspec argument given to its subcommands (e.g. "submodule add <repo> <path>") has been fixed. * sb/submodule-module-list-fix: submodule helper list: respect correct path prefix
| * | | | | | | submodule helper list: respect correct path prefixsb/submodule-module-list-fixStefan Beller2016-02-242-8/+27
| | |_|_|_|/ / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a regression introduced by 74703a1e4d (submodule: rewrite `module_list` shell function in C, 2015-09-02). Add a test to ensure we list the right submodule when giving a specific pathspec. Reported-By: Caleb Jorden <cjorden@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | | Merge branch 'jk/grep-binary-workaround-in-test' into maintJunio C Hamano2016-03-102-13/+17
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Recent versions of GNU grep are pickier when their input contains arbitrary binary data, which some of our tests uses. Rewrite the tests to sidestep the problem. * jk/grep-binary-workaround-in-test: t9200: avoid grep on non-ASCII data t8005: avoid grep on non-ASCII data
| * | | | | | | t9200: avoid grep on non-ASCII datajk/grep-binary-workaround-in-testJohn Keeping2016-02-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GNU grep 2.23 detects the input used in this test as binary data so it does not work for extracting lines from a file. We could add the "-a" option to force grep to treat the input as text, but not all implementations support that. Instead, use sed to extract the desired lines since it will always treat its input as text. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | t8005: avoid grep on non-ASCII dataJohn Keeping2016-02-231-12/+16
| | |_|/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GNU grep 2.23 detects the input used in this test as binary data so it does not work for extracting lines from a file. We could add the "-a" option to force grep to treat the input as text, but not all implementations support that. Instead, use sed to extract the desired lines since it will always treat its input as text. While touching these lines, modernize the test style to avoid hiding the exit status of "git blame" and remove a space following a redirection operator. Also swap the order of the expected and actual output files given to test_cmp; we compare expect and actual to show how actual output differs from what is expected. Signed-off-by: John Keeping <john@keeping.me.uk> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | | Merge branch 'mm/push-simple-doc' into maintJunio C Hamano2016-03-101-0/+7
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The documentation did not clearly state that the 'simple' mode is now the default for "git push" when push.default configuration is not set. * mm/push-simple-doc: Documentation/git-push: document that 'simple' is the default
| * | | | | | | Documentation/git-push: document that 'simple' is the defaultmm/push-simple-docMatthieu Moy2016-02-231-0/+7
| |/ / / / / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The default behavior is well documented already in git-config(1), but git-push(1) itself did not mention it at all. For users willing to learn how "git push" works but not how to configure it, this makes the documentation cumbersome to read. Make the git-push(1) page self-contained by adding a short summary of what 'push.default=simple' does, early in the page. Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | | | | | | Merge branch 'jk/tighten-alloc' into maintJunio C Hamano2016-03-1091-482/+441
|\ \ \ \ \ \ \ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | * jk/tighten-alloc: (23 commits) compat/mingw: brown paper bag fix for 50a6c8e ewah: convert to REALLOC_ARRAY, etc convert ewah/bitmap code to use xmalloc diff_populate_gitlink: use a strbuf transport_anonymize_url: use xstrfmt git-compat-util: drop mempcpy compat code sequencer: simplify memory allocation of get_message test-path-utils: fix normalize_path_copy output buffer size fetch-pack: simplify add_sought_entry fast-import: simplify allocation in start_packfile write_untracked_extension: use FLEX_ALLOC helper prepare_{git,shell}_cmd: use argv_array use st_add and st_mult for allocation size computation convert trivial cases to FLEX_ARRAY macros use xmallocz to avoid size arithmetic convert trivial cases to ALLOC_ARRAY convert manual allocations to argv_array argv-array: add detach function add helpers for allocating flex-array structs harden REALLOC_ARRAY and xcalloc against size_t overflow ...
| * | | | | | | compat/mingw: brown paper bag fix for 50a6c8eJeff King2016-02-291-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 50a6c8e (use st_add and st_mult for allocation size computation, 2016-02-22) fixed up many xmalloc call-sites including ones in compat/mingw.c. But I screwed up one of them, which was half-converted to ALLOC_ARRAY, using a very early prototype of the function. And I never caught it because I don't build on Windows. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | ewah: convert to REALLOC_ARRAY, etcJeff King2016-02-223-19/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we're built around xmalloc and friends, we can use helpers like REALLOC_ARRAY, ALLOC_GROW, and so on to make the code shorter and protect against integer overflow. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | convert ewah/bitmap code to use xmallocJeff King2016-02-224-30/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This code was originally written with the idea that it could be spun off into its own ewah library, and uses the overrideable ewah_malloc to do allocations. We plug in xmalloc as our ewah_malloc, of course. But over the years the ewah code itself has become more entangled with git, and the return value of many ewah_malloc sites is not checked. Let's just drop the level of indirection and use xmalloc and friends directly. This saves a few lines, and will let us adapt these sites to our more advanced malloc helpers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | diff_populate_gitlink: use a strbufJeff King2016-02-221-8/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We allocate 100 bytes to hold the "Submodule commit ..." text. This is enough, but it's not immediately obvious that this is the case, and we have to repeat the magic 100 twice. We could get away with xstrfmt here, but we want to know the size, as well, so let's use a real strbuf. And while we're here, we can clean up the logic around size_only. It currently sets and clears the "data" field pointlessly, and leaves the "should_free" flag on even after we have cleared the data. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | transport_anonymize_url: use xstrfmtJeff King2016-02-221-5/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function uses xcalloc and two memcpy calls to concatenate two strings. We can do this as an xstrfmt one-liner, and then it is more clear that we are allocating the correct amount of memory. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | git-compat-util: drop mempcpy compat codeJeff King2016-02-221-9/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | There are no callers of this left, as the last one was dropped in the previous patch. And there are not likely to be new ones, as the function has been around since 2010 without gaining any new callers. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | sequencer: simplify memory allocation of get_messageJeff King2016-02-221-19/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For a commit with sha1 "1234abcd" and subject "foo", this function produces a struct with three strings: 1. "foo" 2. "1234abcd... foo" 3. "parent of 1234abcd... foo" It takes advantage of the fact that these strings are subsets of each other, and allocates only _one_ string, with pointers into the various parts. Unfortunately, this makes the string allocation complicated and hard to follow. Since we keep only one of these in memory at a time, we can afford to simply allocate three strings. This lets us build on tools like xstrfmt and avoid manual computation. While we're here, we can also drop the ad-hoc reimplementation of get_git_commit_encoding(), and simply call that function. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | test-path-utils: fix normalize_path_copy output buffer sizeJeff King2016-02-221-11/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The normalize_path_copy function needs an output buffer that is at least as long as its input (it may shrink the path, but never expand it). However, this test program feeds it static PATH_MAX-sized buffers, which have no relation to the input size. In the normalize_ceiling_entry case, we do at least check the size against PATH_MAX and die(), but that case is even more convoluted. We normalize into a fixed-size buffer, free the original, and then replace it with a strdup'd copy of the result. But normalize_path_copy explicitly allows normalizing in-place, so we can simply do that. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | fetch-pack: simplify add_sought_entryJeff King2016-02-221-18/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have two variants of this function, one that takes a string and one that takes a ptr/len combo. But we only call the latter with the length of a NUL-terminated string, so our first simplification is to drop it in favor of the string variant. Since we know we have a string, we can also replace the manual memory computation with a call to alloc_ref(). Furthermore, we can rely on get_oid_hex() to complain if it hits the end of the string. That means we can simplify the check for "<sha1> <ref>" versus just "<ref>". Rather than manage the ptr/len pair, we can just bump the start of our string forward. The original code over-allocated based on the original "namelen" (which wasn't _wrong_, but was simply wasteful and confusing). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | fast-import: simplify allocation in start_packfileJeff King2016-02-221-4/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This function allocate a packed_git flex-array, and adds a mysterious 2 bytes to the length of the pack_name field. One is for the trailing NUL, but the other has no purpose. This is probably cargo-culted from add_packed_git, which gets the ".idx" path and needed to allocate enough space to hold the matching ".pack" (though since 48bcc1c, we calculate the size there differently). This site, however, is using the raw path of a tempfile, and does not need the extra byte. We can just replace the allocation with FLEX_ALLOC_STR, which handles the allocation and the NUL for us. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | write_untracked_extension: use FLEX_ALLOC helperJeff King2016-02-221-5/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We perform unchecked additions when computing the size of a "struct ondisk_untracked_cache". This is unlikely to have an integer overflow in practice, but we'd like to avoid this dangerous pattern to make further audits easier. Note that there's one subtlety here, though. We protect ourselves against a NULL exclude_per_dir entry in our source, and avoid calling strlen() on it, keeping "len" at 0. But later, we unconditionally memcpy "len + 1" bytes to get the trailing NUL byte. If we did have a NULL exclude_per_dir, we would read from bogus memory. As it turns out, though, we always create this field pointing to a string literal, so there's no bug. We can just get rid of the pointless extra conditional. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | prepare_{git,shell}_cmd: use argv_arrayJeff King2016-02-223-53/+39
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | These functions transform an existing argv into one suitable for exec-ing or spawning via git or a shell. We can use an argv_array in each to avoid dealing with manual counting and allocation. This also makes the memory allocation more clear and fixes some leaks. In prepare_shell_cmd, we would sometimes allocate a new string with "$@" in it and sometimes not, meaning the caller could not correctly free it. On the non-Windows side, we are in a child process which will exec() or exit() immediately, so the leak isn't a big deal. On Windows, though, we use spawn() from the parent process, and leak a string for each shell command we run. On top of that, the Windows code did not free the allocated argv array at all (but does for the prepare_git_cmd case!). By switching both of these functions to write into an argv_array, we can consistently free the result as appropriate. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | use st_add and st_mult for allocation size computationJeff King2016-02-2225-53/+56
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If our size computation overflows size_t, we may allocate a much smaller buffer than we expected and overflow it. It's probably impossible to trigger an overflow in most of these sites in practice, but it is easy enough convert their additions and multiplications into overflow-checking variants. This may be fixing real bugs, and it makes auditing the code easier. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * | | | | | | convert trivial cases to FLEX_ARRAY macrosJeff King2016-02-2217-82/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Using FLEX_ARRAY macros reduces the amount of manual computation size we have to do. It also ensures we don't overflow size_t, and it makes sure we write the same number of bytes that we allocated. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>