summaryrefslogtreecommitdiff
Commit message (Collapse)AuthorAgeFilesLines
* fuzzer: update for indexer changesethomson/fuzzerEdward Thomson2018-08-261-1/+1
|
* Merge pull request #4727 from libgit2/cmn/null-oid-existing-treeEdward Thomson2018-08-263-6/+24
|\ | | | | tree: accept null ids in existing trees when updating
| * tree: rename from_tree to validate and clarify the tree in the testcmn/null-oid-existing-treeCarlos Martín Nieto2018-07-272-6/+7
| |
| * tree: accept null ids in existing trees when updatingCarlos Martín Nieto2018-07-183-6/+23
| | | | | | | | | | | | | | | | | | When we add entries to a treebuilder we validate them. But we validate even those that we're adding because they exist in the base tree. This disables using the normal mechanisms on these trees, even to fix them. Keep track of whether the entry we're appending comes from an existing tree and bypass the name and id validation if it's from existing data.
* | Merge pull request #4374 from pks-t/pks/pack-file-verifyEdward Thomson2018-08-2618-150/+529
|\ \ | | | | | | Pack file verification
| * | odb_pack: fix passing partially initialized indexer optionsPatrick Steinhardt2018-06-221-1/+1
| | |
| * | indexer: correctly initialize struct with {0}Patrick Steinhardt2018-06-221-1/+1
| | |
| * | tests: indexer: add test to exercise our connectivity checkingPatrick Steinhardt2018-06-221-0/+58
| | | | | | | | | | | | | | | | | | The new connectivity tests are not currently being verified at all due to being turned off by default. Create two test cases for a pack file which fails our checks and one which suceeds.
| * | indexer: add ability to select connectivity checksPatrick Steinhardt2018-06-222-3/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Right now, we simply turn on connectivity checks in the indexer as soon as we have access to an object database. But seeing that the connectivity checks may incur additional overhead, we do want the user to decide for himself whether he wants to allow those checks. Furthermore, it might also be desirable to check connectivity in case where no object database is given at all, e.g. in case where a fully connected pack file is expected. Add a flag `verify` to `git_indexer_options` to enable additional verification checks. Also avoid to query the ODB in case none is given to allow users to enable checks when they do not have an ODB.
| * | indexer: introduce options struct to `git_indexer_new`Patrick Steinhardt2018-06-227-21/+62
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We strive to keep an options structure to many functions to be able to extend options in the future without breaking the API. `git_indexer_new` doesn't have one right now, but we want to be able to add an option for enabling strict packfile verification. Add a new `git_indexer_options` structure and adjust callers to use that.
| * | indexer: check pack file connectivityPatrick Steinhardt2018-06-221-1/+151
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When passing `--strict` to `git-unpack-objects`, core git will verify the pack file that is currently being read. In addition to the typical checksum verification, this will especially cause it to verify object connectivity of the received pack file. So it checks, for every received object, if all the objects it references are either part of the local object database or part of the pack file. In libgit2, we currently have no such mechanism, which leaves us unable to verify received pack files prior to writing them into our local object database. This commit introduce the concept of `expected_oids` to the indexer. When pack file verification is turned on by a new flag, the indexer will try to parse each received object first. If the object has any links to other objects, it will check if those links are already satisfied by known objects either part of the object database or objects it has already seen as part of that pack file. If not, it will add them to the list of `expected_oids`. Furthermore, the indexer will remove the current object from the `expected_oids` if it is currently being expected. Like this, we are able to verify whether all object links are being satisfied. As soon as we hit the end of the object stream and have resolved all objects as well as deltified objects, we assert that `expected_oids` is in fact empty. This should always be the case for a valid pack file with full connectivity.
| * | indexer: extract function reading stream objectsPatrick Steinhardt2018-06-221-78/+91
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The loop inside of `git_indexer_append` iterates over every object that is to be stored as part of the index. While the logic to retrieve every object from the packfile stream is rather involved, it currently just part of the loop, making it unnecessarily hard to follow. Move the logic into its own function `read_stream_object`, which unpacks a single object from the stream. Note that there is some subtletly here involving the special error `GIT_EBUFS`, which indicates to the indexer that no more data is currently available. So instead of returning an error and aborting the whole loop in that case, we do have to catch that value and return successfully to wait for more data to be read.
| * | indexer: remove useless local variablePatrick Steinhardt2018-06-221-6/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The `processed` variable local to `git_indexer_append` counts how many objects have already been processed. But actually, whenever it gets assigned to, we are also assigning the same value to the `stats->indexed_objects` struct member. So in fact, it is being quite useless due to always having the same value as the `indexer_objects` member and makes it a bit harder to understand the code. We can just remove the variable to fix that.
| * | object: implement function to parse raw dataPatrick Steinhardt2018-06-222-8/+68
| | | | | | | | | | | | | | | | | | | | | | | | | | | Now that we have implement functions to parse all git objects from raw data, we can implement a generic function `git_object__from_raw` to create a structure of type `git_object`. This allows us to parse and interpret objects from raw data without having to touch the ODB at all, which is especially useful for object verification prior to accepting them into the repository.
| * | tree: implement function to parse raw dataPatrick Steinhardt2018-06-222-6/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, parsing objects is strictly tied to having an ODB object available. This makes it hard to parse an object when all that is available is its raw object and size. Furthermore, hacking around that limitation by directly creating an ODB structure either on stack or on heap does not really work that well due to ODB objects being reference counted and then automatically free'd when reaching a reference count of zero. Implement a function `git_tree__parse_raw` to parse a tree object from a pair of `data` and `size`.
| * | tag: implement function to parse raw dataPatrick Steinhardt2018-06-222-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, parsing objects is strictly tied to having an ODB object available. This makes it hard to parse an object when all that is available is its raw object and size. Furthermore, hacking around that limitation by directly creating an ODB structure either on stack or on heap does not really work that well due to ODB objects being reference counted and then automatically free'd when reaching a reference count of zero. Implement a function `git_tag__parse_raw` to parse a tag object from a pair of `data` and `size`.
| * | commit: implement function to parse raw dataPatrick Steinhardt2018-06-222-3/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, parsing objects is strictly tied to having an ODB object available. This makes it hard to parse an object when all that is available is its raw object and size. Furthermore, hacking around that limitation by directly creating an ODB structure either on stack or on heap does not really work that well due to ODB objects being reference counted and then automatically free'd when reaching a reference count of zero. Implement a function `git_commit__parse_raw` to parse a commit object from a pair of `data` and `size`.
| * | blob: implement function to parse raw dataPatrick Steinhardt2018-06-222-7/+36
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, parsing objects is strictly tied to having an ODB object available. This makes it hard to parse an object when all that is available is its raw object and size. Furthermore, hacking around that limitation by directly creating an ODB structure either on stack or on heap does not really work that well due to ODB objects being reference counted and then automatically free'd when reaching a reference count of zero. In some occasions parsing raw objects without touching the ODB is actually recuired, though. One use case is for example object verification, where we want to assure that an object is valid before inserting it into the ODB or writing it into the git repository. Asa first step towards that, introduce a distinction between raw and ODB objects for blobs. Creation of ODB objects stays the same by simply using `git_blob__parse`, but a new function `git_blob__parse_raw` has been added that creates a blob from a pair of data and size. By setting a new flag inside of the blob, we can now distinguish whether it is a raw or ODB object now and treat it accordingly in several places. Note that the blob data passed in is not being copied. Because of that, callers need to make sure to keep it alive during the blob's life time. This is being used to avoid unnecessarily increasing the memory footprint when parsing largish blobs.
| * | blob: use getters to get raw blob content and sizePatrick Steinhardt2018-06-221-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | Going forward, we will have to change how blob sizes are calculated based on whether the blob is a cahed object part of the ODB or not. In order to not have to distinguish between those two object types repeatedly when accessing the blob's data or size, encapsulate all existing direct uses of those fields by instead using `git_blob_rawcontent` and `git_blob_rawsize`.
| * | pack-objects: make `git_walk_object` internal to pack-objectsPatrick Steinhardt2018-06-222-16/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The `git_walk_objects` structure is currently only being used inside of the pack-objects.c file, but being declared in its header. This has actually been the case since its inception in 04a36feff (pack-objects: fill a packbuilder from a walk, 2014-10-11) and has never really changed. Move the struct declaration into pack-objects.c to improve code encapsulation.
* | | Merge pull request #4777 from pks-t/pks/cmake-iconv-via-libcEdward Thomson2018-08-241-6/+11
|\ \ \ | | | | | | | | cmake: detect and use libc-provided iconv
| * | | cmake: detect and use libc-provided iconvPatrick Steinhardt2018-08-241-6/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | While most systems provide a separate iconv library against which applications can link, musl based systems do not provide such a library. Instead, iconv functions are directly included in the C library. As our current CMake module to locate the iconv library only checks whether a library exists somewhere in the typical library directories, we will never build libgit2 with libiconv support on such systems. Extend the iconv module to also search whether libc provides iconv functions, which we do by checking whether the `iconv_open` function exists inside of libc. If this is the case, we will default to use the libc provided one instead of trying to use a separate libiconv. While this changes which iconv we use on systems where both libc and an external libiconv exist, to the best of my knowledge common systems only provide either one or the other. Note that libiconv support in musl is held kind of basic. To quote musl libc's page on functional differences from glibc [1]: The iconv implementation musl is very small and oriented towards being unobtrusive to static link. Its character set/encoding coverage is very strong for its size, but not comprehensive like glibc’s. As we assume iconv to be a lot more capable than what musl provides, some of our tests will fail if using iconv on musl-based platforms. [1]: https://wiki.musl-libc.org/functional-differences-from-glibc.html
* | | | Merge pull request #4774 from tiennou/fix/clang-analyzerPatrick Steinhardt2018-08-244-5/+5
|\ \ \ \ | | | | | | | | | | Coverity flavored clang analyzer fixes
| * | | | transport/http: do not return success if we failed to get a schemeEtienne Samson2018-08-211-1/+1
| | | | | | | | | | | | | | | | | | | | Otherwise we return a NULL context, which will get dereferenced in apply_credentials.
| * | | | remote: set the error before cleanupEtienne Samson2018-08-211-2/+2
| | | | | | | | | | | | | | | Otherwise we'll return stack data to the caller.
| * | | | mailmap: Undefined or garbage value returned to callerEtienne Samson2018-08-211-1/+1
| | | | | | | | | | | | | | | | | | | | In case there was nothing to parse in the buf, we'd return uninitialized stack data.
| * | | | revwalk: The left operand of '<' is a garbage valueEtienne Samson2018-08-211-1/+1
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | At line 594, we do this : if (error < 0) return error; but if nothing was pushed in a GIT_SORT_TIME revwalk, we'd return uninitialized stack data.
* | | | Merge pull request #4776 from pks-t/pks/test-index-invalid-filemodeEdward Thomson2018-08-241-0/+42
|\ \ \ \ | | | | | | | | | | tests: verify adding index conflicts with invalid filemodes fails
| * | | | tests: verify adding index conflicts with invalid filemodes failsPatrick Steinhardt2018-08-241-0/+42
| |/ / / | | | | | | | | | | | | | | | | | | | | | | | | | | | | Commit 581d5492f (Fix leak in index.c, 2018-08-16) was fixing a memory leak in our code adding conflicts to the index when the added index entries have an invalid file mode. The memory leak was previously undiscovered as there are no tests covering this scenario, which is now being added by this commit.
* | | | Merge pull request #4769 from tiennou/fix/worktree-unlockPatrick Steinhardt2018-08-242-3/+3
|\ \ \ \ | | | | | | | | | | worktree: unlock should return 1 when the worktree isn't locked
| * | | | worktree: unlock should return 1 when the worktree isn't lockedEtienne Samson2018-08-172-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | The documentation states that git_worktree_unlock returns 0 on success, and 1 on success if the worktree wasn't locked. Turns out we were returning 0 in any of those cases.
* | | | | Merge pull request #4752 from nelhage/fuzz-configPatrick Steinhardt2018-08-242-0/+86
|\ \ \ \ \ | | | | | | | | | | | | Add a fuzzer for config files
| * | | | | Add a proper write loopNelson Elhage2018-08-161-2/+10
| | | | | |
| * | | | | Add a copyright header.Nelson Elhage2018-08-141-0/+9
| | | | | |
| * | | | | Further review comments, fix the buildNelson Elhage2018-08-141-11/+27
| | | | | |
| * | | | | ReformatNelson Elhage2018-08-141-27/+28
| | | | | |
| * | | | | Add a config file to the corpusNelson Elhage2018-08-051-0/+11
| | | | | |
| * | | | | Add a config file fuzzerNelson Elhage2018-08-051-0/+41
| | | | | |
* | | | | | Merge pull request #4763 from cschlack/fix_ng_packetsPatrick Steinhardt2018-08-241-7/+9
|\ \ \ \ \ \ | |_|_|/ / / |/| | | | | Fix 'invalid packet line' for ng packets containing errors
| * | | | | Fix 'invalid packet line' for ng packets containing errorsChristian Schlack2018-08-171-7/+9
| | | | | |
* | | | | | Merge pull request #4768 from abyss7/masterEdward Thomson2018-08-191-1/+2
|\ \ \ \ \ \ | | | | | | | | | | | | | | Fix leak in index.c
| * | | | | | Fix leak in index.cabyss72018-08-161-1/+2
| | |_|/ / / | |/| | | |
* | | | | | Merge pull request #4754 from libgit2/ethomson/threadsEdward Thomson2018-08-192-11/+21
|\ \ \ \ \ \ | | | | | | | | | | | | | | threads::diff: use separate git_repository objects
| * | | | | | threads::iterator: use separate repository objectsethomson/threadsEdward Thomson2018-08-191-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Our thread policies state that we cannot re-use the `git_repository` across threads. Our tests cannot deviate from that. Courtesy of Ximin Luo, https://github.com/infinity0: https://github.com/libgit2/libgit2/issues/4753#issuecomment-412247757
| * | | | | | threads::diff: use separate git_repository objectsEdward Thomson2018-08-051-10/+17
| | |_|/ / / | |/| | | | | | | | | | | | | | | | | | | | | | Our thread policies state that we cannot re-use the `git_repository` across threads. Our tests cannot deviate from that.
* | | | | | Merge pull request #4766 from pks-t/pks/travis-remove-coverityEdward Thomson2018-08-171-8/+1
|\ \ \ \ \ \ | |_|/ / / / |/| | | | | travis: remove Coverity cron job
| * | | | | travis: remove Coverity cron jobPatrick Steinhardt2018-08-161-8/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | With the recent addition of VSTS to our CI infrastructure, we now have two cron jobs running regular Coverity analysis. It doesn't really make a lot of sense to upload two different analysis on our sources to Corverity, though: - in the worst case, Coverity will be repeatedly confused when different sets of sources get analyzed and uploaded - in the best case, nothing is gained because the sources have already been analyzed via the other job Let's just use a single cron job for Coverity. Considering that VSTS seems to be the more beefy and flexible platform, it is more likely to be our future target CI platform. Thus, we retain its support for Coverity and instead remove it from Travis.
* | | | | | Merge pull request #4749 from neithernut/fix-git__linenlen-ubPatrick Steinhardt2018-08-161-4/+7
|\ \ \ \ \ \ | | | | | | | | | | | | | | parse: Do not initialize the content in context to NULL
| * | | | | | parse: Do not initialize the content in context to NULLJulian Ganz2018-08-041-4/+7
| | |/ / / / | |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | String operations in libgit2 are supposed to never receive `NULL`, e.g. they are not `NULL`-save. In the case of `git__linenlen()`, invocation with `NULL` leads to undefined behavior. In a `git_parse_ctx` however, the `content` field used in these operations was initialized to `NULL` if the `git_parse_ctx_init()` was called with `NULL` for `content` or `0` for `content_len`. For the latter case, the initialization function even contained some logic for initializing `content` with `NULL`. This commit mitigates triggering undefined behavior by rewriting the logic. Now `content` is always initialized to a non-null buffer. Instead of a null buffer, an empty string is used for denoting an empty buffer.
* | | | | | Merge pull request #4750 from nelhage/nelhage-config-no-sectionPatrick Steinhardt2018-08-163-2/+43
|\ \ \ \ \ \ | |_|/ / / / |/| | | | | config_file: Don't crash on options without a section