summaryrefslogtreecommitdiff
path: root/src/odb.c
Commit message (Collapse)AuthorAgeFilesLines
* refactor: `src` is now `src/libgit2`Edward Thomson2022-02-221-1831/+0
|
* odb: initialize `object` before useEdward Thomson2022-02-221-1/+1
| | | | | Newer gcc is complaining about `object` being potentially not initialized; initialize it.
* diff: indicate when the file size is "valid"Edward Thomson2022-02-121-4/+2
| | | | | | | | When we know the file size (because we're producing it from a working directory iterator, or an index with an up-to-date cache) then set a flag indicating as such. This removes the ambiguity about a 0 file size, which could indicate that a file exists and is 0 bytes, or that we haven't read it yet.
* odb: check for write failuresEdward Thomson2022-02-071-3/+3
|
* Merge pull request #6104 from libgit2/ethomson/pathEdward Thomson2021-11-111-2/+2
|\ | | | | path: refactor utility path functions
| * path: separate git-specific path functions from utilEdward Thomson2021-11-091-2/+2
| | | | | | | | | | | | Introduce `git_fs_path`, which operates on generic filesystem paths. `git_path` will be kept for only git-specific path functionality (for example, checking for `.git` in a path).
* | Support checking for object existence without refreshJosh Triplett2021-11-081-1/+6
|/ | | | | | | | | | | | Looking up a non-existent object currently always invokes `git_odb_refresh`. If looking up a large batch of objects, many of which may legitimately not exist, this will repeatedly refresh the ODB to no avail. Add a `git_odb_exists_ext` that accepts flags controlling the ODB lookup, and add a flag to suppress the refresh. This allows the user to control if and when they refresh (for instance, refreshing once before starting the batch).
* str: introduce `git_str` for internal, `git_buf` is externalethomson/gitstrEdward Thomson2021-10-171-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
* hash: hash functions operate on byte arrays not git_oidsEdward Thomson2021-10-021-3/+3
| | | | | | Separate the concerns of the hash functions from the git_oid functions. The git_oid structure will need to understand either SHA1 or SHA256; the hash functions should only deal with the appropriate one of these.
* hash: accept the algorithm in inputsEdward Thomson2021-10-011-3/+3
|
* midx: Introduce git_odb_write_multi_pack_index()lhchavez2021-08-271-0/+29
| | | | | | | | This change introduces git_odb_write_multi_pack_index(), which creates a `multi-pack-index` file from all the `.pack` files that have been loaded in the ODB. Fixes: #5399
* Proof-of-concept for a more aggressive GIT_UNUSED()lhchavez2021-08-081-2/+2
| | | | | | This adds a `-Wunused-result`-proof `GIT_UNUSED()`, just to demonstrate that it works. With this, sortedcache.h is now completely `GIT_WARN_UNUSED_RESULT`-annotated!
* tests: reset odb backend priorityethomson/odb_tests_priorityEdward Thomson2021-07-301-2/+2
|
* odb: Implement option for overriding of default odb backend priorityTony De La Nuez2021-07-301-6/+6
| | | | | | | | | | | Introduce GIT_OPT_SET_ODB_LOOSE_PRIORITY and GIT_OPT_SET_ODB_PACKED_PRIORITY to allow overriding the default priority values for the default ODB backends. Libgit2 has historically assumed that most objects for long- running operations will be packed, therefore GIT_LOOSE_PRIORITY is set to 1 by default, and GIT_PACKED_PRIORITY to 2. When a client allows libgit2 to set the default backends, they can specify an override for the two priority values in order to change the order in which each ODB backend is accessed.
* Merge pull request #5765 from lhchavez/cgraph-revwalksEdward Thomson2021-07-261-0/+53
|\ | | | | commit-graph: Use the commit-graph in revwalks
| * commit-graph: Create `git_commit_graph` as an abstraction for the filelhchavez2021-03-101-60/+39
| | | | | | | | | | | | | | | | | | | | This change does a medium-size refactor of the git_commit_graph_file and the interaction with the ODB. Now instead of the ODB owning a direct reference to the git_commit_graph_file, there will be an intermediate git_commit_graph. The main advantage of that is that now end users can explicitly set a git_commit_graph that is eagerly checked for errors, while still being able to lazily use the commit-graph in a regular ODB, if the file is present.
| * commit-graph: Use the commit-graph in revwalkslhchavez2021-03-101-0/+74
| | | | | | | | | | | | | | | | | | | | | | | | | | This change makes revwalks a bit faster by using the `commit-graph` file (if present). This is thanks to the `commit-graph` allow much faster parsing of the commit information by requiring near-zero I/O (aside from reading a few dozen bytes off of a `mmap(2)`-ed file) for each commit, instead of having to read the ODB, inflate the commit, and parse it. This is done by modifying `git_commit_list_parse()` and letting it use the ODB-owned commit-graph file. Part of: #5757
* | Tolerate readlink size less than st_sizeDavid Tolnay2021-05-301-3/+4
| |
* | filter: internal git_buf filter handling functionEdward Thomson2021-05-061-3/+1
|/ | | | | | | | | | | | Introduce `git_filter_list__convert_buf` which behaves like the old implementation of `git_filter_list__apply_data`, where it might move the input data buffer over into the output data buffer space for efficiency. This new implementation will do so in a more predictible way, always freeing the given input buffer (either moving it to the output buffer or filtering it into the output buffer first). Convert internal users to it.
* Make the odb race-freelhchavez2020-11-281-17/+153
| | | | | | | This change adds all the necessary locking to the odb to avoid races in the backends. Part of: #5592
* odb: use GIT_ASSERTEdward Thomson2020-11-271-20/+41
|
* Improve the support of atomicslhchavez2020-10-081-3/+3
| | | | | | | | | | | | | | | | | | | | | | | This change: * Starts using GCC's and clang's `__atomic_*` intrinsics instead of the `__sync_*` ones, since the former supercede the latter (and can be safely replaced by their equivalent `__atomic_*` version with the sequentially consistent model). * Makes `git_atomic64`'s value `volatile`. Otherwise, this will make ThreadSanitizer complain. * Adds ways to load the values from atomics. As it turns out, unsynchronized read are okay only in some architectures, but if we want to be correct (and make ThreadSanitizer happy), those loads should also be performed with the atomic builtins. * Fixes two ThreadSanitizer warnings, as a proof-of-concept that this works: - Avoid directly accessing `git_refcount`'s `owner` directly, and instead makes all callers go through the `GIT_REFCOUNT_*()` macros, which also use the atomic utilities. - Makes `pool_system_page_size()` race-free. Part of: #5592
* Make the tests pass cleanly with MemorySanitizerlhchavez2020-06-301-3/+4
| | | | | | | | | This change: * Initializes a few variables that were being read before being initialized. * Includes https://github.com/madler/zlib/pull/393. As such, it only works reliably with `-DUSE_BUNDLED_ZLIB=ON`.
* tree-wide: do not compile deprecated functions with hard deprecationPatrick Steinhardt2020-06-091-0/+2
| | | | | | | | | | | | | | | | | | | | | | When compiling libgit2 with -DDEPRECATE_HARD, we add a preprocessor definition `GIT_DEPRECATE_HARD` which causes the "git2/deprecated.h" header to be empty. As a result, no function declarations are made available to callers, but the implementations are still available to link against. This has the problem that function declarations also aren't visible to the implementations, meaning that the symbol's visibility will not be set up correctly. As a result, the resulting library may not expose those deprecated symbols at all on some platforms and thus cause linking errors. Fix the issue by conditionally compiling deprecated functions, only. While it becomes impossible to link against such a library in case one uses deprecated functions, distributors of libgit2 aren't expected to pass -DDEPRECATE_HARD anyway. Instead, users of libgit2 should manually define GIT_DEPRECATE_HARD to hide deprecated functions. Using "real" hard deprecation still makes sense in the context of CI to test we don't use deprecated symbols ourselves and in case a dependant uses libgit2 in a vendored way and knows it won't ever use any of the deprecated symbols anyway.
* futils_filesize: use `uint64_t` for object sizeEdward Thomson2019-11-221-8/+14
| | | | | Instead of using a signed type (`off_t`) use `uint64_t` for the maximum size of files.
* odb: use `git_object_size_t` for object sizeEdward Thomson2019-11-221-5/+5
| | | | | Instead of using a signed type (`off_t`) use a new `git_object_size_t` for the sizes of objects.
* fileops: rename to "futils.h" to match function signaturesPatrick Steinhardt2019-07-201-1/+1
| | | | | | | | | Our file utils functions all have a "futils" prefix, e.g. `git_futils_touch`. One would thus naturally guess that their definitions and implementation would live in files "futils.h" and "futils.c", respectively, but in fact they live in "fileops.h". Rename the files to match expectations.
* configuration: cvar -> configmapPatrick Steinhardt2019-07-181-1/+1
| | | | | `cvar` is an unhelpful name. Refactor its usage to `configmap` for more clarity.
* docs: fixupsEtienne Samson2019-06-261-0/+1
|
* oid: `is_zero` instead of `iszero`Edward Thomson2019-06-161-5/+5
| | | | | | The only function that is named `issomething` (without underscore) was `git_oid_iszero`. Rename it to `git_oid_is_zero` for consistency with the rest of the library.
* odb: provide a free function for custom backendsethomson/odb_backend_allocationsEdward Thomson2019-02-231-0/+6
| | | | | | | | | | | | Custom backends can allocate memory when reading objects and providing them to libgit2. However, if an error occurs in the custom backend after the memory has been allocated for the custom object but before it's returned to libgit2, the custom backend has no way to free that memory and it must be leaked. Provide a free function that corresponds to the alloc function so that custom backends have an opportunity to free memory before they return an error.
* odb: rename git_odb_backend_malloc for consistencyEdward Thomson2019-02-231-1/+6
| | | | | | | | | | | | The `git_odb_backend_malloc` name is a system function that is provided for custom ODB backends and allows them to allocate memory for an ODB object in the read callback. This is important so that libgit2 can later free the memory used by an ODB object that was read from the custom backend. However, the name _suggests_ that it actually allocates a `git_odb_backend`. It does not; rename it to make it clear that it actually allocates backend _data_.
* indexer: use git_indexer_progress throughoutEdward Thomson2019-02-221-1/+1
| | | | | Update internal usage of `git_transfer_progress` to `git_indexer_progreses`.
* cache: fix misnaming of `git_cache_free`Patrick Steinhardt2019-02-211-2/+2
| | | | | | | | | Functions that free a structure's contents but not the structure itself shall be named `dispose` in the libgit2 project, but the function `git_cache_free` does not follow this naming pattern. Fix this by renaming it to `git_cache_dispose` and adjusting all callers to make use of the new name.
* Fix a memory leak in odb_otype_fast()lhchavez2019-02-201-0/+1
| | | | This change frees a copy of a cached object in odb_otype_fast().
* Fix a _very_ improbable memory leak in git_odb_new()lhchavez2019-02-161-2/+6
| | | | | | This change fixes a mostly theoretical memory leak in got_odb_new() that can only manifest if git_cache_init() fails due to running out of memory or not being able to acquire its lock.
* blob: validate that blob sizes fit in a size_tEdward Thomson2019-01-251-6/+6
| | | | | | Our blob size is a `git_off_t`, which is a signed 64 bit int. This may be erroneously negative or larger than `SIZE_MAX`. Ensure that the blob size fits into a `size_t` before casting.
* git_error: use new names in internal APIs and usageEdward Thomson2019-01-221-29/+29
| | | | | Move to the `git_error` name in the internal API for error-related functions.
* Fix odb foreach to also close on positive error codeMarijan Ć uflaj2019-01-201-1/+1
| | | | | | | | In include/git2/odb.h it states that callback can also return positive value which should break looping. Implementations of git_odb_foreach() and pack_backend__foreach() did not respect that.
* Merge pull request #4940 from libgit2/ethomson/git_objEdward Thomson2019-01-191-3/+3
|\ | | | | More `git_obj` to `git_object` updates
| * object_type: GIT_OBJECT_BAD is now GIT_OBJECT_INVALIDEdward Thomson2019-01-171-3/+3
| | | | | | | | | | | | | | We use the term "invalid" to refer to bad or malformed data, eg `GIT_REF_INVALID` and `GIT_EINVALIDSPEC`. Since we're changing the names of the `git_object_t`s in this release, update it to be `GIT_OBJECT_INVALID` instead of `BAD`.
* | Fix a bunch of warningslhchavez2019-01-051-1/+1
|/ | | | | | | | | | | This change fixes a bunch of warnings that were discovered by compiling with `clang -target=i386-pc-linux-gnu`. It turned out that the intrinsics were not necessarily being used in all platforms! Especially in GCC, since it does not support __has_builtin. Some more warnings were gleaned from the Windows build, but I stopped when I saw that some third-party dependencies (e.g. zlib) have warnings of their own, so we might never be able to enable -Werror there.
* object_type: use new enumeration namesethomson/index_fixesEdward Thomson2018-12-011-29/+29
| | | | Use the new object_type enumeration names within the codebase.
* odb: fix use of wrong printf formattersPatrick Steinhardt2018-08-061-2/+2
| | | | | | | | | The `git_odb_stream` members `declared_size` and `received_bytes` are both of the type `git_off_t`, which we usually defined to be a 64 bit signed integer. Thus, passing these members to "PRIdZ" formatters is not correct, as they are not guaranteed to accept big enough numbers. Instead, use the "PRId64" formatter, which is able to represent 64 bit signed integers.
* Convert usage of `git_buf_free` to new `git_buf_dispose`Patrick Steinhardt2018-06-101-7/+7
|
* odb: fix writing to fake write streamsPatrick Steinhardt2018-03-231-1/+1
| | | | | | | | | | | | | | | | | | In commit 7ec7aa4a7 (odb: assert on logic errors when writing objects, 2018-02-01), the check for whether we are trying to overflowing the fake stream buffer was changed from returning an error to raising an assert. The conversion forgot though that the logic around `assert`s are basically inverted. Previously, if the statement stream->written + len > steram->size evaluated to true, we would return a `-1`. Now we are asserting that this statement is true, and in case it is not we will raise an error. So the conversion to the `assert` in fact changed the behaviour to the complete opposite intention. Fix the assert by inverting its condition again and add a regression test.
* odb: fix memory leaks due to not freeing hash contextPatrick Steinhardt2018-02-091-0/+2
|
* odb: error when we can't create object headerEdward Thomson2018-02-091-18/+38
| | | | | Return an error to the caller when we can't create an object header for some reason (printf failure) instead of simply asserting.
* odb: assert on logic errors when writing objectsEdward Thomson2018-02-091-2/+1
| | | | | There's no recovery possible if we're so confused or corrupted that we're trying to overwrite our memory. Simply assert.
* git_odb__hashfd: propagate error on failuresEdward Thomson2018-02-091-1/+1
|