| Commit message (Collapse) | Author | Age | Files | Lines |
| | |
|
| |
|
|
|
|
|
|
|
|
|
| |
Introduce GIT_OPT_SET_ODB_LOOSE_PRIORITY and GIT_OPT_SET_ODB_PACKED_PRIORITY
to allow overriding the default priority values for the default ODB
backends. Libgit2 has historically assumed that most objects for long-
running operations will be packed, therefore GIT_LOOSE_PRIORITY is
set to 1 by default, and GIT_PACKED_PRIORITY to 2.
When a client allows libgit2 to set the default backends, they can
specify an override for the two priority values in order to change
the order in which each ODB backend is accessed.
|
| |\
| |
| | |
commit-graph: Use the commit-graph in revwalks
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This change does a medium-size refactor of the git_commit_graph_file and
the interaction with the ODB. Now instead of the ODB owning a direct
reference to the git_commit_graph_file, there will be an intermediate
git_commit_graph. The main advantage of that is that now end users can
explicitly set a git_commit_graph that is eagerly checked for errors,
while still being able to lazily use the commit-graph in a regular ODB,
if the file is present.
|
| | |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This change makes revwalks a bit faster by using the `commit-graph` file
(if present). This is thanks to the `commit-graph` allow much faster
parsing of the commit information by requiring near-zero I/O (aside from
reading a few dozen bytes off of a `mmap(2)`-ed file) for each commit,
instead of having to read the ODB, inflate the commit, and parse it.
This is done by modifying `git_commit_list_parse()` and letting it use
the ODB-owned commit-graph file.
Part of: #5757
|
| | | |
|
| |/
|
|
|
|
|
|
|
|
|
|
| |
Introduce `git_filter_list__convert_buf` which behaves like the old
implementation of `git_filter_list__apply_data`, where it might move the
input data buffer over into the output data buffer space for efficiency.
This new implementation will do so in a more predictible way, always
freeing the given input buffer (either moving it to the output buffer or
filtering it into the output buffer first).
Convert internal users to it.
|
| |
|
|
|
|
|
| |
This change adds all the necessary locking to the odb to avoid races in
the backends.
Part of: #5592
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This change:
* Starts using GCC's and clang's `__atomic_*` intrinsics instead of the
`__sync_*` ones, since the former supercede the latter (and can be
safely replaced by their equivalent `__atomic_*` version with the
sequentially consistent model).
* Makes `git_atomic64`'s value `volatile`. Otherwise, this will make
ThreadSanitizer complain.
* Adds ways to load the values from atomics. As it turns out,
unsynchronized read are okay only in some architectures, but if we
want to be correct (and make ThreadSanitizer happy), those loads
should also be performed with the atomic builtins.
* Fixes two ThreadSanitizer warnings, as a proof-of-concept that this
works:
- Avoid directly accessing `git_refcount`'s `owner` directly, and
instead makes all callers go through the `GIT_REFCOUNT_*()` macros,
which also use the atomic utilities.
- Makes `pool_system_page_size()` race-free.
Part of: #5592
|
| |
|
|
|
|
|
|
|
| |
This change:
* Initializes a few variables that were being read before being
initialized.
* Includes https://github.com/madler/zlib/pull/393. As such,
it only works reliably with `-DUSE_BUNDLED_ZLIB=ON`.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When compiling libgit2 with -DDEPRECATE_HARD, we add a preprocessor
definition `GIT_DEPRECATE_HARD` which causes the "git2/deprecated.h"
header to be empty. As a result, no function declarations are made
available to callers, but the implementations are still available to
link against. This has the problem that function declarations also
aren't visible to the implementations, meaning that the symbol's
visibility will not be set up correctly. As a result, the resulting
library may not expose those deprecated symbols at all on some platforms
and thus cause linking errors.
Fix the issue by conditionally compiling deprecated functions, only.
While it becomes impossible to link against such a library in case one
uses deprecated functions, distributors of libgit2 aren't expected to
pass -DDEPRECATE_HARD anyway. Instead, users of libgit2 should manually
define GIT_DEPRECATE_HARD to hide deprecated functions. Using "real"
hard deprecation still makes sense in the context of CI to test we don't
use deprecated symbols ourselves and in case a dependant uses libgit2 in
a vendored way and knows it won't ever use any of the deprecated symbols
anyway.
|
| |
|
|
|
| |
Instead of using a signed type (`off_t`) use `uint64_t` for the maximum
size of files.
|
| |
|
|
|
| |
Instead of using a signed type (`off_t`) use a new `git_object_size_t`
for the sizes of objects.
|
| |
|
|
|
|
|
|
|
| |
Our file utils functions all have a "futils" prefix, e.g.
`git_futils_touch`. One would thus naturally guess that their
definitions and implementation would live in files "futils.h" and
"futils.c", respectively, but in fact they live in "fileops.h".
Rename the files to match expectations.
|
| |
|
|
|
| |
`cvar` is an unhelpful name. Refactor its usage to `configmap` for more
clarity.
|
| | |
|
| |
|
|
|
|
| |
The only function that is named `issomething` (without underscore) was
`git_oid_iszero`. Rename it to `git_oid_is_zero` for consistency with
the rest of the library.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Custom backends can allocate memory when reading objects and providing
them to libgit2. However, if an error occurs in the custom backend
after the memory has been allocated for the custom object but before
it's returned to libgit2, the custom backend has no way to free that
memory and it must be leaked.
Provide a free function that corresponds to the alloc function so that
custom backends have an opportunity to free memory before they return an
error.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The `git_odb_backend_malloc` name is a system function that is provided
for custom ODB backends and allows them to allocate memory for an ODB
object in the read callback. This is important so that libgit2 can
later free the memory used by an ODB object that was read from the
custom backend.
However, the name _suggests_ that it actually allocates a
`git_odb_backend`. It does not; rename it to make it clear that it
actually allocates backend _data_.
|
| |
|
|
|
| |
Update internal usage of `git_transfer_progress` to
`git_indexer_progreses`.
|
| |
|
|
|
|
|
|
|
| |
Functions that free a structure's contents but not the structure
itself shall be named `dispose` in the libgit2 project, but the
function `git_cache_free` does not follow this naming pattern.
Fix this by renaming it to `git_cache_dispose` and adjusting all
callers to make use of the new name.
|
| |
|
|
| |
This change frees a copy of a cached object in odb_otype_fast().
|
| |
|
|
|
|
| |
This change fixes a mostly theoretical memory leak in got_odb_new()
that can only manifest if git_cache_init() fails due to running out of
memory or not being able to acquire its lock.
|
| |
|
|
|
|
| |
Our blob size is a `git_off_t`, which is a signed 64 bit int. This may
be erroneously negative or larger than `SIZE_MAX`. Ensure that the blob
size fits into a `size_t` before casting.
|
| |
|
|
|
| |
Move to the `git_error` name in the internal API for error-related
functions.
|
| |
|
|
|
|
|
|
| |
In include/git2/odb.h it states that callback can also return
positive value which should break looping.
Implementations of git_odb_foreach() and pack_backend__foreach()
did not respect that.
|
| |\
| |
| | |
More `git_obj` to `git_object` updates
|
| | |
| |
| |
| |
| |
| |
| | |
We use the term "invalid" to refer to bad or malformed data, eg
`GIT_REF_INVALID` and `GIT_EINVALIDSPEC`. Since we're changing the
names of the `git_object_t`s in this release, update it to be
`GIT_OBJECT_INVALID` instead of `BAD`.
|
| |/
|
|
|
|
|
|
|
|
|
| |
This change fixes a bunch of warnings that were discovered by compiling
with `clang -target=i386-pc-linux-gnu`. It turned out that the
intrinsics were not necessarily being used in all platforms! Especially
in GCC, since it does not support __has_builtin.
Some more warnings were gleaned from the Windows build, but I stopped
when I saw that some third-party dependencies (e.g. zlib) have warnings
of their own, so we might never be able to enable -Werror there.
|
| |
|
|
| |
Use the new object_type enumeration names within the codebase.
|
| |
|
|
|
|
|
|
|
| |
The `git_odb_stream` members `declared_size` and `received_bytes` are
both of the type `git_off_t`, which we usually defined to be a 64 bit
signed integer. Thus, passing these members to "PRIdZ" formatters is not
correct, as they are not guaranteed to accept big enough numbers.
Instead, use the "PRId64" formatter, which is able to represent 64 bit
signed integers.
|
| | |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 7ec7aa4a7 (odb: assert on logic errors when writing objects,
2018-02-01), the check for whether we are trying to overflowing the fake
stream buffer was changed from returning an error to raising an assert.
The conversion forgot though that the logic around `assert`s are
basically inverted. Previously, if the statement
stream->written + len > steram->size
evaluated to true, we would return a `-1`. Now we are asserting that
this statement is true, and in case it is not we will raise an error. So
the conversion to the `assert` in fact changed the behaviour to the
complete opposite intention.
Fix the assert by inverting its condition again and add a regression
test.
|
| | |
|
| |
|
|
|
| |
Return an error to the caller when we can't create an object header for
some reason (printf failure) instead of simply asserting.
|
| |
|
|
|
| |
There's no recovery possible if we're so confused or corrupted that
we're trying to overwrite our memory. Simply assert.
|
| | |
|
| |
|
|
|
|
| |
Provide error messages on hash failures: assert when given invalid
input instead of failing with a user error; provide error messages
on program errors.
|
| |
|
|
|
|
|
| |
It's unlikely that we'll fail to allocate a single byte, but let's check
for allocation failures for good measure. Untangle `-1` being a marker
of not having found the hardcoded odb object; use that to reflect actual
errors.
|
| |
|
|
|
| |
At the moment, we're swallowing the allocation failure. We need to
return the error to the caller.
|
| |
|
|
|
| |
The streaming read functionality should provide the length and the type
of the object, like the normal read functionality does.
|
| |
|
|
|
|
|
|
|
| |
The null OID (hash with all zeroes) indicates a missing object in
upstream git and is thus not a valid object ID. Add defensive
measurements to avoid writing such a hash to the object database in the
very unlikely case where some data results in the null OID. Furthermore,
add shortcuts when reading the null OID from the ODB to avoid ever
returning an object when a faulty repository may contain the null OID.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Next to including several files, our "common.h" header also declares
various macros which are then used throughout the project. As such, we
have to make sure to always include this file first in all
implementation files. Otherwise, we might encounter problems or even
silent behavioural differences due to macros or defines not being
defined as they should be. So in fact, our header and implementation
files should make sure to always include "common.h" first.
This commit does so by establishing a common include pattern. Header
files inside of "src" will now always include "common.h" as its first
other file, separated by a newline from all the other includes to make
it stand out as special. There are two cases for the implementation
files. If they do have a matching header file, they will always include
this one first, leading to "common.h" being transitively included as
first file. If they do not have a matching header file, they instead
include "common.h" as first file themselves.
This fixes the outlined problems and will become our standard practice
for header and source files inside of the "src/" from now on.
|
| |
|
|
|
|
|
|
|
| |
When looking for an object by prefix, we query all the backends so that
we can ensure that there is no ambiguity. We need to reset the `error`
value between backends; otherwise the first backend may find an object
by prefix, but subsequent backends may not. If we do not reset the
`error` value then it will remain at `GIT_ENOTFOUND` and `read_prefix_1`
will fail, despite having actually found an object.
|
| |
|
|
|
|
|
|
|
|
| |
The fields `declared_size` and `received_bytes` of the `git_odb_stream`
are both of type `git_off_t` which is defined as a signed integer. When
passing these values to a printf-style string in
`git_odb_stream__invalid_length`, though, we format these as PRIuZ,
which is unsigned.
Fix the issue by using PRIdZ instead, silencing warnings on macOS.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The `error` variable is used as a return value in the out-section of
both `odb_read_1` and `read_prefix_1`. While the value will actually
always be initialized inside of this section, GCC fails to realize this
due to interactions with the `found` variable: if `found` is set, the
error will always be initialized. If it is not, we return early without
reaching the out-statements.
Shut up the warnings by initializing the error variable, even though it
is unnecessary.
|
| |
|
|
|
|
| |
While the function reading an object from the complete OID already
verifies OIDs, we do not yet do so for reading objects from a partial
OID. Do so when strict OID verification is enabled.
|
| |
|
|
|
|
|
|
|
|
|
|
| |
The read_prefix_1 function has several return statements springled
throughout the code. As we have to free memory upon getting an error,
the free code has to be repeated at every single retrun -- which it is
not, so we have a memory leak here.
Refactor the code to use the typical `goto out` pattern, which will free
data when an error has occurred. While we're at it, we can also improve
the error message thrown when multiple ambiguous prefixes are found. It
will now include the colliding prefixes.
|
| |
|
|
|
|
|
|
|
|
|
| |
Verifying hashsums of objects we are reading from the ODB may be costly
as we have to perform an additional hashsum calculation on the object.
Especially when reading large objects, the penalty can be as high as
35%, as can be seen when executing the equivalent of `git cat-file` with
and without verification enabled. To mitigate for this, we add a global
option for libgit2 which enables the developer to turn off the
verification, e.g. when he can be reasonably sure that the objects on
disk won't be corrupted.
|