|  | Commit message (Collapse) | Author | Age | Files | Lines | 
|---|
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Return `GIT_EINVALID` on parse errors so that direct callers of parse
functions can determine when there was a failure to parse the object.
The object parser functions will swallow this error code to prevent it
from propagating down the chain to end-users.  (`git_merge` should not
return `GIT_EINVALID` when a commit it tries to look up is not valid,
this would be too vague to be useful.)
The only public function that this affects is
`git_signature_from_buffer`, which is now documented as returning
`GIT_EINVALID` when appropriate. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This change makes calculations of merge-bases a bit faster when there
are complex graphs and the commit times cause visiting nodes multiple
times. This is done by visiting the nodes in the graph in reverse
generation order when the generation number is available instead of
commit timestamp. If the generation number is missing in any pair of
commits, it can safely fall back to the old heuristic with no negative
side-effects.
Part of: #5757 | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | This change does a medium-size refactor of the git_commit_graph_file and
the interaction with the ODB. Now instead of the ODB owning a direct
reference to the git_commit_graph_file, there will be an intermediate
git_commit_graph. The main advantage of that is that now end users can
explicitly set a git_commit_graph that is eagerly checked for errors,
while still being able to lazily use the commit-graph in a regular ODB,
if the file is present. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | This change makes revwalks a bit faster by using the `commit-graph` file
(if present). This is thanks to the `commit-graph` allow much faster
parsing of the commit information by requiring near-zero I/O (aside from
reading a few dozen bytes off of a `mmap(2)`-ed file) for each commit,
instead of having to read the ODB, inflate the commit, and parse it.
This is done by modifying `git_commit_list_parse()` and letting it use
the ODB-owned commit-graph file.
Part of: #5757 | 
| |\  
| | 
| | | DRY commit parsing | 
| | | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | | The commit list's in- and out-degrees are currently stored as `unsigned
short`. When assigning it the value of `git_array_size`, which returns
an `size_t`, this generates a warning on some Win32 platforms due to
loosing precision.
We could just cast the returned value of `git_array_size`, which would
work fine for 99.99% of all cases as commits typically have less than
2^16 parents. For crafted commits though we might end up with a wrong
value, and thus we should definitely check whether the array size
actually fits into the field.
To ease the check, let's convert the fields to store the degrees as
`uint16_t`. We shouldn't rely on such unspecific types anyway, as it may
lead to different behaviour across platforms. Furthermore, this commit
introduces a new `git__is_uint16` function to check whether it actually
fits -- if not, we return an error. | 
| | | |  | 
| | | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | 
| | | The function `commit_quick_parse` provides a way to quickly parse
parts of a commit without storing or verifying most of its
metadata. The first thing it does is calculating the number of
parents by skipping "parent " lines until it finds the first
non-parent line. Afterwards, this parent count is passed to
`alloc_parents`, which will allocate an array to store all the
parent.
To calculate the amount of storage required for the parents
array, `alloc_parents` simply multiplicates the number of parents
with the respective elements's size. This already screams "buffer
overflow", and in fact this problem is getting worse by the
result being cast to an `uint32_t`.
In fact, triggering this is possible: git-hash-object(1) will
happily write a commit with multiple millions of parents for you.
I've stopped at 67,108,864 parents as git-hash-object(1)
unfortunately soaks up the complete object without streaming
anything to disk and thus will cause an OOM situation at a later
point. The point here is: this commit was about 4.1GB of size but
compressed down to 24MB and thus easy to distribute.
The above doesn't yet trigger the buffer overflow, thus. As the
array's elements are all pointers which are 8 bytes on 64 bit, we
need a total of 536,870,912 parents to trigger the overflow to
`0`. The effect is that we're now underallocating the array
and do an out-of-bound writes. As the buffer is kindly provided
by the adversary, this may easily result in code execution.
Extrapolating from the test file with 67m commits to the one with
536m commits results in a factor of 8. Thus the uncompressed
contents would be about 32GB in size and the compressed ones
192MB. While still easily distributable via the network, only
servers will have that amount of RAM and not cause an
out-of-memory condition previous to triggering the overflow. This
at least makes this attack not an easy vector for client-side use
of libgit2. | 
| |/ |  | 
| | 
| 
| 
| 
| | Move to the `git_error` name in the internal API for error-related
functions. | 
| | 
| 
| 
| | Use the new object_type enumeration names within the codebase. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| | When quick-parsing a commit, we use `git__strtol64` to parse the
commit's time. The buffer that's passed to `commit_quick_parse` is the
raw data of an ODB object, though, whose data may not be properly
formatted and also does not have to be `NUL` terminated. This may lead
to out-of-bound reads.
Use `git__strntol64` to avoid this problem. | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | Next to including several files, our "common.h" header also declares
various macros which are then used throughout the project. As such, we
have to make sure to always include this file first in all
implementation files. Otherwise, we might encounter problems or even
silent behavioural differences due to macros or defines not being
defined as they should be. So in fact, our header and implementation
files should make sure to always include "common.h" first.
This commit does so by establishing a common include pattern. Header
files inside of "src" will now always include "common.h" as its first
other file, separated by a newline from all the other includes to make
it stand out as special. There are two cases for the implementation
files. If they do have a matching header file, they will always include
this one first, leading to "common.h" being transitively included as
first file. If they do not have a matching header file, they instead
include "common.h" as first file themselves.
This fixes the outlined problems and will become our standard practice
for header and source files inside of the "src/" from now on. | 
| | 
| 
| 
| 
| 
| 
| 
| | Error messages should be sentence fragments, and therefore:
1. Should not begin with a capital letter,
2. Should not conclude with punctuation, and
3. Should not end a sentence and begin a new one | 
| | 
| 
| 
| 
| | This returns the integer-cast truth value comparing the dates. What we
want instead of a (-1, 0, 1) output depending on how they compare. | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| | We moved the "main" parsing to use 64 bits for the timestamp, but the
quick parsing for the revwalk did not. This means that for large
timestamps we fail to parse the time and thus the walk.
Move this parser to use 64 bits as well. | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| 
| 
| 
| 
| | I accidentally wrote a separate priority queue implementation when
I was working on file rename detection as part of the file hash
signature calculation code.  To simplify licensing terms, I just
adapted that to a general purpose priority queue and replace the
old priority queue implementation that was borrowed from elsewhere.
This also removes parts of the COPYING document that no longer
apply to libgit2. | 
| | 
| 
| | git-core prefers younger merge bases over older ones in case that multiple valid merge bases exists. | 
| | 
| 
| 
| 
| | This uses the odb object accessors so we can change the internals
more easily... | 
| | |  | 
| | |  | 
| | |  | 
| | 
| 
| 
| 
| 
| 
| | `revwalk.h:commit_lookup()` -> `git_revwalk__commit_lookup()`
and make `git_commit_list_parse()` do real error checking that
the item in the list is an actual commit object.  Also fixed an
apparent typo in a test name. | 
|  | In so doing, promote commit_list to git_commit_list,
with its own internal API header. |