| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Introduce `git_blob_data_is_binary` to examine a blob's data, instead of
the blob itself. A replacement for `git_buf_is_binary`.
|
|
|
|
|
|
| |
Introduce `git_fs_path`, which operates on generic filesystem paths.
`git_path` will be kept for only git-specific path functionality (for
example, checking for `.git` in a path).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libgit2 has two distinct requirements that were previously solved by
`git_buf`. We require:
1. A general purpose string class that provides a number of utility APIs
for manipulating data (eg, concatenating, truncating, etc).
2. A structure that we can use to return strings to callers that they
can take ownership of.
By using a single class (`git_buf`) for both of these purposes, we have
confused the API to the point that refactorings are difficult and
reasoning about correctness is also difficult.
Move the utility class `git_buf` to be called `git_str`: this represents
its general purpose, as an internal string buffer class. The name also
is an homage to Junio Hamano ("gitstr").
The public API remains `git_buf`, and has a much smaller footprint. It
is generally only used as an "out" param with strict requirements that
follow the documentation. (Exceptions exist for some legacy APIs to
avoid breaking callers unnecessarily.)
Utility functions exist to convert a user-specified `git_buf` to a
`git_str` so that we can call internal functions, then converting it
back again.
|
|
|
|
|
|
| |
Resolve absolute paths to be working directory relative when looking up
attributes. Importantly, now we will _never_ pass an absolute path down
to attribute lookup functions.
|
|
|
|
|
|
|
| |
Using a `git_oid *` in filter options was a mistake; it is a deviation
from our typical pattern, and callers in some languages that GC may need
very special treatment in order to pass both an options structure and a
pointer outside of it.
|
|
|
|
|
|
| |
the filtering code to ensure the cached longpath setting is returned.
Fixes: #6054
|
|
|
|
|
| |
Provide a mechanism to filter using attribute data from a specific
commit (making use of `GIT_ATTR_CHECK_INCLUDE_COMMIT`).
|
|
|
|
|
| |
The `git_buf_text` namespace is unnecessary and strange. Remove it,
just keep the functions prefixed with `git_buf`.
|
|
|
|
|
| |
Use `git_repository_workdir_path` to generate workdir paths since it
will validate the length.
|
|\
| |
| | |
blob: fix name of `GIT_BLOB_FILTER_ATTRIBUTES_FROM_HEAD`
|
| |
| |
| |
| |
| |
| | |
`GIT_BLOB_FILTER_ATTTRIBUTES_FROM_HEAD` is misspelled, it should be
`GIT_BLOB_FILTER_ATTRIBUTES_FROM_HEAD`, and it would be if it were not
for the MacBook Pro keyboard and my inattentiveness.
|
|/
|
|
|
|
| |
The `git_blob_filter_options_init` function should be included, to allow
callers in FFI environments to let us initialize an options structure
for them.
|
|
|
|
|
|
| |
`git_buf_sanitize` is called with user-input, and wants to sanity-check
that input. Allow it to return a value if the input was malformed in a
way that we cannot cope.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When compiling libgit2 with -DDEPRECATE_HARD, we add a preprocessor
definition `GIT_DEPRECATE_HARD` which causes the "git2/deprecated.h"
header to be empty. As a result, no function declarations are made
available to callers, but the implementations are still available to
link against. This has the problem that function declarations also
aren't visible to the implementations, meaning that the symbol's
visibility will not be set up correctly. As a result, the resulting
library may not expose those deprecated symbols at all on some platforms
and thus cause linking errors.
Fix the issue by conditionally compiling deprecated functions, only.
While it becomes impossible to link against such a library in case one
uses deprecated functions, distributors of libgit2 aren't expected to
pass -DDEPRECATE_HARD anyway. Instead, users of libgit2 should manually
define GIT_DEPRECATE_HARD to hide deprecated functions. Using "real"
hard deprecation still makes sense in the context of CI to test we don't
use deprecated symbols ourselves and in case a dependant uses libgit2 in
a vendored way and knows it won't ever use any of the deprecated symbols
anyway.
|
|
|
|
|
| |
Instead of using a signed type (`off_t`) use a new `git_object_size_t`
for the sizes of objects.
|
|
|
|
|
|
|
| |
When `GIT_BLOB_FILTER_ATTTRIBUTES_FROM_HEAD` is passed to
`git_blob_filter`, read attributes from `gitattributes` files that
are checked in to the repository at the HEAD revision. This passes
the flag `GIT_FILTER_ATTRIBUTES_FROM_HEAD` to the filter functions.
|
|
|
|
|
|
|
|
| |
Introduce `GIT_BLOB_FILTER_NO_SYSTEM_ATTRIBUTES`, which tells
`git_blob_filter` to ignore the system-wide attributes file, usually
`/etc/gitattributes`.
This simply passes the appropriate flag to the attribute loading code.
|
|
|
|
| |
Users should now use `git_blob_filter`.
|
|
|
|
|
| |
Provide a function to filter blobs that allows for more functionality
than the existing `git_blob_filtered_content` function.
|
|
|
|
|
|
| |
The majority of functions are named `from_something` (with an
underscore) instead of `fromsomething`. Update the blob functions for
consistency with the rest of the library.
|
|
|
|
|
|
| |
Our blob size is a `git_off_t`, which is a signed 64 bit int. This may
be erroneously negative or larger than `SIZE_MAX`. Ensure that the blob
size fits into a `size_t` before casting.
|
|
|
|
|
| |
Move to the `git_error` name in the internal API for error-related
functions.
|
|
|
|
| |
Use the new object_type enumeration names within the codebase.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, parsing objects is strictly tied to having an ODB object
available. This makes it hard to parse an object when all that is
available is its raw object and size. Furthermore, hacking around that
limitation by directly creating an ODB structure either on stack or on
heap does not really work that well due to ODB objects being reference
counted and then automatically free'd when reaching a reference count of
zero.
In some occasions parsing raw objects without touching the ODB
is actually recuired, though. One use case is for example object
verification, where we want to assure that an object is valid before
inserting it into the ODB or writing it into the git repository.
Asa first step towards that, introduce a distinction between raw and ODB
objects for blobs. Creation of ODB objects stays the same by simply
using `git_blob__parse`, but a new function `git_blob__parse_raw` has
been added that creates a blob from a pair of data and size. By setting
a new flag inside of the blob, we can now distinguish whether it is a
raw or ODB object now and treat it accordingly in several places.
Note that the blob data passed in is not being copied. Because of that,
callers need to make sure to keep it alive during the blob's life time.
This is being used to avoid unnecessarily increasing the memory
footprint when parsing largish blobs.
|
|
|
|
|
|
|
|
|
| |
Going forward, we will have to change how blob sizes are calculated
based on whether the blob is a cahed object part of the ODB or not. In
order to not have to distinguish between those two object types
repeatedly when accessing the blob's data or size, encapsulate all
existing direct uses of those fields by instead using
`git_blob_rawcontent` and `git_blob_rawsize`.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Next to including several files, our "common.h" header also declares
various macros which are then used throughout the project. As such, we
have to make sure to always include this file first in all
implementation files. Otherwise, we might encounter problems or even
silent behavioural differences due to macros or defines not being
defined as they should be. So in fact, our header and implementation
files should make sure to always include "common.h" first.
This commit does so by establishing a common include pattern. Header
files inside of "src" will now always include "common.h" as its first
other file, separated by a newline from all the other includes to make
it stand out as special. There are two cases for the implementation
files. If they do have a matching header file, they will always include
this one first, leading to "common.h" being transitively included as
first file. If they do not have a matching header file, they instead
include "common.h" as first file themselves.
This fixes the outlined problems and will become our standard practice
for header and source files inside of the "src/" from now on.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The recent introduction of the commondir variable of a repository
requires callers to distinguish whether their files are part of
the dot-git directory or the common directory shared between
multpile worktrees. In order to take the burden from callers and
unify knowledge on which files reside where, the
`git_repository_item_path` function has been introduced which
encapsulate this knowledge.
Modify most existing callers of `git_repository_path` to use
`git_repository_item_path` instead, thus making them implicitly
aware of the common directory.
|
|
|
|
|
|
|
|
| |
Error messages should be sentence fragments, and therefore:
1. Should not begin with a capital letter,
2. Should not conclude with punctuation, and
3. Should not end a sentence and begin a new one
|
|
|
|
|
|
| |
The callback mechanism makes it awkward to write data from an IO
source; move to `_fromstream()` which lets the caller remain in control,
in the same vein as we prefer iterators over foreach callbacks.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The pair of `git_blob_create_frombuffer()` and
`git_blob_create_frombuffer_commit()` is meant to replace
`git_blob_create_fromchunks()` by providing a way for a user to write a
new blob when they want filtering or they do not know the size.
This approach allows the caller to retain control over when to add data
to this buffer and a more natural fit into higher-level language's own
stream abstractions instead of having to handle IO wait in the callback.
The in-memory buffer size of 2MB is chosen somewhat arbitrarily to be a
round multiple of usual page sizes and a value where most blobs seem
likely to be either going to be way below or way over that size. It's
also a round number of pages.
This implementation re-uses the helper we have from `_fromchunks()` so
we end up writing everything to disk, but hopefully more efficiently
than with a default filebuf. A later optimisation can be to avoid
writing the in-memory contents to disk, with some extra complexity.
|
|
|
|
|
| |
This also affects `git_index_add_bypath()` by providing a better error
message and a specific error code when a directory is passed.
|
|
|
|
|
|
|
|
|
|
| |
Restricting files to size_t is a silly limitation. The loose backend
writes to a file directly, so there is no issue in using 63 bits for the
size.
We still assume that the header is going to fit in 64 bytes, which does
mean quite a bit smaller files due to the run-length encoding, but it's
still a much larger size than you would want Git to handle.
|
| |
|
|
|
|
|
| |
For consistency with the rest of the library, where an opt is an
options *structure*.
|
|
|
|
|
|
| |
Provide a convenience function that creates a buffer that can be provided
to callers but will not be freed via `git_buf_free`, so the buffer
creator maintains the allocation lifecycle of the buffer's contents.
|
| |
|
|
|
|
|
|
|
|
|
| |
Diff and status do not want core.safecrlf to actually raise an
error regardless of the setting, so this extends the filter API
with an additional options flags parameter and adds a flag so that
filters can be applied with GIT_FILTER_OPT_ALLOW_UNSAFE, indicating
that unsafe filter application should be downgraded from a failure
to a warning.
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
| |
The callback to supply data chunks could return a negative value
to stop creation of the blob, but we were neither using GIT_EUSER
nor propagating the return value. This makes things use the new
behavior of returning the negative value back to the user.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the git_buf struct that was used internally into an
externally available structure and eliminates the git_buffer.
As part of that, some of the special cases that arose with the
externally used git_buffer were blended into the git_buf, such as
being careful about git_buf objects that may have a NULL ptr and
allowing for bufs with a valid ptr and size but zero asize as a
way of referring to externally owned data.
|
|
|
|
|
|
|
| |
This adds the ident filter (that knows how to replace $Id$) and
tweaks the filter APIs and code so that git_filter_source objects
actually have the updated OID of the object being filtered when
it is a known value.
|
|
|
|
|
|
|
|
|
|
|
| |
This moves the git_filter_list into the public API so that users
can create, apply, and dispose of filter lists. This allows more
granular application of filters to user data outside of libgit2
internals.
This also converts all the internal usage of filters to the public
APIs along with a few small tweaks to make it easier to use the
public git_buffer stuff alongside the internal git_buf.
|
|
|
|
|
|
|
| |
This creates include/sys/filter.h with a basic definition of a
git_filter and then converts the internal code to use it. There
are related internal objects (git_filter_list) that we will want
to publish at some point, but this is a first step.
|
|
|
|
|
|
|
|
|
|
| |
This begins the process of exposing git_filter objects to the
public API. This includes:
* new public type and API for `git_buffer` through which an
allocated buffer can be passed to the user
* new API `git_blob_filtered_content`
* make the git_filter type and GIT_FILTER_TO_... constants public
|
|
|
|
|
|
| |
This is in preparation for moving the hashing to the frontend, which
requires us to handle the incoming data before passing it to the
backend's stream.
|