| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libgit2 has two distinct requirements that were previously solved by
`git_buf`. We require:
1. A general purpose string class that provides a number of utility APIs
for manipulating data (eg, concatenating, truncating, etc).
2. A structure that we can use to return strings to callers that they
can take ownership of.
By using a single class (`git_buf`) for both of these purposes, we have
confused the API to the point that refactorings are difficult and
reasoning about correctness is also difficult.
Move the utility class `git_buf` to be called `git_str`: this represents
its general purpose, as an internal string buffer class. The name also
is an homage to Junio Hamano ("gitstr").
The public API remains `git_buf`, and has a much smaller footprint. It
is generally only used as an "out" param with strict requirements that
follow the documentation. (Exceptions exist for some legacy APIs to
avoid breaking callers unnecessarily.)
Utility functions exist to convert a user-specified `git_buf` to a
`git_str` so that we can call internal functions, then converting it
back again.
|
|
|
|
| |
The `repo` argument is now unnecessary. Remove it.
|
|
|
|
| |
The attribute source object is now the type and the path.
|
|
|
|
|
|
| |
The enum `git_attr_file_source` is better suffixed with a `_t` since
it's a type-of source. Similarly, its members should have a matching
name.
|
|\
| |
| | |
fix check for ignoring of negate rules
|
| | |
|
| |
| |
| | |
Co-authored-by: lhchavez <lhchavez@lhchavez.com>
|
| |
| |
| | |
Co-authored-by: lhchavez <lhchavez@lhchavez.com>
|
| |
| |
| |
| |
| |
| | |
gitignore rule like "*.pdf" should be negated by "!dir/test.pdf" even though one is inside a directory since *.pdf isn't restricted to any directory
this is fixed by making wildcard match treat * like ** by excluding WM_PATHNAME flag when the rule isn't for a full path
|
| | |
|
|/
|
|
|
|
| |
We should allow attribute files - inside working directories - to have
names longer than MAX_PATH when core.longpaths is set.
`git_attr_path__init` takes a repository to validate the path with.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
problem:
filesystem_iterator loads .gitignore files in top-down order.
subsequently, ignore module evaluates them in the order they are loaded.
this creates a problem if we have unignored a rule (using a wild card)
in a sub dir and ignored it again in a level further below (see the test
included in this patch).
solution:
process ignores in reverse order.
closes #4963
|
|\
| |
| | |
ignore: fix determining whether a shorter pattern negates another
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When computing whether we need to store a negative pattern, we iterate
through all previously known patterns and check whether the negative
pattern undoes any of the previous ones. In doing so we call `wildmatch`
and check it's return for any negative error values. If there was a
negative return, we will abort and bubble up that error to the caller.
In fact, this check for negative values stems from the time where we
still used `fnmatch` instead of `wildmatch`. For `fnmatch`, negative
values indicate a "real" error, while for `wildmatch` a negative value
may be returned if the matching was prematurely aborted. A premature
abort may for example also happen if the pattern matches a prefix of the
haystack if the pattern is shorter. Returning an error in that case is
the wrong thing to do.
Fix the code to compare for equality with `WM_MATCH`, only. Negative
values returned by `wildmatch` are perfectly fine and thus should be
ignored. Add a test that verifies we do not see the error.
|
| |
| |
| |
| |
| | |
`cvar` is an unhelpful name. Refactor its usage to `configmap` for more
clarity.
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now, we are unconditionally applying all macros found in a
gitatttributes file. But quoting gitattributes(5):
Custom macro attributes can be defined only in top-level
gitattributes files ($GIT_DIR/info/attributes, the .gitattributes
file at the top level of the working tree, or the global or
system-wide gitattributes files), not in .gitattributes files in
working tree subdirectories. The built-in macro attribute "binary"
is equivalent to:
So gitattribute files in subdirectories of the working tree may
explicitly _not_ contain macro definitions, but we do not currently
enforce this limitation.
This patch introduces a new parameter to the gitattributes parser that
tells whether macros are allowed in the current file or not. If set to
`false`, we will still parse macros, but silently ignore them instead of
adding them to the list of defined macros. Update all callers to
correctly determine whether the to-be-parsed file may contain macros or
not. Most importantly, when walking up the directory hierarchy, we will
only set it to `true` once it reaches the root directory of the repo
itself.
Add a test that verifies that we are indeed not applying macros from
subdirectories. Previous to these changes, the test would've failed.
|
|
|
|
|
| |
As with the preceding commit, the ignore code tries to load code from
info/exclude, and we fail to ignore a non-existent file here.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Upstream git has converted to use `wildmatch` instead of
`fnmatch`. Convert our gitattributes logic to use `wildmatch` as
the last user of `fnmatch`. Please, don't expect I know what I'm
doing here: the fnmatch parser is one of the most fun things to
play around with as it has a sh*tload of weird cases. In all
honesty, I'm simply relying on our tests that are by now rather
comprehensive in that area.
The conversion actually fixes compatibility with how git.git
parser "**" patterns when the given path does not contain any
directory separators. Previously, a pattern "**.foo" erroneously
wouldn't match a file "x.foo", while git.git would match.
Remove the new-unused LEADINGDIR/NOLEADINGDIR flags for
`git_attr_fnmatch`.
|
|
|
|
|
|
|
|
|
|
| |
Upstream git.git has converted its codebase to use wildcard in
favor of fnmatch in commit 70a8fc999d (stop using fnmatch (either
native or compat), 2014-02-15). To keep our own regex-matching in
line with what git does, convert all trivial instances of
`fnmatch` usage to use `wildcard`, instead. Trivial usage is
defined to be use of `fnmatch` with either no flags or flags that
have a 1:1 equivalent in wildmatch (PATHNAME, IGNORECASE).
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The function `git_ignore_path_is_ignored` is there to test the
ignore status of paths that need not necessarily exist inside of
a repository. This has the implication that for a given path, we
cannot always decide whether it references a directory or a file,
and we need to distinguish those cases because ignore rules may
treat those differently. E.g. given the following gitignore file:
*
!/**/
we'd only want to unignore directories, while keeping files
ignored. But still, calling `git_ignore_path_is_ignored("dir/")`
will say that this directory is ignored because it treats "dir/"
as a file path.
As said, the `is_ignored` function cannot always decide whether
the given path is a file or directory, and thus it may produce
wrong results in some cases. While this is unfixable in the
general case, we can do better when we are being passed a path
name with a trailing path separator (e.g. "dir/") and always
treat them as directories.
|
|
|
|
|
| |
Move to the `git_error` name in the internal API for error-related
functions.
|
|\
| |
| | |
pack: rename `git_packfile_stream_free`
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When checking whether a rule negates another rule, we were checking
whether a rule had the `GIT_ATTR_FNMATCH_LEADINGDIR` flag set and, if
so, added a "/*" to its end before passing it to `fnmatch`. Our code now
sets `GIT_ATTR_FNMATCH_NOLEADINGDIR`, thus the `LEADINGDIR` flag shall
never be set. Furthermore, due to the `NOLEADINGDIR` flag, trailing
globs do not get consumed by our ignore parser anymore.
Clean up code by just dropping this now useless logic.
|
|/
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When computing whether a file is ignored, we simply search for the first
matching rule and return whether it is a positive ignore rule (the file
is really ignored) or whether it is a negative ignore rule (the file is
being unignored). Each rule has a set of flags which are being passed to
`fnmatch`, depending on what kind of rule it is. E.g. in case it is a
negative ignore we add a flag `GIT_ATTR_FNMATCH_NEGATIVE`, in case it
contains a glob we set the `GIT_ATTR_FNMATCH_HASGLOB` flag.
One of these flags is the `GIT_ATTR_FNMATCH_LEADINGDIR` flag, which is
always set in case the pattern has a trailing "/*" or in case the
pattern is negative. The flag causes the `fnmatch` function to return a
match in case a string is a leading directory of another, e.g. "dir/"
matches "dir/foo/bar.c". In case of negative patterns, this is wrong in
certain cases.
Take the following simple example of a gitignore:
dir/
!dir/
The `LEADINGDIR` flag causes "!dir/" to match "dir/foo/bar.c", and we
correctly unignore the directory. But take this example:
*.test
!dir/*
We expect everything in "dir/" to be unignored, but e.g. a file in a
subdirectory of dir should be ignored, as the "*" does not cross
directory hierarchies. With `LEADINGDIR`, though, we would just see that
"dir/" matches and return that the file is unignored, even if it is
contained in a subdirectory. Instead, we want to ignore leading
directories here and check "*.test". Afterwards, we have to iterate up
to the parent directory and do the same checks.
To fix the issue, disallow matching against leading directories in
gitignore files. This can be trivially done by just adding the
`GIT_ATTR_FNMATCH_NOLEADINGDIR` to the spec passed to
`git_attr_fnmatch__parse`. Due to a bug in that function, though, this
flag is being ignored for negative patterns, which is fixed in this
commit, as well. As a last fix, we need to ignore rules that are
supposed to match a directory when our path itself is a file.
All together, these changes fix the described error case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When comparing whether a path matches a directory rule, we pass the
both the path and directory name to `fnmatch` with
`GIT_ATTR_FNMATCH_DIRECTORY` being set. `fnmatch` expects the pattern to
contain no trailing directory '/', which is why we try to always strip
patterns of trailing slashes. We do not handle that case correctly
though when the pattern itself has trailing spaces, causing the match to
fail.
Fix the issue by stripping trailing spaces and tabs for a rule previous
to checking whether the pattern is a directory pattern with a trailing
'/'. This replaces the whitespace-stripping in our ignore file parsing
code, which was stripping whitespaces too late. Add a test to catch
future breakage.
|
|
|
|
|
|
|
|
| |
The win32 C library is compiled cdecl, however when configured with
`STDCALL=ON`, our functions (and function pointers) will use the stdcall
calling convention. You cannot set a `__stdcall` function pointer to a
`__cdecl` function, so it's easier to just use our `git__strncmp`
instead of sorting that mess out.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Depending on whether the path we want to look up an attribute for is a
file or a directory, the fnmatch function will be called with different
flags. Because of this, we have to first stat(3) the path to determine
whether it is a file or directory in `git_attr_path__init`. This is
wasteful though in bare repositories, where we can already be assured
that the path will never exist at all due to there being no worktree. In
this case, we will execute an unnecessary syscall, which might be
noticeable on networked file systems.
What happens right now is that we always pass the `GIT_DIR_FLAG_UNKOWN`
flag to `git_attr_path__init`, which causes it to `stat` the file itself
to determine its type. As it is calling `git_path_isdir` on the path,
which will always return `false` in case the path does not exist, we end
up with the path always being treated as a file in case of a bare
repository. As such, we can just check the bare-repository case in all
callers and then pass in `GIT_DIR_FLAG_FALSE` ourselves, avoiding the
need to `stat`. While this may not always be correct, it at least is no
different from our current behavior.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When computing negative ignores, we throw away any rule which does not
undo a previous rule to optimize. But on case insensitive file systems,
we need to keep in mind that a negative ignore can also undo a previous
rule with different case, which we did not yet honor while determining
whether a rule undoes a previous one. So in the following example, we
fail to unignore the "/Case" directory:
/case
!/Case
Make both paths checking whether a plain- or wildcard-based rule undo a
previous rule aware of case-insensitivity. This fixes the described
issue.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Ignore rules allow for reverting a previously ignored rule by prefixing
it with an exclamation mark. As such, a negative rule can only override
previously ignored files. While computing all ignore patterns, we try to
use this fact to optimize away some negative rules which do not override
any previous patterns, as they won't change the outcome anyway.
In some cases, though, this optimization causes us to get the actual
ignores wrong for some files. This may happen whenever the pattern
contains a wildcard, as we are unable to reason about whether a pattern
overrides a previous pattern in a sane way. This happens for example in
the case where a gitignore file contains "*.c" and "!src/*.c", where we
wouldn't un-ignore files inside of the "src/" subdirectory.
In this case, the first solution coming to mind may be to just strip the
"src/" prefix and simply compare the basenames. While that would work
here, it would stop working as soon as the basename pattern itself is
different, like for example with "*x.c" and "!src/*.c. As such, we
settle for the easier fix of just not optimizing away rules that contain
a wildcard.
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Next to including several files, our "common.h" header also declares
various macros which are then used throughout the project. As such, we
have to make sure to always include this file first in all
implementation files. Otherwise, we might encounter problems or even
silent behavioural differences due to macros or defines not being
defined as they should be. So in fact, our header and implementation
files should make sure to always include "common.h" first.
This commit does so by establishing a common include pattern. Header
files inside of "src" will now always include "common.h" as its first
other file, separated by a newline from all the other includes to make
it stand out as special. There are two cases for the implementation
files. If they do have a matching header file, they will always include
this one first, leading to "common.h" being transitively included as
first file. If they do not have a matching header file, they instead
include "common.h" as first file themselves.
This fixes the outlined problems and will become our standard practice
for header and source files inside of the "src/" from now on.
|
|
|
|
|
| |
Some implementation files were missing the license headers. This commit
adds them.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The recent introduction of the commondir variable of a repository
requires callers to distinguish whether their files are part of
the dot-git directory or the common directory shared between
multpile worktrees. In order to take the burden from callers and
unify knowledge on which files reside where, the
`git_repository_item_path` function has been introduced which
encapsulate this knowledge.
Modify most existing callers of `git_repository_path` to use
`git_repository_item_path` instead, thus making them implicitly
aware of the common directory.
|
|
|
| |
Otherwise we'll NULL-dereference in git_attr_cache__init
|
|
|
|
|
|
|
|
| |
Error messages should be sentence fragments, and therefore:
1. Should not begin with a capital letter,
2. Should not conclude with punctuation, and
3. Should not end a sentence and begin a new one
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The .gitignore file allows for patterns which unignore previous
ignore patterns. When unignoring a previous pattern, there are
basically three cases how this is matched when no globbing is
used:
1. when a previous file has been ignored, it can be unignored by
using its exact name, e.g.
foo/bar
!foo/bar
2. when a file in a subdirectory has been ignored, it can be
unignored by using its basename, e.g.
foo/bar
!bar
3. when all files with a basename are ignored, a specific file
can be unignored again by specifying its path in a
subdirectory, e.g.
bar
!foo/bar
The first problem in libgit2 is that we did not correctly treat
the second case. While we verified that the negative pattern
matches the tail of the positive one, we did not verify if it
only matches the basename of the positive pattern. So e.g. we
would have also negated a pattern like
foo/fruz_bar
!bar
Furthermore, we did not check for the third case, where a
basename is being unignored in a certain subdirectory again.
Both issues are fixed with this commit.
|
|
|
|
|
|
| |
If we're looking for a symlink, realpath will give us the resolved path,
which is not what we're after, but a canonicalized version of the path
the user asked for.
|
|
|
|
|
|
| |
The previous commit left the comment referencing the earlier state of
the code, change it to explain the current logic. While here, change the
logic to avoid repeating the copy of the base pattern.
|
| |
|
|
|
|
|
|
| |
When we discover that we want to keep a negative rule, make sure to
clear the error variable, as it we otherwise return whatever was left by
the previous loop iteration.
|
|
|
|
| |
Minimizing the number directory and file opens, minimizes the amount of IO thus reducing the overall cost of performing ignore operations.
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
During checkout, assume that the .gitattributes files aren't
modified during the checkout. Instead, create an "attribute session"
during checkout. Assume that attribute data read in the same
checkout "session" hasn't been modified since the checkout started.
(But allow subsequent checkouts to invalidate the cache.)
Further, cache nonexistent git_attr_file data even when .gitattributes
files are not found to prevent re-scanning for nonexistent files.
|
|
|
|
|
|
|
|
|
|
|
| |
A rule can only negate something which was explicitly mentioned in the
rules before it. Change our parsing to ignore a negative rule which does
not negate something mentioned in the rules above it.
While here, fix a wrong allocator usage. The memory for the match string
comes from pool allocator. We must not free it with the general
allocator. We can instead simply forget the string and it will be
cleaned up.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The diff code was using an "ignored_prefix" directory to track if
a parent directory was ignored that contained untracked files
alongside tracked files. Unfortunately, when negative ignore rules
were used for directories inside ignored parents, the wrong rules
were applied to untracked files inside the negatively ignored
child directories.
This commit moves the logic for ignore containment into the workdir
iterator (which is a better place for it), so the ignored-ness of
a directory is contained in the frame stack during traversal. This
allows a child directory to override with a negative ignore and yet
still restore the ignored state of the parent when we traverse out
of the child.
Along with this, there are some problems with "directory only"
ignore rules on container directories. Given "a/*" and "!a/b/c/"
(where the second rule is a directory rule but the first rule is
just a generic prefix rule), then the directory only constraint
was having "a/b/c/d/file" match the first rule and not the second.
This was fixed by having ignore directory-only rules test a rule
against the prefix of a file with LEADINGDIR enabled.
Lastly, spot checks for ignores using `git_ignore_path_is_ignored`
were tested from the top directory down to the bottom to deal with
the containment problem, but this is wrong. We have to test bottom
to top so that negative subdirectory rules will be checked before
parent ignore rules.
This does change the behavior of some existing tests, but it seems
only to bring us more in line with core Git, so I think those
changes are acceptable.
|