| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our code parsing Git patch headers is rather lax in parsing headers of a
Git-style patch. Most notably, we do not care for the exact order in
which header lines appear and as such, we may parse patch files which
are not really valid after all. Furthermore, the state transitions
inside of the parser are not as obvious as they could be, making it
harder than required to follow its logic.
To improve upon this situation, this patch introduces a real state
machine to parse the patches. Instead of simply parsing each line
without caring for previous state and the exact ordering, we define a
set of states with their allowed transitions. This makes the patch
parser more strict in only allowing valid successions of header lines.
As the transition table is defined inside of a single structure with
the expected line, required state as well as the state that we end up
in, all state transitions are immediately obvious from just having a
look at this structure. This improves both maintainability and eases
reasoning about the patch parser.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Next to including several files, our "common.h" header also declares
various macros which are then used throughout the project. As such, we
have to make sure to always include this file first in all
implementation files. Otherwise, we might encounter problems or even
silent behavioural differences due to macros or defines not being
defined as they should be. So in fact, our header and implementation
files should make sure to always include "common.h" first.
This commit does so by establishing a common include pattern. Header
files inside of "src" will now always include "common.h" as its first
other file, separated by a newline from all the other includes to make
it stand out as special. There are two cases for the implementation
files. If they do have a matching header file, they will always include
this one first, leading to "common.h" being transitively included as
first file. If they do not have a matching header file, they instead
include "common.h" as first file themselves.
This fixes the outlined problems and will become our standard practice
for header and source files inside of the "src/" from now on.
|
|
|
|
|
|
|
|
| |
While parsing patch header lines, we iterate over each line and check if
the line has trailing garbage. What we do not check though is that the
line is actually a line ending with a trailing newline.
Fix this by checking the return code of `parse_advance_expected_str`.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In a diff, the shortest possible hunk with a modification (that is, no
deletion) results from a file with only one line with a single character
which is removed. Thus the following hunk
@@ -1 +1 @@
-a
+
is the shortest valid hunk modifying a line. The function parsing the
hunk body though assumes that there must always be at least 4 bytes
present to make up a valid hunk, which is obviously wrong in this case.
The absolute minimum number of bytes required for a modification is
actually 2 bytes, that is the "+" and the following newline. Note: if
there is no trailing newline, the assumption will not be offended as the
diff will have a line "\ No trailing newline" at its end.
This patch fixes the issue by lowering the amount of bytes required.
|
| |
|
| |
|
|
|
|
|
|
| |
When creating and printing diffs, deal with binary deltas that have
binary data specially, versus diffs that have a binary file but lack the
actual binary data.
|
|
|
|
|
| |
Ensure that `git_patch_from_diff` can return the patch for parsed diffs,
not just generate a patch for a generated diff.
|
|
|
|
|
| |
Move `id_abbrev` to a more reasonable place where it packs more nicely
(before anybody starts using it).
|
| |
|
|
|
|
|
|
|
| |
When showing copy information because we are duplicating contents,
for example, when performing a `diff --find-copies-harder -M100 -B100`,
then show copy from/to lines in a patch, and do not show context.
Ensure that we can also parse such patches.
|
|
|
|
|
| |
Patches may have no hunks when there's no modifications (for example,
in a rename). Handle them.
|
| |
|
| |
|
|
|
|
| |
Parse diff files into a `git_diff` structure.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
Handle prefixes (in terms of number of path components) for patch
parsing.
|
| |
|
| |
|
|
|
|
|
|
| |
When a text file is added or deleted, use the file names from the
`diff --git` header instead of the `---` or `+++` lines. This is
for compatibility with git.
|
|
|
|
|
| |
We may have parsed binary data, set the `SHOW_BINARY` flag which
indicates that we have actually computed a binary diff.
|
| |
|
|
|
|
|
|
| |
Now that `git_diff_delta` data can be produced by reading patch
file data, which may have an abbreviated oid, allow consumers to
know that the id is abbreviated.
|
| |
|
|
|
|
|
| |
No need to replicate the old_file/new_file members, or plumb them
strangely up.
|
|
Patches can now come from a variety of sources - either internally
generated (from diffing two commits) or as the results of parsing
some external data.
|