summaryrefslogtreecommitdiff
path: root/src/patch_parse.c
Commit message (Collapse)AuthorAgeFilesLines
...
* patch_parse: implement state machine for parsing patch headersPatrick Steinhardt2017-08-251-46/+81
| | | | | | | | | | | | | | | | | | | | Our code parsing Git patch headers is rather lax in parsing headers of a Git-style patch. Most notably, we do not care for the exact order in which header lines appear and as such, we may parse patch files which are not really valid after all. Furthermore, the state transitions inside of the parser are not as obvious as they could be, making it harder than required to follow its logic. To improve upon this situation, this patch introduces a real state machine to parse the patches. Instead of simply parsing each line without caring for previous state and the exact ordering, we define a set of states with their allowed transitions. This makes the patch parser more strict in only allowing valid successions of header lines. As the transition table is defined inside of a single structure with the expected line, required state as well as the state that we end up in, all state transitions are immediately obvious from just having a look at this structure. This improves both maintainability and eases reasoning about the patch parser.
* Make sure to always include "common.h" firstPatrick Steinhardt2017-07-031-1/+3
| | | | | | | | | | | | | | | | | | | | | | Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
* patch_parse: check if advancing over header newline succeedsPatrick Steinhardt2017-03-211-2/+2
| | | | | | | | While parsing patch header lines, we iterate over each line and check if the line has trailing garbage. What we do not check though is that the line is actually a line ending with a trailing newline. Fix this by checking the return code of `parse_advance_expected_str`.
* patch_parse: fix parsing minimal trailing diff linePatrick Steinhardt2017-03-141-2/+3
| | | | | | | | | | | | | | | | | | | | In a diff, the shortest possible hunk with a modification (that is, no deletion) results from a file with only one line with a single character which is removed. Thus the following hunk @@ -1 +1 @@ -a + is the shortest valid hunk modifying a line. The function parsing the hunk body though assumes that there must always be at least 4 bytes present to make up a valid hunk, which is obviously wrong in this case. The absolute minimum number of bytes required for a modification is actually 2 bytes, that is the "+" and the following newline. Note: if there is no trailing newline, the assumption will not be offended as the diff will have a line "\ No trailing newline" at its end. This patch fixes the issue by lowering the amount of bytes required.
* patch_parse: fix memory leakPatrick Steinhardt2016-11-151-1/+3
|
* common: use PRIuZ for size_t in `giterr_set` callsPatrick Steinhardt2016-11-141-25/+25
|
* diff: treat binary patches with no data specialethomson/diff-read-empty-binaryEdward Thomson2016-09-051-12/+33
| | | | | | When creating and printing diffs, deal with binary deltas that have binary data specially, versus diffs that have a binary file but lack the actual binary data.
* Teach `git_patch_from_diff` about parsed diffsethomson/patch_from_diffEdward Thomson2016-08-241-0/+15
| | | | | Ensure that `git_patch_from_diff` can return the patch for parsed diffs, not just generate a patch for a generated diff.
* git_diff_file: move `id_abbrev`ethomson/diff_fileEdward Thomson2016-08-031-3/+3
| | | | | Move `id_abbrev` to a more reasonable place where it packs more nicely (before anybody starts using it).
* apply: check allocation properlyEdward Thomson2016-07-241-1/+1
|
* patch: show copy information for identical copiesEdward Thomson2016-06-251-0/+16
| | | | | | | When showing copy information because we are duplicating contents, for example, when performing a `diff --find-copies-harder -M100 -B100`, then show copy from/to lines in a patch, and do not show context. Ensure that we can also parse such patches.
* patch::parse: handle patches with no hunksEdward Thomson2016-06-251-1/+3
| | | | | Patches may have no hunks when there's no modifications (for example, in a rename). Handle them.
* patch: zero id and abbrev length for empty filesEdward Thomson2016-05-261-8/+20
|
* patch: identify non-binary patches as `NOT_BINARY`Edward Thomson2016-05-261-4/+3
|
* introduce `git_diff_from_buffer` to parse diffsEdward Thomson2016-05-261-32/+43
| | | | Parse diff files into a `git_diff` structure.
* patch: differentiate not found and invalid patchesEdward Thomson2016-05-261-1/+2
|
* git_patch_parse_ctx: refcount the contextEdward Thomson2016-05-261-87/+144
|
* parse: introduce parse_ctx_contains_sEdward Thomson2016-05-261-18/+25
|
* patch: `git_patch_from_patchfile` -> `git_patch_from_buffer`Edward Thomson2016-05-261-1/+1
|
* patch: provide static string `advance_expected`Edward Thomson2016-05-261-10/+13
|
* patch parse: dup the patch from the callersEdward Thomson2016-05-261-5/+22
|
* patch parsing: squash some memory leaksEdward Thomson2016-05-261-0/+7
|
* patch: drop some warningsEdward Thomson2016-05-261-2/+2
|
* Introduce git_patch_options, handle prefixesEdward Thomson2016-05-261-112/+163
| | | | | Handle prefixes (in terms of number of path components) for patch parsing.
* patch printing: include rename informationEdward Thomson2016-05-261-2/+2
|
* patch_parse: don't set new mode when deletedEdward Thomson2016-05-261-4/+4
|
* patch_parse: use names from `diff --git` headerEdward Thomson2016-05-261-17/+44
| | | | | | When a text file is added or deleted, use the file names from the `diff --git` header instead of the `---` or `+++` lines. This is for compatibility with git.
* patch_parse: set binary flagEdward Thomson2016-05-261-0/+1
| | | | | We may have parsed binary data, set the `SHOW_BINARY` flag which indicates that we have actually computed a binary diff.
* patch: when parsing, set nfiles correctly in deltaEdward Thomson2016-05-261-0/+3
|
* diff: include oid length in deltasEdward Thomson2016-05-261-8/+4
| | | | | | Now that `git_diff_delta` data can be produced by reading patch file data, which may have an abbreviated oid, allow consumers to know that the id is abbreviated.
* patch parse: unset path prefixEdward Thomson2016-05-261-0/+4
|
* patch: use delta's old_file/new_file membersEdward Thomson2016-05-261-39/+26
| | | | | No need to replicate the old_file/new_file members, or plumb them strangely up.
* patch: abstract patches into diff'ed and parsedEdward Thomson2016-05-261-0/+920
Patches can now come from a variety of sources - either internally generated (from diffing two commits) or as the results of parsing some external data.