summaryrefslogtreecommitdiff
path: root/src/commit.c
Commit message (Collapse)AuthorAgeFilesLines
* object: return GIT_EINVALID on parse errorsEdward Thomson2021-11-301-8/+11
| | | | | | | | | | | | | | Return `GIT_EINVALID` on parse errors so that direct callers of parse functions can determine when there was a failure to parse the object. The object parser functions will swallow this error code to prevent it from propagating down the chain to end-users. (`git_merge` should not return `GIT_EINVALID` when a commit it tries to look up is not valid, this would be too vague to be useful.) The only public function that this affects is `git_signature_from_buffer`, which is now documented as returning `GIT_EINVALID` when appropriate.
* str: introduce `git_str` for internal, `git_buf` is externalethomson/gitstrEdward Thomson2021-10-171-46/+102
| | | | | | | | | | | | | | | | | | | | | | | | | | | libgit2 has two distinct requirements that were previously solved by `git_buf`. We require: 1. A general purpose string class that provides a number of utility APIs for manipulating data (eg, concatenating, truncating, etc). 2. A structure that we can use to return strings to callers that they can take ownership of. By using a single class (`git_buf`) for both of these purposes, we have confused the API to the point that refactorings are difficult and reasoning about correctness is also difficult. Move the utility class `git_buf` to be called `git_str`: this represents its general purpose, as an internal string buffer class. The name also is an homage to Junio Hamano ("gitstr"). The public API remains `git_buf`, and has a much smaller footprint. It is generally only used as an "out" param with strict requirements that follow the documentation. (Exceptions exist for some legacy APIs to avoid breaking callers unnecessarily.) Utility functions exist to convert a user-specified `git_buf` to a `git_str` so that we can call internal functions, then converting it back again.
* commit: use GIT_ASSERTEdward Thomson2020-11-271-28/+41
|
* Make type mismatch errors consistentTobias Nießen2020-01-151-1/+1
|
* commit: verify objects exist in git_commit_with_signaturecmn/create-with-signature-verificationCarlos Martín Nieto2019-10-301-2/+25
| | | | | | | | | There can be a significant difference between the system where we created the buffer (if at all) and when the caller provides us with the contents of a commit. Verify that the commit we are being asked to create references objects which do exist in the target repository.
* Merge pull request #4445 from tiennou/shallow/dry-commit-parsingPatrick Steinhardt2019-10-031-11/+33
|\ | | | | DRY commit parsing
| * commit: generic parse mechanismEtienne Samson2019-10-031-11/+33
| | | | | | | | | | | | This allows us to pick which data from a commit we're interested in. This will be used by the revwalk code, which is only interested in parents' and committer data.
* | fixup: strange indentationTyler Ang-Wanek2019-08-071-5/+5
| |
* | commit: git_commit_create_with_signature should support null signatureTyler Ang-Wanek2019-07-021-8/+11
|/ | | | If provided with a null signature, skip adding the signature header and create the commit anyway.
* git_error: use new names in internal APIs and usageEdward Thomson2019-01-221-22/+22
| | | | | Move to the `git_error` name in the internal API for error-related functions.
* object_type: use new enumeration namesethomson/index_fixesEdward Thomson2018-12-011-5/+5
| | | | Use the new object_type enumeration names within the codebase.
* commit: fix out-of-bound reads when parsing truncated author fieldsPatrick Steinhardt2018-11-211-1/+1
| | | | | | | | | | | | | | | | | | | | | | | While commit objects usually should have only one author field, our commit parser actually handles the case where a commit has multiple author fields because some tools that exist in the wild actually write them. Detection of those additional author fields is done by using a simple `git__prefixcmp`, checking whether the current line starts with the string "author ". In case where we are handed a non-NUL-terminated string that ends directly after the space, though, we may have an out-of-bounds read of one byte when trying to compare the expected final NUL byte. Fix the issue by using `git__prefixncmp` instead of `git_prefixcmp`. Unfortunately, a test cannot be easily written to catch this case. While we could test the last error message and verify that it didn't in fact fail parsing a signature (because that would indicate that it has in fact tried to parse the additional "author " field, which it shouldn't be able to detect in the first place), this doesn't work as the next line needs to be the "committer" field, which would error out with the same error message even if we hadn't done an out-of-bounds read. As objects read from the object database are always NUL terminated, this issue cannot be triggered in normal code and thus it's not security critical.
* commit: fix reading out of bounds when parsing encodingPatrick Steinhardt2018-10-251-1/+1
| | | | | | | | | | | The commit message encoding is currently being parsed by the `git__prefixcmp` function. As this function does not accept a buffer length, it will happily skip over a buffer's end if it is not `NUL` terminated. Fix the issue by using `git__prefixncmp` instead. Add a test that verifies that we are unable to parse the encoding field if it's cut off by the supplied buffer length.
* commit: implement function to parse raw dataPatrick Steinhardt2018-06-221-3/+10
| | | | | | | | | | | | | Currently, parsing objects is strictly tied to having an ODB object available. This makes it hard to parse an object when all that is available is its raw object and size. Furthermore, hacking around that limitation by directly creating an ODB structure either on stack or on heap does not really work that well due to ODB objects being reference counted and then automatically free'd when reaching a reference count of zero. Implement a function `git_commit__parse_raw` to parse a commit object from a pair of `data` and `size`.
* mailmap: API and style cleanupNika Layzell2018-06-141-2/+3
|
* mailmap: Integrate mailmaps with blame and signaturesNika Layzell2018-06-141-0/+12
|
* Convert usage of `git_buf_free` to new `git_buf_dispose`Patrick Steinhardt2018-06-101-3/+3
|
* Make sure to always include "common.h" firstPatrick Steinhardt2017-07-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | Next to including several files, our "common.h" header also declares various macros which are then used throughout the project. As such, we have to make sure to always include this file first in all implementation files. Otherwise, we might encounter problems or even silent behavioural differences due to macros or defines not being defined as they should be. So in fact, our header and implementation files should make sure to always include "common.h" first. This commit does so by establishing a common include pattern. Header files inside of "src" will now always include "common.h" as its first other file, separated by a newline from all the other includes to make it stand out as special. There are two cases for the implementation files. If they do have a matching header file, they will always include this one first, leading to "common.h" being transitively included as first file. If they do not have a matching header file, they instead include "common.h" as first file themselves. This fixes the outlined problems and will become our standard practice for header and source files inside of the "src/" from now on.
* git_commit_create: freshen tree objects in commitethomson/freshen_treesEdward Thomson2017-03-031-0/+3
| | | | Freshen the tree object that a commit points to during commit time.
* commit: avoid possible use-after-freePatrick Steinhardt2017-02-131-1/+2
| | | | | | | | | | | | | When extracting a commit's signature, we first free the object and only afterwards put its signature contents into the result buffer. This works in most cases - the free'd object will normally be cached anyway, so we only end up decrementing its reference count without actually freeing its contents. But in some more exotic setups, where caching is disabled, this can definitly be a problem, as we might be the only instance currently holding a reference to this object. Fix this issue by first extracting the contents and freeing the object afterwards only.
* commit: clear user-provided buffersPatrick Steinhardt2017-02-131-3/+3
| | | | | | | | | | The functions `git_commit_header_field` and `git_commit_extract_signature` both receive buffers used to hand back the results to the user. While these functions called `git_buf_sanitize` on these buffers, this is not the right thing to do, as it will simply initialize or zero-terminate passed buffers. As we want to overwrite contents, we instead have to call `git_buf_clear` to completely reset them.
* giterr_set: consistent error messagesEdward Thomson2016-12-291-2/+2
| | | | | | | | Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
* commit: always initialize commit messagePatrick Steinhardt2016-10-091-3/+4
| | | | | | | | | | | | | | | When parsing a commit, we will treat all bytes left after parsing the headers as the commit message. When no bytes are left, we leave the commit's message uninitialized. While uncommon to have a commit without message, this is the right behavior as Git unfortunately allows for empty commit messages. Given that this scenario is so uncommon, most programs acting on the commit message will never check if the message is actually set, which may lead to errors. To work around the error and not lay the burden of checking for empty commit messages to the developer, initialize the commit message with an empty string when no commit message is given.
* checkout: drop unused repoEdward Thomson2016-06-011-4/+3
|
* Fix `git_commit_create` for an initial commitJohn Haley2016-05-031-1/+1
| | | | | | | When calling `git_commit_create` with an empty array of `parents` and `parent_count == 0` the call will segfault at https://github.com/libgit2/libgit2/blob/master/src/commit.c#L107 when it's trying to compare `current_id` to a null parent oid. This just puts in a check to stop that segfault.
* git_object_dup: introduce typesafe versionsEdward Thomson2016-03-231-1/+1
|
* Merge pull request #3673 from libgit2/cmn/commit-with-signatureEdward Thomson2016-03-171-0/+64
|\ | | | | commit: add function to attach a signature to a commit
| * commit: add function to attach a signature to a commitcmn/commit-with-signatureCarlos Martín Nieto2016-03-151-0/+64
| | | | | | | | | | In combination with the function which creates a commit into a buffer, this allows us to more easily create signed commits.
* | commit: fix extraction of single-line signaturescmn/extract-oneline-sigCarlos Martín Nieto2016-03-171-1/+1
|/ | | | | | | | The function to extract signatures suffers from a similar bug to the header field finding one by having an unecessary line feed check as a break condition of its loop. Fix that and add a test for this single-line signature situation.
* commit: split creating the commit and writing it outcmn/commit-to-memoryCarlos Martín Nieto2016-03-081-47/+128
| | | | | | Sometimes you want to create a commit but not write it out to the objectdb immediately. For these cases, provide a new function to retrieve the buffer instead of having to go through the db.
* git_commit: validate tree and parent idsEdward Thomson2016-02-281-11/+37
| | | | | When `GIT_OPT_ENABLE_STRICT_OBJECT_CREATION` is turned on, validate the tree and parent ids given to commit creation functions.
* commit: expose the different kinds of errorsCarlos Martín Nieto2016-02-161-1/+7
| | | | | | We should be checking whether the object we're looking up is a commit, and we should let the caller know whether the not-found return code comes from a bad object type or just a missing signature.
* commit: don't forget the last header fieldCarlos Martín Nieto2016-02-111-1/+1
| | | | | | | | | When we moved the logic to handle the first one, wrong loop logic was kept in place which meant we still finished early. But we now notice it because we're not reading past the last LF we find. This was not noticed before as the last field in the tested commit was multi-line which does not trigger the early break.
* Merge pull request #3599 from libgit2/gpgsignVicent Marti2016-02-091-0/+86
|\ | | | | Introduce git_commit_extract_signature
| * Introduce git_commit_extract_signaturegpgsignCarlos Martín Nieto2016-02-091-0/+86
| | | | | | | | | | | | This returns the GPG signature for a commit and its contents without the signature block, allowing for the verification of the commit's signature.
* | commit: also match the first header field when searchingcmn/header-field-2Carlos Martín Nieto2016-02-091-17/+22
|/ | | | | | | | We were searching only past the first header field, which meant we were unable to find e.g. `tree` which is the first field. While here, make sure to set an error message in case we cannot find the field.
* commit: introduce `git_commit_body`Patrick Steinhardt2015-12-011-0/+28
| | | | | | | | | It is already possible to get a commit's summary with the `git_commit_summary` function. It is not possible to get the remaining part of the commit message, that is the commit message's body. Fix this by introducing a new function `git_commit_body`.
* Fix git_commit_summary to convert newlines to spaces even afterStjepan Rajko2015-11-031-10/+25
| | | | whitespace. Collapse spaces around newlines for the summary.
* commit: allow retrieving an arbitrary header fieldcmn/commit-header-fieldCarlos Martín Nieto2015-06-221-0/+55
| | | | | | This allows the user to look up fields which we don't parse in libgit2, and allows them to access gpgsig or mergetag fields if they wish to check the signature.
* commit: ignore multiple author fieldscmn/double-authorCarlos Martín Nieto2015-06-111-0/+10
| | | | | | | | | | Some tools create multiple author fields. git is rather lax when parsing them, although fsck does complain about them. This means that they exist in the wild. As it's not too taxing to check for them, and there shouldn't be a noticeable slowdown when dealing with correct commits, add logic to skip over these extra fields when parsing the commit.
* Remove the signature from ref-modifying functionsCarlos Martín Nieto2015-03-031-2/+2
| | | | | | | | | | The signature for the reflog is not something which changes dynamically. Almost all uses will be NULL, since we want for the repository's default identity to be used, making it noise. In order to allow for changing the identity, we instead provide git_repository_set_ident() and git_repository_ident() which allow a user to override the choice of signature.
* Remove extra semicolon outside of a functionStefan Widgren2015-02-151-1/+1
| | | | | Without this change, compiling with gcc and pedantic generates warning: ISO C does not allow extra ‘;’ outside of a function.
* git_rebase_commit: write HEAD's reflog appropriatelyEdward Thomson2014-10-261-31/+5
|
* commit: safer commit creation with reference updatecmn/commit-create-safeCarlos Martín Nieto2014-04-301-21/+78
| | | | | | | | | | | | | | The current version of the commit creation and amend function are unsafe to use when passing the update_ref parameter, as they do not check that the reference at the moment of update points to what the user expects. Make sure that we're moving history forward when we ask the library to update the reference for us by checking that the first parent of the new commit is the current value of the reference. We also make sure that the ref we're updating hasn't moved between the read and the write. Similarly, when amending a commit, make sure that the current tip of the branch is the commit we're amending.
* commit: simplify and correct refcounting in nth_gen_ancestorCarlos Martín Nieto2014-03-071-9/+8
| | | | | | | | We can make use of git_object_dup to use refcounting instead of pointer comparison to make sure we don't free the caller's object. This also lets us simplify the case for '~0' which is now just an assignment instead of looking up the object we have at hand.
* Remove now-duplicated stdarg.h includeEdward Thomson2014-02-241-2/+0
|
* Add git_commit_amend APIRussell Belfer2014-02-071-66/+161
| | | | | | | | | This adds an API to amend an existing commit, basically a shorthand for creating a new commit filling in missing parameters from the values of an existing commit. As part of this, I also added a new "sys" API to create a commit using a callback to get the parents. This allowed me to rewrite all the other commit creation APIs so that temporary allocations are no longer needed.
* Merge remote-tracking branch 'libgit2/development' into bs/more-reflog-stuffBen Straub2014-02-051-28/+12
|\
| * commit: faster parsingCarlos Martín Nieto2014-02-051-28/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current code issues a lot of strncmp() calls in order to check for the end of the header, simply in order to copy it and start going through it again. These are a lot of calls for something we can check as we go along. Knowing the amount of parents beforehand to reduce allocations in extreme cases does not make up for them. Instead start parsing immediately and check for the double-newline after each header field, leaving the raw_header allocation for the end, which lets us go through the header once and reduces the amount of strncmp() calls significantly. In unscientific testing, this has reduced a shortlog-like usage (walking though the whole history of a branch and extracting data from the commits) of git.git from ~830ms to ~700ms and makes the time we spend in strncmp() negligible.
* | Fix reflog message when creating commitsBen Straub2014-02-041-2/+21
|/