summaryrefslogtreecommitdiff
path: root/src/index.c
Commit message (Collapse)AuthorAgeFilesLines
* index: verify we have enough space left when writing index entriesPatrick Steinhardt2017-06-061-4/+23
| | | | | | | | | | In our code writing index entries, we carry around a `disk_size` representing how much memory we have in total and pass this value to `git_encode_varint` to do bounds checks. This does not make much sense, as at the time when passing on this variable it is already out of date. Fix this by subtracting used memory from `disk_size` as we go along. Furthermore, assert we've actually got enough space left to do the final path memcpy.
* index: fix shared prefix computation when writing index entryPatrick Steinhardt2017-06-061-2/+1
| | | | | | | | | | | When using compressed index entries, each entry's path is preceded by a varint encoding how long the shared prefix with the previous index entry actually is. We currently encode a length of `(path_len - same_len)`, which is doubly wrong. First, `path_len` is already set to `path_len - same_len` previously. Second, we want to encode the shared prefix rather than the un-shared suffix length. Fix this by using `same_len` as the varint value instead.
* index: also sanity check entry size with compressed entriesPatrick Steinhardt2017-06-061-4/+3
| | | | | | | We have a check in place whether the index has enough data left for the required footer after reading an index entry, but this was only used for uncompressed entries. Move the check down a bit so that it is executed for both compressed and uncompressed index entries.
* index: remove file-scope entry size macrosPatrick Steinhardt2017-06-061-6/+4
| | | | | | | All index entry size computations are now performed in `index_entry_size`. As such, we do not need the file-scope macros for computing these sizes anymore. Remove them and move the `entry_size` macro into the `index_entry_size` function.
* index: don't right-pad paths when writing compressed entriesPatrick Steinhardt2017-06-061-4/+3
| | | | | | | | | | Our code to write index entries to disk does not check whether the entry that is to be written should use prefix compression for the path. As such, we were overallocating memory and added bogus right-padding into the resulting index entries. As there is no padding allowed in the index version 4 format, this should actually result in an invalid index. Fix this by re-using the newly extracted `index_entry_size` function.
* index: move index entry size computation into its own functionPatrick Steinhardt2017-06-061-5/+17
| | | | | | | Create a new function `index_entry_size` which encapsulates the logic to calculate how much space is needed for an index entry, whether it is simple/extended or compressed/uncompressed. This can later be re-used by our code writing index entries.
* index: set last written index entry in foreach-entry-loopPatrick Steinhardt2017-06-061-7/+8
| | | | | | | The last written disk entry is currently being written inside of the function `write_disk_entry`. Make behavior a bit more obviously by instead setting it inside of `write_entries` while iterating all entries.
* index: set last entry when reading compressed entriesPatrick Steinhardt2017-06-061-4/+7
| | | | | | | | | | | To calculate the path of a compressed index entry, we need to know the preceding entry's path. While we do actually set the first predecessor correctly to "", we fail to update this while reading the entries. Fix the issue by updating `last` inside of the loop. Previously, we've been passing a double-pointer to `read_entry`, which it didn't update. As it is more obvious to update the pointer inside the loop itself, though, we can simply convert it to a normal pointer.
* index: fix confusion with shared prefix in compressed path namesPatrick Steinhardt2017-06-061-9/+12
| | | | | | | | | | | | | | | | | | The index version 4 introduced compressed path names for the entries. From the git.git index-format documentation: At the beginning of an entry, an integer N in the variable width encoding [...] is stored, followed by a NUL-terminated string S. Removing N bytes from the end of the path name for the previous entry, and replacing it with the string S yields the path name for this entry. But instead of stripping N bytes from the previous path's string and using the remaining prefix, we were instead simply concatenating the previous path with the current entry path, which is obviously wrong. Fix the issue by correctly copying the first N bytes of the previous entry only and concatenating the result with our current entry's path.
* idxmap: remove GIT__USE_IDXMAPPatrick Steinhardt2017-02-171-3/+0
|
* khash: avoid using `kh_resize` directlyPatrick Steinhardt2017-02-171-6/+6
|
* khash: avoid using macro magic to get return addressPatrick Steinhardt2017-02-171-5/+5
|
* giterr_set: consistent error messagesEdward Thomson2016-12-291-15/+15
| | | | | | | | Error messages should be sentence fragments, and therefore: 1. Should not begin with a capital letter, 2. Should not conclude with punctuation, and 3. Should not end a sentence and begin a new one
* use `giterr_set_str()` wherever possiblePranit Bauva2016-11-171-1/+1
| | | | | | | | | | `giterr_set()` is used when it is required to format a string, and since we don't really require it for this case, it is better to stick to `giterr_set_str()`. This also suppresses a warning(-Wformat-security) raised by the compiler. Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com>
* index: support index v4David Turner2016-08-101-29/+118
| | | | | | | Support reading and writing index v4. Index v4 uses a very simple compression scheme for pathnames, but is otherwise similar to index v3. Signed-off-by: David Turner <dturner@twitter.com>
* index: cast to avoid warningEdward Thomson2016-07-241-2/+2
|
* index: include conflicts in `git_index_read_index`ethomson/read_index_conflictsEdward Thomson2016-06-291-6/+7
| | | | | | Ensure that we include conflicts when calling `git_index_read_index`, which will remove conflicts in the index that do not exist in the new target, and will add conflicts from the new target.
* index: refactor common `read_index` functionalityEdward Thomson2016-06-291-13/+36
| | | | | | Most of `git_index_read_index` is common to reading any iterator. Refactor it out in case we want to implement `read_tree` in terms of it in the future.
* index: fix NULL pointer access in index_remove_entryPatrick Steinhardt2016-06-071-2/+3
| | | | | | | | | | | | When removing an entry from the index by its position, we first retrieve the position from the index's entries and then try to remove the retrieved value from the index map with `DELETE_IN_MAP`. When `index_remove_entry` returns `NULL` we try to feed it into the `DELETE_IN_MAP` macro, which will unconditionally call `idxentry_hash` and then happily dereference the `NULL` entry pointer. Fix the issue by not passing a `NULL` entry into `DELETE_IN_MAP`.
* index_read_index: invalidate new paths in tree cacheEdward Thomson2016-06-021-0/+6
| | | | | | When adding a new entry to an existing index via `git_index_read_index`, be sure to remove the tree cache entry for that new path. This will mark all parent trees as dirty.
* index_read_index: set flags for path_len correctlyEdward Thomson2016-06-021-0/+3
| | | | Update the flags to reset the path_len (to emulate `index_insert`)
* index_read_index: differentiate on modeEdward Thomson2016-06-021-1/+2
| | | | | Treat index entries with different modes as different, which they are, at least for the purposes of up-to-date calculations.
* index_read_index: reset error correctlyEdward Thomson2016-06-021-0/+2
| | | | | | Clear any error state upon each iteration. If one of the iterations ends (with an error of `GIT_ITEROVER`) we need to reset that error to 0, lest we stop the whole process prematurely.
* index: fix memory leak on error casePatrick Steinhardt2016-05-021-1/+1
|
* tree: re-use the id and filename in the odb objectCarlos Martín Nieto2016-03-201-1/+1
| | | | | Instead of copying over the data into the individual entries, point to the originals, which are already in a format we can use.
* index: assert required OID are non-NULLPatrick Steinhardt2016-03-111-3/+9
|
* git_index_add: validate objects in index entries (optionally)Edward Thomson2016-02-281-6/+20
| | | | | When `GIT_OPT_ENABLE_STRICT_OBJECT_CREATION` is turned on, validate the index entries given to `git_index_add`.
* Merge pull request #3638 from ethomson/nsecCarlos Martín Nieto2016-02-251-2/+2
|\ | | | | USE_NSECS fixes
| * nsec: support NDK's crazy nanosecondsEdward Thomson2016-02-251-2/+2
| | | | | | | | | | | | | | | | Android NDK does not have a `struct timespec` in its `struct stat` for nanosecond support, instead it has a single nanosecond member inside the struct stat itself. We will use that and use a macro to expand to the `st_mtim` / `st_mtimespec` definition on other systems (much like the existing `st_mtime` backcompat definition).
* | index: fix contradicting comparisonPatrick Steinhardt2016-02-231-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The overflow check in `read_reuc` tries to verify if the `git__strtol32` parses an integer bigger than UINT_MAX. The `tmp` variable is casted to an unsigned int for this and then checked for being greater than UINT_MAX, which obviously can never be true. Fix this by instead fixing the `mode` field's size in `struct git_index_reuc_entry` to `uint32_t`. We can now parse the int with `git__strtol64`, which can never return a value bigger than `UINT32_MAX`, and additionally checking if the returned value is smaller than zero. We do not need to handle overflows explicitly here, as `git__strtol64` returns an error when the returned value would overflow.
* | index: plug memory leak in `read_conflict_names`Patrick Steinhardt2016-02-231-4/+14
|/
* Merge pull request #3613 from ethomson/fixupsCarlos Martín Nieto2016-02-181-6/+6
|\ | | | | Remove most of the silly warnings
| * index: explicitly cast the teeny index entry membersEdward Thomson2016-02-161-3/+3
| |
| * index: don't use `seek` return as an error codeEdward Thomson2016-02-161-2/+2
| |
| * index: explicitly cast new hash size to an intEdward Thomson2016-02-161-1/+1
| |
* | Merge pull request #3619 from ethomson/win32_forbiddenCarlos Martín Nieto2016-02-181-9/+23
|\ \ | | | | | | win32: allow us to read indexes with forbidden paths on win32
| * | index: allow read of index w/ illegal entriesEdward Thomson2016-02-171-9/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | Allow `git_index_read` to handle reading existing indexes with illegal entries. Allow the low-level `git_index_add` to add properly formed `git_index_entry`s even if they contain paths that would be illegal for the current filesystem (eg, `AUX`). Continue to disallow `git_index_add_bypath` from adding entries that are illegal universally illegal (eg, `.git`, `foo/../bar`).
* | | Horrible fix for #3173.Arthur Schreiber2016-02-111-2/+2
| |/ |/|
* | index: get rid of the lockingcmn/index-nolockCarlos Martín Nieto2015-12-281-130/+15
|/ | | | | | | We don't support using an index object from multiple threads at the same time, so the locking doesn't have any effect when following the rules. If not following the rules, things are going to break down anyway.
* index: Also size-hint the hash tablevmg/index-fill-2Vicent Marti2015-12-161-4/+2
| | | | | | | Note that we're not checking whether the resize succeeds; in OOM cases, we let it run with a "small" vector and hash table and see if by chance we can grow it dynamically as we insert the new entries. Nothing to lose really.
* index: Preallocate the entries vector with size hintVicent Marti2015-12-161-0/+8
|
* index: Adjust namemask & mode when fillingVicent Marti2015-12-161-14/+17
|
* merge: Use `git_index__fill` to populate the indexvmg/index-fillVicent Marti2015-12-161-0/+37
| | | | | | | | | | | | | Instead of calling `git_index_add` in a loop, use the new `git_index_fill` internal API to fill the index with the initial staged entries. The new `fill` helper assumes that all the entries will be unique and valid, so it can append them at the end of the entries vector and only sort it once at the end. It performs no validation checks. This prevents the quadratic behavior caused by having to sort the entries list once after every insertion.
* Merge pull request #3538 from pks-t/pks/index-memory-leakCarlos Martín Nieto2015-12-101-1/+1
|\ | | | | index: always queue `remove_entry` for removal
| * index: always queue `remove_entry` for removalPatrick Steinhardt2015-12-081-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When replacing an index with a new one, we need to iterate through all index entries in order to determine which entries are equal. When it is not possible to re-use old entries for the new index, we move it into a list of entries that are to be removed and thus free'd. When we encounter a non-zero error code, though, we skip adding the current index entry to the remove-queue. `INSERT_MAP_EX`, which is the function last run before adding to the remove-queue, may return a positive non-zero code that indicates what exactly happened while inserting the element. In this case we skip adding the entry to the remove-queue but still continue the current operation, leading to a leak of the current entry. Fix this by checking for a negative return value instead of a non-zero one when we want to add the current index entry to the remove-queue.
* | index: canonicalize inserted paths safelyEdward Thomson2015-12-031-1/+1
|/ | | | | | | | | | | | | | | | | | When adding to the index, we look to see if a portion of the given path matches a portion of a path in the index. If so, we will use the existing path information. For example, when adding `foo/bar.c`, if there is an index entry to `FOO/other` and the filesystem is case insensitive, then we will put `bar.c` into the existing tree instead of creating a new one with a different case. Use `strncmp` to do that instead of `memcmp`. When we `bsearch` into the index, we locate the position where the new entry would go. The index entry at that position does not necessarily have a relation to the entry we're adding, so we cannot make assumptions and use `memcmp`. Instead, compare them as strings. When canonicalizing paths, we look for the first index entry that matches a given substring.
* checkout: only consider nsecs when built that wayEdward Thomson2015-11-231-17/+3
| | | | | | | | When examining the working directory and determining whether it's up-to-date, only consider the nanoseconds in the index entry when built with `GIT_USE_NSEC`. This prevents us from believing that the working directory is always dirty when the index was originally written with a git client that uinderstands nsecs (like git 2.x).
* racy: make git_index_read_index handle racinessEdward Thomson2015-11-161-30/+48
| | | | | | | | | | | Ensure that `git_index_read_index` clears the uptodate bit on files that it modifies. Further, do not propagate the cache from an on-disk index into another on-disk index. Although this should not be done, as `git_index_read_index` is used to bring an in-memory index into another index (that may or may not be on-disk), ensure that we do not accidentally bring in these bits when misused.
* index: clear uptodate bit on saveEdward Thomson2015-11-161-1/+16
| | | | | | The uptodate bit should have a lifecycle of a single read->write on the index. Once the index is written, the files within it should be scanned for racy timestamps against the new index timestamp.
* index: don't detect raciness in uptodate entriesEdward Thomson2015-11-161-2/+7
| | | | | | | | | | | Keep track of entries that we believe are up-to-date, because we added the index entries since the index was loaded. This prevents us from unnecessarily examining files that we wrote during the cleanup of racy entries (when we smudge racily clean files that have a timestamp newer than or equal to the index's timestamp when we read it). Without keeping track of this, we would examine every file that we just checked out for raciness, since all their timestamps would be newer than the index's timestamp.