summaryrefslogtreecommitdiff
path: root/reftable
Commit message (Collapse)AuthorAgeFilesLines
* reftable: ensure git-compat-util.h is the first (indirect) includeElijah Newren2023-04-244-2/+4
| | | | | Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* hash-ll.h: split out of hash.h to remove dependency on repository.hElijah Newren2023-04-242-2/+2
| | | | | | | | | | | | | | | | | | | hash.h depends upon and includes repository.h, due to the definition and use of the_hash_algo (defined as the_repository->hash_algo). However, most headers trying to include hash.h are only interested in the layout of the structs like object_id. Move the parts of hash.h that do not depend upon repository.h into a new file hash-ll.h (the "low level" parts of hash.h), and adjust other files to use this new header where the convenience inline functions aren't needed. This allows hash.h and object.h to be fairly small, minimal headers. It also exposes a lot of hidden dependencies on both path.h (which was brought in by repository.h) and repository.h (which was previously implicitly brought in by object.h), so also adjust other files to be more explicit about what they depend upon. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: use a pointer for pq_entry paramElijah Conners2022-09-154-6/+6
| | | | | | | | | | | | | | | | | | The speed of the merged_iter_pqueue_add() can be improved by using a pointer to the pq_entry struct, which is 96 bytes. Since the pq_entry param is worked directly on the stack and does not currently have a pointer to it, the merged_iter_pqueue_add() function is slightly slower. References to pq_entry in reftable have typically included pointers, such as both of the params for pq_less(). Since we are working with pointers in the pq_entry param, as keenly pointed out, the pq_entry param has also been made into a const since the contents of the pq_entry param are copied and not manipulated. Signed-off-by: Elijah Conners <business@elijahpepe.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: drop unused parameter from reader_seek_linear()Jeff King2022-08-201-3/+3
| | | | | | | | | The reader code passes around a "struct reftable_reader" context variable. But the seek function doesn't need it; the table iterator we already get is sufficient. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* Merge branch 'ep/maint-equals-null-cocci'Junio C Hamano2022-05-203-9/+9
|\ | | | | | | | | | | | | | | | | | | | | Introduce and apply coccinelle rule to discourage an explicit comparison between a pointer and NULL, and applies the clean-up to the maintenance track. * ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci
| * tree-wide: apply equals-null.cocciJunio C Hamano2022-05-023-9/+9
| | | | | | | | Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'cm/reftable-0-length-memset'Junio C Hamano2022-05-041-3/+6
|\ \ | | | | | | | | | | | | | | | | | | Code clean-up. * cm/reftable-0-length-memset: reftable: avoid undefined behaviour breaking t0032
| * | reftable: avoid undefined behaviour breaking t0032Carlo Marcelo Arenas Belón2022-04-151-3/+6
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | 1214aa841bc (reftable: add blocksource, an abstraction for random access reads, 2021-10-07), makes the assumption that it is ok to free a reftable_block pointing to NULL if the size is also set to 0, but implements that using a memset call that at least in glibc based system will trigger a runtime exception if called with a NULL pointer as its first parameter. Avoid doing so by adding a conditional to check for the size in all three identically looking functions that were affected, and therefore, still allow memset to help catch callers that might incorrectly pass a NULL pointer with a non zero size, but avoiding the exception for the valid cases. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | reftable: make assignments portable to AIX xlc v12.01Ævar Arnfjörð Bjarmason2022-03-283-6/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change the assignment syntax introduced in 66c0dabab5e (reftable: make reftable_record a tagged union, 2022-01-20) to be portable to AIX xlc v12.1: avar@gcc111:[/home/avar]xlc -qversion IBM XL C/C++ for AIX, V12.1 (5765-J02, 5725-C72) Version: 12.01.0000.0000 The error emitted before this was e.g.: "reftable/generic.c", line 133.26: 1506-196 (S) Initialization between types "char*" and "struct reftable_ref_record" is not allowed. The syntax in the pre-image is supported by e.g. xlc 13.01 on a newer AIX version: avar@gcc119:[/home/avar]xlc -qversion IBM XL C/C++ for AIX, V13.1.3 (5725-C72, 5765-J07) Version: 13.01.0003.0006 But as we've otherwise supported this compiler let's not break it entirely if it's easy to work around it. Suggested-by: René Scharfe <l.s.r@web.de> Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | reftable: rename writer_stats to reftable_writer_statsHan-Wen Nienhuys2022-02-233-7/+7
| | | | | | | | | | | | | | | | This function is part of the reftable API, so it should use the reftable_ prefix Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | reftable: add test for length of disambiguating prefixHan-Wen Nienhuys2022-02-231-0/+38
| | | | | | | | | | | | | | | | The ID => ref map is trimming object IDs to a disambiguating prefix. Check that we are computing their length correctly. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | reftable: ensure that obj_id_len is >= 2 on writingHan-Wen Nienhuys2022-02-232-1/+40
| | | | | | | | | | | | | | | | | | When writing the same hash many times, we might decide to use a length-1 object ID prefix for the ObjectID => ref table, which is out of spec. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | reftable: avoid writing empty keys at the block layerHan-Wen Nienhuys2022-02-233-12/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | The public interface (reftable_writer) already ensures that keys are written in strictly increasing order, and an empty key by definition fails this check. However, by also enforcing this at the block layer, it is easier to verify that records (which are written into blocks) never have to consider the possibility of empty keys. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | reftable: add a test that verifies that writing empty keys failsHan-Wen Nienhuys2022-02-231-0/+24
| | | | | | | | | | | | | | | | | | Empty keys can only be written as ref records with empty names. The log record has a logical timestamp in the key, so the key is never empty. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | reftable: reject 0 object_id_lenHan-Wen Nienhuys2022-02-231-0/+5
| | | | | | | | | | | | | | | | The spec says 2 <= object_id_len <= 31. We are lenient and allow 1, but we forbid 0, so we can be sure that we never read a 0-length key. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'ab/auto-detect-zlib-compress2'Junio C Hamano2022-02-161-11/+0
|\ \ | | | | | | | | | | | | | | | | | | | | | The build procedure has been taught to notice older version of zlib and enable our replacement uncompress2() automatically. * ab/auto-detect-zlib-compress2: compat: auto-detect if zlib has uncompress2()
| * | compat: auto-detect if zlib has uncompress2()Ævar Arnfjörð Bjarmason2022-01-261-11/+0
| |/ | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a copy of uncompress2() implementation in compat/ so that we can build with an older version of zlib that lack the function, and the build procedure selects if it is used via the NO_UNCOMPRESS2 $(MAKE) variable. This is yet another "annoying" knob the porters need to tweak on platforms that are not common enough to have the default set in the config.mak.uname file. Attempt to instead ask the system header <zlib.h> to decide if we need the compatibility implementation. This is a deviation from the way we have been handling the "compatiblity" features so far, and if it can be done cleanly enough, it could work as a model for features that need compatibility definition we discover in the future. With that goal in mind, avoid expedient but ugly hacks, like shoving the code that is conditionally compiled into an unrelated .c file, which may not work in future cases---instead, take an approach that uses a file that is independently compiled and stands on its own. Compile and link compat/zlib-uncompress2.c file unconditionally, but conditionally hide the implementation behind #if/#endif when zlib version is 1.2.9 or newer, and unconditionally archive the resulting object file in the libgit.a to be picked up by the linker. There are a few things to note in the shape of the code base after this change: - We no longer use NO_UNCOMPRESS2 knob; if the system header <zlib.h> claims a version that is more cent than the library actually is, this would break, but it is easy to add it back when we find such a system. - The object file compat/zlib-uncompress2.o is always compiled and archived in libgit.a, just like a few other compat/ object files already are. - The inclusion of <zlib.h> is done in <git-compat-util.h>; we used to do so from <cache.h> which includes <git-compat-util.h> as the first thing it does, so from the *.c codes, there is no practical change. - Until objects in libgit.a that is already used gains a reference to the function, the reftable code will be the only one that wants it, so libgit.a on the linker command line needs to appear once more at the end to satisify the mutual dependency. - Beat found a trick used by OpenSSL to avoid making the conditionally-compiled object truly empty (apparently because they had to deal with compilers that do not want to see an effectively empty input file). Our compat/zlib-uncompress2.c file borrows the same trick for portabilty. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Beat Bolli <dev+git@drbeat.li> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | Merge branch 'hn/reftable-coverity-fixes'Junio C Hamano2022-02-1618-525/+615
|\ \ | |/ |/| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Problems identified by Coverity in the reftable code have been corrected. * hn/reftable-coverity-fixes: reftable: add print functions to the record types reftable: make reftable_record a tagged union reftable: remove outdated file reftable.c reftable: implement record equality generically reftable: make reftable-record.h function signatures const correct reftable: handle null refnames in reftable_ref_record_equal reftable: drop stray printf in readwrite_test reftable: order unittests by complexity reftable: all xxx_free() functions accept NULL arguments reftable: fix resource warning reftable: ignore remove() return value in stack_test.c reftable: check reftable_stack_auto_compact() return value reftable: fix resource leak blocksource.c reftable: fix resource leak in block.c error path reftable: fix OOB stack write in print functions
| * reftable: add print functions to the record typesHan-Wen Nienhuys2022-01-203-15/+95
| | | | | | | | | | | | | | | | This isn't used per se, but it is useful for debugging, especially Windows CI failures. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: make reftable_record a tagged unionHan-Wen Nienhuys2022-01-2012-337/+334
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reduces the amount of glue code, because we don't need a void pointer or vtable within the structure. The only snag is that reftable_index_record contain a strbuf, so it cannot be zero-initialized. To address this, use reftable_new_record() to return fresh instance, given a record type. Since reftable_new_record() doesn't cause heap allocation anymore, it should be balanced with reftable_record_release() rather than reftable_record_destroy(). Thanks to Peff for the suggestion. Helped-by: Jeff King <peff@peff.net> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: remove outdated file reftable.cHan-Wen Nienhuys2022-01-201-115/+0
| | | | | | | | | | | | | | This was renamed to generic.c, but the origin was never removed Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: implement record equality genericallyHan-Wen Nienhuys2022-01-203-22/+63
| | | | | | | | | | | | | | | | This simplifies unittests a little, and provides further coverage for reftable_record_copy(). Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: make reftable-record.h function signatures const correctHan-Wen Nienhuys2022-01-202-14/+14
| | | | | | | | | | Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: handle null refnames in reftable_ref_record_equalHan-Wen Nienhuys2022-01-201-3/+5
| | | | | | | | | | | | | | Spotted by Coverity. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: drop stray printf in readwrite_testHan-Wen Nienhuys2022-01-201-1/+0
| | | | | | | | | | Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: all xxx_free() functions accept NULL argumentsHan-Wen Nienhuys2022-01-202-0/+4
| | | | | | | | | | | | | | This fixes NULL derefs in error paths. Spotted by Coverity. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: fix resource warningHan-Wen Nienhuys2022-01-201-5/+5
| | | | | | | | | | | | | | | | This would trigger in the unlikely event that we are compacting, and the next available file handle is 0. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: ignore remove() return value in stack_test.cHan-Wen Nienhuys2022-01-201-1/+1
| | | | | | | | | | | | | | If the cleanup fails, there is nothing we can do. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: check reftable_stack_auto_compact() return valueHan-Wen Nienhuys2022-01-201-0/+1
| | | | | | | | | | | | | | Fixes a problem detected by Coverity. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: fix resource leak blocksource.cHan-Wen Nienhuys2022-01-201-2/+4
| | | | | | | | | | | | | | | | This would be triggered in the unlikely event of fstat() failing on an opened file. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: fix resource leak in block.c error pathHan-Wen Nienhuys2022-01-203-18/+97
| | | | | | | | | | | | | | | | | | | | Add test coverage for corrupt zlib data. Fix memory leaks demonstrated by unittest. This problem was discovered by a Coverity scan. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
| * reftable: fix OOB stack write in print functionsHan-Wen Nienhuys2022-01-201-2/+2
| | | | | | | | | | Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | reftable tests: avoid "int" overflow, use "uint64_t"Ævar Arnfjörð Bjarmason2022-01-131-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Change code added in 1ae2b8cda84 (reftable: add merged table view, 2021-10-07) to consistently use the "uint64_t" type. These "min" and "max" variables get passed in the body of this function to a function whose prototype is: [...] reftable_writer_set_limits([...], uint64_t min, uint64_t max This avoids the following warning on SunCC 12.5 on gcc211.fsffrance.org: "reftable/merged_test.c", line 27: warning: initializer does not fit or is out of range: 0xffffffff Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | reftable: avoid initializing structs from structsHan-Wen Nienhuys2022-01-131-11/+11
| | | | | | | | | | | | | | Apparently, the IBM xlc compiler doesn't like this. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | reftable: support preset file mode for writingHan-Wen Nienhuys2021-12-233-10/+56
| | | | | | | | | | | | | | | | Create files with mode 0666, so umask works as intended. Provides an override, which is useful to support shared repos (test t1301-shared-repo.sh). Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | reftable: signal overflowHan-Wen Nienhuys2021-12-234-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | reflog entries have unbounded size. In theory, each log ('g') block in reftable can have an arbitrary size, so the format allows for arbitrarily sized reflog messages. However, in the implementation, we are not scaling the log blocks up with the message, and writing a large message fails. This triggers a failure for reftable in t7006-pager.sh. Until this is fixed more structurally, report an error from within the reftable library for easier debugging. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* | reftable: fix typo in headerHan-Wen Nienhuys2021-12-231-1/+1
|/ | | | | Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: add dump utilityHan-Wen Nienhuys2021-10-081-0/+107
| | | | | | | | | provide a command-line utility for inspecting individual tables, and inspecting a complete ref database Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: implement stack, a mutable database of reftable files.Han-Wen Nienhuys2021-10-084-0/+2518
| | | | | Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: implement refname validationHan-Wen Nienhuys2021-10-083-0/+340
| | | | | | | | | | The packed/loose format has restrictions on refnames: a and a/b cannot coexist. This limitation does not apply to reftable per se, but must be maintained for interoperability. This code adds validation routines to abort transactions that are trying to add invalid names. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: add merged table viewHan-Wen Nienhuys2021-10-084-0/+940
| | | | | | | | | | | | This adds an abstract, read-only interface to the ref database. This primitive is used to construct the read view of the ref database (the read view is constructed by merging several *.ref files). It also provides the mechanism to provide a unified view of the refs in the main repository and the per-worktree refs. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: add a heap-based priority queue for reftable recordsHan-Wen Nienhuys2021-10-084-0/+221
| | | | | | | This is needed to create a merged view multiple reftables Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: reftable file level testsHan-Wen Nienhuys2021-10-082-1/+653
| | | | | | | | | | | With support for reading and writing files in place, we can construct files (in memory) and attempt to read them back. Because some sections of the format are optional (eg. indices, log entries), we have to exercise this code using multiple sizes of input data Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: read reftable filesHan-Wen Nienhuys2021-10-085-0/+1229
| | | | | | | | | | | This supports reading a single reftable file. The commit introduces an abstract iterator type, which captures the usecases both of reading individual refs, and iterating over a segment of the ref namespace. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: generic interface to tablesHan-Wen Nienhuys2021-10-085-0/+402
| | | | | Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: write reftable filesHan-Wen Nienhuys2021-10-083-0/+888
| | | | | Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: a generic binary tree implementationHan-Wen Nienhuys2021-10-083-0/+158
| | | | | | | | | | | | | The reftable format includes support for an (OID => ref) map. This map can speed up visibility and reachability checks. In particular, various operations along the fetch/push path within Gerrit have ben sped up by using this structure. The map is constructed with help of a binary tree. Object IDs are hashes, so they are uniformly distributed. Hence, the tree does not attempt forced rebalancing. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: reading/writing blocksHan-Wen Nienhuys2021-10-083-0/+684
| | | | | | | | | | | | The reftable format is structured as a sequence of block. Within a block, records are prefix compressed, with an index of offsets for fully expand keys to enable binary search within blocks. This commit provides the logic to read and write these blocks. Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: (de)serialization for the polymorphic record type.Han-Wen Nienhuys2021-10-085-0/+1898
| | | | | | | | | | | The reftable format is structured as a sequence of blocks, and each block contains a sequence of prefix-compressed key-value records. There are 4 types of records, and they have similarities in how they must be handled. This is achieved by introducing a polymorphic 'record' type that encapsulates ref, log, index and object records. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
* reftable: add blocksource, an abstraction for random access readsHan-Wen Nienhuys2021-10-083-0/+219
| | | | | | | | | | | | | | The reftable format is usually used with files for storage. However, we abstract away this using the blocksource data structure. This has two advantages: * log blocks are zlib compressed, and handling them is simplified if we can discard byte segments from within the block layer. * for unittests, it is useful to read and write in-memory. The blocksource allows us to abstract the data away from on-disk files. Signed-off-by: Han-Wen Nienhuys <hanwen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>