From 99daf3ce03f4091c74400f895f9c82a1c046e645 Mon Sep 17 00:00:00 2001 From: Daan De Meyer Date: Sat, 23 Oct 2021 22:36:47 +0100 Subject: journal: Use 32-bit entry array offsets in compact mode Before: OBJECT TYPE ENTRIES SIZE Unused 0 0B Data 3610336 595.7M Field 5310 285.2K Entry 3498326 1.2G Data Hash Table 29 103.1M Field Hash Table 29 151.3K Entry Array 605991 1011.6M Tag 0 0B Total 7720021 2.9G After: OBJECT TYPE ENTRIES SIZE Unused 0 0B Data 3562667 591.0M Field 3971 213.6K Entry 3498566 1.2G Data Hash Table 20 71.1M Field Hash Table 20 104.3K Entry Array 582647 505.0M Tag 0 0B Total 7647891 2.4G --- docs/JOURNAL_FILE_FORMAT.md | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) (limited to 'docs/JOURNAL_FILE_FORMAT.md') diff --git a/docs/JOURNAL_FILE_FORMAT.md b/docs/JOURNAL_FILE_FORMAT.md index d40688c440..c4484693af 100644 --- a/docs/JOURNAL_FILE_FORMAT.md +++ b/docs/JOURNAL_FILE_FORMAT.md @@ -71,7 +71,7 @@ thread](https://lists.freedesktop.org/archives/systemd-devel/2012-October/007054 ## Basics -* All offsets, sizes, time values, hashes (and most other numeric values) are 64bit unsigned integers in LE format. +* All offsets, sizes, time values, hashes (and most other numeric values) are 32bit/64bit unsigned integers in LE format. * Offsets are always relative to the beginning of the file. * The 64bit hash function siphash24 is used for newer journal files. For older files [Jenkins lookup3](https://en.wikipedia.org/wiki/Jenkins_hash_function) is used, more specifically `jenkins_hashlittle2()` with the first 32bit integer it returns as higher 32bit part of the 64bit value, and the second one uses as lower 32bit part. * All structures are aligned to 64bit boundaries and padded to multiples of 64bit @@ -552,7 +552,10 @@ creativity rather than runtime parameters. _packed_ struct EntryArrayObject { ObjectHeader object; le64_t next_entry_array_offset; - le64_t items[]; + union { + le64_t regular[]; + le32_t compact[]; + } items; }; ``` @@ -560,6 +563,9 @@ Entry Arrays are used to store a sorted array of offsets to entries. Entry arrays are strictly sorted by offsets on disk, and hence by their timestamps and sequence numbers (with some restrictions, see above). +If the `HEADER_INCOMPATIBLE_COMPACT` flag is set, offsets are stored as 32-bit +integers instead of 64bit. + Entry Arrays are chained up. If one entry array is full another one is allocated and the **next_entry_array_offset** field of the old one pointed to it. An Entry Array with **next_entry_array_offset** set to 0 is the last in the -- cgit v1.2.1