From 99daf3ce03f4091c74400f895f9c82a1c046e645 Mon Sep 17 00:00:00 2001
From: Daan De Meyer <daan.j.demeyer@gmail.com>
Date: Sat, 23 Oct 2021 22:36:47 +0100
Subject: journal: Use 32-bit entry array offsets in compact mode

Before:

OBJECT TYPE      ENTRIES SIZE
Unused           0       0B
Data             3610336 595.7M
Field            5310    285.2K
Entry            3498326 1.2G
Data Hash Table  29	 103.1M
Field Hash Table 29      151.3K
Entry Array      605991  1011.6M
Tag              0	 0B
Total            7720021 2.9G

After:

OBJECT TYPE      ENTRIES SIZE
Unused           0	 0B
Data             3562667 591.0M
Field            3971    213.6K
Entry            3498566 1.2G
Data Hash Table  20	 71.1M
Field Hash Table 20	 104.3K
Entry Array	 582647  505.0M
Tag              0	 0B
Total            7647891 2.4G
---
 docs/JOURNAL_FILE_FORMAT.md | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

(limited to 'docs/JOURNAL_FILE_FORMAT.md')

diff --git a/docs/JOURNAL_FILE_FORMAT.md b/docs/JOURNAL_FILE_FORMAT.md
index d40688c440..c4484693af 100644
--- a/docs/JOURNAL_FILE_FORMAT.md
+++ b/docs/JOURNAL_FILE_FORMAT.md
@@ -71,7 +71,7 @@ thread](https://lists.freedesktop.org/archives/systemd-devel/2012-October/007054
 
 ## Basics
 
-* All offsets, sizes, time values, hashes (and most other numeric values) are 64bit unsigned integers in LE format.
+* All offsets, sizes, time values, hashes (and most other numeric values) are 32bit/64bit unsigned integers in LE format.
 * Offsets are always relative to the beginning of the file.
 * The 64bit hash function siphash24 is used for newer journal files. For older files [Jenkins lookup3](https://en.wikipedia.org/wiki/Jenkins_hash_function) is used, more specifically `jenkins_hashlittle2()` with the first 32bit integer it returns as higher 32bit part of the 64bit value, and the second one uses as lower 32bit part.
 * All structures are aligned to 64bit boundaries and padded to multiples of 64bit
@@ -552,7 +552,10 @@ creativity rather than runtime parameters.
 _packed_ struct EntryArrayObject {
         ObjectHeader object;
         le64_t next_entry_array_offset;
-        le64_t items[];
+        union {
+                le64_t regular[];
+                le32_t compact[];
+        } items;
 };
 ```
 
@@ -560,6 +563,9 @@ Entry Arrays are used to store a sorted array of offsets to entries. Entry
 arrays are strictly sorted by offsets on disk, and hence by their timestamps
 and sequence numbers (with some restrictions, see above).
 
+If the `HEADER_INCOMPATIBLE_COMPACT` flag is set, offsets are stored as 32-bit
+integers instead of 64bit.
+
 Entry Arrays are chained up. If one entry array is full another one is
 allocated and the **next_entry_array_offset** field of the old one pointed to
 it. An Entry Array with **next_entry_array_offset** set to 0 is the last in the
-- 
cgit v1.2.1