summaryrefslogtreecommitdiff
path: root/src/rdb.h
diff options
context:
space:
mode:
authorantirez <antirez@gmail.com>2017-06-27 13:09:33 +0200
committerantirez <antirez@gmail.com>2017-06-27 13:19:16 +0200
commit365dd037dcc00249c7631caac82c49a9c0c8c0f6 (patch)
tree19da989003aab15c4b0a7535fa805905df199a09 /src/rdb.h
parentc3998728a2674ebdbc3c24851adece5ccc9d3363 (diff)
downloadredis-365dd037dcc00249c7631caac82c49a9c0c8c0f6.tar.gz
RDB modules values serialization format version 2.
The original RDB serialization format was not parsable without the module loaded, becuase the structure was managed only by the module itself. Moreover RDB is a streaming protocol in the sense that it is both produce di an append-only fashion, and is also sometimes directly sent to the socket (in the case of diskless replication). The fact that modules values cannot be parsed without the relevant module loaded is a problem in many ways: RDB checking tools must have loaded modules even for doing things not involving the value at all, like splitting an RDB into N RDBs by key or alike, or just checking the RDB for sanity. In theory module values could be just a blob of data with a prefixed length in order for us to be able to skip it. However prefixing the values with a length would mean one of the following: 1. To be able to write some data at a previous offset. This breaks stremaing. 2. To bufferize values before outputting them. This breaks performances. 3. To have some chunked RDB output format. This breaks simplicity. Moreover, the above solution, still makes module values a totally opaque matter, with the fowllowing problems: 1. The RDB check tool can just skip the value without being able to at least check the general structure. For datasets composed mostly of modules values this means to just check the outer level of the RDB not actually doing any checko on most of the data itself. 2. It is not possible to do any recovering or processing of data for which a module no longer exists in the future, or is unknown. So this commit implements a different solution. The modules RDB serialization API is composed if well defined calls to store integers, floats, doubles or strings. After this commit, the parts generated by the module API have a one-byte prefix for each of the above emitted parts, and there is a final EOF byte as well. So even if we don't know exactly how to interpret a module value, we can always parse it at an high level, check the overall structure, understand the types used to store the information, and easily skip the whole value. The change is backward compatible: older RDB files can be still loaded since the new encoding has a new RDB type: MODULE_2 (of value 7). The commit also implements the ability to check RDB files for sanity taking advantage of the new feature.
Diffstat (limited to 'src/rdb.h')
-rw-r--r--src/rdb.h12
1 files changed, 11 insertions, 1 deletions
diff --git a/src/rdb.h b/src/rdb.h
index efe932255..a22cb33ce 100644
--- a/src/rdb.h
+++ b/src/rdb.h
@@ -78,6 +78,8 @@
#define RDB_TYPE_HASH 4
#define RDB_TYPE_ZSET_2 5 /* ZSET version 2 with doubles stored in binary. */
#define RDB_TYPE_MODULE 6
+#define RDB_TYPE_MODULE_2 7 /* Module value with annotations for parsing without
+ the generating module being loaded. */
/* NOTE: WHEN ADDING NEW RDB TYPE, UPDATE rdbIsObjectType() BELOW */
/* Object types for encoded objects. */
@@ -90,7 +92,7 @@
/* NOTE: WHEN ADDING NEW RDB TYPE, UPDATE rdbIsObjectType() BELOW */
/* Test if a type is an object type. */
-#define rdbIsObjectType(t) ((t >= 0 && t <= 6) || (t >= 9 && t <= 14))
+#define rdbIsObjectType(t) ((t >= 0 && t <= 7) || (t >= 9 && t <= 14))
/* Special RDB opcodes (saved/loaded with rdbSaveType/rdbLoadType). */
#define RDB_OPCODE_AUX 250
@@ -100,6 +102,14 @@
#define RDB_OPCODE_SELECTDB 254
#define RDB_OPCODE_EOF 255
+/* Module serialized values sub opcodes */
+#define RDB_MODULE_OPCODE_EOF 0 /* End of module value. */
+#define RDB_MODULE_OPCODE_SINT 1 /* Signed integer. */
+#define RDB_MODULE_OPCODE_UINT 2 /* Unsigned integer. */
+#define RDB_MODULE_OPCODE_FLOAT 3 /* Float. */
+#define RDB_MODULE_OPCODE_DOUBLE 4 /* Double. */
+#define RDB_MODULE_OPCODE_STRING 5 /* String. */
+
/* rdbLoad...() functions flags. */
#define RDB_LOAD_NONE 0
#define RDB_LOAD_ENC (1<<0)