diff options
author | antirez <antirez@gmail.com> | 2017-06-27 13:09:33 +0200 |
---|---|---|
committer | antirez <antirez@gmail.com> | 2017-06-27 13:19:16 +0200 |
commit | 365dd037dcc00249c7631caac82c49a9c0c8c0f6 (patch) | |
tree | 19da989003aab15c4b0a7535fa805905df199a09 /src/rdb.h | |
parent | c3998728a2674ebdbc3c24851adece5ccc9d3363 (diff) | |
download | redis-365dd037dcc00249c7631caac82c49a9c0c8c0f6.tar.gz |
RDB modules values serialization format version 2.
The original RDB serialization format was not parsable without the
module loaded, becuase the structure was managed only by the module
itself. Moreover RDB is a streaming protocol in the sense that it is
both produce di an append-only fashion, and is also sometimes directly
sent to the socket (in the case of diskless replication).
The fact that modules values cannot be parsed without the relevant
module loaded is a problem in many ways: RDB checking tools must have
loaded modules even for doing things not involving the value at all,
like splitting an RDB into N RDBs by key or alike, or just checking the
RDB for sanity.
In theory module values could be just a blob of data with a prefixed
length in order for us to be able to skip it. However prefixing the values
with a length would mean one of the following:
1. To be able to write some data at a previous offset. This breaks
stremaing.
2. To bufferize values before outputting them. This breaks performances.
3. To have some chunked RDB output format. This breaks simplicity.
Moreover, the above solution, still makes module values a totally opaque
matter, with the fowllowing problems:
1. The RDB check tool can just skip the value without being able to at
least check the general structure. For datasets composed mostly of
modules values this means to just check the outer level of the RDB not
actually doing any checko on most of the data itself.
2. It is not possible to do any recovering or processing of data for which a
module no longer exists in the future, or is unknown.
So this commit implements a different solution. The modules RDB
serialization API is composed if well defined calls to store integers,
floats, doubles or strings. After this commit, the parts generated by
the module API have a one-byte prefix for each of the above emitted
parts, and there is a final EOF byte as well. So even if we don't know
exactly how to interpret a module value, we can always parse it at an
high level, check the overall structure, understand the types used to
store the information, and easily skip the whole value.
The change is backward compatible: older RDB files can be still loaded
since the new encoding has a new RDB type: MODULE_2 (of value 7).
The commit also implements the ability to check RDB files for sanity
taking advantage of the new feature.
Diffstat (limited to 'src/rdb.h')
-rw-r--r-- | src/rdb.h | 12 |
1 files changed, 11 insertions, 1 deletions
@@ -78,6 +78,8 @@ #define RDB_TYPE_HASH 4 #define RDB_TYPE_ZSET_2 5 /* ZSET version 2 with doubles stored in binary. */ #define RDB_TYPE_MODULE 6 +#define RDB_TYPE_MODULE_2 7 /* Module value with annotations for parsing without + the generating module being loaded. */ /* NOTE: WHEN ADDING NEW RDB TYPE, UPDATE rdbIsObjectType() BELOW */ /* Object types for encoded objects. */ @@ -90,7 +92,7 @@ /* NOTE: WHEN ADDING NEW RDB TYPE, UPDATE rdbIsObjectType() BELOW */ /* Test if a type is an object type. */ -#define rdbIsObjectType(t) ((t >= 0 && t <= 6) || (t >= 9 && t <= 14)) +#define rdbIsObjectType(t) ((t >= 0 && t <= 7) || (t >= 9 && t <= 14)) /* Special RDB opcodes (saved/loaded with rdbSaveType/rdbLoadType). */ #define RDB_OPCODE_AUX 250 @@ -100,6 +102,14 @@ #define RDB_OPCODE_SELECTDB 254 #define RDB_OPCODE_EOF 255 +/* Module serialized values sub opcodes */ +#define RDB_MODULE_OPCODE_EOF 0 /* End of module value. */ +#define RDB_MODULE_OPCODE_SINT 1 /* Signed integer. */ +#define RDB_MODULE_OPCODE_UINT 2 /* Unsigned integer. */ +#define RDB_MODULE_OPCODE_FLOAT 3 /* Float. */ +#define RDB_MODULE_OPCODE_DOUBLE 4 /* Double. */ +#define RDB_MODULE_OPCODE_STRING 5 /* String. */ + /* rdbLoad...() functions flags. */ #define RDB_LOAD_NONE 0 #define RDB_LOAD_ENC (1<<0) |