ovsdb: Use column diffs for ovsdb and raft log entries.

Currently, ovsdb-server stores complete value for the column in a database file and in a raft log in case this column changed. This means that transaction that adds, for example, one new acl to a port group creates a log entry with all UUIDs of all existing acls + one new. Same for ports in logical switches and routers and more other columns with sets in Northbound DB. There could be thousands of acls in one port group or thousands of ports in a single logical switch. And the typical use case is to add one new if we're starting a new service/VM/container or adding one new node in a kubernetes or OpenStack cluster. This generates huge amount of traffic within ovsdb raft cluster, grows overall memory consumption and hurts performance since all these UUIDs are parsed and formatted to/from json several times and stored on disks. And more values we have in a set - more space a single log entry will occupy and more time it will take to process by ovsdb-server cluster members. Simple test: 1. Start OVN sandbox with clustered DBs: # make sandbox SANDBOXFLAGS='--nbdb-model=clustered --sbdb-model=clustered' 2. Run a script that creates one port group and adds 4000 acls into it: # cat ../memory-test.sh pg_name=my_port_group export OVN_NB_DAEMON=$(ovn-nbctl --pidfile --detach --log-file -vsocket_util:off) ovn-nbctl pg-add $pg_name for i in $(seq 1 4000); do echo "Iteration: $i" ovn-nbctl --log acl-add $pg_name from-lport $i udp drop done ovn-nbctl acl-del $pg_name ovn-nbctl pg-del $pg_name ovs-appctl -t $(pwd)/sandbox/nb1 memory/show ovn-appctl -t ovn-nbctl exit --- 4. Check the current memory consumption of ovsdb-server processes and space occupied by database files: # ls sandbox/[ns]b*.db -alh # ps -eo vsz,rss,comm,cmd | egrep '=[ns]b[123].pid' Test results with current ovsdb log format: On-disk Nb DB size : ~369 MB RSS of Nb ovsdb-servers: ~2.7 GB Time to finish the test: ~2m In order to mitigate memory consumption issues and reduce computational load on ovsdb-servers let's store diff between old and new values instead. This will make size of each log entry that adds single acl to port group (or port to logical switch or anything else like that) very small and independent from the number of already existing acls (ports, etc.). Added a new marker '_is_diff' into a file transaction to specify that this transaction contains diffs instead of replacements for the existing data. One side effect is that this change will actually increase the size of file transaction that removes more than a half of entries from the set, because diff will be larger than the resulted new value. However, such operations are rare. Test results with change applied: On-disk Nb DB size : ~2.7 MB ---> reduced by 99% RSS of Nb ovsdb-servers: ~580 MB ---> reduced by 78% Time to finish the test: ~1m27s ---> reduced by 27% After this change new ovsdb-server is still able to read old databases, but old ovsdb-server will not be able to read new ones. Since new servers could join ovsdb cluster dynamically it's hard to implement any runtime mechanism to handle cases where different versions of ovsdb-server joins the cluster. However we still need to handle cluster upgrades. For this case added special command line argument to disable new functionality. Documentation updated with the recommended way to upgrade the ovsdb cluster. Acked-by: Dumitru Ceara <dceara@redhat.com> Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
author: Ilya Maximets <i.maximets@ovn.org> 2020-12-11 21:54:47 +0100
committer: Ilya Maximets <i.maximets@ovn.org> 2021-01-15 19:23:02 +0100
commit: 2ccd66f594f7a5fdc39028f8c7473e11d2329a11 (patch)
tree: 4226fe03c9c33033f65fa083071ede5f8428fefa /Documentation/ref
parent: 980bca70799da3d186c568f26b72a9774043d6ef (diff)
download: openvswitch-2ccd66f594f7a5fdc39028f8c7473e11d2329a11.tar.gz
1 files changed, 65 insertions, 0 deletions
diff --git a/Documentation/ref/ovsdb.7.rst b/Documentation/ref/ovsdb.7.rst
index da4dbedd2..e4f1bf766 100644
--- a/Documentation/ref/ovsdb.7.rst
+++ b/Documentation/ref/ovsdb.7.rst
@@ -204,6 +204,14 @@ split-brain.
 
 Open vSwitch 2.6 introduced support for the active-backup service model.
 
+.. important::
+
+   There was a change of a database file format in version 2.15.
+   To upgrade/downgrade the ``ovsdb-server`` processes across this version
+   follow the instructions described under
+   `Upgrading from version 2.14 and earlier to 2.15 and later`_ and
+   `Downgrading from version 2.15 and later to 2.14 and earlier`_.
+
 Clustered Database Service Model
 --------------------------------
 
@@ -270,11 +278,68 @@ vSwitch to another, upgrading them one at a time will keep the cluster healthy
 during the upgrade process.  (This is different from upgrading a database
 schema, which is covered later under `Upgrading or Downgrading a Database`_.)
 
+.. important::
+
+   There was a change of a database file format in version 2.15.
+   To upgrade/downgrade the ``ovsdb-server`` processes across this version
+   follow the instructions described under
+   `Upgrading from version 2.14 and earlier to 2.15 and later`_ and
+   `Downgrading from version 2.15 and later to 2.14 and earlier`_.
+
 Clustered OVSDB does not support the OVSDB "ephemeral columns" feature.
 ``ovsdb-tool`` and ``ovsdb-client`` change ephemeral columns into persistent
 ones when they work with schemas for clustered databases.  Future versions of
 OVSDB might add support for this feature.
 
+Upgrading from version 2.14 and earlier to 2.15 and later
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+There is a change of a database file format in version 2.15 that doesn't allow
+older versions of ``ovsdb-server`` to read the database file modified by the
+``ovsdb-server`` version 2.15 or later.  This also affects runtime
+communications between servers in **active-backup** and **cluster** service
+models. To upgrade the ``ovsdb-server`` processes from one version of Open
+vSwitch (2.14 or earlier) to another (2.15 or higher) instructions below should
+be followed. (This is different from upgrading a database schema, which is
+covered later under `Upgrading or Downgrading a Database`_.)
+
+In case of **standalone** service model no special handling during upgrade is
+required.
+
+For the **active-backup** service model, administrator needs to update backup
+``ovsdb-server`` first and the active one after that, or shut down both servers
+and upgrade at the same time.
+
+For the **cluster** service model recommended upgrade strategy is following:
+
+1. Upgrade processes one at a time.  Each ``ovsdb-server`` process after
+   upgrade should be started with ``--disable-file-column-diff`` command line
+   argument.
+
+2. When all ``ovsdb-server`` processes upgraded, use ``ovs-appctl`` to invoke
+   ``ovsdb/file/column-diff-enable`` command on each of them or restart all
+   ``ovsdb-server`` processes one at a time without
+   ``--disable-file-column-diff`` command line option.
+
+Downgrading from version 2.15 and later to 2.14 and earlier
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Similar to upgrading covered under `Upgrading from version 2.14 and earlier to
+2.15 and later`_, downgrading from the ``ovsdb-server`` version 2.15 and later
+to 2.14 and earlier requires additional steps. (This is different from
+upgrading a database schema, which is covered later under
+`Upgrading or Downgrading a Database`_.)
+
+For all service models it's required to:
+
+1. Stop all ``ovsdb-server`` processes (single process for **standalone**
+   service model, all involved processes for **active-backup** and **cluster**
+   service models).
+
+2. Compact all database files with ``ovsdb-tool compact`` command.
+
+3. Downgrade and restart ``ovsdb-server`` processes.
+
 Understanding Cluster Consistency
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
author	Ilya Maximets <i.maximets@ovn.org>	2020-12-11 21:54:47 +0100
committer	Ilya Maximets <i.maximets@ovn.org>	2021-01-15 19:23:02 +0100
commit	2ccd66f594f7a5fdc39028f8c7473e11d2329a11 (patch)
tree	4226fe03c9c33033f65fa083071ede5f8428fefa /Documentation/ref
parent	980bca70799da3d186c568f26b72a9774043d6ef (diff)
download	openvswitch-2ccd66f594f7a5fdc39028f8c7473e11d2329a11.tar.gz