summaryrefslogtreecommitdiff
path: root/doc
diff options
context:
space:
mode:
authorLingxian Kong <anlin.kong@gmail.com>2021-07-22 16:38:08 +1200
committerLingxian Kong <anlin.kong@gmail.com>2021-07-23 22:16:20 +1200
commit02971d850b57ac27a126ecb8ca4012f97ae856fd (patch)
tree0ae1bf0909bcbc13b74a4b7ba35f083c4fcbfb2a /doc
parent69f08ab470a0d1d1d4676b41ad29a9c19ce28648 (diff)
downloadtrove-02971d850b57ac27a126ecb8ca4012f97ae856fd.tar.gz
Add periodic task to remove postgres archived wal files
* Added a periodic task for postgresql datastore to clean up the archived WAL files. * Added a check when creating incremental backups for postgresql. * A new container image ``openstacktrove/db-backup-postgresql:1.1.2`` is uploaded to docker hub. Story: 2009066 Task: 42871 Change-Id: I235e2abf8c0405e143ded6fb48017d596b8b41a1
Diffstat (limited to 'doc')
-rw-r--r--doc/source/admin/database_management.rst24
-rw-r--r--doc/source/admin/index.rst1
-rw-r--r--doc/source/user/backup-db.rst22
3 files changed, 47 insertions, 0 deletions
diff --git a/doc/source/admin/database_management.rst b/doc/source/admin/database_management.rst
new file mode 100644
index 00000000..58857384
--- /dev/null
+++ b/doc/source/admin/database_management.rst
@@ -0,0 +1,24 @@
+.. _database_management:
+
+===================
+Database Management
+===================
+
+PostgreSQL
+----------
+
+WAL(Write Ahead Log)
+~~~~~~~~~~~~~~~~~~~~
+
+By default, ``archive_mode`` is enabled in order to create incremental database backup, which is triggered by the users. ``archive_command`` is configured as well for continuous WAL archiving, the WAL files in pg_wal subdirectory are copied to ``/var/lib/postgresql/data/wal_archive``.
+
+That is going to be a problem if the WAL segment files in the archive folder keep increasing, especially in the busy system, several TBs of WALs can be piled up in archive destination(part of the data volume), which will lead to the database service unavailable.
+
+In the PostgreSQL manager of trove-guestagent, there is a periodic task aiming at cleaning up the archive folder, when it's running, it checks the size of the archive folder, if the size is greater than half of the data volume size, in the archive folder:
+
+1. If there is no ``.backup`` file, it means the database has never been backed up before, all the WAL segment files except for the latest one are removed.
+2. If there are ``.backup`` files, remove all the files older than the backup file. Check the size again, if the size condition is still met, all the WAL segment files except for the latest one are removed.
+
+When creating the incremental backup, trove will check if the parent backup file still exists in the archive folder, the backup creation will fail if that's not found. The user is able to see the error message in the instance detail and has to create full backup instead.
+
+Another option is to archive WAL files to Swift(in the user's account), e.g. using WAL-G or other 3rd party tools, but that will incur charges for the object storage usage which is not optimal. We leave it to the users to decide when and how the backups should be created.
diff --git a/doc/source/admin/index.rst b/doc/source/admin/index.rst
index 6e6f8b29..38a62b81 100644
--- a/doc/source/admin/index.rst
+++ b/doc/source/admin/index.rst
@@ -10,4 +10,5 @@
datastore
building_guest_images
secure_oslo_messaging
+ database_management
troubleshooting
diff --git a/doc/source/user/backup-db.rst b/doc/source/user/backup-db.rst
index d45272f6..d8450b04 100644
--- a/doc/source/user/backup-db.rst
+++ b/doc/source/user/backup-db.rst
@@ -333,3 +333,25 @@ object URL), the local datastore version and the backup data size are required.
| status | RESTORED |
| updated | 2021-02-22T01:44:06 |
+----------------------+---------------------------------------------------------------------------------------------------------------------------------------+
+
+Troubleshooting
+---------------
+
+Failed to create incremental backup for PostgreSQL
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+One possible reason could be it has been a long time since the parent backup was created, and the parent backup WAL file is removed internally because of disk pressure, it could be confirmed by checking the instance detail, e.g.
+
+.. code-block:: console
+
+ $ openstack database instance show e7231e46-ca3b-4dce-bf67-739b3af0ef85 -c fault
+ +-------+----------------------------------------------------------------------+
+ | Field | Value |
+ +-------+----------------------------------------------------------------------+
+ | fault | Failed to create backup c76de467-6587-4e27-bb8d-7c3d3b136663, error: |
+ | | Cannot find parent backup WAL file. |
+ +-------+----------------------------------------------------------------------+
+
+In this case, you have to create full backup instead.
+
+To avoid this issue in the future, you can set up a cron job to create (incremental) backups regularly. \ No newline at end of file