summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorNick Vatamaniuc <vatamane@gmail.com>2022-11-16 22:32:35 -0500
committerNick Vatamaniuc <nickva@users.noreply.github.com>2022-11-18 17:04:30 -0500
commit3c24731a5e49bbb4a1d1f407f11f141ca8698e6c (patch)
treee4107c1c74edec9d8037906c0cf7d78f1f40545e
parent111f2616e1ef48fadb43dcc358bb8dabb69ad839 (diff)
downloadcouchdb-3c24731a5e49bbb4a1d1f407f11f141ca8698e6c.tar.gz
Update smoosh documentation
* Remove the state chart. With activated/not-activated state gone, we don't need it any longer. * Describe the cleanup channels. * Add upgrade db and view channel references in a few places. * Remove references to `external` or `data_size` and other previous compaction size metrics used for triggering compactions. Replace references with `active` size. * Use double back-ticks in a few places instead of single back-ticks due to differences in RST vs MD. In RST code litterals need double back-ticks.
-rw-r--r--src/docs/src/maintenance/compaction.rst28
-rw-r--r--src/smoosh/README.md33
-rw-r--r--src/smoosh/operator_guide.md89
-rw-r--r--src/smoosh/recovery_process_diagram.jpegbin51388 -> 0 bytes
4 files changed, 74 insertions, 76 deletions
diff --git a/src/docs/src/maintenance/compaction.rst b/src/docs/src/maintenance/compaction.rst
index c15344f11..5f3ff02cc 100644
--- a/src/docs/src/maintenance/compaction.rst
+++ b/src/docs/src/maintenance/compaction.rst
@@ -93,6 +93,7 @@ configuration setting in the ``[smoosh]`` block. The default configuration is
[smoosh]
db_channels = upgrade_dbs,ratio_dbs,slack_dbs
view_channels = upgrade_views,ratio_views,slack_views
+ cleanup_channels = index_cleanup
[smoosh.ratio_dbs]
priority = ratio
@@ -110,18 +111,23 @@ configuration setting in the ``[smoosh]`` block. The default configuration is
priority = slack
min_priority = 536870912
-The "upgrade" channels are a special pair of channels that only check whether
-the `disk_format_version` for the file matches the current version, and enqueue
-the file for compaction (which has the side effect of upgrading the file format)
-if that's not the case. There are several additional properties that can be
-configured for each channel; these are documented in the :ref:`configuration API
-<config/compactions>`
+The "upgrade" and "cleanup_channels" are special system channels. The "upgrade"
+ones check whether the ``disk_format_version`` for the file matches the current
+version, and enqueue the file for compaction (which has the side effect of
+upgrading the file format) if that's not the case. In addition to that, the
+``upgrade_views`` will enqueue views for compaction after the collation
+(libicu) library is upgraded. The "index_cleanup" channel is used for
+scheduling jobs used to remove stale index files and purge _local checkpoint
+document after design documents are updated.
+
+Here are several additional properties that can be configured for each channel;
+these are documented in the :ref:`configuration API <config/compactions>`
Scheduling Windows
------------------
Each compaction channel can be configured to run only during certain hours of
-the day. The channel-specific `from`, `to`, and `strict_window` configuration
+the day. The channel-specific ``from``, ``to``, and ``strict_window`` configuration
settings control this behavior. For example
.. code-block:: ini
@@ -131,7 +137,7 @@ settings control this behavior. For example
to = 06:00
strict_window = true
-where `overnight_channel` is the name of the channel you want to configure.
+where ``overnight_channel`` is the name of the channel you want to configure.
Note: CouchDB determines time via the UTC (GMT) timezone, so these settings must be
expressed as UTC (GMT).
@@ -220,9 +226,9 @@ Manual Database Compaction
Database compaction compresses the database file by removing unused file
sections created during updates. Old documents revisions are replaced with
-small amount of metadata called `tombstone` which are used for conflicts
+small amount of metadata called ``tombstone`` which are used for conflicts
resolution during replication. The number of stored revisions
-(and their `tombstones`) can be configured by using the :get:`_revs_limit
+(and their ``tombstones``) can be configured by using the :get:`_revs_limit
</{db}/_revs_limit>` URL endpoint.
Compaction can be manually triggered per database and runs as a background
@@ -326,7 +332,7 @@ is actually running. To track the compaction progress you may query the
Manual View Compaction
======================
-`Views` also need compaction. Unlike databases, views are compacted by groups
+Views also need compaction. Unlike databases, views are compacted by groups
per `design document`. To start their compaction, send the HTTP
:post:`/{db}/_compact/{ddoc}` request::
diff --git a/src/smoosh/README.md b/src/smoosh/README.md
index 9f9a48074..31d111ba3 100644
--- a/src/smoosh/README.md
+++ b/src/smoosh/README.md
@@ -24,6 +24,8 @@ The main settings one interacts with are:
databases.
<dt>view_channels<dd>A comma-separated list of channel names for
views.
+<dt>cleanup_channels<dd>A comma-separated list of channel names
+for cleaning old index files.
<dt>staleness<dd>The number of minutes that the (expensive) priority
calculation can be stale for before it is recalculated. Defaults to 5.
</dl>
@@ -32,9 +34,7 @@ Sometimes it's necessary to use the following:
<dl>
<dt>cleanup_index_files</dt><dd>Whether smoosh cleans up the files
-for indexes that have been deleted. Defaults to false and probably
-shouldn't be changed unless the cluster is running low on disk space,
-and only after considering the ramifications.</dd>
+for indexes that have been deleted. Defaults to true but may be switched to false.</dd>
<dt>wait_secs</dt><dd>The time a channel waits before starting compactions
to allow time to observe the system and make a smarter decision about what
to compact first. Hardly ever changed from the default. Default 30 (seconds).
@@ -68,11 +68,13 @@ properly managed by OTP yet.
Compaction Scheduling Algorithm
-------------------------------
-Smoosh decides whether to compact a database or view by evaluating the
-item against the selection criteria of each _channel_ in the order
-they are configured. By default there are two channels for databases
-("ratio_dbs" and "slack_dbs"), and two channels for views ("ratio_views"
-and "slack_views")
+Smoosh decides whether to compact a database or view by evaluating the item
+against the selection criteria of each _channel_ in the order they are
+configured. By default there are three channels for databases ("ratio_dbs",
+"slack_dbs" and "upgrade_dbs"), three channels for views ("ratio_views",
+"slack_views" and "upgrade_views"). The "cleanup_channels" has only the
+"index_cleanup" channel. That channel is for enqueueing stale index file
+cleanup jobs.
Smoosh will enqueue the new item to the first channel that accepts
it. If none accept it, the item is not enqueued for compaction.
@@ -80,18 +82,9 @@ it. If none accept it, the item is not enqueued for compaction.
Notes on the data_size value
----------------------------
-Every database and view shard has a data_size value. In CouchDB this
-accurately reflects the post-compaction file size. In DbCore, it is
-the size of the file that we bill for. It excludes the b+tree and
-database footer overhead. We also bill customers for the uncompressed
-size of their documents, though we store them compressed on disk.
-These two systems were developed independently (ours predates
-CouchDB's) and DbCore only calculates the billing size value.
-
-Because of the way our data_size is currently calculated, it can
-sometimes be necessary to enqueue databases and views with very low
-ratios. Due to this, it is also currently impossible to tell how
-optimally compacted a cluster is.
+Every database and view shard has an active size value. In CouchDB this
+accurately reflects the post-compaction file size plus the b+tree metadata and
+database footer overhead.
Example config commands
-----------------------
diff --git a/src/smoosh/operator_guide.md b/src/smoosh/operator_guide.md
index fafee30d4..4764333bd 100644
--- a/src/smoosh/operator_guide.md
+++ b/src/smoosh/operator_guide.md
@@ -29,7 +29,7 @@ each node maintains and processes an independent set of compactions.
Each channel has a basic type for the algorithm it uses to select pending
compactions for its queue and how it prioritises them.
-The two queue types are:
+There are a few queue types:
* **ratio**: this uses the ratio `total_bytes / user_bytes` as its driving
calculation. The result _X_ must be greater than some configurable value _Y_ for a
@@ -45,14 +45,33 @@ calculation of _X_ is described in [Priority calculation](#priority-calculation)
Both algorithms operate on two main measures:
-* **user_bytes**: this is the amount of data the user has in the file. It
-doesn't include storage overhead: old revisions, on-disk btree structure and
-so on.
+* **active_bytes**: this is the amount of data used by btree structure and the
+document bodies in the leaves of the revision tree of each document. It
+includes storage overhead, on-disk btree structure but does not include document
+bodies not in leaf nodes. So, for instance, after deleting a document, that
+document's body revision will become an intermediate revision tree node and its
+size won't be relfected in the **active_bytes** ammount.
* **total_bytes**: the size of the file on disk.
Channel type is set using the `priority` configuration setting.
+There are also a few special "system" channels:
+
+* **upgrade_dbs** : this is used for enqueuing database shards which need to be
+ upgraded. This may happen after when Apache CouchDB's data format changes.
+
+* **upgrade_views** : channels used for enqueuing views which need to be
+ upgraded. This may happen when view disk format changes, or after operation
+ system's collation library (libicu) major version upgrade. Then, view shard
+ will be enqueued for recompaction, so their rows are re-ordered according the
+ updated rules of the new collation library.
+
+* **cleanup_channels** : currently there is only a single **index_cleanup**
+ channel which is used to enqueue jobs used to remove stale view index files
+ and purge view client checkpoint _local document after design documents get
+ updated.
+
#### Further configuration options
Beyond its basic type, there are several other configuration options which
@@ -86,8 +105,8 @@ currently [here][ss].
#### Background Detail
-`user_bytes` is called `data_size` in `db_info` blocks. It is the total of all bytes
-that are used to store docs and their attachments.
+`user_bytes` is called `sizes.active` in `db_info` blocks. It is the total of all bytes
+that are used to store docs and their attachments visible in the leaf nodes of document revision trees.
Since `.couch` files are append only, every update adds data to the file. When
you update a btree, a new leaf node is written and all the nodes back up the
@@ -95,48 +114,15 @@ root. In this update, old data is never overwritten and these parts of the
file are no longer live; this includes old btree nodes and document bodies.
Compaction takes this file and writes a new file that only contains live data.
-`total_data` is the number of bytes in the file as reported by `ls -al filename`.
-
-#### Flaws
-
-An important flaw in this calculation is that `total_data` takes into account
-the compression of data on disk, whereas `user_bytes` does not. This can give
-unexpected results to calculations, as the values are not directly comparable.
-
-However, it's the best measure we currently have.
-
-[Even more info](https://github.com/apache/couchdb-smoosh#notes-on-the-data_size-value).
-
-#### State diagram
-
-Below is a diagram of smoosh's initial state during the recovery process.
-
-```
-stateDiagram
- [*] --> init
- init --> start_recovery: send_after(?START_DELAY_IN_MSEC, self(), start_recovery)
- note right of start_recovery
- activated = false
- paused = true
- end note
- start_recovery --> activate: send_after(?ACTIVATE_DELAY_IN_MSEC, self(), activate)
- note right of activate
- state has been recovered
- activated = true
- paused = true
- end note
- activate --> schedule_unpause
- schedule_unpause --> [*]: after 30 sec, paused = false and compaction of new jobs begin
-```
-
-![Smoosh State Recovery Process Diagram](recovery_process_diagram.jpeg)
+`total_data` is the number of bytes in the file as reported by `ls -al
+filename`. In `db_info` reponse this is the `sizes.file` value.
### Defining a channel
Defining a channel is done via normal dbcore configuration, with some
convention as to the parameter names.
-Channel configuration is defined using `smoosh.channel_name` top level config
+Channel configuration is defined using `smoosh.$channel_name` top level config
options. Defining a channel is just setting the various options you want
for the channel, then bringing it into smoosh's sets of active channels by
adding it to either `db_channels` or `view_channels`.
@@ -152,9 +138,14 @@ It's important to choose good channel names. There are some conventional ones:
* `ratio_views`: a ratio channel for views, usually using the default settings.
* `slack_views`: a slack channel for views, usually using the default settings.
-These four are defined by default if there are no others set ([source][source1]).
+These four are defined by default along with three **system** channel:
-[source1]: https://github.com/apache/couchdb-smoosh/blob/master/src/smoosh_server.erl#L75
+* `upgrade_dbs`: update channel for dbs, used when db file format changes
+* `upgrade_views` : update channel for views, used when view file format
+ changes or after the operating system's collation library undergoes a major
+ version change.
+* `index_cleanup` : a single channel in the `cleanup_channels` list used for
+ enqueueing jobs used to clean up stale index files.
And some standard names for ones we often have to add:
@@ -229,7 +220,6 @@ The same as defining a channel, you just need to set the new value:
It sometimes takes a little while to take affect.
-
## Standard operating procedures
There are a few standard things that operators often have to do when responding
@@ -387,6 +377,15 @@ Suspend is currently pretty literal: `erlang:suspend_process(Pid, [unless_suspen
is called for each compaction process in each channel. `resume_process` is called
for resume.
+### Disable a channel
+
+An alternative to pausing a channel is to disable it by setting its concurrency
+value to `"0"`.
+
+```
+rpc:multicall(config, set, ["smoosh.ratio_dbs", "concurrency", "0"]).
+```
+
### Restarting Smoosh
Restarting Smoosh is a long shot and is a brute force approach in the hope that
diff --git a/src/smoosh/recovery_process_diagram.jpeg b/src/smoosh/recovery_process_diagram.jpeg
deleted file mode 100644
index 300db5cd0..000000000
--- a/src/smoosh/recovery_process_diagram.jpeg
+++ /dev/null
Binary files differ