summaryrefslogtreecommitdiff
path: root/src/smoosh/operator_guide.md
diff options
context:
space:
mode:
Diffstat (limited to 'src/smoosh/operator_guide.md')
-rw-r--r--src/smoosh/operator_guide.md398
1 files changed, 0 insertions, 398 deletions
diff --git a/src/smoosh/operator_guide.md b/src/smoosh/operator_guide.md
deleted file mode 100644
index fafee30d4..000000000
--- a/src/smoosh/operator_guide.md
+++ /dev/null
@@ -1,398 +0,0 @@
-# An operator's guide to smoosh
-
-Smoosh is the auto-compactor for the databases. It automatically selects and
-processes the compacting of database shards on each node.
-
-## Smoosh Channels
-
-Smoosh works using the concept of channels. A channel is essentially a queue of pending
-compactions. There are separate sets of channels for database and view compactions. Each
-channel is assigned a configuration which defines whether a compaction ends up in
-the channel's queue and how compactions are prioritised within that queue.
-
-Smoosh takes each channel and works through the compactions queued in each in priority
-order. Each channel is processed concurrently, so the priority levels only matter within
-a given channel.
-
-Finally, each channel has an assigned number of active compactions, which defines how
-many compactions happen for that channel in parallel. For example, a cluster with
-a lot of database churn but few views might require more active compactions to the
-database channel(s).
-
-It's important to remember that a channel is local to a dbcore node, that is
-each node maintains and processes an independent set of compactions.
-
-### Channel configuration options
-
-#### Channel types
-
-Each channel has a basic type for the algorithm it uses to select pending
-compactions for its queue and how it prioritises them.
-
-The two queue types are:
-
-* **ratio**: this uses the ratio `total_bytes / user_bytes` as its driving
-calculation. The result _X_ must be greater than some configurable value _Y_ for a
-compaction to be added to the queue. Compactions are then prioritised for
-higher values of _X_.
-
-* **slack**: this uses `total_bytes - user_bytes` as its driving calculation.
-The result _X_ must be greater than some configurable value _Y_ for a compaction
-to be added to the queue. Compactions are prioritised for higher values of _X_.
-
-In both cases, _Y_ is set using the `min_priority` configuration variable. The
-calculation of _X_ is described in [Priority calculation](#priority-calculation), below.
-
-Both algorithms operate on two main measures:
-
-* **user_bytes**: this is the amount of data the user has in the file. It
-doesn't include storage overhead: old revisions, on-disk btree structure and
-so on.
-
-* **total_bytes**: the size of the file on disk.
-
-Channel type is set using the `priority` configuration setting.
-
-#### Further configuration options
-
-Beyond its basic type, there are several other configuration options which
-can be applied to a queue.
-
-*All options MUST be set as strings.* See the [smoosh readme][srconfig] for
-all settings and their defaults.
-
-#### Priority calculation
-
-The algorithm type and certain configuration options feed into the priority
-calculation.
-
-The priority is calculated when a compaction is enqueued. As each channel
-has a different configuration, each channel will end up with a different
-priority value. The enqueue code checks each channel in turn to see whether the
-compaction passes its configured priority threshold (`min_priority`). Once
-a channel is found that can accept the compaction, the compaction is added
-to that channel's queue and the enqueue process stops. Therefore the
-ordering of channels has a bearing in what channel a compaction ends up in.
-
-If you want to follow this along, the call order is all in `smoosh_server`,
-`enqueue_request -> find_channel -> get_priority`.
-
-The priority calculation is probably the easiest way to understand the effects
-of configuration variables. It's defined in `smoosh_server#get_priority/3`,
-currently [here][ss].
-
-[ss]: https://github.com/apache/couchdb-smoosh/blob/master/src/smoosh_server.erl#L277
-[srconfig]: https://github.com/apache/couchdb-smoosh#channel-settings
-
-#### Background Detail
-
-`user_bytes` is called `data_size` in `db_info` blocks. It is the total of all bytes
-that are used to store docs and their attachments.
-
-Since `.couch` files are append only, every update adds data to the file. When
-you update a btree, a new leaf node is written and all the nodes back up the
-root. In this update, old data is never overwritten and these parts of the
-file are no longer live; this includes old btree nodes and document bodies.
-Compaction takes this file and writes a new file that only contains live data.
-
-`total_data` is the number of bytes in the file as reported by `ls -al filename`.
-
-#### Flaws
-
-An important flaw in this calculation is that `total_data` takes into account
-the compression of data on disk, whereas `user_bytes` does not. This can give
-unexpected results to calculations, as the values are not directly comparable.
-
-However, it's the best measure we currently have.
-
-[Even more info](https://github.com/apache/couchdb-smoosh#notes-on-the-data_size-value).
-
-#### State diagram
-
-Below is a diagram of smoosh's initial state during the recovery process.
-
-```
-stateDiagram
- [*] --> init
- init --> start_recovery: send_after(?START_DELAY_IN_MSEC, self(), start_recovery)
- note right of start_recovery
- activated = false
- paused = true
- end note
- start_recovery --> activate: send_after(?ACTIVATE_DELAY_IN_MSEC, self(), activate)
- note right of activate
- state has been recovered
- activated = true
- paused = true
- end note
- activate --> schedule_unpause
- schedule_unpause --> [*]: after 30 sec, paused = false and compaction of new jobs begin
-```
-
-![Smoosh State Recovery Process Diagram](recovery_process_diagram.jpeg)
-
-### Defining a channel
-
-Defining a channel is done via normal dbcore configuration, with some
-convention as to the parameter names.
-
-Channel configuration is defined using `smoosh.channel_name` top level config
-options. Defining a channel is just setting the various options you want
-for the channel, then bringing it into smoosh's sets of active channels by
-adding it to either `db_channels` or `view_channels`.
-
-This means that smoosh channels can be defined either for a single node or
-globally across a cluster, by setting the configuration either globally or
-locally. In the example, we set up a new global channel.
-
-It's important to choose good channel names. There are some conventional ones:
-
-* `ratio_dbs`: a ratio channel for dbs, usually using the default settings.
-* `slack_dbs`: a slack channel for dbs, usually using the default settings.
-* `ratio_views`: a ratio channel for views, usually using the default settings.
-* `slack_views`: a slack channel for views, usually using the default settings.
-
-These four are defined by default if there are no others set ([source][source1]).
-
-[source1]: https://github.com/apache/couchdb-smoosh/blob/master/src/smoosh_server.erl#L75
-
-And some standard names for ones we often have to add:
-
-* `big_dbs`: a ratio channel for only enqueuing large database shards. What
- _large_ means is very workload specific.
-
-Channels have certain defaults for their configuration, defined in the
-[smoosh readme][srconfig]. It's only neccessary to set up how this channel
-differs from those defaults. Below, we just need to set the `min_size` and
-`concurrency` settings, and allow the `priority` to default to `ratio`
-along with the other defaults.
-
-```bash
-# Define the new channel
-(couchdb@db1.foo.bar)3> rpc:multicall(config, set, ["smoosh.big_dbs", "min_size", "20000000000"]).
-{[ok,ok,ok],[]}
-(couchdb@db1.foo.bar)3> rpc:multicall(config, set, ["smoosh.big_dbs", "concurrency", "2"]).
-{[ok,ok,ok],[]}
-
-# Add the channel to the db_channels set -- note we need to get the original
-# value first so we can add the new one to the existing list!
-(couchdb@db1.foo.bar)5> rpc:multicall(config, get, ["smoosh", "db_channels"]).
-{["ratio_dbs","ratio_dbs","ratio_dbs"],[]}
-(couchdb@db1.foo.bar)6> rpc:multicall(config, set, ["smoosh", "db_channels", "ratio_dbs,big_dbs"]).
-{[ok,ok,ok],[]}
-```
-
-### Viewing active channels
-
-```bash
-(couchdb@db3.foo.bar)3> rpc:multicall(config, get, ["smoosh", "db_channels"]).
-{["ratio_dbs,big_dbs","ratio_dbs,big_dbs","ratio_dbs,big_dbs"],[]}
-(couchdb@db3.foo.bar)4> rpc:multicall(config, get, ["smoosh", "view_channels"]).
-{["ratio_views","ratio_views","ratio_views"],[]}
-```
-
-### Removing a channel
-
-```bash
-# Remove it from the active set
-(couchdb@db1.foo.bar)5> rpc:multicall(config, get, ["smoosh", "db_channels"]).
-{["ratio_dbs,big_dbs", "ratio_dbs,big_dbs", "ratio_dbs,big_dbs"],[]}
-(couchdb@db1.foo.bar)6> rpc:multicall(config, set, ["smoosh", "db_channels", "ratio_dbs"]).
-{[ok,ok,ok],[]}
-
-# Delete the config -- you need to do each value
-(couchdb@db1.foo.bar)3> rpc:multicall(config, delete, ["smoosh.big_dbs", "concurrency"]).
-{[ok,ok,ok],[]}
-(couchdb@db1.foo.bar)3> rpc:multicall(config, delete, ["smoosh.big_dbs", "min_size"]).
-{[ok,ok,ok],[]}
-```
-
-### Getting channel configuration
-
-As far as I know, you have to get each setting separately:
-
-```
-(couchdb@db1.foo.bar)1> rpc:multicall(config, get, ["smoosh.big_dbs", "concurrency"]).
-{["2","2","2"],[]}
-
-```
-
-### Setting channel configuration
-
-The same as defining a channel, you just need to set the new value:
-
-```
-(couchdb@db1.foo.bar)2> rpc:multicall(config, set, ["smoosh.ratio_dbs", "concurrency", "1"]).
-{[ok,ok,ok],[]}
-```
-
-It sometimes takes a little while to take affect.
-
-
-
-## Standard operating procedures
-
-There are a few standard things that operators often have to do when responding
-to pages.
-
-In addition to the below, in some circumstances it's useful to define new
-channels with certain properties (`big_dbs` is a common one) if smoosh isn't
-selecting and prioritising compactions that well.
-
-### Checking smoosh's status
-
-You can see the queued items for each channel by going into `remsh` on a node
-and using:
-
-```
-> smoosh:status().
-{ok,[{"ratio_dbs",
- [{active,1},
- {starting,0},
- {waiting,[{size,522},
- {min,{5.001569007970237,{1378,394651,323864}}},
- {max,{981756.5441159063,{1380,370286,655752}}}]}]},
- {"slack_views",
- [{active,1},
- {starting,0},
- {waiting,[{size,819},
- {min,{16839814,{1375,978920,326458}}},
- {max,{1541336279,{1380,370205,709896}}}]}]},
- {"slack_dbs",
- [{active,1},
- {starting,0},
- {waiting,[{size,286},
- {min,{19004944,{1380,295245,887295}}},
- {max,{48770817098,{1380,370185,876596}}}]}]},
- {"ratio_views",
- [{active,1},
- {starting,0},
- {waiting,[{size,639},
- {min,{5.0126340031149335,{1380,186581,445489}}},
- {max,{10275.555632057285,{1380,370411,421477}}}]}]}]}
-```
-
-This gives you the node-local status for each queue.
-
-Under each channel there is some information about the channel:
-
-* `active`: number of current compactions in the channel.
-* `starting`: number of compactions starting-up.
-* `waiting`: number of queued compactions.
- * `min` and `max` give an idea of the queued jobs' effectiveness. The values
- for these are obviously dependent on whether the queue is ratio or slack.
-
-For ratio queues, the default minimum for smoosh to enqueue a compaction is 5. In
-the example above, we can guess that 981,756 is quite high. This could be a
-small database, however, so it doesn't necessarily mean useful compactions
-from the point of view of reclaiming disk space.
-
-For this example, we can see that there are quite a lot of queued compactions,
-but we don't know which would be most effective to run to reclaim disk space.
-It's also worth noting that the waiting queue sizes are only meaningful
-related to other factors on the cluster (e.g., db number and size).
-
-
-### Smoosh IOQ priority
-
-This is a global setting which affects all channels. Increasing it allows each
-active compaction to (hopefully) proceed faster as the compaction work is of
-a higher priority relative to other jobs. Decreasing it (hopefully) has the
-converse effect.
-
-By this point you'll [know whether smoosh is backing up](#checking-smooshs-status).
-If it's falling behind (big queues), try increasing compaction priority.
-
-Smoosh's IOQ priority is controlled via the `ioq` -> `compaction` queue.
-
-```
-> rpc:multicall(config, get, ["ioq", "compaction"]).
-{[undefined,undefined,undefined],[]}
-
-```
-
-Priority by convention runs 0 to 1, though the priority can be any positive
-number. The default for compaction is 0.01; pretty low.
-
-If it looks like smoosh has a bunch of work that it's not getting
-through, priority can be increased. However, be careful that this
-doesn't adversely impact the customer experience. If it will, and
-it's urgent, at least drop them a warning.
-
-```
-> rpc:multicall(config, set, ["ioq", "compaction", "0.5"]).
-{[ok,ok,ok],[]}
-```
-
-In general, this should be a temporary measure. For some clusters,
-a change from the default may be required to help smoosh keep up
-with particular workloads.
-
-### Granting specific channels more workers
-
-Giving smoosh a higher concurrency for a given channel can allow a backlog
-in that channel to catch up.
-
-Again, some clusters run best with specific channels having more workers.
-
-From [assessing disk space](#assess-the-space-on-the-disk), you should
-know whether the biggest offenders are db or view files. From this,
-you can infer whether it's worth giving a specific smoosh channel a
-higher concurrency.
-
-The current setting can be seen for a channel like so:
-
-```
-> rpc:multicall(config, get, ["smoosh.ratio_dbs", "concurrency"]).
-{["2","2","2"], []}
-```
-
-`undefined` means the default is used.
-
-If we knew that disk space for DBs was the major user of disk space, we might
-want to increase a `_dbs` channel. Experience shows `ratio_dbs` is often best
-but evaluate this based on the current status.
-
-If we want to increase the ratio_dbs setting:
-
-```
-> rpc:multicall(config, set, ["smoosh.ratio_dbs", "concurrency", "2"]).
-{[ok,ok,ok],[]}
-```
-
-### Suspending smoosh
-
-If smoosh itself is causing issues, it's possible to suspend its operation.
-This differs from either `application:stop(smoosh).` or setting all channel's
-concurrency to zero because it both pauses on going compactions and maintains
-the channel queues intact.
-
-If, for example, a node's compactions are causing disk space issues, smoosh
-could be suspended while working out which channel is causing the problem. For
-example, a big_dbs channel might be creating huge compaction-in-progress
-files if there's not much in the shard to compact away.
-
-It's therefore useful to use when testing to see if smoosh is causing a
-problem.
-
-```
-# suspend
-smoosh:suspend().
-
-# resume a suspended smoosh
-smoosh:resume().
-```
-
-Suspend is currently pretty literal: `erlang:suspend_process(Pid, [unless_suspending])`
-is called for each compaction process in each channel. `resume_process` is called
-for resume.
-
-### Restarting Smoosh
-
-Restarting Smoosh is a long shot and is a brute force approach in the hope that
-when Smoosh rescans the DBs that it makes the right decisions. If required to take
-this step contact rnewson or davisp so that they can inspect Smoosh and see the bug.
-
-```
-> exit(whereis(smoosh_server), kill), smoosh:enqueue_all_dbs(), smoosh:enqueue_all_views().
-```