summaryrefslogtreecommitdiff
path: root/doc/administration/operations/extra_sidekiq_routing.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/operations/extra_sidekiq_routing.md')
-rw-r--r--doc/administration/operations/extra_sidekiq_routing.md164
1 files changed, 164 insertions, 0 deletions
diff --git a/doc/administration/operations/extra_sidekiq_routing.md b/doc/administration/operations/extra_sidekiq_routing.md
new file mode 100644
index 00000000000..93cf8bd4f43
--- /dev/null
+++ b/doc/administration/operations/extra_sidekiq_routing.md
@@ -0,0 +1,164 @@
+---
+stage: Enablement
+group: Distribution
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Queue routing rules **(FREE SELF)**
+
+When the number of Sidekiq jobs increases to a certain scale, the system faces
+some scalability issues. One of them is that the length of the queue tends to get
+longer. High-urgency jobs have to wait longer until other less urgent jobs
+finish. This head-of-line blocking situation may eventually affect the
+responsiveness of the system, especially critical actions. In another scenario,
+the performance of some jobs is degraded due to other long running or CPU-intensive jobs
+(computing or rendering ones) in the same machine.
+
+To counter the aforementioned issues, one effective solution is to split
+Sidekiq jobs into different queues and assign machines handling each queue
+exclusively. For example, all CPU-intensive jobs could be routed to the
+`cpu-bound` queue and handled by a fleet of CPU optimized instances. The queue
+topology differs between companies depending on the workloads and usage
+patterns. Therefore, GitLab supports a flexible mechanism for the
+administrator to route the jobs based on their characteristics.
+
+As an alternative to [Queue selector](extra_sidekiq_processes.md#queue-selector), which
+configures Sidekiq cluster to listen to a specific set of workers or queues,
+GitLab also supports routing a job from a worker to the desired queue when it
+is scheduled. Sidekiq clients try to match a job against a configured list of
+routing rules. Rules are evaluated from first to last, and as soon as we find a
+match for a given worker we stop processing for that worker (first match wins).
+If the worker doesn't match any rule, it falls back to the queue name generated
+from the worker name.
+
+By default, if the routing rules are not configured (or denoted with an empty
+array), all the jobs are routed to the queue generated from the worker name.
+
+## Example configuration
+
+In `/etc/gitlab/gitlab.rb`:
+
+```ruby
+sidekiq['routing_rules'] = [
+ # Route all non-CPU-bound workers that are high urgency to `high-urgency` queue
+ ['resource_boundary!=cpu&urgency=high', 'high-urgency'],
+ # Route all database, gitaly and global search workers that are throttled to `throttled` queue
+ ['feature_category=database,gitaly,global_search&urgency=throttled', 'throttled'],
+ # Route all workers having contact with outside work to a `network-intenstive` queue
+ ['has_external_dependencies=true|feature_category=hooks|tags=network', 'network-intensive'],
+ # Route all import workers to the queues generated by the worker name, for
+ # example, JiraImportWorker to `jira_import`, SVNWorker to `svn_worker`
+ ['feature_category=import', nil],
+ # Wildcard matching, route the rest to `default` queue
+ ['*', 'default']
+]
+```
+
+The routing rules list is an order-matter array of tuples of query and
+corresponding queue:
+
+- The query is following a [worker matching query](#worker-matching-query) syntax.
+- The `<queue_name>` must be a valid Sidekiq queue name. If the queue name
+ is `nil`, or an empty string, the worker is routed to the queue generated
+ by the name of the worker instead.
+
+The query supports wildcard matching `*`, which matches all workers. As a
+result, the wildcard query must stay at the end of the list or the rules after it
+are ignored.
+
+NOTE:
+Mixing queue routing rules and queue selectors requires care to
+ensure all jobs that are scheduled and picked up by appropriate Sidekiq
+workers.
+
+## Worker matching query
+
+GitLab provides a simple query syntax to match a worker based on its
+attributes. This query syntax is employed by both [Queue routing
+rules](#queue-routing-rules) and [Queue
+selector](extra_sidekiq_processes.md#queue-selector). A query includes two
+components:
+
+- Attributes that can be selected.
+- Operators used to construct a query.
+
+### Available attributes
+
+> [Introduced](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/261) in GitLab 13.1 (`tags`).
+
+Queue matching query works upon the worker attributes, described in [Sidekiq
+style guide](../../development/sidekiq_style_guide.md). We support querying
+based on a subset of worker attributes:
+
+- `feature_category` - the [GitLab feature
+ category](https://about.gitlab.com/direction/maturity/#category-maturity) the
+ queue belongs to. For example, the `merge` queue belongs to the
+ `source_code_management` category.
+- `has_external_dependencies` - whether or not the queue connects to external
+ services. For example, all importers have this set to `true`.
+- `urgency` - how important it is that this queue's jobs run
+ quickly. Can be `high`, `low`, or `throttled`. For example, the
+ `authorized_projects` queue is used to refresh user permissions, and
+ is high urgency.
+- `worker_name` - the worker name. The other attributes are typically more useful as
+ they are more general, but this is available in case a particular worker needs
+ to be selected.
+- `name` - the queue name. The other attributes are typically more useful as
+ they are more general, but this is available in case a particular queue needs
+ to be selected.
+- `resource_boundary` - if the queue is bound by `cpu`, `memory`, or
+ `unknown`. For example, the `ProjectExportWorker` is memory bound as it has
+ to load data in memory before saving it for export.
+- `tags` - short-lived annotations for queues. These are expected to frequently
+ change from release to release, and may be removed entirely.
+
+`has_external_dependencies` is a boolean attribute: only the exact
+string `true` is considered true, and everything else is considered
+false.
+
+`tags` is a set, which means that `=` checks for intersecting sets, and
+`!=` checks for disjoint sets. For example, `tags=a,b` selects queues
+that have tags `a`, `b`, or both. `tags!=a,b` selects queues that have
+neither of those tags.
+
+The attributes of each worker are hard-coded in the source code. For
+convenience, we generate a [list of all available attributes in
+GitLab Community Edition](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/all_queues.yml)
+and a [list of all available attributes in
+GitLab Enterprise Edition](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/all_queues.yml).
+
+### Available operators
+
+`queue_selector` supports the following operators, listed from highest
+to lowest precedence:
+
+- `|` - the logical OR operator. For example, `query_a|query_b` (where `query_a`
+ and `query_b` are queries made up of the other operators here) will include
+ queues that match either query.
+- `&` - the logical AND operator. For example, `query_a&query_b` (where
+ `query_a` and `query_b` are queries made up of the other operators here) will
+ only include queues that match both queries.
+- `!=` - the NOT IN operator. For example, `feature_category!=issue_tracking`
+ excludes all queues from the `issue_tracking` feature category.
+- `=` - the IN operator. For example, `resource_boundary=cpu` includes all
+ queues that are CPU bound.
+- `,` - the concatenate set operator. For example,
+ `feature_category=continuous_integration,pages` includes all queues from
+ either the `continuous_integration` category or the `pages` category. This
+ example is also possible using the OR operator, but allows greater brevity, as
+ well as being lower precedence.
+
+The operator precedence for this syntax is fixed: it's not possible to make AND
+have higher precedence than OR.
+
+[In GitLab 12.9](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/26594) and
+later, as with the standard queue group syntax above, a single `*` as the
+entire queue group selects all queues.
+
+### Migration
+
+After the Sidekiq routing rules are changed, administrators need to take care
+with the migration to avoid losing jobs entirely, especially in a system with
+long queues of jobs. The migration can be done by following the migration steps
+mentioned in [Sidekiq job
+migration](../../raketasks/sidekiq_job_migration.md)