summaryrefslogtreecommitdiff
path: root/doc/administration/sidekiq/extra_sidekiq_processes.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/sidekiq/extra_sidekiq_processes.md')
-rw-r--r--doc/administration/sidekiq/extra_sidekiq_processes.md362
1 files changed, 362 insertions, 0 deletions
diff --git a/doc/administration/sidekiq/extra_sidekiq_processes.md b/doc/administration/sidekiq/extra_sidekiq_processes.md
new file mode 100644
index 00000000000..1cd3771c94d
--- /dev/null
+++ b/doc/administration/sidekiq/extra_sidekiq_processes.md
@@ -0,0 +1,362 @@
+---
+stage: Systems
+group: Distribution
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
+---
+
+# Run multiple Sidekiq processes **(FREE SELF)**
+
+GitLab allows you to start multiple Sidekiq processes.
+These processes can be used to consume a dedicated set
+of queues. This can be used to ensure certain queues always have dedicated
+workers, no matter the number of jobs to be processed.
+
+NOTE:
+The information in this page applies only to Omnibus GitLab.
+
+## Available Sidekiq queues
+
+For a list of the existing Sidekiq queues, check the following files:
+
+- [Queues for both GitLab Community and Enterprise Editions](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/all_queues.yml)
+- [Queues for GitLab Enterprise Editions only](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/all_queues.yml)
+
+Each entry in the above files represents a queue on which Sidekiq processes
+can be started.
+
+## Start multiple processes
+
+> - [Introduced](https://gitlab.com/gitlab-org/omnibus-gitlab/-/merge_requests/4006) in GitLab 12.10, starting multiple processes with Sidekiq cluster.
+> - [Sidekiq cluster moved](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/181) to GitLab Free in 12.10.
+> - [Sidekiq cluster became default](https://gitlab.com/gitlab-org/omnibus-gitlab/-/merge_requests/4140) in GitLab 13.0.
+
+When starting multiple processes, the number of processes should
+equal (and **not** exceed) the number of CPU cores you want to
+dedicate to Sidekiq. Each Sidekiq process can use only 1 CPU
+core, subject to the available workload and concurrency settings.
+
+To start multiple processes:
+
+1. Using the `sidekiq['queue_groups']` array setting, specify how many processes to
+ create using `sidekiq-cluster` and which queue they should handle.
+ Each item in the array equates to one additional Sidekiq
+ process, and values in each item determine the queues it works on.
+
+ For example, the following setting creates three Sidekiq processes, one to run on
+ `elastic_commit_indexer`, one to run on `mailers`, and one process running on all queues:
+
+ ```ruby
+ sidekiq['queue_groups'] = [
+ "elastic_commit_indexer",
+ "mailers",
+ "*"
+ ]
+ ```
+
+ To have an additional Sidekiq process handle multiple queues, add multiple
+ queue names to its item delimited by commas. For example:
+
+ ```ruby
+ sidekiq['queue_groups'] = [
+ "elastic_commit_indexer, elastic_association_indexer",
+ "mailers",
+ "*"
+ ]
+ ```
+
+ [In GitLab 12.9](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/26594) and
+ later, the special queue name `*` means all queues. This starts two
+ processes, each handling all queues:
+
+ ```ruby
+ sidekiq['queue_groups'] = [
+ "*",
+ "*"
+ ]
+ ```
+
+ `*` cannot be combined with concrete queue names - `*, mailers`
+ just handles the `mailers` queue.
+
+ When `sidekiq-cluster` is only running on a single node, make sure that at least
+ one process is running on all queues using `*`. This ensures a process
+ automatically picks up jobs in queues created in the future,
+ including queues that have dedicated processes.
+
+ If `sidekiq-cluster` is running on more than one node, you can also use
+ [`--negate`](#negate-settings) and list all the queues that are already being
+ processed.
+
+1. Save the file and reconfigure GitLab for the changes to take effect:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+To view the Sidekiq processes in GitLab:
+
+1. On the top bar, select **Menu > Admin**.
+1. On the left sidebar, select **Monitoring > Background Jobs**.
+
+## Negate settings
+
+To have the Sidekiq process work on every queue **except** the ones
+you list. In this example, we exclude all import-related jobs from a Sidekiq node:
+
+1. Edit `/etc/gitlab/gitlab.rb` and add:
+
+ ```ruby
+ sidekiq['negate'] = true
+ sidekiq['queue_selector'] = true
+ sidekiq['queue_groups'] = [
+ "feature_category=importers"
+ ]
+ ```
+
+1. Save the file and reconfigure GitLab for the changes to take effect:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+## Queue selector
+
+> - [Introduced](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/45) in GitLab 12.8.
+> - [Sidekiq cluster, including queue selector, moved](https://gitlab.com/groups/gitlab-com/gl-infra/-/epics/181) to GitLab Free in 12.10.
+> - [Renamed from `experimental_queue_selector` to `queue_selector`](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/147) in GitLab 13.6.
+
+In addition to selecting queues by name, as above, the `queue_selector` option
+allows queue groups to be selected in a more general way using a
+[worker matching query](extra_sidekiq_routing.md#worker-matching-query). After `queue_selector`
+is set, all `queue_groups` must follow the aforementioned syntax.
+
+In `/etc/gitlab/gitlab.rb`:
+
+```ruby
+sidekiq['enable'] = true
+sidekiq['queue_selector'] = true
+sidekiq['queue_groups'] = [
+ # Run all non-CPU-bound queues that are high urgency
+ 'resource_boundary!=cpu&urgency=high',
+ # Run all continuous integration and pages queues that are not high urgency
+ 'feature_category=continuous_integration,pages&urgency!=high',
+ # Run all queues
+ '*'
+]
+```
+
+## Ignore all import queues
+
+When [importing from GitHub](../../user/project/import/github.md) or
+other sources, Sidekiq might use all of its resources to perform those
+operations. To set up two separate `sidekiq-cluster` processes, where
+one only processes imports and the other processes all other queues:
+
+1. Edit `/etc/gitlab/gitlab.rb` and add:
+
+ ```ruby
+ sidekiq['enable'] = true
+ sidekiq['queue_selector'] = true
+ sidekiq['queue_groups'] = [
+ "feature_category=importers",
+ "feature_category!=importers"
+ ]
+ ```
+
+1. Save the file and reconfigure GitLab for the changes to take effect:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+## Number of threads
+
+By default each process defined under `sidekiq` starts with a
+number of threads that equals the number of queues, plus one spare thread.
+For example, a process that handles the `process_commit` and `post_receive`
+queues uses three threads in total.
+
+These thread run inside a single Ruby process, and each process
+can only use a single CPU core. The usefulness of threading depends
+on the work having some external dependencies to wait on, like database queries or
+HTTP requests. Most Sidekiq deployments benefit from this threading, and when
+running fewer queues in a process, increasing the thread count might be
+even more desirable to make the most effective use of CPU resources.
+
+### Manage thread counts explicitly
+
+The correct maximum thread count (also called concurrency) depends on the workload.
+Typical values range from `1` for highly CPU-bound tasks to `15` or higher for mixed
+low-priority work. A reasonable starting range is `15` to `25` for a non-specialized
+deployment.
+
+You can find example values used by GitLab.com by searching for `concurrency:` in
+[the Helm charts](https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-com/-/blob/master/releases/gitlab/values/gprd.yaml.gotmpl).
+The values vary according to the work each specific deployment of Sidekiq does.
+Any other specialized deployments with processes dedicated to specific queues should
+have the concurrency tuned according to:
+have the concurrency tuned according to:
+
+- The CPU usage of each type of process.
+- The throughput achieved.
+
+Each thread requires a Redis connection, so adding threads may increase Redis
+latency and potentially cause client timeouts. See the
+[Sidekiq documentation about Redis](https://github.com/mperham/sidekiq/wiki/Using-Redis) for more
+details.
+
+#### When running Sidekiq cluster (default)
+
+Running Sidekiq cluster is the default in GitLab 13.0 and later.
+
+1. Edit `/etc/gitlab/gitlab.rb` and add:
+
+ ```ruby
+ sidekiq['min_concurrency'] = 15
+ sidekiq['max_concurrency'] = 25
+ ```
+
+1. Save the file and reconfigure GitLab for the changes to take effect:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+`min_concurrency` and `max_concurrency` are independent; one can be set without
+the other. Setting `min_concurrency` to `0` disables the limit.
+
+For each queue group, let `N` be one more than the number of queues. The
+concurrency is set to:
+
+1. `N`, if it's between `min_concurrency` and `max_concurrency`.
+1. `max_concurrency`, if `N` exceeds this value.
+1. `min_concurrency`, if `N` is less than this value.
+
+If `min_concurrency` is equal to `max_concurrency`, then this value is used
+regardless of the number of queues.
+
+When `min_concurrency` is greater than `max_concurrency`, it is treated as
+being equal to `max_concurrency`.
+
+#### When running a single Sidekiq process
+
+Running a single Sidekiq process is the default in GitLab 12.10 and earlier.
+
+WARNING:
+Running Sidekiq directly was removed in GitLab
+[14.0](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/240).
+
+1. Edit `/etc/gitlab/gitlab.rb` and add:
+
+ ```ruby
+ sidekiq['cluster'] = false
+ sidekiq['concurrency'] = 25
+ ```
+
+1. Save the file and reconfigure GitLab for the changes to take effect:
+
+ ```shell
+ sudo gitlab-ctl reconfigure
+ ```
+
+This sets the concurrency (number of threads) for the Sidekiq process.
+
+## Modify the check interval
+
+To modify `sidekiq-cluster`'s health check interval for the additional Sidekiq processes:
+
+1. Edit `/etc/gitlab/gitlab.rb` and add (the value can be any integer number of seconds):
+
+ ```ruby
+ sidekiq['interval'] = 5
+ ```
+
+1. Save the file and [reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect.
+
+## Troubleshoot using the CLI
+
+WARNING:
+It's recommended to use `/etc/gitlab/gitlab.rb` to configure the Sidekiq processes.
+If you experience a problem, you should contact GitLab support. Use the command
+line at your own risk.
+
+For debugging purposes, you can start extra Sidekiq processes by using the command
+`/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster`. This command
+takes arguments using the following syntax:
+
+```shell
+/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster [QUEUE,QUEUE,...] [QUEUE, ...]
+```
+
+Each separate argument denotes a group of queues that have to be processed by a
+Sidekiq process. Multiple queues can be processed by the same process by
+separating them with a comma instead of a space.
+
+Instead of a queue, a queue namespace can also be provided, to have the process
+automatically listen on all queues in that namespace without needing to
+explicitly list all the queue names. For more information about queue namespaces,
+see the relevant section in the
+[Sidekiq development documentation](../../development/sidekiq/index.md#queue-namespaces).
+
+For example, say you want to start 2 extra processes: one to process the
+`process_commit` queue, and one to process the `post_receive` queue. This can be
+done as follows:
+
+```shell
+/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster process_commit post_receive
+```
+
+If you instead want to start one process processing both queues, you'd use the
+following syntax:
+
+```shell
+/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster process_commit,post_receive
+```
+
+If you want to have one Sidekiq process dealing with the `process_commit` and
+`post_receive` queues, and one process to process the `gitlab_shell` queue,
+you'd use the following:
+
+```shell
+/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster process_commit,post_receive gitlab_shell
+```
+
+### Monitor the `sidekiq-cluster` command
+
+The `sidekiq-cluster` command does not terminate once it has started the desired
+amount of Sidekiq processes. Instead, the process continues running and
+forwards any signals to the child processes. This allows you to stop all
+Sidekiq processes as you send a signal to the `sidekiq-cluster` process,
+instead of having to send it to the individual processes.
+
+If the `sidekiq-cluster` process crashes or receives a `SIGKILL`, the child
+processes terminate themselves after a few seconds. This ensures you don't
+end up with zombie Sidekiq processes.
+
+This allows you to monitor the processes by hooking up
+`sidekiq-cluster` to your supervisor of choice (for example, runit).
+
+If a child process died the `sidekiq-cluster` command signals all remaining
+process to terminate, then terminate itself. This removes the need for
+`sidekiq-cluster` to re-implement complex process monitoring/restarting code.
+Instead you should make sure your supervisor restarts the `sidekiq-cluster`
+process whenever necessary.
+
+### PID files
+
+The `sidekiq-cluster` command can store its PID in a file. By default no PID
+file is written, but this can be changed by passing the `--pidfile` option to
+`sidekiq-cluster`. For example:
+
+```shell
+/opt/gitlab/embedded/service/gitlab-rails/bin/sidekiq-cluster --pidfile /var/run/gitlab/sidekiq_cluster.pid process_commit
+```
+
+Keep in mind that the PID file contains the PID of the `sidekiq-cluster`
+command and not the PIDs of the started Sidekiq processes.
+
+### Environment
+
+The Rails environment can be set by passing the `--environment` flag to the
+`sidekiq-cluster` command, or by setting `RAILS_ENV` to a non-empty value. The
+default value can be found in `/opt/gitlab/etc/gitlab-rails/env/RAILS_ENV`.