summaryrefslogtreecommitdiff
path: root/doc/administration/redis/troubleshooting.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/redis/troubleshooting.md')
-rw-r--r--doc/administration/redis/troubleshooting.md158
1 files changed, 158 insertions, 0 deletions
diff --git a/doc/administration/redis/troubleshooting.md b/doc/administration/redis/troubleshooting.md
new file mode 100644
index 00000000000..402b60e5b7b
--- /dev/null
+++ b/doc/administration/redis/troubleshooting.md
@@ -0,0 +1,158 @@
+---
+type: reference
+stage: Enablement
+group: Distribution
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
+---
+
+# Troubleshooting Redis
+
+There are a lot of moving parts that needs to be taken care carefully
+in order for the HA setup to work as expected.
+
+Before proceeding with the troubleshooting below, check your firewall rules:
+
+- Redis machines
+ - Accept TCP connection in `6379`
+ - Connect to the other Redis machines via TCP in `6379`
+- Sentinel machines
+ - Accept TCP connection in `26379`
+ - Connect to other Sentinel machines via TCP in `26379`
+ - Connect to the Redis machines via TCP in `6379`
+
+## Troubleshooting Redis replication
+
+You can check if everything is correct by connecting to each server using
+`redis-cli` application, and sending the `info replication` command as below.
+
+```shell
+/opt/gitlab/embedded/bin/redis-cli -h <redis-host-or-ip> -a '<redis-password>' info replication
+```
+
+When connected to a `Primary` Redis, you will see the number of connected
+`replicas`, and a list of each with connection details:
+
+```plaintext
+# Replication
+role:master
+connected_replicas:1
+replica0:ip=10.133.5.21,port=6379,state=online,offset=208037514,lag=1
+master_repl_offset:208037658
+repl_backlog_active:1
+repl_backlog_size:1048576
+repl_backlog_first_byte_offset:206989083
+repl_backlog_histlen:1048576
+```
+
+When it's a `replica`, you will see details of the primary connection and if
+its `up` or `down`:
+
+```plaintext
+# Replication
+role:replica
+master_host:10.133.1.58
+master_port:6379
+master_link_status:up
+master_last_io_seconds_ago:1
+master_sync_in_progress:0
+replica_repl_offset:208096498
+replica_priority:100
+replica_read_only:1
+connected_replicas:0
+master_repl_offset:0
+repl_backlog_active:0
+repl_backlog_size:1048576
+repl_backlog_first_byte_offset:0
+repl_backlog_histlen:0
+```
+
+## Troubleshooting Sentinel
+
+If you get an error like: `Redis::CannotConnectError: No sentinels available.`,
+there may be something wrong with your configuration files or it can be related
+to [this issue](https://github.com/redis/redis-rb/issues/531).
+
+You must make sure you are defining the same value in `redis['master_name']`
+and `redis['master_pasword']` as you defined for your sentinel node.
+
+The way the Redis connector `redis-rb` works with sentinel is a bit
+non-intuitive. We try to hide the complexity in omnibus, but it still requires
+a few extra configurations.
+
+---
+
+To make sure your configuration is correct:
+
+1. SSH into your GitLab application server
+1. Enter the Rails console:
+
+ ```shell
+ # For Omnibus installations
+ sudo gitlab-rails console
+
+ # For source installations
+ sudo -u git rails console -e production
+ ```
+
+1. Run in the console:
+
+ ```ruby
+ redis = Redis.new(Gitlab::Redis::SharedState.params)
+ redis.info
+ ```
+
+ Keep this screen open and try to simulate a failover below.
+
+1. To simulate a failover on primary Redis, SSH into the Redis server and run:
+
+ ```shell
+ # port must match your primary redis port, and the sleep time must be a few seconds bigger than defined one
+ redis-cli -h localhost -p 6379 DEBUG sleep 20
+ ```
+
+1. Then back in the Rails console from the first step, run:
+
+ ```ruby
+ redis.info
+ ```
+
+ You should see a different port after a few seconds delay
+ (the failover/reconnect time).
+
+## Troubleshooting a non-bundled Redis with an installation from source
+
+If you get an error in GitLab like `Redis::CannotConnectError: No sentinels available.`,
+there may be something wrong with your configuration files or it can be related
+to [this upstream issue](https://github.com/redis/redis-rb/issues/531).
+
+You must make sure that `resque.yml` and `sentinel.conf` are configured correctly,
+otherwise `redis-rb` will not work properly.
+
+The `master-group-name` (`gitlab-redis`) defined in (`sentinel.conf`)
+**must** be used as the hostname in GitLab (`resque.yml`):
+
+```conf
+# sentinel.conf:
+sentinel monitor gitlab-redis 10.0.0.1 6379 2
+sentinel down-after-milliseconds gitlab-redis 10000
+sentinel config-epoch gitlab-redis 0
+sentinel leader-epoch gitlab-redis 0
+```
+
+```yaml
+# resque.yaml
+production:
+ url: redis://:myredispassword@gitlab-redis/
+ sentinels:
+ -
+ host: 10.0.0.1
+ port: 26379 # point to sentinel, not to redis port
+ -
+ host: 10.0.0.2
+ port: 26379 # point to sentinel, not to redis port
+ -
+ host: 10.0.0.3
+ port: 26379 # point to sentinel, not to redis port
+```
+
+When in doubt, read the [Redis Sentinel documentation](https://redis.io/topics/sentinel).