summaryrefslogtreecommitdiff
path: root/doc/administration/high_availability/redis.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/high_availability/redis.md')
-rw-r--r--doc/administration/high_availability/redis.md902
1 files changed, 720 insertions, 182 deletions
diff --git a/doc/administration/high_availability/redis.md b/doc/administration/high_availability/redis.md
index bc424330656..b4e7bf21e35 100644
--- a/doc/administration/high_availability/redis.md
+++ b/doc/administration/high_availability/redis.md
@@ -1,265 +1,780 @@
# Configuring Redis for GitLab HA
-You can choose to install and manage Redis yourself, or you can use the one
-that comes bundled with GitLab Omnibus packages.
-
-> **Note:** Redis does not require authentication by default. See
+>
+Experimental Redis Sentinel support was [Introduced][ce-1877] in GitLab 8.11.
+Starting with 8.14, Redis Sentinel is no longer experimental.
+If you've used it with versions `< 8.14` before, please check the updated
+documentation here.
+
+High Availability with [Redis] is possible using a **Master** x **Slave**
+topology with a [Redis Sentinel][sentinel] service to watch and automatically
+start the failover procedure.
+
+You can choose to install and manage Redis and Sentinel yourself, use
+a hosted cloud solution or you can use the one that comes bundled with
+Omnibus GitLab packages.
+
+> **Notes:**
+- Redis requires authentication for High Availability. See
[Redis Security](http://redis.io/topics/security) documentation for more
information. We recommend using a combination of a Redis password and tight
firewall rules to secure your Redis service.
+- You are highly encouraged to read the [Redis Sentinel][sentinel] documentation
+ before configuring Redis HA with GitLab to fully understand the topology and
+ architecture.
+- This is the documentation for the Omnibus GitLab packages. For installations
+ from source, follow the [Redis HA source installation](redis_source.md) guide.
+- Redis Sentinel daemon is bundled with Omnibus GitLab Enterprise Edition only.
+ For configuring Sentinel with the Omnibus GitLab Community Edition and
+ installations from source, read the
+ [Available configuration setups](#available-configuration-setups) section
+ below.
+
+## Overview
+
+Before diving into the details of setting up Redis and Redis Sentinel for HA,
+make sure you read this Overview section to better understand how the components
+are tied together.
+
+You need at least `3` independent machines: physical, or VMs running into
+distinct physical machines. It is essential that all master and slaves Redis
+instances run in different machines. If you fail to provision the machines in
+that specific way, any issue with the shared environment can bring your entire
+setup down.
+
+It is OK to run a Sentinel along with a master or slave Redis instance.
+No more than one Sentinel in the same machine though.
+
+You also need to take in consideration the underlying network topology,
+making sure you have redundant connectivity between Redis / Sentinel and
+GitLab instances, otherwise the networks will become a single point of
+failure.
+
+Make sure that you read this document once as a whole before configuring the
+components below.
+
+### High Availability with Sentinel
+
+>**Notes:**
+- Starting with GitLab `8.11`, you can configure a list of Redis Sentinel
+ servers that will monitor a group of Redis servers to provide failover support.
+- Starting with GitLab `8.14`, the Omnibus GitLab Enterprise Edition package
+ comes with Redis Sentinel daemon built-in.
+
+High Availability with Redis requires a few things:
+
+- Multiple Redis instances
+- Run Redis in a **Master** x **Slave** topology
+- Multiple Sentinel instances
+- Application support and visibility to all Sentinel and Redis instances
+
+Redis Sentinel can handle the most important tasks in an HA environment and that's
+to help keep servers online with minimal to no downtime. Redis Sentinel:
+
+- Monitors **Master** and **Slaves** instances to see if they are available
+- Promotes a **Slave** to **Master** when the **Master** fails
+- Demotes a **Master** to **Slave** when the failed **Master** comes back online
+ (to prevent data-partitioning)
+- Can be queried by the application to always connect to the current **Master**
+ server
+
+When a **Master** fails to respond, it's the application's responsibility
+(in our case GitLab) to handle timeout and reconnect (querying a **Sentinel**
+for a new **Master**).
-## Configure your own Redis server
+To get a better understanding on how to correctly setup Sentinel, please read
+the [Redis Sentinel documentation](http://redis.io/topics/sentinel) first, as
+failing to configure it correctly can lead to data loss or can bring your
+whole cluster down, invalidating the failover effort.
-If you're hosting GitLab on a cloud provider, you can optionally use a
-managed service for Redis. For example, AWS offers a managed ElastiCache service
-that runs Redis.
+### Recommended setup
-## Configure Redis using Omnibus
+For a minimal setup, you will install the Omnibus GitLab package in `3`
+**independent** machines, both with **Redis** and **Sentinel**:
-If you don't want to bother setting up your own Redis server, you can use the
-one bundled with Omnibus. In this case, you should disable all services except
-Redis.
+- Redis Master + Sentinel
+- Redis Slave + Sentinel
+- Redis Slave + Sentinel
-1. Download/install GitLab Omnibus using **steps 1 and 2** from
- [GitLab downloads](https://about.gitlab.com/downloads). Do not complete other
- steps on the download page.
-1. Create/edit `/etc/gitlab/gitlab.rb` and use the following configuration.
- Be sure to change the `external_url` to match your eventual GitLab front-end
- URL:
+If you are not sure or don't understand why and where the amount of nodes come
+from, read [Redis setup overview](#redis-setup-overview) and
+[Sentinel setup overview](#sentinel-setup-overview).
- ```ruby
- external_url 'https://gitlab.example.com'
-
- # Disable all services except Redis
- redis['enable'] = true
- bootstrap['enable'] = false
- nginx['enable'] = false
- unicorn['enable'] = false
- sidekiq['enable'] = false
- postgresql['enable'] = false
- gitlab_workhorse['enable'] = false
- mailroom['enable'] = false
-
- # Redis configuration
- redis['port'] = 6379
- redis['bind'] = '0.0.0.0'
+For a recommended setup that can resist more failures, you will install
+the Omnibus GitLab package in `5` **independent** machines, both with
+**Redis** and **Sentinel**:
- # If you wish to use Redis authentication (recommended)
- redis['password'] = 'Redis Password'
- ```
+- Redis Master + Sentinel
+- Redis Slave + Sentinel
+- Redis Slave + Sentinel
+- Redis Slave + Sentinel
+- Redis Slave + Sentinel
-1. Run `sudo gitlab-ctl reconfigure` to install and configure PostgreSQL.
+### Redis setup overview
- > **Note**: This `reconfigure` step will result in some errors.
- That's OK - don't be alarmed.
+You must have at least `3` Redis servers: `1` Master, `2` Slaves, and they
+need to be each in a independent machine (see explanation above).
-1. Run `touch /etc/gitlab/skip-auto-migrations` to prevent database migrations
- from running on upgrade. Only the primary GitLab application server should
- handle migrations.
+You can have additional Redis nodes, that will help survive a situation
+where more nodes goes down. Whenever there is only `2` nodes online, a failover
+will not be initiated.
-## Experimental Redis Sentinel support
+As an example, if you have `6` Redis nodes, a maximum of `3` can be
+simultaneously down.
-> [Introduced][ce-1877] in GitLab 8.11.
+Please note that there are different requirements for Sentinel nodes.
+If you host them in the same Redis machines, you may need to take
+that restrictions into consideration when calculating the amount of
+nodes to be provisioned. See [Sentinel setup overview](#sentinel-setup-overview)
+documentation for more information.
-Since GitLab 8.11, you can configure a list of Redis Sentinel servers that
-will monitor a group of Redis servers to provide you with a standard failover
-support.
+All Redis nodes should be configured the same way and with similar server specs, as
+in a failover situation, any **Slave** can be promoted as the new **Master** by
+the Sentinel servers.
-There is currently one exception to the Sentinel support: `mail_room`, the
-component that processes incoming emails. It doesn't support Sentinel yet, but
-we hope to integrate a future release that does support it.
+The replication requires authentication, so you need to define a password to
+protect all Redis nodes and the Sentinels. They will all share the same
+password, and all instances must be able to talk to
+each other over the network.
-To get a better understanding on how to correctly setup Sentinel, please read
-the [Redis Sentinel documentation](http://redis.io/topics/sentinel) first, as
-failing to configure it correctly can lead to data loss.
+### Sentinel setup overview
-The configuration consists of three parts:
+Sentinels watch both other Sentinels and Redis nodes. Whenever a Sentinel
+detects that a Redis node is not responding, it will announce that to the
+other Sentinels. They have to reach the **quorum**, that is the minimum amount
+of Sentinels that agrees a node is down, in order to be able to start a failover.
-- Redis setup
-- Sentinel setup
-- GitLab setup
+Whenever the **quorum** is met, the **majority** of all known Sentinel nodes
+need to be available and reachable, so that they can elect the Sentinel **leader**
+who will take all the decisions to restore the service availability by:
-Read carefully how to configure those components below.
+- Promoting a new **Master**
+- Reconfiguring the other **Slaves** and make them point to the new **Master**
+- Announce the new **Master** to every other Sentinel peer
+- Reconfigure the old **Master** and demote to **Slave** when it comes back online
-### Redis setup
+You must have at least `3` Redis Sentinel servers, and they need to
+be each in a independent machine (that are believed to fail independently),
+ideally in different geographical areas.
-You must have at least 2 Redis servers: 1 Master, 1 or more Slaves.
-They should be configured the same way and with similar server specs, as
-in a failover situation, any Slave can be elected as the new Master by
-the Sentinel servers.
+You can configure them in the same machines where you've configured the other
+Redis servers, but understand that if a whole node goes down, you loose both
+a Sentinel and a Redis instance.
-In a minimal setup, the only required change for the slaves in `redis.conf`
-is the addition of a `slaveof` line pointing to the initial master.
-You can increase the security by defining a `requirepass` configuration in
-the master, and `masterauth` in slaves.
+The number of sentinels should ideally always be an **odd** number, for the
+consensus algorithm to be effective in the case of a failure.
----
+In a `3` nodes topology, you can only afford `1` Sentinel node going down.
+Whenever the **majority** of the Sentinels goes down, the network partition
+protection prevents destructive actions and a failover **will not be started**.
-**Configuring your own Redis server**
+Here are some examples:
-1. Add to the slaves' `redis.conf`:
+- With `5` or `6` sentinels, a maximum of `2` can go down for a failover begin.
+- With `7` sentinels, a maximum of `3` nodes can go down.
- ```conf
- # IP and port of the master Redis server
- slaveof 10.10.10.10 6379
- ```
+The **Leader** election can sometimes fail the voting round when **consensus**
+is not achieved (see the odd number of nodes requirement above). In that case,
+a new attempt will be made after the amount of time defined in
+`sentinel['failover_timeout']` (in milliseconds).
-1. Optionally, set up password authentication for increased security.
- Add the following to master's `redis.conf`:
+>**Note:**
+We will see where `sentinel['failover_timeout']` is defined later.
+
+The `failover_timeout` variable has a lot of different use cases. According to
+the official documentation:
+
+- The time needed to re-start a failover after a previous failover was
+ already tried against the same master by a given Sentinel, is two
+ times the failover timeout.
+
+- The time needed for a slave replicating to a wrong master according
+ to a Sentinel current configuration, to be forced to replicate
+ with the right master, is exactly the failover timeout (counting since
+ the moment a Sentinel detected the misconfiguration).
+
+- The time needed to cancel a failover that is already in progress but
+ did not produced any configuration change (SLAVEOF NO ONE yet not
+ acknowledged by the promoted slave).
+
+- The maximum time a failover in progress waits for all the slaves to be
+ reconfigured as slaves of the new master. However even after this time
+ the slaves will be reconfigured by the Sentinels anyway, but not with
+ the exact parallel-syncs progression as specified.
+
+### Available configuration setups
+
+Based on your infrastructure setup and how you have installed GitLab, there are
+multiple ways to configure Redis HA. Omnibus GitLab packages have Redis and/or
+Redis Sentinel bundled with them so you only need to focus on configuration.
+Pick the one that suits your needs.
+
+- [Installations from source][source]: You need to install Redis and Sentinel
+ yourself. Use the [Redis HA installation from source](redis_source.md)
+ documentation.
+- [Omnibus GitLab **Community Edition** (CE) package][ce]: Redis is bundled, so you
+ can use the package with only the Redis service enabled as described in steps
+ 1 and 2 of this document (works for both master and slave setups). To install
+ and configure Sentinel, jump directly to the Sentinel section in the
+ [Redis HA installation from source](redis_source.md#step-3-configuring-the-redis-sentinel-instances) documentation.
+- [Omnibus GitLab **Enterprise Edition** (EE) package][ee]: Both Redis and Sentinel
+ are bundled in the package, so you can use the EE package to setup the whole
+ Redis HA infrastructure (master, slave and Sentinel) which is described in
+ this document.
+- If you have installed GitLab using the Omnibus GitLab packages (CE or EE),
+ but you want to use your own external Redis server, follow steps 1-3 in the
+ [Redis HA installation from source](redis_source.md) documentation, then go
+ straight to step 4 in this guide to
+ [set up the GitLab application](#step-4-configuring-the-gitlab-application).
+
+## Configuring Redis HA
+
+This is the section where we install and setup the new Redis instances.
+
+>**Notes:**
+- We assume that you install GitLab and all HA components from scratch. If you
+ already have it installed and running, read how to
+ [switch from a single-machine installation to Redis HA](#switching-from-an-existing-single-machine-installation-to-redis-ha).
+- Redis nodes (both master and slaves) will need the same password defined in
+ `redis['password']`. At any time during a failover the Sentinels can
+ reconfigure a node and change its status from master to slave and vice versa.
+
+### Prerequisites
+
+The prerequisites for a HA Redis setup are the following:
+
+1. Provision the minimum required number of instances as specified in the
+ [recommended setup](#recommended-setup) section.
+1. **Do NOT** install Redis or Redis Sentinel in the same machines your
+ GitLab application is running on. You can however opt in to install Redis
+ and Sentinel in the same machine (each in independent ones is recommended
+ though).
+1. All Redis nodes must be able to talk to each other and accept incoming
+ connections over Redis (`6379`) and Sentinel (`26379`) ports (unless you
+ change the default ones).
+1. The server that hosts the GitLab application must be able to access the
+ Redis nodes.
+1. Protect the nodes from access from external networks ([Internet][it]), using
+ firewall.
+
+### Step 1. Configuring the master Redis instance
+
+1. SSH into the **master** Redis server.
+1. [Download/install](https://about.gitlab.com/installation) the Omnibus GitLab
+ package you want using **steps 1 and 2** from the GitLab downloads page.
+ - Make sure you select the correct Omnibus package, with the same version
+ and type (Community, Enterprise editions) of your current install.
+ - Do not complete any other steps on the download page.
+
+1. Edit `/etc/gitlab/gitlab.rb` and add the contents:
- ```conf
- # Optional password authentication for increased security
- requirepass "<password>"
+ ```ruby
+ # Enable the master role and disable all other services in the machine
+ # (you can still enable Sentinel).
+ redis_master_role['enable'] = true
+
+ # IP address pointing to a local IP that the other machines can reach to.
+ # You can also set bind to '0.0.0.0' which listen in all interfaces.
+ # If you really need to bind to an external accessible IP, make
+ # sure you add extra firewall rules to prevent unauthorized access.
+ redis['bind'] = '10.0.0.1'
+
+ # Define a port so Redis can listen for TCP requests which will allow other
+ # machines to connect to it.
+ redis['port'] = 6379
+
+ # Set up password authentication for Redis (use the same password in all nodes).
+ redis['password'] = 'redis-password-goes-here'
```
-1. Then add this line to all the slave servers' `redis.conf`:
+1. Only the primary GitLab application server should handle migrations. To
+ prevent database migrations from running on upgrade, add the following
+ configuration to your `/etc/gitlab/gitlab.rb` file:
- ```conf
- masterauth "<password>"
+ ```
+ gitlab_rails['auto_migrate'] = false
```
-1. Restart the Redis services for the changes to take effect.
+1. [Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect.
----
+### Step 2. Configuring the slave Redis instances
-**Using Redis via Omnibus**
+1. SSH into the **slave** Redis server.
+1. [Download/install](https://about.gitlab.com/installation) the Omnibus GitLab
+ package you want using **steps 1 and 2** from the GitLab downloads page.
+ - Make sure you select the correct Omnibus package, with the same version
+ and type (Community, Enterprise editions) of your current install.
+ - Do not complete any other steps on the download page.
-1. Edit `/etc/gitlab/gitlab.rb` of a master Redis machine (usualy a single machine):
+1. Edit `/etc/gitlab/gitlab.rb` and add the contents:
```ruby
- ## Redis TCP support (will disable UNIX socket transport)
- redis['bind'] = '0.0.0.0' # or specify an IP to bind to a single one
+ # Enable the slave role and disable all other services in the machine
+ # (you can still enable Sentinel). This will also set automatically
+ # `redis['master'] = false`.
+ redis_slave_role['enable'] = true
+
+ # IP address pointing to a local IP that the other machines can reach to.
+ # You can also set bind to '0.0.0.0' which listen in all interfaces.
+ # If you really need to bind to an external accessible IP, make
+ # sure you add extra firewall rules to prevent unauthorized access.
+ redis['bind'] = '10.0.0.2'
+
+ # Define a port so Redis can listen for TCP requests which will allow other
+ # machines to connect to it.
redis['port'] = 6379
- ## Master redis instance
- redis['password'] = '<huge password string here>'
- ```
+ # The same password for Redeis authentication you set up for the master node.
+ redis['password'] = 'redis-password-goes-here'
-1. Edit `/etc/gitlab/gitlab.rb` of a slave Redis machine (should be one or more machines):
+ # The IP of the master Redis node.
+ redis['master_ip'] = '10.0.0.1'
- ```ruby
- ## Redis TCP support (will disable UNIX socket transport)
- redis['bind'] = '0.0.0.0' # or specify an IP to bind to a single one
- redis['port'] = 6379
+ # Port of master Redis server, uncomment to change to non default. Defaults
+ # to `6379`.
+ #redis['master_port'] = 6379
+ ```
- ## Slave redis instance
- redis['master_ip'] = '10.10.10.10' # IP of master Redis server
- redis['master_port'] = 6379 # Port of master Redis server
- redis['master_password'] = "<huge password string here>"
+1. To prevent database migrations from running on upgrade, run:
+
+ ```
+ sudo touch /etc/gitlab/skip-auto-migrations
```
-1. Reconfigure the GitLab for the changes to take effect: `sudo gitlab-ctl reconfigure`
+ Only the primary GitLab application server should handle migrations.
+
+1. [Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect.
+1. Go through the steps again for all the other slave nodes.
---
+These values don't have to be changed again in `/etc/gitlab/gitlab.rb` after
+a failover, as the nodes will be managed by the Sentinels, and even after a
+`gitlab-ctl reconfigure`, they will get their configuration restored by
+the same Sentinels.
+
+### Step 3. Configuring the Redis Sentinel instances
+
+>**Note:**
+Redis Sentinel is bundled with Omnibus GitLab Enterprise Edition only. The
+following section assumes you are using Omnibus GitLab Enterprise Edition.
+For the Omnibus Community Edition and installations from source, follow the
+[Redis HA source install](redis_source.md) guide.
+
Now that the Redis servers are all set up, let's configure the Sentinel
servers.
-### Sentinel setup
+If you are not sure if your Redis servers are working and replicating
+correctly, please read the [Troubleshooting Replication](#troubleshooting-replication)
+and fix it before proceeding with Sentinel setup.
-We don't provide yet an automated way to setup and run the Sentinel daemon
-from Omnibus installation method. You must follow the instructions below and
-run it by yourself.
+You must have at least `3` Redis Sentinel servers, and they need to
+be each in an independent machine. You can configure them in the same
+machines where you've configured the other Redis servers.
-The support for Sentinel in Ruby has some [caveats](https://github.com/redis/redis-rb/issues/531).
-While you can give any name for the `master-group-name` part of the
-configuration, as in this example:
+With GitLab Enterprise Edition, you can use the Omnibus package to setup
+multiple machines with the Sentinel daemon.
-```conf
-sentinel monitor <master-group-name> <ip> <port> <quorum>
-```
+---
-,for it to work in Ruby, you have to use the "hostname" of the master Redis
-server, otherwise you will get an error message like:
-`Redis::CannotConnectError: No sentinels available.`. Read
-[Sentinel troubleshooting](#sentinel-troubleshooting) for more information.
+1. SSH into the server that will host Redis Sentinel.
+1. **You can omit this step if the Sentinels will be hosted in the same node as
+ the other Redis instances.**
-Here is an example configuration file (`sentinel.conf`) for a Sentinel node:
+ [Download/install](https://about.gitlab.com/downloads-ee) the
+ Omnibus GitLab Enterprise Edition package using **steps 1 and 2** from the
+ GitLab downloads page.
+ - Make sure you select the correct Omnibus package, with the same version
+ the GitLab application is running.
+ - Do not complete any other steps on the download page.
-```conf
-port 26379
-sentinel monitor master-redis.example.com 10.10.10.10 6379 1
-sentinel down-after-milliseconds master-redis.example.com 10000
-sentinel config-epoch master-redis.example.com 0
-sentinel leader-epoch master-redis.example.com 0
-```
+1. Edit `/etc/gitlab/gitlab.rb` and add the contents (if you are installing the
+ Sentinels in the same node as the other Redis instances, some values might
+ be duplicate below):
----
+ ```ruby
+ redis_sentinel_role['enable'] = true
-The final part is to inform the main GitLab application server of the Redis
-master and the new sentinels servers.
+ # Must be the same in every sentinel node
+ redis['master_name'] = 'gitlab-redis'
-### GitLab setup
+ # The same password for Redis authentication you set up for the master node.
+ redis['password'] = 'redis-password-goes-here'
-You can enable or disable sentinel support at any time in new or existing
-installations. From the GitLab application perspective, all it requires is
-the correct credentials for the master Redis and for a few Sentinel nodes.
+ # The IP of the master Redis node.
+ redis['master_ip'] = '10.0.0.1'
-It doesn't require a list of all Sentinel nodes, as in case of a failure,
-the application will need to query only one of them.
+ # Define a port so Redis can listen for TCP requests which will allow other
+ # machines to connect to it.
+ redis['port'] = 6379
->**Note:**
-The following steps should be performed in the [GitLab application server](gitlab.md).
+ # Port of master Redis server, uncomment to change to non default. Defaults
+ # to `6379`.
+ #redis['master_port'] = 6379
+
+ ## Configure Sentinel
+ sentinel['bind'] = '10.0.0.1'
+
+ # Port that Sentinel listens on, uncomment to change to non default. Defaults
+ # to `26379`.
+ # sentinel['port'] = 26379
+
+ ## Quorum must reflect the amount of voting sentinels it take to start a failover.
+ ## Value must NOT be greater then the amount of sentinels.
+ ##
+ ## The quorum can be used to tune Sentinel in two ways:
+ ## 1. If a the quorum is set to a value smaller than the majority of Sentinels
+ ## we deploy, we are basically making Sentinel more sensible to master failures,
+ ## triggering a failover as soon as even just a minority of Sentinels is no longer
+ ## able to talk with the master.
+ ## 1. If a quorum is set to a value greater than the majority of Sentinels, we are
+ ## making Sentinel able to failover only when there are a very large number (larger
+ ## than majority) of well connected Sentinels which agree about the master being down.s
+ sentinel['quorum'] = 2
+
+ ## Consider unresponsive server down after x amount of ms.
+ # sentinel['down_after_milliseconds'] = 10000
+
+ ## Specifies the failover timeout in milliseconds. It is used in many ways:
+ ##
+ ## - The time needed to re-start a failover after a previous failover was
+ ## already tried against the same master by a given Sentinel, is two
+ ## times the failover timeout.
+ ##
+ ## - The time needed for a slave replicating to a wrong master according
+ ## to a Sentinel current configuration, to be forced to replicate
+ ## with the right master, is exactly the failover timeout (counting since
+ ## the moment a Sentinel detected the misconfiguration).
+ ##
+ ## - The time needed to cancel a failover that is already in progress but
+ ## did not produced any configuration change (SLAVEOF NO ONE yet not
+ ## acknowledged by the promoted slave).
+ ##
+ ## - The maximum time a failover in progress waits for all the slaves to be
+ ## reconfigured as slaves of the new master. However even after this time
+ ## the slaves will be reconfigured by the Sentinels anyway, but not with
+ ## the exact parallel-syncs progression as specified.
+ # sentinel['failover_timeout'] = 60000
+ ```
-**For source based installations**
+1. To prevent database migrations from running on upgrade, run:
-1. Edit `/home/git/gitlab/config/resque.yml` following the example in
- `/home/git/gitlab/config/resque.yml.example`, and uncomment the sentinels
- line, changing to the correct server credentials.
-1. Restart GitLab for the changes to take effect.
+ ```
+ sudo touch /etc/gitlab/skip-auto-migrations
+ ```
-**For Omnibus installations**
+ Only the primary GitLab application server should handle migrations.
+1. [Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect.
+1. Go through the steps again for all the other Sentinel nodes.
+
+### Step 4. Configuring the GitLab application
+
+The final part is to inform the main GitLab application server of the Redis
+Sentinels servers and authentication credentials.
+
+You can enable or disable Sentinel support at any time in new or existing
+installations. From the GitLab application perspective, all it requires is
+the correct credentials for the Sentinel nodes.
+
+While it doesn't require a list of all Sentinel nodes, in case of a failure,
+it needs to access at least one of the listed.
+
+>**Note:**
+The following steps should be performed in the [GitLab application server](gitlab.md)
+which ideally should not have Redis or Sentinels on it for a HA setup.
+
+1. SSH into the server where the GitLab application is installed.
1. Edit `/etc/gitlab/gitlab.rb` and add/change the following lines:
- ```ruby
- gitlab-rails['redis_host'] = "master-redis.example.com"
- gitlab-rails['redis_port'] = 6379
- gitlab-rails['redis_password'] = '<huge password string here>'
- gitlab-rails['redis_sentinels'] = [
- {'host' => '10.10.10.1', 'port' => 26379},
- {'host' => '10.10.10.2', 'port' => 26379},
- {'host' => '10.10.10.3', 'port' => 26379}
+ ```
+ ## Must be the same in every sentinel node
+ redis['master_name'] = 'gitlab-redis'
+
+ ## The same password for Redis authentication you set up for the master node.
+ redis['password'] = 'redis-password-goes-here'
+
+ ## A list of sentinels with `host` and `port`
+ gitlab_rails['redis_sentinels'] = [
+ {'host' => '10.0.0.1', 'port' => 26379},
+ {'host' => '10.0.0.2', 'port' => 26379},
+ {'host' => '10.0.0.3', 'port' => 26379}
]
```
-1. [Reconfigure] the GitLab for the changes to take effect.
+1. [Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect.
-### Sentinel troubleshooting
+## Switching from an existing single-machine installation to Redis HA
-If you get an error like: `Redis::CannotConnectError: No sentinels available.`,
-there may be something wrong with your configuration files or it can be related
-to [this issue][gh-531] ([pull request][gh-534] that should make things better).
+If you already have a single-machine GitLab install running, you will need to
+replicate from this machine first, before de-activating the Redis instance
+inside it.
+
+Your single-machine install will be the initial **Master**, and the `3` others
+should be configured as **Slave** pointing to this machine.
+
+After replication catches up, you will need to stop services in the
+single-machine install, to rotate the **Master** to one of the new nodes.
-It's a bit rigid the way you have to config `resque.yml` and `sentinel.conf`,
-otherwise `redis-rb` will not work properly.
+Make the required changes in configuration and restart the new nodes again.
-The hostname ('my-primary-redis') of the primary Redis server (`sentinel.conf`)
-**must** match the one configured in GitLab (`resque.yml` for source installations
-or `gitlab-rails['redis_*']` in Omnibus) and it must be valid ex:
+To disable redis in the single install, edit `/etc/gitlab/gitlab.rb`:
-```conf
-# sentinel.conf:
-sentinel monitor my-primary-redis 10.10.10.10 6379 1
-sentinel down-after-milliseconds my-primary-redis 10000
-sentinel config-epoch my-primary-redis 0
-sentinel leader-epoch my-primary-redis 0
+```ruby
+redis['enable'] = false
```
-```yaml
-# resque.yaml
-production:
- url: redis://my-primary-redis:6378
- sentinels:
- -
- host: slave1
- port: 26380 # point to sentinel, not to redis port
- -
- host: slave2
- port: 26381 # point to sentinel, not to redis port
+If you fail to replicate first, you may loose data (unprocessed background jobs).
+
+## Example of a minimal configuration with 1 master, 2 slaves and 3 Sentinels
+
+>**Note:**
+Redis Sentinel is bundled with Omnibus GitLab Enterprise Edition only. For
+different setups, read the
+[available configuration setups](#available-configuration-setups) section.
+
+In this example we consider that all servers have an internal network
+interface with IPs in the `10.0.0.x` range, and that they can connect
+to each other using these IPs.
+
+In a real world usage, you would also setup firewall rules to prevent
+unauthorized access from other machines and block traffic from the
+outside (Internet).
+
+We will use the same `3` nodes with **Redis** + **Sentinel** topology
+discussed in [Redis setup overview](#redis-setup-overview) and
+[Sentinel setup overview](#sentinel-setup-overview) documentation.
+
+Here is a list and description of each **machine** and the assigned **IP**:
+
+* `10.0.0.1`: Redis Master + Sentinel 1
+* `10.0.0.2`: Redis Slave 1 + Sentinel 2
+* `10.0.0.3`: Redis Slave 2 + Sentinel 3
+* `10.0.0.4`: GitLab application
+
+Please note that after the initial configuration, if a failover is initiated
+by the Sentinel nodes, the Redis nodes will be reconfigured and the **Master**
+will change permanently (including in `redis.conf`) from one node to the other,
+until a new failover is initiated again.
+
+The same thing will happen with `sentinel.conf` that will be overridden after the
+initial execution, after any new sentinel node starts watching the **Master**,
+or a failover promotes a different **Master** node.
+
+### Example configuration for Redis master and Sentinel 1
+
+In `/etc/gitlab/gitlab.rb`:
+
+```ruby
+redis_master_role['enable'] = true
+redis_sentinel_role['enable'] = true
+redis['bind'] = '10.0.0.1'
+redis['port'] = 6379
+redis['password'] = 'redis-password-goes-here'
+redis['master_name'] = 'gitlab-redis' # must be the same in every sentinel node
+redis['master_password'] = 'redis-password-goes-here' # the same value defined in redis['password'] in the master instance
+redis['master_ip'] = '10.0.0.1' # ip of the initial master redis instance
+#redis['master_port'] = 6379 # port of the initial master redis instance, uncomment to change to non default
+sentinel['bind'] = '10.0.0.1'
+# sentinel['port'] = 26379 # uncomment to change default port
+sentinel['quorum'] = 2
+# sentinel['down_after_milliseconds'] = 10000
+# sentinel['failover_timeout'] = 60000
+```
+
+[Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect.
+
+### Example configuration for Redis slave 1 and Sentinel 2
+
+In `/etc/gitlab/gitlab.rb`:
+
+```ruby
+redis_slave_role['enable'] = true
+redis_sentinel_role['enable'] = true
+redis['bind'] = '10.0.0.2'
+redis['port'] = 6379
+redis['password'] = 'redis-password-goes-here'
+redis['master_password'] = 'redis-password-goes-here'
+redis['master_ip'] = '10.0.0.1' # IP of master Redis server
+#redis['master_port'] = 6379 # Port of master Redis server, uncomment to change to non default
+redis['master_name'] = 'gitlab-redis' # must be the same in every sentinel node
+sentinel['bind'] = '10.0.0.2'
+# sentinel['port'] = 26379 # uncomment to change default port
+sentinel['quorum'] = 2
+# sentinel['down_after_milliseconds'] = 10000
+# sentinel['failover_timeout'] = 60000
+```
+
+[Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect.
+
+### Example configuration for Redis slave 2 and Sentinel 3
+
+In `/etc/gitlab/gitlab.rb`:
+
+```ruby
+redis_slave_role['enable'] = true
+redis_sentinel_role['enable'] = true
+redis['bind'] = '10.0.0.3'
+redis['port'] = 6379
+redis['password'] = 'redis-password-goes-here'
+redis['master_password'] = 'redis-password-goes-here'
+redis['master_ip'] = '10.0.0.1' # IP of master Redis server
+#redis['master_port'] = 6379 # Port of master Redis server, uncomment to change to non default
+redis['master_name'] = 'gitlab-redis' # must be the same in every sentinel node
+sentinel['bind'] = '10.0.0.3'
+# sentinel['port'] = 26379 # uncomment to change default port
+sentinel['quorum'] = 2
+# sentinel['down_after_milliseconds'] = 10000
+# sentinel['failover_timeout'] = 60000
```
-When in doubt, please read [Redis Sentinel documentation](http://redis.io/topics/sentinel)
+[Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect.
+
+### Example configuration for the GitLab application
+
+In `/etc/gitlab/gitlab.rb`:
+
+```ruby
+redis['master_name'] = 'gitlab-redis'
+redis['password'] = 'redis-password-goes-here'
+gitlab_rails['redis_sentinels'] = [
+ {'host' => '10.0.0.1', 'port' => 26379},
+ {'host' => '10.0.0.2', 'port' => 26379},
+ {'host' => '10.0.0.3', 'port' => 26379}
+]
+```
+
+[Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect.
+
+## Advanced configuration
+
+Omnibus GitLab configures some things behind the curtains to make the sysadmins'
+lives easier. If you want to know what happens underneath keep reading.
+
+### Control running services
+
+In the previous example, we've used `redis_sentinel_role` and
+`redis_master_role` which simplifies the amount of configuration changes.
+
+If you want more control, here is what each one sets for you automatically
+when enabled:
+
+```ruby
+## Redis Sentinel Role
+redis_sentinel_role['enable'] = true
+
+# When Sentinel Role is enabled, the following services are also enabled
+sentinel['enable'] = true
+
+# The following services are disabled
+redis['enable'] = false
+bootstrap['enable'] = false
+nginx['enable'] = false
+postgresql['enable'] = false
+gitlab_rails['enable'] = false
+mailroom['enable'] = false
+
+-------
+
+## Redis master/slave Role
+redis_master_role['enable'] = true # enable only one of them
+redis_slave_role['enable'] = true # enable only one of them
+
+# When Redis Master or Slave role are enabled, the following services are
+# enabled/disabled. Note that if Redis and Sentinel roles are combined, both
+# services will be enabled.
+
+# The following services are disabled
+sentinel['enable'] = false
+bootstrap['enable'] = false
+nginx['enable'] = false
+postgresql['enable'] = false
+gitlab_rails['enable'] = false
+mailroom['enable'] = false
+
+# For Redis Slave role, also change this setting from default 'true' to 'false':
+redis['master'] = false
+```
+
+You can find the relevant attributes defined in [gitlab_rails.rb][omnifile].
+
+## Troubleshooting
+
+There are a lot of moving parts that needs to be taken care carefully
+in order for the HA setup to work as expected.
+
+Before proceeding with the troubleshooting below, check your firewall rules:
+
+- Redis machines
+ - Accept TCP connection in `6379`
+ - Connect to the other Redis machines via TCP in `6379`
+- Sentinel machines
+ - Accept TCP connection in `26379`
+ - Connect to other Sentinel machines via TCP in `26379`
+ - Connect to the Redis machines via TCP in `6379`
+
+### Troubleshooting Redis replication
+
+You can check if everything is correct by connecting to each server using
+`redis-cli` application, and sending the `INFO` command.
+
+If authentication was correctly defined, it should fail with:
+`NOAUTH Authentication required` error. Try to authenticate with the
+previous defined password with `AUTH redis-password-goes-here` and
+try the `INFO` command again.
+
+Look for the `# Replication` section where you should see some important
+information like the `role` of the server.
+
+When connected to a `master` redis, you will see the number of connected
+`slaves`, and a list of each with connection details:
+
+```
+# Replication
+role:master
+connected_slaves:1
+slave0:ip=10.133.5.21,port=6379,state=online,offset=208037514,lag=1
+master_repl_offset:208037658
+repl_backlog_active:1
+repl_backlog_size:1048576
+repl_backlog_first_byte_offset:206989083
+repl_backlog_histlen:1048576
+```
+
+When it's a `slave`, you will see details of the master connection and if
+its `up` or `down`:
+
+```
+# Replication
+role:slave
+master_host:10.133.1.58
+master_port:6379
+master_link_status:up
+master_last_io_seconds_ago:1
+master_sync_in_progress:0
+slave_repl_offset:208096498
+slave_priority:100
+slave_read_only:1
+connected_slaves:0
+master_repl_offset:0
+repl_backlog_active:0
+repl_backlog_size:1048576
+repl_backlog_first_byte_offset:0
+repl_backlog_histlen:0
+```
+
+### Troubleshooting Sentinel
+
+If you get an error like: `Redis::CannotConnectError: No sentinels available.`,
+there may be something wrong with your configuration files or it can be related
+to [this issue][gh-531].
+
+You must make sure you are defining the same value in `redis['master_name']`
+and `redis['master_pasword']` as you defined for your sentinel node.
+
+The way the redis connector `redis-rb` works with sentinel is a bit
+non-intuitive. We try to hide the complexity in omnibus, but it still requires
+a few extra configs.
---
@@ -273,7 +788,7 @@ To make sure your configuration is correct:
sudo gitlab-rails console
# For source installations
- sudo -u git rails console RAILS_ENV=production
+ sudo -u git rails console production
```
1. Run in the console:
@@ -288,8 +803,8 @@ To make sure your configuration is correct:
1. To simulate a failover on master Redis, SSH into the Redis server and run:
```bash
- # port must match your master redis port
- redis-cli -h localhost -p 6379 DEBUG sleep 60
+ # port must match your master redis port, and the sleep time must be a few seconds bigger than defined one
+ redis-cli -h localhost -p 6379 DEBUG sleep 20
```
1. Then back in the Rails console from the first step, run:
@@ -301,10 +816,26 @@ To make sure your configuration is correct:
You should see a different port after a few seconds delay
(the failover/reconnect time).
----
-Read more on high-availability configuration:
+## Changelog
+
+Changes to Redis HA over time.
+
+**8.14**
+
+- Redis Sentinel support is production-ready and bundled in the Omnibus GitLab
+ Enterprise Edition package
+- Documentation restructure for better readability
+
+**8.11**
+
+- Experimental Redis Sentinel support was added
+
+## Further reading
+
+Read more on High Availability:
+1. [High Availability Overview](README.md)
1. [Configure the database](database.md)
1. [Configure NFS](nfs.md)
1. [Configure the GitLab application servers](gitlab.md)
@@ -315,3 +846,10 @@ Read more on high-availability configuration:
[reconfigure]: ../restart_gitlab.md#omnibus-gitlab-reconfigure
[gh-531]: https://github.com/redis/redis-rb/issues/531
[gh-534]: https://github.com/redis/redis-rb/issues/534
+[redis]: http://redis.io/
+[sentinel]: http://redis.io/topics/sentinel
+[omnifile]: https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-cookbooks/gitlab/libraries/gitlab_rails.rb
+[source]: ../../install/installation.md
+[ce]: https://about.gitlab.com/downloads
+[ee]: https://about.gitlab.com/downloads-ee
+[it]: https://gitlab.com/gitlab-org/gitlab-ce/uploads/c4cc8cd353604bd80315f9384035ff9e/The_Internet_IT_Crowd.png