summaryrefslogtreecommitdiff
path: root/doc/administration
diff options
context:
space:
mode:
authorMarcel Amirault <ravlen@gmail.com>2019-05-05 22:58:11 +0000
committerAchilleas Pipinellis <axil@gitlab.com>2019-05-05 22:58:11 +0000
commit4b972049f959ff70a197b7b15fc2439f0b0af342 (patch)
tree41f7e6ac12349af9139ec4bf1e0966f5f0a41d8a /doc/administration
parent65db7fb35cb477205d76bf8c99b32d9ece8fcc57 (diff)
downloadgitlab-ce-4b972049f959ff70a197b7b15fc2439f0b0af342.tar.gz
Docs: Merge EE doc/administration/high_availability to CE
Diffstat (limited to 'doc/administration')
-rw-r--r--doc/administration/high_availability/README.md180
-rw-r--r--doc/administration/high_availability/alpha_database.md6
-rw-r--r--doc/administration/high_availability/consul.md105
-rw-r--r--doc/administration/high_availability/database.md1158
-rw-r--r--doc/administration/high_availability/gitaly.md90
-rw-r--r--doc/administration/high_availability/gitlab.md6
-rw-r--r--doc/administration/high_availability/nfs_host_client_setup.md135
-rw-r--r--doc/administration/high_availability/pg_ha_architecture.pngbin0 -> 18412 bytes
-rw-r--r--doc/administration/high_availability/pgbouncer.md132
-rw-r--r--doc/administration/high_availability/redis.md160
10 files changed, 1854 insertions, 118 deletions
diff --git a/doc/administration/high_availability/README.md b/doc/administration/high_availability/README.md
index c69ac52a41b..de61c01991b 100644
--- a/doc/administration/high_availability/README.md
+++ b/doc/administration/high_availability/README.md
@@ -1,4 +1,4 @@
-# High Availability
+# Scaling and High Availability
GitLab supports several different types of clustering and high-availability.
The solution you choose will be based on the level of scalability and
@@ -13,47 +13,173 @@ of Git, developers can still commit code locally even when GitLab is not
available. However, some GitLab features such as the issue tracker and
Continuous Integration are not available when GitLab is down.
-**Keep in mind that all Highly Available solutions come with a trade-off between
+**Keep in mind that all highly-available solutions come with a trade-off between
cost/complexity and uptime**. The more uptime you want, the more complex the
solution. And the more complex the solution, the more work is involved in
setting up and maintaining it. High availability is not free and every HA
solution should balance the costs against the benefits.
-## Architecture
+There are many options when choosing a highly-available GitLab architecture. We
+recommend engaging with GitLab Support to choose the best architecture for your
+use-case. This page contains some various options and guidelines based on
+experience with GitLab.com and Enterprise Edition on-premises customers.
-There are two kinds of setups:
+For a detailed insight into how GitLab scales and configures GitLab.com, you can
+watch [this 1 hour Q&A](https://www.youtube.com/watch?v=uCU8jdYzpac)
+with [John Northrup](https://gitlab.com/northrup), one of our infrastructure
+engineers, and live questions coming in from some of our customers.
-- active/active
-- active/passive
+## GitLab Components
-### Active/Active
+The following components need to be considered for a scaled or highly-available
+environment. In many cases components can be combined on the same nodes to reduce
+complexity.
-This architecture scales easily because all application servers handle
-user requests simultaneously. The database, Redis, and GitLab application are
-all deployed on separate servers. The configuration is **only** highly-available
-if the database, Redis and storage are also configured as such.
+- Unicorn/Workhorse - Web-requests (UI, API, Git over HTTP)
+- Sidekiq - Asynchronous/Background jobs
+- PostgreSQL - Database
+ - Consul - Database service discovery and health checks/failover
+ - PGBouncer - Database pool manager
+- Redis - Key/Value store (User sessions, cache, queue for Sidekiq)
+ - Sentinel - Redis health check/failover manager
+- Gitaly - Provides high-level RPC access to Git repositories
-Follow the steps below to configure an active/active setup:
+## Scalable Architecture Examples
+
+When an organization reaches a certain threshold it will be necessary to scale
+the GitLab instance. Still, true high availability may not be necessary. There
+are options for scaling GitLab instances relatively easily without incurring the
+infrastructure and maintenance costs of full high availability.
+
+### Basic Scaling
+
+This is the simplest form of scaling and will work for the majority of
+cases. Backend components such as PostgreSQL, Redis and storage are offloaded
+to their own nodes while the remaining GitLab components all run on 2 or more
+application nodes.
+
+This form of scaling also works well in a cloud environment when it is more
+cost-effective to deploy several small nodes rather than a single
+larger one.
+
+- 1 PostgreSQL node
+- 1 Redis node
+- 2 or more GitLab application nodes (Unicorn, Workhorse, Sidekiq)
+- 1 NFS/Gitaly storage server
+
+#### Installation Instructions
+
+Complete the following installation steps in order. A link at the end of each
+section will bring you back to the Scalable Architecture Examples section so
+you can continue with the next step.
+
+1. [PostgreSQL](./database.md#postgresql-in-a-scaled-environment)
+1. [Redis](./redis.md#redis-in-a-scaled-environment)
+1. [Gitaly](./gitaly.md) (recommended) or [NFS](./nfs.md)
+1. [GitLab application nodes](./gitlab.md)
+
+### Full Scaling
+
+For very large installations it may be necessary to further split components
+for maximum scalability. In a fully-scaled architecture the application node
+is split into separate Sidekiq and Unicorn/Workhorse nodes. One indication that
+this architecture is required is if Sidekiq queues begin to periodically increase
+in size, indicating that there is contention or not enough resources.
+
+- 1 PostgreSQL node
+- 1 Redis node
+- 2 or more GitLab application nodes (Unicorn, Workhorse)
+- 2 or more Sidekiq nodes
+- 2 or more NFS/Gitaly storage servers
+
+## High Availability Architecture Examples
+
+When organizations require scaling *and* high availability the following
+architectures can be utilized. As the introduction section at the top of this
+page mentions, there is a tradeoff between cost/complexity and uptime. Be sure
+this complexity is absolutely required before taking the step into full
+high availability.
+
+For all examples below, we recommend running Consul and Redis Sentinel on
+dedicated nodes. If Consul is running on PostgreSQL nodes or Sentinel on
+Redis nodes there is a potential that high resource usage by PostgreSQL or
+Redis could prevent communication between the other Consul and Sentinel nodes.
+This may lead to the other nodes believing a failure has occurred and automated
+failover is necessary. Isolating them from the services they monitor reduces
+the chances of split-brain.
+
+The examples below do not really address high availability of NFS. Some enterprises
+have access to NFS appliances that manage availability. This is the best case
+scenario. In the future, GitLab may offer a more user-friendly solution to
+[GitLab HA Storage](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2472).
+
+There are many options in between each of these examples. Work with GitLab Support
+to understand the best starting point for your workload and adapt from there.
+
+### Horizontal
+
+This is the simplest form of high availability and scaling. It requires the
+fewest number of individual servers (virtual or physical) but does have some
+trade-offs and limits.
+
+This architecture will work well for many GitLab customers. Larger customers
+may begin to notice certain events cause contention/high load - for example,
+cloning many large repositories with binary files, high API usage, a large
+number of enqueued Sidekiq jobs, etc. If this happens you should consider
+moving to a hybrid or fully distributed architecture depending on what is causing
+the contention.
+
+- 3 PostgreSQL nodes
+- 2 Redis nodes
+- 3 Consul/Sentinel nodes
+- 2 or more GitLab application nodes (Unicorn, Workhorse, Sidekiq, PGBouncer)
+- 1 NFS/Gitaly server
+
+![Horizontal architecture diagram](https://docs.gitlab.com/ee/administration/img/high_availability/horizontal.png)
+
+### Hybrid
+
+In this architecture, certain components are split on dedicated nodes so high
+resource usage of one component does not interfere with others. In larger
+environments this is a good architecture to consider if you foresee or do have
+contention due to certain workloads.
+
+- 3 PostgreSQL nodes
+- 2 Redis nodes
+- 3 Consul/Sentinel nodes
+- 2 or more Sidekiq nodes
+- 2 or more Web nodes (Unicorn, Workhorse, PGBouncer)
+- 1 or more NFS/Gitaly servers
+
+![Hybrid architecture diagram](https://docs.gitlab.com/ee/administration/img/high_availability/hybrid.png)
+
+### Fully Distributed
+
+This architecture scales to hundreds of thousands of users and projects and is
+the basis of the GitLab.com architecture. While this scales well it also comes
+with the added complexity of many more nodes to configure, manage and monitor.
+
+- 3 PostgreSQL nodes
+- 4 or more Redis nodes (2 separate clusters for persistent and cache data)
+- 3 Consul nodes
+- 3 Sentinel nodes
+- Multiple dedicated Sidekiq nodes (Split into real-time, best effort, ASAP,
+ CI Pipeline and Pull Mirror sets)
+- 2 or more Git nodes (Git over SSH/Git over HTTP)
+- 2 or more API nodes (All requests to `/api`)
+- 2 or more Web nodes (All other web requests)
+- 2 or more NFS/Gitaly servers
+
+![Fully Distributed architecture diagram](https://docs.gitlab.com/ee/administration/img/high_availability/fully-distributed.png)
+
+The following pages outline the steps necessary to configure each component
+separately:
1. [Configure the database](database.md)
1. [Configure Redis](redis.md)
1. [Configure Redis for GitLab source installations](redis_source.md)
1. [Configure NFS](nfs.md)
+ 1. [NFS Client and Host setup](nfs_host_client_setup.md)
1. [Configure the GitLab application servers](gitlab.md)
1. [Configure the load balancers](load_balancer.md)
-### Active/Passive
-
-For pure high-availability/failover with no scaling you can use an
-active/passive configuration. This utilizes DRBD (Distributed Replicated
-Block Device) to keep all data in sync. DRBD requires a low latency link to
-remain in sync. It is not advisable to attempt to run DRBD between data centers
-or in different cloud availability zones.
-
-> **Note:** GitLab recommends against choosing this HA method because of the
- complexity of managing DRBD and crafting automatic failover. This is
- *compatible* with GitLab, but not officially *supported*. If you are
- an EE customer, support will help you with GitLab related problems, but if the
- root cause is identified as DRBD, we will not troubleshoot further.
-
-Components/Servers Required: 2 servers/virtual machines (one active/one passive)
diff --git a/doc/administration/high_availability/alpha_database.md b/doc/administration/high_availability/alpha_database.md
new file mode 100644
index 00000000000..7bf20be60e6
--- /dev/null
+++ b/doc/administration/high_availability/alpha_database.md
@@ -0,0 +1,6 @@
+---
+redirect_to: 'database.md'
+---
+
+This documentation has been moved to the main
+[database documentation](database.md#configure_using_omnibus_for_high_availability).
diff --git a/doc/administration/high_availability/consul.md b/doc/administration/high_availability/consul.md
new file mode 100644
index 00000000000..056b7fc15d9
--- /dev/null
+++ b/doc/administration/high_availability/consul.md
@@ -0,0 +1,105 @@
+# Working with the bundled Consul service **[PREMIUM ONLY]**
+
+## Overview
+
+As part of its High Availability stack, GitLab Premium includes a bundled version of [Consul](http://consul.io) that can be managed through `/etc/gitlab/gitlab.rb`.
+
+A Consul cluster consists of multiple server agents, as well as client agents that run on other nodes which need to talk to the consul cluster.
+
+## Operations
+
+### Checking cluster membership
+
+To see which nodes are part of the cluster, run the following on any member in the cluster
+```
+# /opt/gitlab/embedded/bin/consul members
+Node Address Status Type Build Protocol DC
+consul-b XX.XX.X.Y:8301 alive server 0.9.0 2 gitlab_consul
+consul-c XX.XX.X.Y:8301 alive server 0.9.0 2 gitlab_consul
+consul-c XX.XX.X.Y:8301 alive server 0.9.0 2 gitlab_consul
+db-a XX.XX.X.Y:8301 alive client 0.9.0 2 gitlab_consul
+db-b XX.XX.X.Y:8301 alive client 0.9.0 2 gitlab_consul
+```
+
+Ideally all nodes will have a `Status` of `alive`.
+
+### Restarting the server cluster
+
+**Note**: This section only applies to server agents. It is safe to restart client agents whenever needed.
+
+If it is necessary to restart the server cluster, it is important to do this in a controlled fashion in order to maintain quorum. If quorum is lost, you will need to follow the consul [outage recovery](#outage-recovery) process to recover the cluster.
+
+To be safe, we recommend you only restart one server agent at a time to ensure the cluster remains intact.
+
+For larger clusters, it is possible to restart multiple agents at a time. See the [Consul consensus document](https://www.consul.io/docs/internals/consensus.html#deployment-table) for how many failures it can tolerate. This will be the number of simulateneous restarts it can sustain.
+
+## Troubleshooting
+
+### Consul server agents unable to communicate
+
+By default, the server agents will attempt to [bind](https://www.consul.io/docs/agent/options.html#_bind) to '0.0.0.0', but they will advertise the first private IP address on the node for other agents to communicate with them. If the other nodes cannot communicate with a node on this address, then the cluster will have a failed status.
+
+You will see messages like the following in `gitlab-ctl tail consul` output if you are running into this issue:
+
+```
+2017-09-25_19:53:39.90821 2017/09/25 19:53:39 [WARN] raft: no known peers, aborting election
+2017-09-25_19:53:41.74356 2017/09/25 19:53:41 [ERR] agent: failed to sync remote state: No cluster leader
+```
+
+
+To fix this:
+
+1. Pick an address on each node that all of the other nodes can reach this node through.
+1. Update your `/etc/gitlab/gitlab.rb`
+
+ ```ruby
+ consul['configuration'] = {
+ ...
+ bind_addr: 'IP ADDRESS'
+ }
+ ```
+1. Run `gitlab-ctl reconfigure`
+
+If you still see the errors, you may have to [erase the consul database and reinitialize](#recreate-from-scratch) on the affected node.
+
+### Consul agents do not start - Multiple private IPs
+
+In the case that a node has multiple private IPs the agent be confused as to which of the private addresses to advertise, and then immediately exit on start.
+
+You will see messages like the following in `gitlab-ctl tail consul` output if you are running into this issue:
+
+```
+2017-11-09_17:41:45.52876 ==> Starting Consul agent...
+2017-11-09_17:41:45.53057 ==> Error creating agent: Failed to get advertise address: Multiple private IPs found. Please configure one.
+```
+
+To fix this:
+
+1. Pick an address on the node that all of the other nodes can reach this node through.
+1. Update your `/etc/gitlab/gitlab.rb`
+
+ ```ruby
+ consul['configuration'] = {
+ ...
+ bind_addr: 'IP ADDRESS'
+ }
+ ```
+1. Run `gitlab-ctl reconfigure`
+
+### Outage recovery
+
+If you lost enough server agents in the cluster to break quorum, then the cluster is considered failed, and it will not function without manual intervenetion.
+
+#### Recreate from scratch
+By default, GitLab does not store anything in the consul cluster that cannot be recreated. To erase the consul database and reinitialize
+
+```
+# gitlab-ctl stop consul
+# rm -rf /var/opt/gitlab/consul/data
+# gitlab-ctl start consul
+```
+
+After this, the cluster should start back up, and the server agents rejoin. Shortly after that, the client agents should rejoin as well.
+
+#### Recover a failed cluster
+If you have taken advantage of consul to store other data, and want to restore the failed cluster, please follow the [Consul guide](https://www.consul.io/docs/guides/outage.html) to recover a failed cluster.
diff --git a/doc/administration/high_availability/database.md b/doc/administration/high_availability/database.md
index c1eeb40b98f..1648b6b848a 100644
--- a/doc/administration/high_availability/database.md
+++ b/doc/administration/high_availability/database.md
@@ -1,11 +1,6 @@
-# Configuring a Database for GitLab HA
+# Configuring PostgreSQL for Scaling and High Availability
-You can choose to install and manage a database server (PostgreSQL/MySQL)
-yourself, or you can use GitLab Omnibus packages to help. GitLab recommends
-PostgreSQL. This is the database that will be installed if you use the
-Omnibus package to manage your database.
-
-## Configure your own database server
+## Provide your own PostgreSQL instance **[CORE ONLY]**
If you're hosting GitLab on a cloud provider, you can optionally use a
managed service for PostgreSQL. For example, AWS offers a managed Relational
@@ -20,91 +15,1147 @@ If you use a cloud-managed service, or provide your own PostgreSQL:
1. Configure the GitLab application servers with the appropriate details.
This step is covered in [Configuring GitLab for HA](gitlab.md).
-## Configure using Omnibus
+## PostgreSQL in a Scaled Environment
-1. Download/install GitLab Omnibus using **steps 1 and 2** from
- [GitLab downloads](https://about.gitlab.com/downloads). Do not complete other
- steps on the download page.
-1. Create/edit `/etc/gitlab/gitlab.rb` and use the following configuration.
- Be sure to change the `external_url` to match your eventual GitLab front-end
- URL. If there is a directive listed below that you do not see in the configuration, be sure to add it.
+This section is relevant for [Scaled Architecture](./README.md#scalable-architecture-examples)
+environments including [Basic Scaling](./README.md#basic-scaling) and
+[Full Scaling](./README.md#full-scaling).
- ```ruby
- external_url 'https://gitlab.example.com'
+### Provide your own PostgreSQL instance **[CORE ONLY]**
+
+If you want to use your own deployed PostgreSQL instance(s),
+see [Provide your own PostgreSQL instance](#provide-your-own-postgresql-instance-core-only)
+for more details. However, you can use the GitLab Omnibus package to easily
+deploy the bundled PostgreSQL.
+
+### Standalone PostgreSQL using GitLab Omnibus **[CORE ONLY]**
+
+1. SSH into the PostgreSQL server.
+1. [Download/install](https://about.gitlab.com/installation) the Omnibus GitLab
+ package you want using **steps 1 and 2** from the GitLab downloads page.
+ - Do not complete any other steps on the download page.
+1. Generate a password hash for PostgreSQL. This assumes you will use the default
+ username of `gitlab` (recommended). The command will request a password
+ and confirmation. Use the value that is output by this command in the next
+ step as the value of `POSTGRESQL_PASSWORD_HASH`.
+ ```sh
+ sudo gitlab-ctl pg-password-md5 gitlab
+ ```
+
+1. Edit `/etc/gitlab/gitlab.rb` and add the contents below, updating placeholder
+ values appropriately.
+
+ - `POSTGRESQL_PASSWORD_HASH` - The value output from the previous step
+ - `APPLICATION_SERVER_IP_BLOCKS` - A space delimited list of IP subnets or IP
+ addresses of the GitLab application servers that will connect to the
+ database. Example: `%w(123.123.123.123/32 123.123.123.234/32)`
+
+ ```ruby
# Disable all components except PostgreSQL
roles ['postgres_role']
+ repmgr['enable'] = false
+ consul['enable'] = false
+ prometheus['enable'] = false
+ alertmanager['enable'] = false
+ pgbouncer_exporter['enable'] = false
+ redis_exporter['enable'] = false
+ gitlab_monitor['enable'] = false
+
+ postgresql['listen_address'] = '0.0.0.0'
+ postgresql['port'] = 5432
+
+ # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
+ postgresql['sql_user_password'] = 'POSTGRESQL_PASSWORD_HASH'
+
+ # Replace XXX.XXX.XXX.XXX/YY with Network Address
+ # ????
+ postgresql['trust_auth_cidr_addresses'] = %w(APPLICATION_SERVER_IP_BLOCKS)
+
+ # Disable automatic database migrations
+ gitlab_rails['auto_migrate'] = false
+ ```
+
+ NOTE: **Note:** The role `postgres_role` was introduced with GitLab 10.3
+
+1. [Reconfigure GitLab] for the changes to take effect.
+1. Note the PostgreSQL node's IP address or hostname, port, and
+ plain text password. These will be necessary when configuring the GitLab
+ application servers later.
+
+Advanced configuration options are supported and can be added if
+needed.
+
+Continue configuration of other components by going
+[back to Scaled Architectures](./README.md#scalable-architecture-examples)
+
+## PostgreSQL with High Availability
+
+This section is relevant for [High Availability Architecture](./README.md#high-availability-architecture-examples)
+environments including [Horizontal](./README.md#horizontal),
+[Hybrid](./README.md#hybrid), and
+[Fully Distributed](./README.md#fully-distributed).
+
+### Provide your own PostgreSQL instance **[CORE ONLY]**
+
+If you want to use your own deployed PostgreSQL instance(s),
+see [Provide your own PostgreSQL instance](#provide-your-own-postgresql-instance-core-only)
+for more details. However, you can use the GitLab Omnibus package to easily
+deploy the bundled PostgreSQL.
+
+### High Availability with GitLab Omnibus **[PREMIUM ONLY]**
+
+> Important notes:
+> - This document will focus only on configuration supported with [GitLab Premium](https://about.gitlab.com/pricing/), using the Omnibus GitLab package.
+> - If you are a Community Edition or Starter user, consider using a cloud hosted solution.
+> - This document will not cover installations from source.
+>
+> - If HA setup is not what you were looking for, see the [database configuration document](http://docs.gitlab.com/omnibus/settings/database.html)
+> for the Omnibus GitLab packages.
+
+> Please read this document fully before attempting to configure PostgreSQL HA
+> for GitLab.
+>
+> This configuration is GA in EE 10.2.
+
+The recommended configuration for a PostgreSQL HA requires:
+
+- A minimum of three database nodes
+ - Each node will run the following services:
+ - `PostgreSQL` - The database itself
+ - `repmgrd` - A service to monitor, and handle failover in case of a failure
+ - `Consul` agent - Used for service discovery, to alert other nodes when failover occurs
+- A minimum of three `Consul` server nodes
+- A minimum of one `pgbouncer` service node
+
+You also need to take into consideration the underlying network topology,
+making sure you have redundant connectivity between all Database and GitLab instances,
+otherwise the networks will become a single point of failure.
+
+#### Architecture
+
+![PG HA Architecture](pg_ha_architecture.png)
+
+Database nodes run two services with PostgreSQL:
+
+- Repmgrd. Monitors the cluster and handles failover when issues with the master occur. The failover consists of:
+ - Selecting a new master for the cluster.
+ - Promoting the new node to master.
+ - Instructing remaining servers to follow the new master node.
+
+ On failure, the old master node is automatically evicted from the cluster, and should be rejoined manually once recovered.
+- Consul. Monitors the status of each node in the database cluster and tracks its health in a service definition on the consul cluster.
+
+Alongside pgbouncer, there is a consul agent that watches the status of the PostgreSQL service. If that status changes, consul runs a script which updates the configuration and reloads pgbouncer
+
+##### Connection flow
+
+Each service in the package comes with a set of [default ports](https://docs.gitlab.com/omnibus/package-information/defaults.html#ports). You may need to make specific firewall rules for the connections listed below:
+
+- Application servers connect to [PgBouncer default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#pgbouncer)
+- PgBouncer connects to the primary database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql)
+- Repmgr connects to the database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql)
+- Postgres secondaries connect to the primary database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql)
+- Consul servers and agents connect to each others [Consul default ports](https://docs.gitlab.com/omnibus/package-information/defaults.html#consul)
+
+#### Required information
+
+Before proceeding with configuration, you will need to collect all the necessary
+information.
+
+##### Network information
+
+PostgreSQL does not listen on any network interface by default. It needs to know
+which IP address to listen on in order to be accessible to other services.
+Similarly, PostgreSQL access is controlled based on the network source.
+
+This is why you will need:
+
+> IP address of each nodes network interface
+> - This can be set to `0.0.0.0` to listen on all interfaces. It cannot
+> be set to the loopack address `127.0.0.1`
+>
+> Network Address
+> - This can be in subnet (i.e. `192.168.0.0/255.255.255.0`) or CIDR (i.e.
+> `192.168.0.0/24`) form.
+
+##### User information
+
+Various services require different configuration to secure
+the communication as well as information required for running the service.
+Bellow you will find details on each service and the minimum required
+information you need to provide.
+
+##### Consul information
+
+When using default setup, minimum configuration requires:
+
+- `CONSUL_USERNAME`. Defaults to `gitlab-consul`
+- `CONSUL_DATABASE_PASSWORD`. Password for the database user.
+- `CONSUL_PASSWORD_HASH`. This is a hash generated out of consul username/password pair.
+ Can be generated with:
+
+ ```sh
+ sudo gitlab-ctl pg-password-md5 CONSUL_USERNAME
+ ```
+
+- `CONSUL_SERVER_NODES`. The IP addresses or DNS records of the Consul server nodes.
+
+Few notes on the service itself:
+
+- The service runs under a system account, by default `gitlab-consul`.
+ - If you are using a different username, you will have to specify it. We
+will refer to it with `CONSUL_USERNAME`,
+- There will be a database user created with read only access to the repmgr
+database
+- Passwords will be stored in the following locations:
+ - `/etc/gitlab/gitlab.rb`: hashed
+ - `/var/opt/gitlab/pgbouncer/pg_auth`: hashed
+ - `/var/opt/gitlab/gitlab-consul/.pgpass`: plaintext
+
+##### PostgreSQL information
+
+When configuring PostgreSQL, we will set `max_wal_senders` to one more than
+the number of database nodes in the cluster.
+This is used to prevent replication from using up all of the
+available database connections.
+
+> Note:
+> - In this document we are assuming 3 database nodes, which makes this configuration:
+
+```
+postgresql['max_wal_senders'] = 4
+```
+
+As previously mentioned, you'll have to prepare the network subnets that will
+be allowed to authenticate with the database.
+You'll also need to supply the IP addresses or DNS records of Consul
+server nodes.
+
+We will need the following password information for the application's database user:
+
+- `POSTGRESQL_USERNAME`. Defaults to `gitlab`
+- `POSTGRESQL_USER_PASSWORD`. The password for the database user
+- `POSTGRESQL_PASSWORD_HASH`. This is a hash generated out of the username/password pair.
+ Can be generated with:
+
+ ```sh
+ sudo gitlab-ctl pg-password-md5 POSTGRESQL_USERNAME
+ ```
+
+##### Pgbouncer information
+
+When using default setup, minimum configuration requires:
+
+- `PGBOUNCER_USERNAME`. Defaults to `pgbouncer`
+- `PGBOUNCER_PASSWORD`. This is a password for pgbouncer service.
+- `PGBOUNCER_PASSWORD_HASH`. This is a hash generated out of pgbouncer username/password pair.
+ Can be generated with:
+
+ ```sh
+ sudo gitlab-ctl pg-password-md5 PGBOUNCER_USERNAME
+ ```
+
+- `PGBOUNCER_NODE`, is the IP address or a FQDN of the node running Pgbouncer.
+
+Few notes on the service itself:
+
+- The service runs as the same system account as the database
+ - In the package, this is by default `gitlab-psql`
+- If you use a non-default user account for Pgbouncer service (by default `pgbouncer`), you will have to specify this username. We will refer to this requirement with `PGBOUNCER_USERNAME`.
+- The service will have a regular database user account generated for it
+ - This defaults to `repmgr`
+- Passwords will be stored in the following locations:
+ - `/etc/gitlab/gitlab.rb`: hashed, and in plain text
+ - `/var/opt/gitlab/pgbouncer/pg_auth`: hashed
+
+##### Repmgr information
+
+When using default setup, you will only have to prepare the network subnets that will
+be allowed to authenticate with the service.
+
+Few notes on the service itself:
+
+- The service runs under the same system account as the database
+ - In the package, this is by default `gitlab-psql`
+- The service will have a superuser database user account generated for it
+ - This defaults to `gitlab_repmgr`
+
+#### Installing Omnibus GitLab
+
+First, make sure to [download/install](https://about.gitlab.com/installation)
+GitLab Omnibus **on each node**.
+
+Make sure you install the necessary dependencies from step 1,
+add GitLab package repository from step 2.
+When installing the GitLab package, do not supply `EXTERNAL_URL` value.
+
+#### Configuring the Consul nodes
+
+On each Consul node perform the following:
+
+1. Make sure you collect [`CONSUL_SERVER_NODES`](#consul-information) before executing the next step.
+
+1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
+
+ ```ruby
+ # Disable all components except Consul
+ roles ['consul_role']
+
+ # START user configuration
+ # Replace placeholders:
+ #
+ # Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z
+ # with the addresses gathered for CONSUL_SERVER_NODES
+ consul['configuration'] = {
+ server: true,
+ retry_join: %w(Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z)
+ }
+
+ # Disable auto migrations
+ gitlab_rails['auto_migrate'] = false
+ #
+ # END user configuration
+ ```
+
+ > `consul_role` was introduced with GitLab 10.3
+
+1. [Reconfigure GitLab] for the changes to take effect.
+
+##### Consul Checkpoint
+
+Before moving on, make sure Consul is configured correctly. Run the following
+command to verify all server nodes are communicating:
+
+```
+/opt/gitlab/embedded/bin/consul members
+```
+
+The output should be similar to:
+
+```
+Node Address Status Type Build Protocol DC
+CONSUL_NODE_ONE XXX.XXX.XXX.YYY:8301 alive server 0.9.2 2 gitlab_consul
+CONSUL_NODE_TWO XXX.XXX.XXX.YYY:8301 alive server 0.9.2 2 gitlab_consul
+CONSUL_NODE_THREE XXX.XXX.XXX.YYY:8301 alive server 0.9.2 2 gitlab_consul
+```
+
+If any of the nodes isn't `alive` or if any of the three nodes are missing,
+check the [Troubleshooting section](#troubleshooting) before proceeding.
+
+#### Configuring the Database nodes
+
+1. Make sure you collect [`CONSUL_SERVER_NODES`](#consul-information), [`PGBOUNCER_PASSWORD_HASH`](#pgbouncer-information), [`POSTGRESQL_PASSWORD_HASH`](#postgresql-information), the [number of db nodes](#postgresql-information), and the [network address](#network-information) before executing the next step.
+
+1. On the master database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
+
+ ```ruby
+ # Disable all components except PostgreSQL and Repmgr and Consul
+ roles ['postgres_role']
# PostgreSQL configuration
- gitlab_rails['db_password'] = 'DB password'
- postgresql['md5_auth_cidr_addresses'] = ['0.0.0.0/0']
postgresql['listen_address'] = '0.0.0.0'
+ postgresql['hot_standby'] = 'on'
+ postgresql['wal_level'] = 'replica'
+ postgresql['shared_preload_libraries'] = 'repmgr_funcs'
# Disable automatic database migrations
gitlab_rails['auto_migrate'] = false
+
+ # Configure the consul agent
+ consul['services'] = %w(postgresql)
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ #
+ # Replace PGBOUNCER_PASSWORD_HASH with a generated md5 value
+ postgresql['pgbouncer_user_password'] = 'PGBOUNCER_PASSWORD_HASH'
+ # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value
+ postgresql['sql_user_password'] = 'POSTGRESQL_PASSWORD_HASH'
+ # Replace X with value of number of db nodes + 1
+ postgresql['max_wal_senders'] = X
+
+ # Replace XXX.XXX.XXX.XXX/YY with Network Address
+ postgresql['trust_auth_cidr_addresses'] = %w(XXX.XXX.XXX.XXX/YY)
+ repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 XXX.XXX.XXX.XXX/YY)
+
+ # Replace placeholders:
+ #
+ # Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z
+ # with the addresses gathered for CONSUL_SERVER_NODES
+ consul['configuration'] = {
+ retry_join: %w(Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z)
+ }
+ #
+ # END user configuration
```
-1. Run `sudo gitlab-ctl reconfigure` to install and configure PostgreSQL.
+ > `postgres_role` was introduced with GitLab 10.3
+
+1. On secondary nodes, add all the configuration specified above for primary node
+ to `/etc/gitlab/gitlab.rb`. In addition, append the following configuration
+ to inform gitlab-ctl that they are standby nodes initially and it need not
+ attempt to register them as primary node
+ ```
+ # HA setting to specify if a node should attempt to be master on initialization
+ repmgr['master_on_initialization'] = false
+ ```
- > **Note**: This `reconfigure` step will result in some errors.
- That's OK - don't be alarmed.
+1. [Reconfigure GitLab] for te changes to take effect.
+
+> Please note:
+> - If you want your database to listen on a specific interface, change the config:
+> `postgresql['listen_address'] = '0.0.0.0'`
+> - If your Pgbouncer service runs under a different user account,
+> you also need to specify: `postgresql['pgbouncer_user'] = PGBOUNCER_USERNAME` in
+> your configuration
+
+##### Database nodes post-configuration
+
+###### Primary node
+
+Select one node as a primary node.
1. Open a database prompt:
+ ```sh
+ gitlab-psql -d gitlabhq_production
```
- su - gitlab-psql
- /bin/bash
- psql -h /var/opt/gitlab/postgresql -d template1
- # Output:
+1. Enable the `pg_trgm` extension:
- psql (9.2.15)
- Type "help" for help.
+ ```sh
+ CREATE EXTENSION pg_trgm;
+ ```
+
+1. Exit the database prompt by typing `\q` and Enter.
+
+1. Verify the cluster is initialized with one node:
- template1=#
+ ```sh
+ gitlab-ctl repmgr cluster show
```
-1. Run the following command at the database prompt and you will be asked to
- enter the new password for the PostgreSQL superuser.
+ The output should be similar to the following:
```
- \password
+ Role | Name | Upstream | Connection String
+ ----------+----------|----------|----------------------------------------
+ * master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr
+ ```
- # Output:
+1. Note down the hostname/ip in the connection string: `host=HOSTNAME`. We will
+ refer to the hostname in the next section as `MASTER_NODE_NAME`. If the value
+ is not an IP address, it will need to be a resolvable name (via DNS or
+ `/etc/hosts`)
- Enter new password:
- Enter it again:
+
+###### Secondary nodes
+
+1. Set up the repmgr standby:
+
+ ```sh
+ gitlab-ctl repmgr standby setup MASTER_NODE_NAME
```
-1. Similarly, set the password for the `gitlab` database user. Use the same
- password that you specified in the `/etc/gitlab/gitlab.rb` file for
- `gitlab_rails['db_password']`.
+ Do note that this will remove the existing data on the node. The command
+ has a wait time.
+ The output should be similar to the following:
+
+ ```console
+ # gitlab-ctl repmgr standby setup MASTER_NODE_NAME
+ Doing this will delete the entire contents of /var/opt/gitlab/postgresql/data
+ If this is not what you want, hit Ctrl-C now to exit
+ To skip waiting, rerun with the -w option
+ Sleeping for 30 seconds
+ Stopping the database
+ Removing the data
+ Cloning the data
+ Starting the database
+ Registering the node with the cluster
+ ok: run: repmgrd: (pid 19068) 0s
```
- \password gitlab
- # Output:
+1. Verify the node now appears in the cluster:
- Enter new password:
- Enter it again:
+ ```sh
+ gitlab-ctl repmgr cluster show
```
-1. Exit from editing `template1` prompt by typing `\q` and Enter.
-1. Enable the `pg_trgm` extension within the `gitlabhq_production` database:
-
+
+ The output should be similar to the following:
+
+ ```
+ Role | Name | Upstream | Connection String
+ ----------+---------|-----------|------------------------------------------------
+ * master | MASTER | | host=MASTER_NODE_NAME user=gitlab_repmgr dbname=gitlab_repmgr
+ standby | STANDBY | MASTER | host=STANDBY_HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr
+ ```
+
+Repeat the above steps on all secondary nodes.
+
+##### Database checkpoint
+
+Before moving on, make sure the databases are configured correctly. Run the
+following command on the **primary** node to verify that replication is working
+properly:
+
+```
+gitlab-ctl repmgr cluster show
+```
+
+The output should be similar to:
+
+```
+Role | Name | Upstream | Connection String
+----------+--------------|--------------|--------------------------------------------------------------------
+* master | MASTER | | host=MASTER port=5432 user=gitlab_repmgr dbname=gitlab_repmgr
+ standby | STANDBY | MASTER | host=STANDBY port=5432 user=gitlab_repmgr dbname=gitlab_repmgr
+```
+
+If the 'Role' column for any node says "FAILED", check the
+[Troubleshooting section](#troubleshooting) before proceeding.
+
+Also, check that the check master command works successfully on each node:
+
+```
+su - gitlab-consul
+gitlab-ctl repmgr-check-master || echo 'This node is a standby repmgr node'
+```
+
+This command relies on exit codes to tell Consul whether a particular node is a master
+or secondary. The most important thing here is that this command does not produce errors.
+If there are errors it's most likely due to incorrect `gitlab-consul` database user permissions.
+Check the [Troubleshooting section](#troubleshooting) before proceeding.
+
+#### Configuring the Pgbouncer node
+
+1. Make sure you collect [`CONSUL_SERVER_NODES`](#consul-information), [`CONSUL_PASSWORD_HASH`](#consul-information), and [`PGBOUNCER_PASSWORD_HASH`](#pgbouncer-information) before executing the next step.
+
+1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
+
+ ```ruby
+ # Disable all components except Pgbouncer and Consul agent
+ roles ['pgbouncer_role']
+
+ # Configure Pgbouncer
+ pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul)
+
+ # Configure Consul agent
+ consul['watchers'] = %w(postgresql)
+
+ # START user configuration
+ # Please set the real values as explained in Required Information section
+ # Replace CONSUL_PASSWORD_HASH with with a generated md5 value
+ # Replace PGBOUNCER_PASSWORD_HASH with with a generated md5 value
+ pgbouncer['users'] = {
+ 'gitlab-consul': {
+ password: 'CONSUL_PASSWORD_HASH'
+ },
+ 'pgbouncer': {
+ password: 'PGBOUNCER_PASSWORD_HASH'
+ }
+ }
+ # Replace placeholders:
+ #
+ # Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z
+ # with the addresses gathered for CONSUL_SERVER_NODES
+ consul['configuration'] = {
+ retry_join: %w(Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z)
+ }
+ #
+ # END user configuration
```
+
+ > `pgbouncer_role` was introduced with GitLab 10.3
+
+1. [Reconfigure GitLab] for the changes to take effect.
+
+1. Create a `.pgpass` file so Consule is able to
+ reload pgbouncer. Enter the `PGBOUNCER_PASSWORD` twice when asked:
+
+ ```sh
+ gitlab-ctl write-pgpass --host 127.0.0.1 --database pgbouncer --user pgbouncer --hostuser gitlab-consul
+ ```
+
+##### PGBouncer Checkpoint
+
+1. Ensure the node is talking to the current master:
+
+ ```sh
+ gitlab-ctl pgb-console # You will be prompted for PGBOUNCER_PASSWORD
+ ```
+
+ If there is an error `psql: ERROR: Auth failed` after typing in the
+ password, ensure you previously generated the MD5 password hashes with the correct
+ format. The correct format is to concatenate the password and the username:
+ `PASSWORDUSERNAME`. For example, `Sup3rS3cr3tpgbouncer` would be the text
+ needed to generate an MD5 password hash for the `pgbouncer` user.
+
+1. Once the console prompt is available, run the following queries:
+
+ ```sh
+ show databases ; show clients ;
+ ```
+
+ The output should be similar to the following:
+
+ ```
+ name | host | port | database | force_user | pool_size | reserve_pool | pool_mode | max_connections | current_connections
+ ---------------------+-------------+------+---------------------+------------+-----------+--------------+-----------+-----------------+---------------------
+ gitlabhq_production | MASTER_HOST | 5432 | gitlabhq_production | | 20 | 0 | | 0 | 0
+ pgbouncer | | 6432 | pgbouncer | pgbouncer | 2 | 0 | statement | 0 | 0
+ (2 rows)
+
+ type | user | database | state | addr | port | local_addr | local_port | connect_time | request_time | ptr | link | remote_pid | tls
+ ------+-----------+---------------------+---------+----------------+-------+------------+------------+---------------------+---------------------+-----------+------+------------+-----
+ C | pgbouncer | pgbouncer | active | 127.0.0.1 | 56846 | 127.0.0.1 | 6432 | 2017-08-21 18:09:59 | 2017-08-21 18:10:48 | 0x22b3880 | | 0 |
+ (2 rows)
+ ```
+
+#### Configuring the Application nodes
+
+These will be the nodes running the `gitlab-rails` service. You may have other
+attributes set, but the following need to be set.
+
+1. Edit `/etc/gitlab/gitlab.rb`:
+
+ ```ruby
+ # Disable PostgreSQL on the application node
+ postgresql['enable'] = false
+
+ gitlab_rails['db_host'] = 'PGBOUNCER_NODE'
+ gitlab_rails['db_port'] = 6432
+ gitlab_rails['db_password'] = 'POSTGRESQL_USER_PASSWORD'
+ gitlab_rails['auto_migrate'] = false
+ ```
+
+1. [Reconfigure GitLab] for the changes to take effect.
+
+##### Application node post-configuration
+
+Ensure that all migrations ran:
+
+```sh
+gitlab-rake gitlab:db:configure
+```
+
+> **Note**: If you encounter a `rake aborted!` error stating that PGBouncer is failing to connect to
+PostgreSQL it may be that your PGBouncer node's IP address is missing from
+PostgreSQL's `trust_auth_cidr_addresses` in `gitlab.rb` on your database nodes. See
+[PGBouncer error `ERROR: pgbouncer cannot connect to server`](#pgbouncer-error-error-pgbouncer-cannot-connect-to-server)
+in the Troubleshooting section before proceeding.
+
+##### Ensure GitLab is running
+
+At this point, your GitLab instance should be up and running. Verify you are
+able to login, and create issues and merge requests. If you have troubles check
+the [Troubleshooting section](#troubleshooting).
+
+#### Example configuration
+
+Here we'll show you some fully expanded example configurations.
+
+##### Example recommended setup
+
+This example uses 3 consul servers, 3 postgresql servers, and 1 application node.
+
+We start with all servers on the same 10.6.0.0/16 private network range, they
+can connect to each freely other on those addresses.
+
+Here is a list and description of each machine and the assigned IP:
+
+* `10.6.0.11`: Consul 1
+* `10.6.0.12`: Consul 2
+* `10.6.0.13`: Consul 3
+* `10.6.0.21`: PostgreSQL master
+* `10.6.0.22`: PostgreSQL secondary
+* `10.6.0.23`: PostgreSQL secondary
+* `10.6.0.31`: GitLab application
+
+All passwords are set to `toomanysecrets`, please do not use this password or derived hashes.
+
+The external_url for GitLab is `http://gitlab.example.com`
+
+Please note that after the initial configuration, if a failover occurs, the PostgresSQL master will change to one of the available secondaries until it is failed back.
+
+##### Example recommended setup for Consul servers
+
+On each server edit `/etc/gitlab/gitlab.rb`:
+
+```ruby
+# Disable all components except Consul
+roles ['consul_role']
+
+consul['configuration'] = {
+ server: true,
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13)
+}
+```
+
+[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect.
+
+##### Example recommended setup for PostgreSQL servers
+
+###### Primary node
+
+On primary node edit `/etc/gitlab/gitlab.rb`:
+
+```ruby
+# Disable all components except PostgreSQL and Repmgr and Consul
+roles ['postgres_role']
+
+# PostgreSQL configuration
+postgresql['listen_address'] = '0.0.0.0'
+postgresql['hot_standby'] = 'on'
+postgresql['wal_level'] = 'replica'
+postgresql['shared_preload_libraries'] = 'repmgr_funcs'
+
+# Disable automatic database migrations
+gitlab_rails['auto_migrate'] = false
+
+# Configure the consul agent
+consul['services'] = %w(postgresql)
+
+postgresql['pgbouncer_user_password'] = '771a8625958a529132abe6f1a4acb19c'
+postgresql['sql_user_password'] = '450409b85a0223a214b5fb1484f34d0f'
+postgresql['max_wal_senders'] = 4
+
+postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/16)
+repmgr['trust_auth_cidr_addresses'] = %w(10.6.0.0/16)
+
+consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13)
+}
+```
+
+[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect.
+
+###### Secondary nodes
+
+On secondary nodes, edit `/etc/gitlab/gitlab.rb` and add all the configuration
+added to primary node, noted above. In addition, append the following
+configuration
+
+```
+# HA setting to specify if a node should attempt to be master on initialization
+repmgr['master_on_initialization'] = false
+```
+
+[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect.
+
+##### Example recommended setup for application server
+
+On the server edit `/etc/gitlab/gitlab.rb`:
+
+```ruby
+external_url 'http://gitlab.example.com'
+
+gitlab_rails['db_host'] = '127.0.0.1'
+gitlab_rails['db_port'] = 6432
+gitlab_rails['db_password'] = 'toomanysecrets'
+gitlab_rails['auto_migrate'] = false
+
+postgresql['enable'] = false
+pgbouncer['enable'] = true
+consul['enable'] = true
+
+# Configure Pgbouncer
+pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul)
+
+# Configure Consul agent
+consul['watchers'] = %w(postgresql)
+
+pgbouncer['users'] = {
+ 'gitlab-consul': {
+ password: '5e0e3263571e3704ad655076301d6ebe'
+ },
+ 'pgbouncer': {
+ password: '771a8625958a529132abe6f1a4acb19c'
+ }
+}
+
+consul['configuration'] = {
+ retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13)
+}
+```
+
+[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect.
+
+##### Example recommended setup manual steps
+
+After deploying the configuration follow these steps:
+
+1. On `10.6.0.21`, our primary database
+
+ Enable the `pg_trgm` extension
+
+ ```sh
gitlab-psql -d gitlabhq_production
-
+ ```
+
+ ```
CREATE EXTENSION pg_trgm;
+ ```
+
+1. On `10.6.0.22`, our first standby database
- # Output:
+ Make this node a standby of the primary
- CREATE EXTENSION
+ ```sh
+ gitlab-ctl repmgr standby setup 10.6.0.21
```
-1. Exit the database prompt by typing `\q` and Enter.
-1. Exit the `gitlab-psql` user by running `exit` twice.
-1. Run `sudo gitlab-ctl reconfigure` a final time.
-1. Configure the GitLab application servers with the appropriate details.
- This step is covered in [Configuring GitLab for HA](gitlab.md).
+
+1. On `10.6.0.23`, our second standby database
+
+ Make this node a standby of the primary
+
+ ```sh
+ gitlab-ctl repmgr standby setup 10.6.0.21
+ ```
+
+1. On `10.6.0.31`, our application server
+
+ Set gitlab-consul's pgbouncer password to `toomanysecrets`
+
+ ```sh
+ gitlab-ctl write-pgpass --host 127.0.0.1 --database pgbouncer --user pgbouncer --hostuser gitlab-consul
+ ```
+
+ Run database migrations
+
+ ```sh
+ gitlab-rake gitlab:db:configure
+ ```
+
+#### Example minimal setup
+
+This example uses 3 postgresql servers, and 1 application node.
+
+It differs from the [recommended setup](#example-recommended-setup) by moving the consul servers into the same servers we use for PostgreSQL.
+The trade-off is between reducing server counts, against the increased operational complexity of needing to deal with postgres [failover](#failover-procedure) and [restore](#restore-procedure) procedures in addition to [consul outage recovery](consul.md#outage-recovery) on the same set of machines.
+
+In this example we start with all servers on the same 10.6.0.0/16 private network range, they can connect to each freely other on those addresses.
+
+Here is a list and description of each machine and the assigned IP:
+
+* `10.6.0.21`: PostgreSQL master
+* `10.6.0.22`: PostgreSQL secondary
+* `10.6.0.23`: PostgreSQL secondary
+* `10.6.0.31`: GitLab application
+
+All passwords are set to `toomanysecrets`, please do not use this password or derived hashes.
+
+The external_url for GitLab is `http://gitlab.example.com`
+
+Please note that after the initial configuration, if a failover occurs, the PostgresSQL master will change to one of the available secondaries until it is failed back.
+
+##### Example minimal configuration for database servers
+
+##### Primary node
+On primary database node edit `/etc/gitlab/gitlab.rb`:
+
+```ruby
+# Disable all components except PostgreSQL, Repmgr, and Consul
+roles ['postgres_role']
+
+# PostgreSQL configuration
+postgresql['listen_address'] = '0.0.0.0'
+postgresql['hot_standby'] = 'on'
+postgresql['wal_level'] = 'replica'
+postgresql['shared_preload_libraries'] = 'repmgr_funcs'
+
+# Disable automatic database migrations
+gitlab_rails['auto_migrate'] = false
+
+# Configure the consul agent
+consul['services'] = %w(postgresql)
+
+postgresql['pgbouncer_user_password'] = '771a8625958a529132abe6f1a4acb19c'
+postgresql['sql_user_password'] = '450409b85a0223a214b5fb1484f34d0f'
+postgresql['max_wal_senders'] = 4
+
+postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/16)
+repmgr['trust_auth_cidr_addresses'] = %w(10.6.0.0/16)
+
+consul['configuration'] = {
+ server: true,
+ retry_join: %w(10.6.0.21 10.6.0.22 10.6.0.23)
+}
+```
+
+[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect.
+
+###### Secondary nodes
+
+On secondary nodes, edit `/etc/gitlab/gitlab.rb` and add all the information added
+to primary node, noted above. In addition, append the following configuration
+
+```
+# HA setting to specify if a node should attempt to be master on initialization
+repmgr['master_on_initialization'] = false
+```
+
+##### Example minimal configuration for application server
+
+On the server edit `/etc/gitlab/gitlab.rb`:
+
+```ruby
+external_url 'http://gitlab.example.com'
+
+gitlab_rails['db_host'] = '127.0.0.1'
+gitlab_rails['db_port'] = 6432
+gitlab_rails['db_password'] = 'toomanysecrets'
+gitlab_rails['auto_migrate'] = false
+
+postgresql['enable'] = false
+pgbouncer['enable'] = true
+consul['enable'] = true
+
+# Configure Pgbouncer
+pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul)
+
+# Configure Consul agent
+consul['watchers'] = %w(postgresql)
+
+pgbouncer['users'] = {
+ 'gitlab-consul': {
+ password: '5e0e3263571e3704ad655076301d6ebe'
+ },
+ 'pgbouncer': {
+ password: '771a8625958a529132abe6f1a4acb19c'
+ }
+}
+
+consul['configuration'] = {
+ retry_join: %w(10.6.0.21 10.6.0.22 10.6.0.23)
+}
+```
+
+[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect.
+
+##### Example minimal setup manual steps
+
+The manual steps for this configuration are the same as for the [example recommended setup](#example-recommended-setup-manual-steps).
+
+#### Failover procedure
+
+By default, if the master database fails, `repmgrd` should promote one of the
+standby nodes to master automatically, and consul will update pgbouncer with
+the new master.
+
+If you need to failover manually, you have two options:
+
+**Shutdown the current master database**
+
+Run:
+
+```sh
+gitlab-ctl stop postgresql
+```
+
+The automated failover process will see this and failover to one of the
+standby nodes.
+
+**Or perform a manual failover**
+
+1. Ensure the old master node is not still active.
+1. Login to the server that should become the new master and run:
+
+ ```sh
+ gitlab-ctl repmgr standby promote
+ ```
+
+1. If there are any other standby servers in the cluster, have them follow
+ the new master server:
+
+ ```sh
+ gitlab-ctl repmgr standby follow NEW_MASTER
+ ```
+
+#### Restore procedure
+
+If a node fails, it can be removed from the cluster, or added back as a standby
+after it has been restored to service.
+
+- If you want to remove the node from the cluster, on any other node in the
+ cluster, run:
+
+ ```sh
+ gitlab-ctl repmgr standby unregister --node=X
+ ```
+
+ where X is the value of node in `repmgr.conf` on the old server.
+
+ To find this, you can use:
+
+ ```sh
+ awk -F = '$1 == "node" { print $2 }' /var/opt/gitlab/postgresql/repmgr.conf
+ ```
+
+ It will output something like:
+
+ ```
+ 959789412
+ ```
+
+ Then you will use this id to unregister the node:
+
+ ```sh
+ gitlab-ctl repmgr standby unregister --node=959789412
+ ```
+
+- To add the node as a standby server:
+
+ ```sh
+ gitlab-ctl repmgr standby follow NEW_MASTER
+ gitlab-ctl restart repmgrd
+ ```
+
+ CAUTION: **Warning:** When the server is brought back online, and before
+ you switch it to a standby node, repmgr will report that there are two masters.
+ If there are any clients that are still attempting to write to the old master,
+ this will cause a split, and the old master will need to be resynced from
+ scratch by performing a `gitlab-ctl repmgr standby setup NEW_MASTER`.
+
+#### Alternate configurations
+
+##### Database authorization
+
+By default, we give any host on the database network the permission to perform
+repmgr operations using PostgreSQL's `trust` method. If you do not want this
+level of trust, there are alternatives.
+
+You can trust only the specific nodes that will be database clusters, or you
+can require md5 authentication.
+
+##### Trust specific addresses
+
+If you know the IP address, or FQDN of all database and pgbouncer nodes in the
+cluster, you can trust only those nodes.
+
+In `/etc/gitlab/gitlab.rb` on all of the database nodes, set
+`repmgr['trust_auth_cidr_addresses']` to an array of strings containing all of
+the addresses.
+
+If setting to a node's FQDN, they must have a corresponding PTR record in DNS.
+If setting to a node's IP address, specify it as `XXX.XXX.XXX.XXX/32`.
+
+For example:
+
+```ruby
+repmgr['trust_auth_cidr_addresses'] = %w(192.168.1.44/32 db2.example.com)
+```
+
+
+##### MD5 Authentication
+
+If you are running on an untrusted network, repmgr can use md5 authentication
+with a [.pgpass file](https://www.postgresql.org/docs/9.6/static/libpq-pgpass.html)
+to authenticate.
+
+You can specify by IP address, FQDN, or by subnet, using the same format as in
+the previous section:
+
+1. On the current master node, create a password for the `gitlab` and
+ `gitlab_repmgr` user:
+
+ ```sh
+ gitlab-psql -d template1
+ template1=# \password gitlab_repmgr
+ Enter password: ****
+ Confirm password: ****
+ template1=# \password gitlab
+ ```
+
+1. On each database node:
+
+ 1. Edit `/etc/gitlab/gitlab.rb`:
+ 1. Ensure `repmgr['trust_auth_cidr_addresses']` is **not** set
+ 1. Set `postgresql['md5_auth_cidr_addresses']` to the desired value
+ 1. Set `postgresql['sql_replication_user'] = 'gitlab_repmgr'`
+ 1. Reconfigure with `gitlab-ctl reconfigure`
+ 1. Restart postgresql with `gitlab-ctl restart postgresql`
+
+ 1. Create a `.pgpass` file. Enter the `gitlab_repmgr` password twice to
+ when asked:
+
+ ```sh
+ gitlab-ctl write-pgpass --user gitlab_repmgr --hostuser gitlab-psql --database '*'
+ ```
+
+1. On each pgbouncer node, edit `/etc/gitlab/gitlab.rb`:
+ 1. Ensure `gitlab_rails['db_password']` is set to the plaintext password for
+ the `gitlab` database user
+ 1. [Reconfigure GitLab] for the changes to take effect
+
+## Troubleshooting
+
+#### Consul and PostgreSQL changes not taking effect.
+
+Due to the potential impacts, `gitlab-ctl reconfigure` only reloads Consul and PostgreSQL, it will not restart the services. However, not all changes can be activated by reloading.
+
+To restart either service, run `gitlab-ctl restart SERVICE`
+
+For PostgreSQL, it is usually safe to restart the master node by default. Automatic failover defaults to a 1 minute timeout. Provided the database returns before then, nothing else needs to be done. To be safe, you can stop `repmgrd` on the standby nodes first with `gitlab-ctl stop repmgrd`, then start afterwards with `gitlab-ctl start repmgrd`.
+
+On the consul server nodes, it is important to restart the consul service in a controlled fashion. Read our [consul documentation](consul.md#restarting-the-server-cluster) for instructions on how to restart the service.
+
+#### `gitlab-ctl repmgr-check-master` command produces errors
+
+If this command displays errors about database permissions it is likely that something failed during
+install, resulting in the `gitlab-consul` database user getting incorrect permissions. Follow these
+steps to fix the problem:
+
+1. On the master database node, connect to the database prompt - `gitlab-psql -d template1`
+1. Delete the `gitlab-consul` user - `DROP USER "gitlab-consul";`
+1. Exit the database prompt - `\q`
+1. [Reconfigure GitLab] and the user will be re-added with the proper permissions.
+1. Change to the `gitlab-consul` user - `su - gitlab-consul`
+1. Try the check command again - `gitlab-ctl repmgr-check-master`.
+
+Now there should not be errors. If errors still occur then there is another problem.
+
+#### PGBouncer error `ERROR: pgbouncer cannot connect to server`
+
+You may get this error when running `gitlab-rake gitlab:db:configure` or you
+may see the error in the PGBouncer log file.
+
+```
+PG::ConnectionBad: ERROR: pgbouncer cannot connect to server
+```
+
+The problem may be that your PGBouncer node's IP address is not included in the
+`trust_auth_cidr_addresses` setting in `/etc/gitlab/gitlab.rb` on the database nodes.
+
+You can confirm that this is the issue by checking the PostgreSQL log on the master
+database node. If you see the following error then `trust_auth_cidr_addresses`
+is the problem.
+
+```
+2018-03-29_13:59:12.11776 FATAL: no pg_hba.conf entry for host "123.123.123.123", user "pgbouncer", database "gitlabhq_production", SSL off
+```
+
+To fix the problem, add the IP address to `/etc/gitlab/gitlab.rb`.
+
+```
+postgresql['trust_auth_cidr_addresses'] = %w(123.123.123.123/32 <other_cidrs>)
+```
+
+[Reconfigure GitLab] for the changes to take effect.
+
+#### Issues with other components
+
+If you're running into an issue with a component not outlined here, be sure to check the troubleshooting section of their specific documentation page.
+
+- [Consul](consul.md#troubleshooting)
+- [PostgreSQL](http://docs.gitlab.com/omnibus/settings/database.html#troubleshooting)
+- [GitLab application](gitlab.md#troubleshooting)
+
+## Configure using Omnibus
+
+**Note**: We recommend that you follow the instructions here for a full [PostgreSQL cluster](#high-availability-with-gitlab-omnibus-premium-only).
+If you are reading this section due to an old bookmark, you can find that old documentation [in the repository](https://gitlab.com/gitlab-org/gitlab-ce/blob/v10.1.4/doc/administration/high_availability/database.md#configure-using-omnibus).
---
@@ -114,3 +1165,6 @@ Read more on high-availability configuration:
1. [Configure NFS](nfs.md)
1. [Configure the GitLab application servers](gitlab.md)
1. [Configure the load balancers](load_balancer.md)
+1. [Manage the bundled Consul cluster](consul.md)
+
+[reconfigure GitLab]: ../restart_gitlab.md#omnibus-gitlab-reconfigure
diff --git a/doc/administration/high_availability/gitaly.md b/doc/administration/high_availability/gitaly.md
new file mode 100644
index 00000000000..d44744f2af8
--- /dev/null
+++ b/doc/administration/high_availability/gitaly.md
@@ -0,0 +1,90 @@
+# Configuring Gitaly for Scaled and High Availability
+
+Gitaly does not yet support full high availability. However, Gitaly is quite
+stable and is in use on GitLab.com. Scaled and highly available GitLab environments
+should consider using Gitaly on a separate node.
+
+See the [Gitaly HA Epic](https://gitlab.com/groups/gitlab-org/-/epics/289) to
+track plans and progress toward high availability support.
+
+This document is relevant for [Scaled Architecture](./README.md#scalable-architecture-examples)
+environments and [High Availability Architecture](./README.md#high-availability-architecture-examples).
+
+## Running Gitaly on its own server
+
+Starting with GitLab 11.4, Gitaly is a replacement for NFS except
+when the [Elastic Search indexer](https://gitlab.com/gitlab-org/gitlab-elasticsearch-indexer)
+is used.
+
+NOTE: **Note:** While Gitaly can be used as a replacement for NFS, we do not recommend using EFS as it may impact GitLab's performance. Please review the [relevant documentation](nfs.md#avoid-using-awss-elastic-file-system-efs) for more details.
+
+NOTE: **Note:** Gitaly network traffic is unencrypted so we recommend a firewall to
+restrict access to your Gitaly server.
+
+The steps below are the minimum necessary to configure a Gitaly server with
+Omnibus:
+
+1. SSH into the Gitaly server.
+1. [Download/install](https://about.gitlab.com/installation) the Omnibus GitLab
+ package you want using **steps 1 and 2** from the GitLab downloads page.
+ - Do not complete any other steps on the download page.
+
+1. Edit `/etc/gitlab/gitlab.rb` and add the contents:
+
+ Gitaly must trigger some callbacks to GitLab via GitLab Shell. As a result,
+ the GitLab Shell secret must be the same between the other GitLab servers and
+ the Gitaly server. The easiest way to accomplish this is to copy `/etc/gitlab/gitlab-secrets.json`
+ from an existing GitLab server to the Gitaly server. Without this shared secret,
+ Git operations in GitLab will result in an API error.
+
+ > **NOTE:** In most or all cases the storage paths below end in `repositories` which is
+ different than `path` in `git_data_dirs` of Omnibus installations. Check the
+ directory layout on your Gitaly server to be sure.
+
+ ```ruby
+ # Enable Gitaly
+ gitaly['enable'] = true
+
+ ## Disable all other services
+ sidekiq['enable'] = false
+ gitlab_workhorse['enable'] = false
+ unicorn['enable'] = false
+ postgresql['enable'] = false
+ nginx['enable'] = false
+ prometheus['enable'] = false
+ alertmanager['enable'] = false
+ pgbouncer_exporter['enable'] = false
+ redis_exporter['enable'] = false
+ gitlab_monitor['enable'] = false
+
+ # Prevent database connections during 'gitlab-ctl reconfigure'
+ gitlab_rails['rake_cache_clear'] = false
+ gitlab_rails['auto_migrate'] = false
+
+ # Configure the gitlab-shell API callback URL. Without this, `git push` will
+ # fail. This can be your 'front door' GitLab URL or an internal load
+ # balancer.
+ gitlab_rails['internal_api_url'] = 'https://gitlab.example.com'
+
+ # Make Gitaly accept connections on all network interfaces. You must use
+ # firewalls to restrict access to this address/port.
+ gitaly['listen_addr'] = "0.0.0.0:8075"
+ gitaly['auth_token'] = 'abc123secret'
+
+ gitaly['storage'] = [
+ { 'name' => 'default', 'path' => '/mnt/gitlab/default/repositories' },
+ { 'name' => 'storage1', 'path' => '/mnt/gitlab/storage1/repositories' },
+ ]
+
+ # To use tls for gitaly you need to add
+ gitaly['tls_listen_addr'] = "0.0.0.0:9999"
+ gitaly['certificate_path'] = "path/to/cert.pem"
+ gitaly['key_path'] = "path/to/key.pem"
+ ```
+
+Again, reconfigure (Omnibus) or restart (source).
+
+Continue configuration of other components by going back to:
+
+- [Scaled Architectures](./README.md#scalable-architecture-examples)
+- [High Availability Architectures](./README.md#high-availability-architecture-examples)
diff --git a/doc/administration/high_availability/gitlab.md b/doc/administration/high_availability/gitlab.md
index d95c3acec54..888426ece5c 100644
--- a/doc/administration/high_availability/gitlab.md
+++ b/doc/administration/high_availability/gitlab.md
@@ -1,8 +1,4 @@
-# Configuring GitLab for HA
-
-Assuming you have already configured a [database](database.md), [Redis](redis.md), and [NFS](nfs.md), you can
-configure the GitLab application server(s) now. Complete the steps below
-for each GitLab application server in your environment.
+# Configuring GitLab Scaling and High Availability
> **Note:** There is some additional configuration near the bottom for
additional GitLab application servers. It's important to read and understand
diff --git a/doc/administration/high_availability/nfs_host_client_setup.md b/doc/administration/high_availability/nfs_host_client_setup.md
new file mode 100644
index 00000000000..a8bc101dee6
--- /dev/null
+++ b/doc/administration/high_availability/nfs_host_client_setup.md
@@ -0,0 +1,135 @@
+# Configuring NFS for GitLab HA
+
+Setting up NFS for a GitLab HA setup allows all applications nodes in a cluster
+to share the same files and maintain data consistency. Application nodes in an HA
+setup act as clients while the NFS server plays host.
+
+> Note: The instructions provided in this documentation allow for setting a quick
+proof of concept but will leave NFS as potential single point of failure and
+therefore not recommended for use in production. Explore options such as [Pacemaker
+and Corosync](http://clusterlabs.org/) for highly available NFS in production.
+
+Below are instructions for setting up an application node(client) in an HA cluster
+to read from and write to a central NFS server(host).
+
+NOTE: **Note:**
+Using EFS may negatively impact performance. Please review the [relevant documentation](nfs.md#avoid-using-awss-elastic-file-system-efs) for additional details.
+
+## NFS Server Setup
+
+> Follow the instructions below to set up and configure your NFS server.
+
+### Step 1 - Install NFS Server on Host
+
+Installing the nfs-kernel-server package allows you to share directories with the clients running the GitLab application.
+
+```sh
+apt-get update
+apt-get install nfs-kernel-server
+```
+
+### Step 2 - Export Host's Home Directory to Client
+
+In this setup we will share the home directory on the host with the client. Edit the exports file as below to share the host's home directory with the client. If you have multiple clients running GitLab you must enter the client IP addresses in line in the `/etc/exports` file.
+
+```text
+#/etc/exports for one client
+/home <client-ip-address>(rw,sync,no_root_squash,no_subtree_check)
+
+#/etc/exports for three clients
+/home <client-ip-address>(rw,sync,no_root_squash,no_subtree_check) <client-2-ip-address>(rw,sync,no_root_squash,no_subtree_check) <client-3-ip-address>(rw,sync,no_root_squash,no_subtree_check)
+```
+
+Restart the NFS server after making changes to the `exports` file for the changes
+to take effect.
+
+```sh
+systemctl restart nfs-kernel-server
+```
+
+NOTE: **Note:**
+You may need to update your server's firewall. See the [firewall section](#nfs-in-a-firewalled-environment) at the end of this guide.
+
+## Client/ GitLab application node Setup
+
+> Follow the instructions below to connect any GitLab rails application node running
+inside your HA environment to the NFS server configured above.
+
+### Step 1 - Install NFS Common on Client
+
+The nfs-common provides NFS functionality without installing server components which
+we don't need running on the application nodes.
+
+```sh
+apt-get update
+apt-get install nfs-common
+```
+
+### Step 2 - Create Mount Points on Client
+
+Create a directroy on the client that we can mount the shared directory from the host.
+Please note that if your mount point directory contains any files they will be hidden
+once the remote shares are mounted. An empty/new directory on the client is recommended
+for this purpose.
+
+```sh
+mkdir -p /nfs/home
+```
+
+Confirm that the mount point works by mounting it on the client and checking that
+it is mounted with the command below:
+
+```sh
+mount <host_ip_address>:/home
+df -h
+```
+
+### Step 3 - Set up Automatic Mounts on Boot
+
+Edit `/etc/fstab` on client as below to mount the remote shares automatically at boot.
+Note that GitLab requires advisory file locking, which is only supported natively in
+NFS version 4. NFSv3 also supports locking as long as Linux Kernel 2.6.5+ is used.
+We recommend using version 4 and do not specifically test NFSv3.
+
+```text
+#/etc/fstab
+165.227.159.85:/home /nfs/home nfs4 defaults,soft,rsize=1048576,wsize=1048576,noatime,nofail,lookupcache=positive 0 2
+```
+
+Reboot the client and confirm that the mount point is mounted automatically.
+
+### Step 4 - Set up GitLab to Use NFS mounts
+
+When using the default Omnibus configuration you will need to share 5 data locations
+between all GitLab cluster nodes. No other locations should be shared. Changing the
+default file locations in `gitlab.rb` on the client allows you to have one main mount
+point and have all the required locations as subdirectories to use the NFS mount for
+git-data.
+
+```text
+git_data_dirs({"default" => {"path" => "/nfs/home/var/opt/gitlab-data/git-data"}})
+gitlab_rails['uploads_directory'] = '/nfs/home/var/opt/gitlab-data/uploads'
+gitlab_rails['shared_path'] = '/nfs/home/var/opt/gitlab-data/shared'
+gitlab_ci['builds_directory'] = '/nfs/home/var/opt/gitlab-data/builds'
+```
+
+Save the changes in `gitlab.rb` and run `gitlab-ctl reconfigure`.
+
+## NFS in a Firewalled Environment
+
+If the traffic between your NFS server and NFS client(s) is subject to port filtering
+by a firewall, then you will need to reconfigure that firewall to allow NFS communication.
+
+[This guide from TDLP](http://tldp.org/HOWTO/NFS-HOWTO/security.html#FIREWALLS)
+covers the basics of using NFS in a firewalled environment. Additionally, we encourage you to
+search for and review the specific documentation for your OS/distro and your firewall software.
+
+Example for Ubuntu:
+
+Check that NFS traffic from the client is allowed by the firewall on the host by running
+the command: `sudo ufw status`. If it's being blocked, then you can allow traffic from a specific
+client with the command below.
+
+```sh
+sudo ufw allow from <client-ip-address> to any port nfs
+```
diff --git a/doc/administration/high_availability/pg_ha_architecture.png b/doc/administration/high_availability/pg_ha_architecture.png
new file mode 100644
index 00000000000..ef870f652ae
--- /dev/null
+++ b/doc/administration/high_availability/pg_ha_architecture.png
Binary files differ
diff --git a/doc/administration/high_availability/pgbouncer.md b/doc/administration/high_availability/pgbouncer.md
new file mode 100644
index 00000000000..762179cf756
--- /dev/null
+++ b/doc/administration/high_availability/pgbouncer.md
@@ -0,0 +1,132 @@
+# Working with the bundle Pgbouncer service
+
+## Overview
+
+As part of its High Availability stack, GitLab Premium includes a bundled version of [Pgbouncer](https://pgbouncer.github.io/) that can be managed through `/etc/gitlab/gitlab.rb`.
+
+In a High Availability setup, Pgbouncer is used to seamlessly migrate database connections between servers in a failover scenario.
+
+Additionally, it can be used in a non-HA setup to pool connections, speeding up response time while reducing resource usage.
+
+It is recommended to run pgbouncer alongside the `gitlab-rails` service, or on its own dedicated node in a cluster.
+
+## Operations
+
+### Running Pgbouncer as part of an HA GitLab installation
+
+See our [HA documentation for PostgreSQL](database.md) for information on running pgbouncer as part of a HA setup
+
+### Running Pgbouncer as part of a non-HA GitLab installation
+
+1. Generate PGBOUNCER_USER_PASSWORD_HASH with the command `gitlab-ctl pg-password-md5 pgbouncer`
+
+1. Generate SQL_USER_PASSWORD_HASH with the command `gitlab-ctl pg-password-md5 gitlab`. We'll also need to enter the plaintext SQL_USER_PASSWORD later
+
+1. On your database node, ensure the following is set in your `/etc/gitlab/gitlab.rb`
+
+ ```ruby
+ postgresql['pgbouncer_user_password'] = 'PGBOUNCER_USER_PASSWORD_HASH'
+ postgresql['sql_user_password'] = 'SQL_USER_PASSWORD_HASH'
+ postgresql['listen_address'] = 'XX.XX.XX.Y' # Where XX.XX.XX.Y is the ip address on the node postgresql should listen on
+ postgresql['md5_auth_cidr_addresses'] = %w(AA.AA.AA.B/32) # Where AA.AA.AA.B is the IP address of the pgbouncer node
+ ```
+
+1. Run `gitlab-ctl reconfigure`
+
+ **Note:** If the database was already running, it will need to be restarted after reconfigure by running `gitlab-ctl restart postgresql`.
+
+1. On the node you are running pgbouncer on, make sure the following is set in `/etc/gitlab/gitlab.rb`
+
+ ```ruby
+ pgbouncer['enable'] = true
+ pgbouncer['databases'] = {
+ gitlabhq_production: {
+ host: 'DATABASE_HOST',
+ user: 'pgbouncer',
+ password: 'PGBOUNCER_USER_PASSWORD_HASH'
+ }
+ }
+ ```
+
+1. Run `gitlab-ctl reconfigure`
+
+1. On the node running unicorn, make sure the following is set in `/etc/gitlab/gitlab.rb`
+
+ ```ruby
+ gitlab_rails['db_host'] = 'PGBOUNCER_HOST'
+ gitlab_rails['db_port'] = '6432'
+ gitlab_rails['db_password'] = 'SQL_USER_PASSWORD'
+ ```
+
+1. Run `gitlab-ctl reconfigure`
+
+1. At this point, your instance should connect to the database through pgbouncer. If you are having issues, see the [Troubleshooting](#troubleshooting) section
+
+### Interacting with pgbouncer
+
+#### Administrative console
+
+As part of omnibus-gitlab, we provide a command `gitlab-ctl pgb-console` to automatically connect to the pgbouncer administrative console. Please see the [pgbouncer documentation](https://pgbouncer.github.io/usage.html#admin-console) for detailed instructions on how to interact with the console.
+
+To start a session, run
+
+```shell
+# gitlab-ctl pgb-console
+Password for user pgbouncer:
+psql (9.6.8, server 1.7.2/bouncer)
+Type "help" for help.
+
+pgbouncer=#
+```
+
+The password you will be prompted for is the PGBOUNCER_USER_PASSWORD
+
+To get some basic information about the instance, run
+```shell
+pgbouncer=# show databases; show clients; show servers;
+ name | host | port | database | force_user | pool_size | reserve_pool | pool_mode | max_connections | current_connections
+---------------------+-----------+------+---------------------+------------+-----------+--------------+-----------+-----------------+---------------------
+ gitlabhq_production | 127.0.0.1 | 5432 | gitlabhq_production | | 100 | 5 | | 0 | 1
+ pgbouncer | | 6432 | pgbouncer | pgbouncer | 2 | 0 | statement | 0 | 0
+(2 rows)
+
+ type | user | database | state | addr | port | local_addr | local_port | connect_time | request_time | ptr | link
+| remote_pid | tls
+------+-----------+---------------------+--------+-----------+-------+------------+------------+---------------------+---------------------+-----------+------
++------------+-----
+ C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44590 | 127.0.0.1 | 6432 | 2018-04-24 22:13:10 | 2018-04-24 22:17:10 | 0x12444c0 |
+| 0 |
+ C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44592 | 127.0.0.1 | 6432 | 2018-04-24 22:13:10 | 2018-04-24 22:17:10 | 0x12447c0 |
+| 0 |
+ C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44594 | 127.0.0.1 | 6432 | 2018-04-24 22:13:10 | 2018-04-24 22:17:10 | 0x1244940 |
+| 0 |
+ C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44706 | 127.0.0.1 | 6432 | 2018-04-24 22:14:22 | 2018-04-24 22:16:31 | 0x1244ac0 |
+| 0 |
+ C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44708 | 127.0.0.1 | 6432 | 2018-04-24 22:14:22 | 2018-04-24 22:15:15 | 0x1244c40 |
+| 0 |
+ C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44794 | 127.0.0.1 | 6432 | 2018-04-24 22:15:15 | 2018-04-24 22:15:15 | 0x1244dc0 |
+| 0 |
+ C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44798 | 127.0.0.1 | 6432 | 2018-04-24 22:15:15 | 2018-04-24 22:16:31 | 0x1244f40 |
+| 0 |
+ C | pgbouncer | pgbouncer | active | 127.0.0.1 | 44660 | 127.0.0.1 | 6432 | 2018-04-24 22:13:51 | 2018-04-24 22:17:12 | 0x1244640 |
+| 0 |
+(8 rows)
+
+ type | user | database | state | addr | port | local_addr | local_port | connect_time | request_time | ptr | link | rem
+ote_pid | tls
+------+--------+---------------------+-------+-----------+------+------------+------------+---------------------+---------------------+-----------+------+----
+--------+-----
+ S | gitlab | gitlabhq_production | idle | 127.0.0.1 | 5432 | 127.0.0.1 | 35646 | 2018-04-24 22:15:15 | 2018-04-24 22:17:10 | 0x124dca0 | |
+ 19980 |
+(1 row)
+```
+
+## Troubleshooting
+
+In case you are experiencing any issues connecting through pgbouncer, the first place to check is always the logs:
+
+```shell
+# gitlab-ctl tail pgbouncer
+```
+
+Additionally, you can check the output from `show databases` in the [Administrative console](#administrative-console). In the output, you would expect to see values in the `host` field for the `gitlabhq_production` database. Additionally, `current_connections` should be greater than 1.
diff --git a/doc/administration/high_availability/redis.md b/doc/administration/high_availability/redis.md
index 3daebc4d84b..46ad3ecd9bb 100644
--- a/doc/administration/high_availability/redis.md
+++ b/doc/administration/high_availability/redis.md
@@ -1,6 +1,103 @@
-# Configuring Redis for GitLab HA
+# Configuring Redis for Scaling and High Availability
-> Experimental Redis Sentinel support was [Introduced][ce-1877] in GitLab 8.11.
+## Provide your own Redis instance **[CORE ONLY]**
+
+The following are the requirements for providing your own Redis instance:
+
+- Redis version 2.8 or higher. Version 3.2 or higher is recommend as this is
+ what ships with the GitLab Omnibus package.
+- Standalone Redis or Redis high availability with Sentinel are supported. Redis
+ Cluster is not supported.
+- Managed Redis from cloud providers such as AWS Elasticache will work. If these
+ services support high availability, be sure it is not the Redis Cluster type.
+
+Note the Redis node's IP address or hostname, port, and password (if required).
+These will be necessary when configuring the GitLab application servers later.
+
+## Redis in a Scaled Environment
+
+This section is relevant for [Scaled Architecture](./README.md#scalable-architecture-examples)
+environments including [Basic Scaling](./README.md#basic-scaling) and
+[Full Scaling](./README.md#full-scaling).
+
+### Provide your own Redis instance **[CORE ONLY]**
+
+If you want to use your own deployed Redis instance(s),
+see [Provide your own Redis instance](#provide-your-own-redis-instance-core-only)
+for more details. However, you can use the GitLab Omnibus package to easily
+deploy the bundled Redis.
+
+### Standalone Redis using GitLab Omnibus **[CORE ONLY]**
+
+The GitLab Omnibus package can be used to configure a standalone Redis server.
+In this configuration Redis is not highly available, and represents a single
+point of failure. However, in a scaled environment the objective is to allow
+the environment to handle more users or to increase throughput. Redis itself
+is generally stable and can handle many requests so it is an acceptable
+trade off to have only a single instance. See [Scaling and High Availability](./README.md)
+for an overview of GitLab scaling and high availability options.
+
+The steps below are the minimum necessary to configure a Redis server with
+Omnibus:
+
+1. SSH into the Redis server.
+1. [Download/install](https://about.gitlab.com/installation) the Omnibus GitLab
+ package you want using **steps 1 and 2** from the GitLab downloads page.
+ - Do not complete any other steps on the download page.
+
+1. Edit `/etc/gitlab/gitlab.rb` and add the contents:
+
+ ```ruby
+ ## Enable Redis
+ redis['enable'] = true
+
+ ## Disable all other services
+ sidekiq['enable'] = false
+ gitlab_workhorse['enable'] = false
+ unicorn['enable'] = false
+ postgresql['enable'] = false
+ nginx['enable'] = false
+ prometheus['enable'] = false
+ alertmanager['enable'] = false
+ pgbouncer_exporter['enable'] = false
+ gitlab_monitor['enable'] = false
+ gitaly['enable'] = false
+
+ redis['bind'] = '0.0.0.0'
+ redis['port'] = '6379'
+ redis['password'] = 'SECRET_PASSWORD_HERE'
+
+ gitlab_rails['auto_migrate'] = false
+ ```
+
+1. [Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect.
+1. Note the Redis node's IP address or hostname, port, and
+ Redis password. These will be necessary when configuring the GitLab
+ application servers later.
+
+Advanced configuration options are supported and can be added if
+needed.
+
+Continue configuration of other components by going
+[back to Scaled Architectures](./README.md#scalable-architecture-examples)
+
+## Redis with High Availability
+
+This section is relevant for [High Availability Architecture](./README.md#high-availability-architecture-examples)
+environments including [Horizontal](./README.md#horizontal),
+[Hybrid](./README.md#hybrid), and
+[Fully Distributed](./README.md#fully-distributed).
+
+### Provide your own Redis instance **[CORE ONLY]**
+
+If you want to use your own deployed Redis instance(s),
+see [Provide your own Redis instance](#provide-your-own-redis-instance-core-only)
+for more details. However, you can use the GitLab Omnibus package to easily
+deploy the bundled Redis.
+
+### High Availability with GitLab Omnibus **[PREMIUM ONLY]**
+
+> Experimental Redis Sentinel support was [introduced in GitLab 8.11][ce-1877].
Starting with 8.14, Redis Sentinel is no longer experimental.
If you've used it with versions `< 8.14` before, please check the updated
documentation here.
@@ -53,8 +150,6 @@ failure.
Make sure that you read this document once as a whole before configuring the
components below.
-### High Availability with Sentinel
-
> **Notes:**
>
> - Starting with GitLab `8.11`, you can configure a list of Redis Sentinel
@@ -270,10 +365,9 @@ The prerequisites for a HA Redis setup are the following:
1. Edit `/etc/gitlab/gitlab.rb` and add the contents:
```ruby
- # Enable the master role and disable all other services in the machine
- # (you can still enable Sentinel).
- redis_master_role['enable'] = true
-
+ # Specify server role as 'redis_master_role'
+ roles ['redis_master_role']
+
# IP address pointing to a local IP that the other machines can reach to.
# You can also set bind to '0.0.0.0' which listen in all interfaces.
# If you really need to bind to an external accessible IP, make
@@ -287,6 +381,7 @@ The prerequisites for a HA Redis setup are the following:
# Set up password authentication for Redis (use the same password in all nodes).
redis['password'] = 'redis-password-goes-here'
```
+
1. Only the primary GitLab application server should handle migrations. To
prevent database migrations from running on upgrade, add the following
@@ -298,6 +393,10 @@ The prerequisites for a HA Redis setup are the following:
1. [Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect.
+> Note: You can specify multiple roles like sentinel and redis as:
+> roles ['redis_sentinel_role', 'redis_master_role']. Read more about high
+> availability roles at https://docs.gitlab.com/omnibus/roles/
+
### Step 2. Configuring the slave Redis instances
1. SSH into the **slave** Redis server.
@@ -310,11 +409,9 @@ The prerequisites for a HA Redis setup are the following:
1. Edit `/etc/gitlab/gitlab.rb` and add the contents:
```ruby
- # Enable the slave role and disable all other services in the machine
- # (you can still enable Sentinel). This will also set automatically
- # `redis['master'] = false`.
- redis_slave_role['enable'] = true
-
+ # Specify server role as 'redis_slave_role'
+ roles ['redis_slave_role']
+
# IP address pointing to a local IP that the other machines can reach to.
# You can also set bind to '0.0.0.0' which listen in all interfaces.
# If you really need to bind to an external accessible IP, make
@@ -336,17 +433,19 @@ The prerequisites for a HA Redis setup are the following:
#redis['master_port'] = 6379
```
-1. To prevent database migrations from running on upgrade, run:
+1. To prevent reconfigure from running automatically on upgrade, run:
```
sudo touch /etc/gitlab/skip-auto-reconfigure
```
- Only the primary GitLab application server should handle migrations.
-
1. [Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect.
1. Go through the steps again for all the other slave nodes.
+> Note: You can specify multiple roles like sentinel and redis as:
+> roles ['redis_sentinel_role', 'redis_slave_role']. Read more about high
+> availability roles at https://docs.gitlab.com/omnibus/roles/
+
---
These values don't have to be changed again in `/etc/gitlab/gitlab.rb` after
@@ -400,13 +499,13 @@ multiple machines with the Sentinel daemon.
be duplicate below):
```ruby
- redis_sentinel_role['enable'] = true
+ roles ['redis_sentinel_role']
# Must be the same in every sentinel node
redis['master_name'] = 'gitlab-redis'
# The same password for Redis authentication you set up for the master node.
- redis['password'] = 'redis-password-goes-here'
+ redis['master_password'] = 'redis-password-goes-here'
# The IP of the master Redis node.
redis['master_ip'] = '10.0.0.1'
@@ -573,8 +672,7 @@ or a failover promotes a different **Master** node.
In `/etc/gitlab/gitlab.rb`:
```ruby
-redis_master_role['enable'] = true
-redis_sentinel_role['enable'] = true
+roles ['redis_sentinel_role', 'redis_master_role']
redis['bind'] = '10.0.0.1'
redis['port'] = 6379
redis['password'] = 'redis-password-goes-here'
@@ -596,8 +694,7 @@ sentinel['quorum'] = 2
In `/etc/gitlab/gitlab.rb`:
```ruby
-redis_slave_role['enable'] = true
-redis_sentinel_role['enable'] = true
+roles ['redis_sentinel_role', 'redis_slave_role']
redis['bind'] = '10.0.0.2'
redis['port'] = 6379
redis['password'] = 'redis-password-goes-here'
@@ -619,8 +716,7 @@ sentinel['quorum'] = 2
In `/etc/gitlab/gitlab.rb`:
```ruby
-redis_slave_role['enable'] = true
-redis_sentinel_role['enable'] = true
+roles ['redis_sentinel_role', 'redis_slave_role']
redis['bind'] = '10.0.0.3'
redis['port'] = 6379
redis['password'] = 'redis-password-goes-here'
@@ -643,7 +739,7 @@ In `/etc/gitlab/gitlab.rb`:
```ruby
redis['master_name'] = 'gitlab-redis'
-redis['password'] = 'redis-password-goes-here'
+redis['master_password'] = 'redis-password-goes-here'
gitlab_rails['redis_sentinels'] = [
{'host' => '10.0.0.1', 'port' => 26379},
{'host' => '10.0.0.2', 'port' => 26379},
@@ -764,15 +860,11 @@ Before proceeding with the troubleshooting below, check your firewall rules:
### Troubleshooting Redis replication
You can check if everything is correct by connecting to each server using
-`redis-cli` application, and sending the `INFO` command.
+`redis-cli` application, and sending the `info replication` command as below.
-If authentication was correctly defined, it should fail with:
-`NOAUTH Authentication required` error. Try to authenticate with the
-previous defined password with `AUTH redis-password-goes-here` and
-try the `INFO` command again.
-
-Look for the `# Replication` section where you should see some important
-information like the `role` of the server.
+```
+/opt/gitlab/embedded/bin/redis-cli -a <redis-password> info replication
+```
When connected to a `master` redis, you will see the number of connected
`slaves`, and a list of each with connection details:
@@ -842,7 +934,7 @@ To make sure your configuration is correct:
1. Run in the console:
```ruby
- redis = Redis.new(Gitlab::Redis.params)
+ redis = Redis.new(Gitlab::Redis::SharedState.params)
redis.info
```