summaryrefslogtreecommitdiff
path: root/doc/administration/high_availability/consul.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/high_availability/consul.md')
-rw-r--r--doc/administration/high_availability/consul.md197
1 files changed, 2 insertions, 195 deletions
diff --git a/doc/administration/high_availability/consul.md b/doc/administration/high_availability/consul.md
index 978ba08c4fa..362d6ee8ba7 100644
--- a/doc/administration/high_availability/consul.md
+++ b/doc/administration/high_availability/consul.md
@@ -1,198 +1,5 @@
---
-type: reference
+redirect_to: ../consul.md
---
-# Working with the bundled Consul service **(PREMIUM ONLY)**
-
-As part of its High Availability stack, GitLab Premium includes a bundled version of [Consul](https://www.consul.io/) that can be managed through `/etc/gitlab/gitlab.rb`. Consul is a service networking solution. When it comes to [GitLab Architecture](../../development/architecture.md), Consul utilization is supported for configuring:
-
-1. [Monitoring in Scaled and Highly Available environments](monitoring_node.md)
-1. [PostgreSQL High Availability with Omnibus](../postgresql/replication_and_failover.md)
-
-A Consul cluster consists of multiple server agents, as well as client agents that run on other nodes which need to talk to the Consul cluster.
-
-## Prerequisites
-
-First, make sure to [download/install](https://about.gitlab.com/install/)
-Omnibus GitLab **on each node**.
-
-Choose an installation method, then make sure you complete steps:
-
-1. Install and configure the necessary dependencies.
-1. Add the GitLab package repository and install the package.
-
-When installing the GitLab package, do not supply `EXTERNAL_URL` value.
-
-## Configuring the Consul nodes
-
-On each Consul node perform the following:
-
-1. Make sure you collect [`CONSUL_SERVER_NODES`](../postgresql/replication_and_failover.md#consul-information), which are the IP addresses or DNS records of the Consul server nodes, for the next step, before executing the next step.
-
-1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section:
-
- ```ruby
- # Disable all components except Consul
- roles ['consul_role']
-
- # START user configuration
- # Replace placeholders:
- #
- # Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z
- # with the addresses gathered for CONSUL_SERVER_NODES
- consul['configuration'] = {
- server: true,
- retry_join: %w(Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z)
- }
-
- # Disable auto migrations
- gitlab_rails['auto_migrate'] = false
- #
- # END user configuration
- ```
-
- > `consul_role` was introduced with GitLab 10.3
-
-1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes
- to take effect.
-
-### Consul checkpoint
-
-Before moving on, make sure Consul is configured correctly. Run the following
-command to verify all server nodes are communicating:
-
-```shell
-/opt/gitlab/embedded/bin/consul members
-```
-
-The output should be similar to:
-
-```plaintext
-Node Address Status Type Build Protocol DC
-CONSUL_NODE_ONE XXX.XXX.XXX.YYY:8301 alive server 0.9.2 2 gitlab_consul
-CONSUL_NODE_TWO XXX.XXX.XXX.YYY:8301 alive server 0.9.2 2 gitlab_consul
-CONSUL_NODE_THREE XXX.XXX.XXX.YYY:8301 alive server 0.9.2 2 gitlab_consul
-```
-
-If any of the nodes isn't `alive` or if any of the three nodes are missing,
-check the [Troubleshooting section](#troubleshooting) before proceeding.
-
-## Operations
-
-### Checking cluster membership
-
-To see which nodes are part of the cluster, run the following on any member in the cluster
-
-```shell
-$ /opt/gitlab/embedded/bin/consul members
-Node Address Status Type Build Protocol DC
-consul-b XX.XX.X.Y:8301 alive server 0.9.0 2 gitlab_consul
-consul-c XX.XX.X.Y:8301 alive server 0.9.0 2 gitlab_consul
-consul-c XX.XX.X.Y:8301 alive server 0.9.0 2 gitlab_consul
-db-a XX.XX.X.Y:8301 alive client 0.9.0 2 gitlab_consul
-db-b XX.XX.X.Y:8301 alive client 0.9.0 2 gitlab_consul
-```
-
-Ideally all nodes will have a `Status` of `alive`.
-
-### Restarting the server cluster
-
-NOTE: **Note:**
-This section only applies to server agents. It is safe to restart client agents whenever needed.
-
-If it is necessary to restart the server cluster, it is important to do this in a controlled fashion in order to maintain quorum. If quorum is lost, you will need to follow the Consul [outage recovery](#outage-recovery) process to recover the cluster.
-
-To be safe, we recommend you only restart one server agent at a time to ensure the cluster remains intact.
-
-For larger clusters, it is possible to restart multiple agents at a time. See the [Consul consensus document](https://www.consul.io/docs/internals/consensus.html#deployment-table) for how many failures it can tolerate. This will be the number of simultaneous restarts it can sustain.
-
-## Upgrades for bundled Consul
-
-Nodes running GitLab-bundled Consul should be:
-
-- Members of a healthy cluster prior to upgrading the Omnibus GitLab package.
-- Upgraded one node at a time.
-
-NOTE: **Note:**
-Running `curl http://127.0.0.1:8500/v1/health/state/critical` from any Consul node will identify existing health issues in the cluster. The command will return an empty array if the cluster is healthy.
-
-Consul clusters communicate using the raft protocol. If the current leader goes offline, there needs to be a leader election. A leader node must exist to facilitate synchronization across the cluster. If too many nodes go offline at the same time, the cluster will lose quorum and not elect a leader due to [broken consensus](https://www.consul.io/docs/internals/consensus.html).
-
-Consult the [troubleshooting section](#troubleshooting) if the cluster is not able to recover after the upgrade. The [outage recovery](#outage-recovery) may be of particular interest.
-
-NOTE: **Note:**
-GitLab only uses Consul to store transient data that is easily regenerated. If the bundled Consul was not used by any process other than GitLab itself, then [rebuilding the cluster from scratch](#recreate-from-scratch) is fine.
-
-## Troubleshooting
-
-### Consul server agents unable to communicate
-
-By default, the server agents will attempt to [bind](https://www.consul.io/docs/agent/options.html#_bind) to '0.0.0.0', but they will advertise the first private IP address on the node for other agents to communicate with them. If the other nodes cannot communicate with a node on this address, then the cluster will have a failed status.
-
-You will see messages like the following in `gitlab-ctl tail consul` output if you are running into this issue:
-
-```plaintext
-2017-09-25_19:53:39.90821 2017/09/25 19:53:39 [WARN] raft: no known peers, aborting election
-2017-09-25_19:53:41.74356 2017/09/25 19:53:41 [ERR] agent: failed to sync remote state: No cluster leader
-```
-
-To fix this:
-
-1. Pick an address on each node that all of the other nodes can reach this node through.
-1. Update your `/etc/gitlab/gitlab.rb`
-
- ```ruby
- consul['configuration'] = {
- ...
- bind_addr: 'IP ADDRESS'
- }
- ```
-
-1. Run `gitlab-ctl reconfigure`
-
-If you still see the errors, you may have to [erase the Consul database and reinitialize](#recreate-from-scratch) on the affected node.
-
-### Consul agents do not start - Multiple private IPs
-
-In the case that a node has multiple private IPs the agent be confused as to which of the private addresses to advertise, and then immediately exit on start.
-
-You will see messages like the following in `gitlab-ctl tail consul` output if you are running into this issue:
-
-```plaintext
-2017-11-09_17:41:45.52876 ==> Starting Consul agent...
-2017-11-09_17:41:45.53057 ==> Error creating agent: Failed to get advertise address: Multiple private IPs found. Please configure one.
-```
-
-To fix this:
-
-1. Pick an address on the node that all of the other nodes can reach this node through.
-1. Update your `/etc/gitlab/gitlab.rb`
-
- ```ruby
- consul['configuration'] = {
- ...
- bind_addr: 'IP ADDRESS'
- }
- ```
-
-1. Run `gitlab-ctl reconfigure`
-
-### Outage recovery
-
-If you lost enough server agents in the cluster to break quorum, then the cluster is considered failed, and it will not function without manual intervention.
-
-#### Recreate from scratch
-
-By default, GitLab does not store anything in the Consul cluster that cannot be recreated. To erase the Consul database and reinitialize
-
-```shell
-gitlab-ctl stop consul
-rm -rf /var/opt/gitlab/consul/data
-gitlab-ctl start consul
-```
-
-After this, the cluster should start back up, and the server agents rejoin. Shortly after that, the client agents should rejoin as well.
-
-#### Recover a failed cluster
-
-If you have taken advantage of Consul to store other data, and want to restore the failed cluster, please follow the [Consul guide](https://learn.hashicorp.com/consul/day-2-operations/outage) to recover a failed cluster.
+This document was moved to [another location](../consul.md).