summaryrefslogtreecommitdiff
path: root/doc/administration/high_availability/consul.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/high_availability/consul.md')
-rw-r--r--doc/administration/high_availability/consul.md17
1 files changed, 17 insertions, 0 deletions
diff --git a/doc/administration/high_availability/consul.md b/doc/administration/high_availability/consul.md
index b01419200cc..392b9b76c31 100644
--- a/doc/administration/high_availability/consul.md
+++ b/doc/administration/high_availability/consul.md
@@ -102,6 +102,23 @@ To be safe, we recommend you only restart one server agent at a time to ensure t
For larger clusters, it is possible to restart multiple agents at a time. See the [Consul consensus document](https://www.consul.io/docs/internals/consensus.html#deployment-table) for how many failures it can tolerate. This will be the number of simulateneous restarts it can sustain.
+## Upgrades for bundled Consul
+
+Nodes running GitLab-bundled Consul should be:
+
+- Members of a healthy cluster prior to upgrading the GitLab Omnibus package.
+- Upgraded one node at a time.
+
+NOTE: **NOTE:**
+Running `curl http://127.0.0.1:8500/v1/health/state/critical` from any Consul node will identify existing health issues in the cluster. The command will return an empty array if the cluster is healthy.
+
+Consul clusters communicate using the raft protocol. If the current leader goes offline, there needs to be a leader election. A leader node must exist to facilitate synchronization across the cluster. If too many nodes go offline at the same time, the cluster will lose quorum and not elect a leader due to [broken consensus](https://www.consul.io/docs/internals/consensus.html).
+
+Consult the [troubleshooting section](#troubleshooting) if the cluster is not able to recover after the upgrade. The [outage recovery](#outage-recovery) may be of particular interest.
+
+NOTE: **NOTE:**
+GitLab only uses Consul to store transient data that is easily regenerated. If the bundled Consul was not used by any process other than GitLab itself, then [rebuilding the cluster from scratch](#recreate-from-scratch) is fine.
+
## Troubleshooting
### Consul server agents unable to communicate