summaryrefslogtreecommitdiff
path: root/doc/administration/geo
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/geo')
-rw-r--r--doc/administration/geo/disaster_recovery/bring_primary_back.md8
-rw-r--r--doc/administration/geo/disaster_recovery/index.md43
-rw-r--r--doc/administration/geo/disaster_recovery/planned_failover.md8
-rw-r--r--doc/administration/geo/disaster_recovery/promotion_runbook.md269
-rw-r--r--doc/administration/geo/index.md295
-rw-r--r--doc/administration/geo/replication/configuration.md2
-rw-r--r--doc/administration/geo/replication/database.md468
-rw-r--r--doc/administration/geo/replication/datatypes.md62
-rw-r--r--doc/administration/geo/replication/external_database.md233
-rw-r--r--doc/administration/geo/replication/faq.md2
-rw-r--r--doc/administration/geo/replication/geo_validation_tests.md12
-rw-r--r--doc/administration/geo/replication/index.md315
-rw-r--r--doc/administration/geo/replication/location_aware_git_url.md2
-rw-r--r--doc/administration/geo/replication/multiple_servers.md6
-rw-r--r--doc/administration/geo/replication/object_storage.md4
-rw-r--r--doc/administration/geo/replication/security_review.md4
-rw-r--r--doc/administration/geo/replication/troubleshooting.md47
-rw-r--r--doc/administration/geo/replication/updating_the_geo_nodes.md30
-rw-r--r--doc/administration/geo/replication/using_a_geo_server.md2
-rw-r--r--doc/administration/geo/replication/version_specific_updates.md6
-rw-r--r--doc/administration/geo/setup/database.md473
-rw-r--r--doc/administration/geo/setup/external_database.md234
-rw-r--r--doc/administration/geo/setup/index.md32
23 files changed, 1432 insertions, 1125 deletions
diff --git a/doc/administration/geo/disaster_recovery/bring_primary_back.md b/doc/administration/geo/disaster_recovery/bring_primary_back.md
index 3b7c7fd549c..83081e2cef6 100644
--- a/doc/administration/geo/disaster_recovery/bring_primary_back.md
+++ b/doc/administration/geo/disaster_recovery/bring_primary_back.md
@@ -21,7 +21,7 @@ If you have any doubts about the consistency of the data on this node, we recomm
Since the former **primary** node will be out of sync with the current **primary** node, the first step is to bring the former **primary** node up to date. Note, deletion of data stored on disk like
repositories and uploads will not be replayed when bringing the former **primary** node back
into sync, which may result in increased disk usage.
-Alternatively, you can [set up a new **secondary** GitLab instance](../replication/index.md#setup-instructions) to avoid this.
+Alternatively, you can [set up a new **secondary** GitLab instance](../setup/index.md) to avoid this.
To bring the former **primary** node up to date:
@@ -37,7 +37,7 @@ To bring the former **primary** node up to date:
you need to undo those steps now. For Debian/Ubuntu you just need to run
`sudo systemctl enable gitlab-runsvdir`. For CentOS 6, you need to install
the GitLab instance from scratch and set it up as a **secondary** node by
- following [Setup instructions](../replication/index.md#setup-instructions). In this case, you don't need to follow the next step.
+ following [Setup instructions](../setup/index.md). In this case, you don't need to follow the next step.
NOTE: **Note:**
If you [changed the DNS records](index.md#step-4-optional-updating-the-primary-domain-dns-record)
@@ -45,12 +45,12 @@ To bring the former **primary** node up to date:
all the writes to this node](planned_failover.md#prevent-updates-to-the-primary-node)
during this procedure.
-1. [Setup database replication](../replication/database.md). Note that in this
+1. [Setup database replication](../setup/database.md). Note that in this
case, **primary** node refers to the current **primary** node, and **secondary** node refers to the
former **primary** node.
If you have lost your original **primary** node, follow the
-[setup instructions](../replication/index.md#setup-instructions) to set up a new **secondary** node.
+[setup instructions](../setup/index.md) to set up a new **secondary** node.
## Promote the **secondary** node to **primary** node
diff --git a/doc/administration/geo/disaster_recovery/index.md b/doc/administration/geo/disaster_recovery/index.md
index 2d837ebb369..8862776ee1b 100644
--- a/doc/administration/geo/disaster_recovery/index.md
+++ b/doc/administration/geo/disaster_recovery/index.md
@@ -11,11 +11,13 @@ Geo replicates your database, your Git repositories, and few other assets.
We will support and replicate more data in the future, that will enable you to
failover with minimal effort, in a disaster situation.
-See [Geo current limitations](../replication/index.md#current-limitations) for more information.
+See [Geo current limitations](../index.md#current-limitations) for more information.
CAUTION: **Warning:**
Disaster recovery for multi-secondary configurations is in **Alpha**.
-For the latest updates, check the multi-secondary [Disaster Recovery epic](https://gitlab.com/groups/gitlab-org/-/epics/65).
+For the latest updates, check the [Disaster Recovery epic for complete maturity](https://gitlab.com/groups/gitlab-org/-/epics/590).
+Multi-secondary configurations require the complete re-synchronization and re-configuration of all non-promoted secondaries and
+will cause downtime.
## Promoting a **secondary** Geo node in single-secondary configurations
@@ -96,6 +98,10 @@ must disable the **primary** node.
Note the following when promoting a secondary:
+- If replication was paused on the secondary node, for example as a part of upgrading,
+ while you were running a version of GitLab lower than 13.4, you _must_
+ [enable the node via the database](../replication/troubleshooting.md#while-promoting-the-secondary-i-got-an-error-activerecordrecordinvalid)
+ before proceeding.
- A new **secondary** should not be added at this time. If you want to add a new
**secondary**, do this after you have completed the entire process of promoting
the **secondary** to the **primary**.
@@ -123,28 +129,23 @@ Note the following when promoting a secondary:
```
1. Promote the **secondary** node to the **primary** node.
-
- Before promoting a secondary node to primary, preflight checks should be run. They can be run separately or along with the promotion script.
-
+
To promote the secondary node to primary along with preflight checks:
```shell
gitlab-ctl promote-to-primary-node
```
- CAUTION: **Warning:**
- Skipping preflight checks will promote the secondary to a primary without any further confirmation!
-
- If you have already run the [preflight checks](planned_failover.md#preflight-checks) or don't want to run them, you can skip preflight checks with:
+ If you have already run the [preflight checks](planned_failover.md#preflight-checks) separately or don't want to run them, you can skip preflight checks with:
```shell
gitlab-ctl promote-to-primary-node --skip-preflight-check
```
- You can also run preflight checks separately:
+ You can also promote the secondary node to primary **without any further confirmation**, even when preflight checks fail:
```shell
- gitlab-ctl promotion-preflight-checks
+ gitlab-ctl promote-to-primary-node --force
```
1. Verify you can connect to the newly promoted **primary** node using the URL used
@@ -321,7 +322,7 @@ secondary domain, like changing Git remotes and API URLs.
Promoting a **secondary** node to **primary** node using the process above does not enable
Geo on the new **primary** node.
-To bring a new **secondary** node online, follow the [Geo setup instructions](../replication/index.md#setup-instructions).
+To bring a new **secondary** node online, follow the [Geo setup instructions](../index.md#setup-instructions).
### Step 6. (Optional) Removing the secondary's tracking database
@@ -374,7 +375,7 @@ and after that you also need two extra steps.
gitlab_rails['auto_migrate'] = false
```
- (For more details about these settings you can read [Configure the primary server](../replication/database.md#step-1-configure-the-primary-server))
+ (For more details about these settings you can read [Configure the primary server](../setup/database.md#step-1-configure-the-primary-server))
1. Save the file and reconfigure GitLab for the database listen changes and
the replication slot changes to be applied.
@@ -407,21 +408,9 @@ and after that you also need two extra steps.
### Step 2. Initiate the replication process
Now we need to make each **secondary** node listen to changes on the new **primary** node. To do that you need
-to [initiate the replication process](../replication/database.md#step-3-initiate-the-replication-process) again but this time
+to [initiate the replication process](../setup/database.md#step-3-initiate-the-replication-process) again but this time
for another **primary** node. All the old replication settings will be overwritten.
## Troubleshooting
-### I followed the disaster recovery instructions and now two-factor auth is broken
-
-The setup instructions for Geo prior to 10.5 failed to replicate the
-`otp_key_base` secret, which is used to encrypt the two-factor authentication
-secrets stored in the database. If it differs between **primary** and **secondary**
-nodes, users with two-factor authentication enabled won't be able to log in
-after a failover.
-
-If you still have access to the old **primary** node, you can follow the
-instructions in the
-[Upgrading to GitLab 10.5](../replication/version_specific_updates.md#updating-to-gitlab-105)
-section to resolve the error. Otherwise, the secret is lost and you'll need to
-[reset two-factor authentication for all users](../../../security/two_factor_authentication.md#disabling-2fa-for-everyone).
+This section was moved to [another location](../replication/troubleshooting.md#fixing-errors-during-a-failover-or-when-promoting-a-secondary-to-a-primary-node).
diff --git a/doc/administration/geo/disaster_recovery/planned_failover.md b/doc/administration/geo/disaster_recovery/planned_failover.md
index a0cf263a762..9b9c386652c 100644
--- a/doc/administration/geo/disaster_recovery/planned_failover.md
+++ b/doc/administration/geo/disaster_recovery/planned_failover.md
@@ -27,7 +27,7 @@ have a high degree of confidence in being able to perform them accurately.
## Not all data is automatically replicated
-If you are using any GitLab features that Geo [doesn't support](../replication/index.md#current-limitations),
+If you are using any GitLab features that Geo [doesn't support](../index.md#current-limitations),
you must make separate provisions to ensure that the **secondary** node has an
up-to-date copy of any data associated with that feature. This may extend the
required scheduled maintenance period significantly.
@@ -51,12 +51,6 @@ Run this command to list out all preflight checks and automatically check if rep
gitlab-ctl promotion-preflight-checks
```
-You can run this command in `force` mode to promote to primary even if preflight checks fail:
-
-```shell
-sudo gitlab-ctl promotion-preflight-checks --force
-```
-
Each step is described in more detail below.
### Object storage
diff --git a/doc/administration/geo/disaster_recovery/promotion_runbook.md b/doc/administration/geo/disaster_recovery/promotion_runbook.md
new file mode 100644
index 00000000000..fb2353513df
--- /dev/null
+++ b/doc/administration/geo/disaster_recovery/promotion_runbook.md
@@ -0,0 +1,269 @@
+---
+stage: Enablement
+group: Geo
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
+type: howto
+---
+
+CAUTION: **Caution:**
+This runbook is in **alpha**. For complete, production-ready documentation, see the
+[disaster recovery documentation](index.md).
+
+# Disaster Recovery (Geo) promotion runbooks **(PREMIUM ONLY)**
+
+## Geo planned failover runbook 1
+
+| Component | Configuration |
+| ----------- | --------------- |
+| PostgreSQL | Omnibus-managed |
+| Geo site | Single-node |
+| Secondaries | One |
+
+This runbook will guide you through a planned failover of a single-node Geo site
+with one secondary. The following general architecture is assumed:
+
+```mermaid
+graph TD
+ subgraph main[Geo deployment]
+ subgraph Primary[Primary site]
+ Node_1[(GitLab node)]
+ end
+ subgraph Secondary1[Secondary site]
+ Node_2[(GitLab node)]
+ end
+ end
+```
+
+This guide will result in the following:
+
+1. An offline primary.
+1. A promoted secondary that is now the new primary.
+
+What is not covered:
+
+1. Re-adding the old **primary** as a secondary.
+1. Adding a new secondary.
+
+### Preparation
+
+NOTE: **Note:**
+Before following any of those steps, make sure you have `root` access to the
+**secondary** to promote it, since there isn't provided an automated way to
+promote a Geo replica and perform a failover.
+
+On the **secondary** node, navigate to the **Admin Area > Geo** dashboard to
+review its status. Replicated objects (shown in green) should be close to 100%,
+and there should be no failures (shown in red). If a large proportion of
+objects aren't yet replicated (shown in gray), consider giving the node more
+time to complete.
+
+![Replication status](img/replication-status.png)
+
+If any objects are failing to replicate, this should be investigated before
+scheduling the maintenance window. After a planned failover, anything that
+failed to replicate will be **lost**.
+
+You can use the
+[Geo status API](../../../api/geo_nodes.md#retrieve-project-sync-or-verification-failures-that-occurred-on-the-current-node)
+to review failed objects and the reasons for failure.
+A common cause of replication failures is the data being missing on the
+**primary** node - you can resolve these failures by restoring the data from backup,
+or removing references to the missing data.
+
+The maintenance window won't end until Geo replication and verification is
+completely finished. To keep the window as short as possible, you should
+ensure these processes are close to 100% as possible during active use.
+
+If the **secondary** node is still replicating data from the **primary** node,
+follow these steps to avoid unnecessary data loss:
+
+1. Until a [read-only mode](https://gitlab.com/gitlab-org/gitlab/-/issues/14609)
+ is implemented, updates must be prevented from happening manually to the
+ **primary**. Note that your **secondary** node still needs read-only
+ access to the **primary** node during the maintenance window:
+
+ 1. At the scheduled time, using your cloud provider or your node's firewall, block
+ all HTTP, HTTPS and SSH traffic to/from the **primary** node, **except** for your IP and
+ the **secondary** node's IP.
+
+ For instance, you can run the following commands on the **primary** node:
+
+ ```shell
+ sudo iptables -A INPUT -p tcp -s <secondary_node_ip> --destination-port 22 -j ACCEPT
+ sudo iptables -A INPUT -p tcp -s <your_ip> --destination-port 22 -j ACCEPT
+ sudo iptables -A INPUT --destination-port 22 -j REJECT
+
+ sudo iptables -A INPUT -p tcp -s <secondary_node_ip> --destination-port 80 -j ACCEPT
+ sudo iptables -A INPUT -p tcp -s <your_ip> --destination-port 80 -j ACCEPT
+ sudo iptables -A INPUT --tcp-dport 80 -j REJECT
+
+ sudo iptables -A INPUT -p tcp -s <secondary_node_ip> --destination-port 443 -j ACCEPT
+ sudo iptables -A INPUT -p tcp -s <your_ip> --destination-port 443 -j ACCEPT
+ sudo iptables -A INPUT --tcp-dport 443 -j REJECT
+ ```
+
+ From this point, users will be unable to view their data or make changes on the
+ **primary** node. They will also be unable to log in to the **secondary** node.
+ However, existing sessions will work for the remainder of the maintenance period, and
+ public data will be accessible throughout.
+
+ 1. Verify the **primary** node is blocked to HTTP traffic by visiting it in browser via
+ another IP. The server should refuse connection.
+
+ 1. Verify the **primary** node is blocked to Git over SSH traffic by attempting to pull an
+ existing Git repository with an SSH remote URL. The server should refuse
+ connection.
+
+ 1. On the **primary** node, disable non-Geo periodic background jobs by navigating
+ to **Admin Area > Monitoring > Background Jobs > Cron**, clicking `Disable All`,
+ and then clicking `Enable` for the `geo_sidekiq_cron_config_worker` cron job.
+ This job will re-enable several other cron jobs that are essential for planned
+ failover to complete successfully.
+
+1. Finish replicating and verifying all data:
+
+ CAUTION: **Caution:**
+ Not all data is automatically replicated. Read more about
+ [what is excluded](planned_failover.md#not-all-data-is-automatically-replicated).
+
+ 1. If you are manually replicating any
+ [data not managed by Geo](../replication/datatypes.md#limitations-on-replicationverification),
+ trigger the final replication process now.
+ 1. On the **primary** node, navigate to **Admin Area > Monitoring > Background Jobs > Queues**
+ and wait for all queues except those with `geo` in the name to drop to 0.
+ These queues contain work that has been submitted by your users; failing over
+ before it is completed will cause the work to be lost.
+ 1. On the **primary** node, navigate to **Admin Area > Geo** and wait for the
+ following conditions to be true of the **secondary** node you are failing over to:
+ - All replication meters to each 100% replicated, 0% failures.
+ - All verification meters reach 100% verified, 0% failures.
+ - Database replication lag is 0ms.
+ - The Geo log cursor is up to date (0 events behind).
+
+ 1. On the **secondary** node, navigate to **Admin Area > Monitoring > Background Jobs > Queues**
+ and wait for all the `geo` queues to drop to 0 queued and 0 running jobs.
+ 1. On the **secondary** node, use [these instructions](../../raketasks/check.md)
+ to verify the integrity of CI artifacts, LFS objects, and uploads in file
+ storage.
+
+ At this point, your **secondary** node will contain an up-to-date copy of everything the
+ **primary** node has, meaning nothing will be lost when you fail over.
+
+1. In this final step, you need to permanently disable the **primary** node.
+
+ CAUTION: **Caution:**
+ When the **primary** node goes offline, there may be data saved on the **primary** node
+ that has not been replicated to the **secondary** node. This data should be treated
+ as lost if you proceed.
+
+ TIP: **Tip:**
+ If you plan to [update the **primary** domain DNS record](index.md#step-4-optional-updating-the-primary-domain-dns-record),
+ you may wish to lower the TTL now to speed up propagation.
+
+ When performing a failover, we want to avoid a split-brain situation where
+ writes can occur in two different GitLab instances. So to prepare for the
+ failover, you must disable the **primary** node:
+
+ - If you have SSH access to the **primary** node, stop and disable GitLab:
+
+ ```shell
+ sudo gitlab-ctl stop
+ ```
+
+ Prevent GitLab from starting up again if the server unexpectedly reboots:
+
+ ```shell
+ sudo systemctl disable gitlab-runsvdir
+ ```
+
+ NOTE: **Note:**
+ (**CentOS only**) In CentOS 6 or older, there is no easy way to prevent GitLab from being
+ started if the machine reboots isn't available (see [Omnibus GitLab issue #3058](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/3058)).
+ It may be safest to uninstall the GitLab package completely with `sudo yum remove gitlab-ee`.
+
+ NOTE: **Note:**
+ (**Ubuntu 14.04 LTS**) If you are using an older version of Ubuntu
+ or any other distribution based on the Upstart init system, you can prevent GitLab
+ from starting if the machine reboots as `root` with
+ `initctl stop gitlab-runsvvdir && echo 'manual' > /etc/init/gitlab-runsvdir.override && initctl reload-configuration`.
+
+ - If you do not have SSH access to the **primary** node, take the machine offline and
+ prevent it from rebooting. Since there are many ways you may prefer to accomplish
+ this, we will avoid a single recommendation. You may need to:
+
+ - Reconfigure the load balancers.
+ - Change DNS records (for example, point the **primary** DNS record to the **secondary**
+ node in order to stop usage of the **primary** node).
+ - Stop the virtual servers.
+ - Block traffic through a firewall.
+ - Revoke object storage permissions from the **primary** node.
+ - Physically disconnect a machine.
+
+### Promoting the **secondary** node
+
+Note the following when promoting a secondary:
+
+- A new **secondary** should not be added at this time. If you want to add a new
+ **secondary**, do this after you have completed the entire process of promoting
+ the **secondary** to the **primary**.
+- If you encounter an `ActiveRecord::RecordInvalid: Validation failed: Name has already been taken`
+ error during this process, read
+ [the troubleshooting advice](../replication/troubleshooting.md#fixing-errors-during-a-failover-or-when-promoting-a-secondary-to-a-primary-node).
+
+To promote the secondary node:
+
+1. SSH in to your **secondary** node and login as root:
+
+ ```shell
+ sudo -i
+ ```
+
+1. Edit `/etc/gitlab/gitlab.rb` to reflect its new status as **primary** by
+ removing any lines that enabled the `geo_secondary_role`:
+
+ ```ruby
+ ## In pre-11.5 documentation, the role was enabled as follows. Remove this line.
+ geo_secondary_role['enable'] = true
+
+ ## In 11.5+ documentation, the role was enabled as follows. Remove this line.
+ roles ['geo_secondary_role']
+ ```
+
+1. Run the following command to list out all preflight checks and automatically
+ check if replication and verification are complete before scheduling a planned
+ failover to ensure the process will go smoothly:
+
+ ```shell
+ gitlab-ctl promotion-preflight-checks
+ ```
+
+1. Promote the **secondary**:
+
+ ```shell
+ gitlab-ctl promote-to-primary-node
+ ```
+
+ If you have already run the [preflight checks](planned_failover.md#preflight-checks)
+ or don't want to run them, you can skip them:
+
+ ```shell
+ gitlab-ctl promote-to-primary-node --skip-preflight-check
+ ```
+
+ You can also promote the secondary node to primary **without any further confirmation**, even when preflight checks fail:
+
+ ```shell
+ sudo gitlab-ctl promote-to-primary-node --force
+ ```
+
+1. Verify you can connect to the newly promoted **primary** node using the URL used
+ previously for the **secondary** node.
+
+ If successful, the **secondary** node has now been promoted to the **primary** node.
+
+### Next steps
+
+To regain geographic redundancy as quickly as possible, you should
+[add a new **secondary** node](../setup/index.md). To
+do that, you can re-add the old **primary** as a new secondary and bring it back
+online.
diff --git a/doc/administration/geo/index.md b/doc/administration/geo/index.md
new file mode 100644
index 00000000000..6fdf213ac78
--- /dev/null
+++ b/doc/administration/geo/index.md
@@ -0,0 +1,295 @@
+---
+stage: Enablement
+group: Geo
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
+type: howto
+---
+
+# Geo **(PREMIUM ONLY)**
+
+> - Introduced in GitLab Enterprise Edition 8.9.
+> - Using Geo in combination with
+> [multi-node architectures](../reference_architectures/index.md)
+> is considered **Generally Available** (GA) in
+> [GitLab Premium](https://about.gitlab.com/pricing/) 10.4.
+
+Geo is the solution for widely distributed development teams and for providing a warm-standby as part of a disaster recovery strategy.
+
+## Overview
+
+CAUTION: **Caution:**
+Geo undergoes significant changes from release to release. Upgrades **are** supported and [documented](#updating-geo), but you should ensure that you're using the right version of the documentation for your installation.
+
+Fetching large repositories can take a long time for teams located far from a single GitLab instance.
+
+Geo provides local, read-only instances of your GitLab instances. This can reduce the time it takes
+to clone and fetch large repositories, speeding up development.
+
+For a video introduction to Geo, see [Introduction to GitLab Geo - GitLab Features](https://www.youtube.com/watch?v=-HDLxSjEh6w).
+
+To make sure you're using the right version of the documentation, navigate to [the source version of this page on GitLab.com](https://gitlab.com/gitlab-org/gitlab/blob/master/doc/administration/geo/index.md) and choose the appropriate release from the **Switch branch/tag** dropdown. For example, [`v11.2.3-ee`](https://gitlab.com/gitlab-org/gitlab/blob/v11.2.3-ee/doc/administration/geo/index.md).
+
+## Use cases
+
+Implementing Geo provides the following benefits:
+
+- Reduce from minutes to seconds the time taken for your distributed developers to clone and fetch large repositories and projects.
+- Enable all of your developers to contribute ideas and work in parallel, no matter where they are.
+- Balance the read-only load between your **primary** and **secondary** nodes.
+
+In addition, it:
+
+- Can be used for cloning and fetching projects, in addition to reading any data available in the GitLab web interface (see [current limitations](#current-limitations)).
+- Overcomes slow connections between distant offices, saving time by improving speed for distributed teams.
+- Helps reducing the loading time for automated tasks, custom integrations, and internal workflows.
+- Can quickly fail over to a **secondary** node in a [disaster recovery](disaster_recovery/index.md) scenario.
+- Allows [planned failover](disaster_recovery/planned_failover.md) to a **secondary** node.
+
+Geo provides:
+
+- Read-only **secondary** nodes: Maintain one **primary** GitLab node while still enabling read-only **secondary** nodes for each of your distributed teams.
+- Authentication system hooks: **Secondary** nodes receives all authentication data (like user accounts and logins) from the **primary** instance.
+- An intuitive UI: **Secondary** nodes utilize the same web interface your team has grown accustomed to. In addition, there are visual notifications that block write operations and make it clear that a user is on a **secondary** node.
+
+## How it works
+
+Your Geo instance can be used for cloning and fetching projects, in addition to reading any data. This will make working with large repositories over large distances much faster.
+
+![Geo overview](replication/img/geo_overview.png)
+
+When Geo is enabled, the:
+
+- Original instance is known as the **primary** node.
+- Replicated read-only nodes are known as **secondary** nodes.
+
+Keep in mind that:
+
+- **Secondary** nodes talk to the **primary** node to:
+ - Get user data for logins (API).
+ - Replicate repositories, LFS Objects, and Attachments (HTTPS + JWT).
+- Since GitLab Premium 10.0, the **primary** node no longer talks to **secondary** nodes to notify for changes (API).
+- Pushing directly to a **secondary** node (for both HTTP and SSH, including Git LFS) was [introduced](https://about.gitlab.com/releases/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3.
+- There are [limitations](#current-limitations) in the current implementation.
+
+### Architecture
+
+The following diagram illustrates the underlying architecture of Geo.
+
+![Geo architecture](replication/img/geo_architecture.png)
+
+In this diagram:
+
+- There is the **primary** node and the details of one **secondary** node.
+- Writes to the database can only be performed on the **primary** node. A **secondary** node receives database
+ updates via PostgreSQL streaming replication.
+- If present, the [LDAP server](#ldap) should be configured to replicate for [Disaster Recovery](disaster_recovery/index.md) scenarios.
+- A **secondary** node performs different type of synchronizations against the **primary** node, using a special
+ authorization protected by JWT:
+ - Repositories are cloned/updated via Git over HTTPS.
+ - Attachments, LFS objects, and other files are downloaded via HTTPS using a private API endpoint.
+
+From the perspective of a user performing Git operations:
+
+- The **primary** node behaves as a full read-write GitLab instance.
+- **Secondary** nodes are read-only but proxy Git push operations to the **primary** node. This makes **secondary** nodes appear to support push operations themselves.
+
+To simplify the diagram, some necessary components are omitted. Note that:
+
+- Git over SSH requires [`gitlab-shell`](https://gitlab.com/gitlab-org/gitlab-shell) and OpenSSH.
+- Git over HTTPS required [`gitlab-workhorse`](https://gitlab.com/gitlab-org/gitlab-workhorse).
+
+Note that a **secondary** node needs two different PostgreSQL databases:
+
+- A read-only database instance that streams data from the main GitLab database.
+- [Another database instance](#geo-tracking-database) used internally by the **secondary** node to record what data has been replicated.
+
+In **secondary** nodes, there is an additional daemon: [Geo Log Cursor](#geo-log-cursor).
+
+## Requirements for running Geo
+
+The following are required to run Geo:
+
+- An operating system that supports OpenSSH 6.9+ (needed for
+ [fast lookup of authorized SSH keys in the database](../operations/fast_ssh_key_lookup.md))
+ The following operating systems are known to ship with a current version of OpenSSH:
+ - [CentOS](https://www.centos.org) 7.4+
+ - [Ubuntu](https://ubuntu.com) 16.04+
+- PostgreSQL 11+ with [Streaming Replication](https://wiki.postgresql.org/wiki/Streaming_Replication)
+- Git 2.9+
+- All nodes must run the same GitLab version.
+
+Additionally, check GitLab's [minimum requirements](../../install/requirements.md),
+and we recommend you use:
+
+- At least GitLab Enterprise Edition 10.0 for basic Geo features.
+- The latest version for a better experience.
+
+### Firewall rules
+
+The following table lists basic ports that must be open between the **primary** and **secondary** nodes for Geo.
+
+| **Primary** node | **Secondary** node | Protocol |
+|:-----------------|:-------------------|:-------------|
+| 80 | 80 | HTTP |
+| 443 | 443 | TCP or HTTPS |
+| 22 | 22 | TCP |
+| 5432 | | PostgreSQL |
+
+See the full list of ports used by GitLab in [Package defaults](https://docs.gitlab.com/omnibus/package-information/defaults.html)
+
+NOTE: **Note:**
+[Web terminal](../../ci/environments/index.md#web-terminals) support requires your load balancer to correctly handle WebSocket connections.
+When using HTTP or HTTPS proxying, your load balancer must be configured to pass through the `Connection` and `Upgrade` hop-by-hop headers. See the [web terminal](../integration/terminal.md) integration guide for more details.
+
+NOTE: **Note:**
+When using HTTPS protocol for port 443, you will need to add an SSL certificate to the load balancers.
+If you wish to terminate SSL at the GitLab application server instead, use TCP protocol.
+
+### LDAP
+
+We recommend that if you use LDAP on your **primary** node, you also set up secondary LDAP servers on each **secondary** node. Otherwise, users will not be able to perform Git operations over HTTP(s) on the **secondary** node using HTTP Basic Authentication. However, Git via SSH and personal access tokens will still work.
+
+NOTE: **Note:**
+It is possible for all **secondary** nodes to share an LDAP server, but additional latency can be an issue. Also, consider what LDAP server will be available in a [disaster recovery](disaster_recovery/index.md) scenario if a **secondary** node is promoted to be a **primary** node.
+
+Check for instructions on how to set up replication in your LDAP service. Instructions will be different depending on the software or service used. For example, OpenLDAP provides [these instructions](https://www.openldap.org/doc/admin24/replication.html).
+
+### Geo Tracking Database
+
+The tracking database instance is used as metadata to control what needs to be updated on the disk of the local instance. For example:
+
+- Download new assets.
+- Fetch new LFS Objects.
+- Fetch changes from a repository that has recently been updated.
+
+Because the replicated database instance is read-only, we need this additional database instance for each **secondary** node.
+
+### Geo Log Cursor
+
+This daemon:
+
+- Reads a log of events replicated by the **primary** node to the **secondary** database instance.
+- Updates the Geo Tracking Database instance with changes that need to be executed.
+
+When something is marked to be updated in the tracking database instance, asynchronous jobs running on the **secondary** node will execute the required operations and update the state.
+
+This new architecture allows GitLab to be resilient to connectivity issues between the nodes. It doesn't matter how long the **secondary** node is disconnected from the **primary** node as it will be able to replay all the events in the correct order and become synchronized with the **primary** node again.
+
+## Setup instructions
+
+For setup instructions, see [Setting up Geo](setup/index.md).
+
+## Post-installation documentation
+
+After installing GitLab on the **secondary** nodes and performing the initial configuration, see the following documentation for post-installation information.
+
+### Configuring Geo
+
+For information on configuring Geo, see [Geo configuration](replication/configuration.md).
+
+### Updating Geo
+
+For information on how to update your Geo nodes to the latest GitLab version, see [Updating the Geo nodes](replication/updating_the_geo_nodes.md).
+
+### Pausing and resuming replication
+
+> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/35913) in [GitLab Premium](https://about.gitlab.com/pricing/) 13.2.
+
+In some circumstances, like during [upgrades](replication/updating_the_geo_nodes.md) or a [planned failover](disaster_recovery/planned_failover.md), it is desirable to pause replication between the primary and secondary.
+
+Pausing and resuming replication is done via a command line tool from the secondary node.
+
+**To Pause: (from secondary)**
+
+```shell
+gitlab-ctl geo-replication-pause
+```
+
+**To Resume: (from secondary)**
+
+```shell
+gitlab-ctl geo-replication-resume
+```
+
+### Configuring Geo for multiple nodes
+
+For information on configuring Geo for multiple nodes, see [Geo for multiple servers](replication/multiple_servers.md).
+
+### Configuring Geo with Object Storage
+
+For information on configuring Geo with object storage, see [Geo with Object storage](replication/object_storage.md).
+
+### Disaster Recovery
+
+For information on using Geo in disaster recovery situations to mitigate data-loss and restore services, see [Disaster Recovery](disaster_recovery/index.md).
+
+### Replicating the Container Registry
+
+For more information on how to replicate the Container Registry, see [Docker Registry for a **secondary** node](replication/docker_registry.md).
+
+### Security Review
+
+For more information on Geo security, see [Geo security review](replication/security_review.md).
+
+### Tuning Geo
+
+For more information on tuning Geo, see [Tuning Geo](replication/tuning.md).
+
+### Set up a location-aware Git URL
+
+For an example of how to set up a location-aware Git remote URL with AWS Route53, see [Location-aware Git remote URL with AWS Route53](replication/location_aware_git_url.md).
+
+## Remove Geo node
+
+For more information on removing a Geo node, see [Removing **secondary** Geo nodes](replication/remove_geo_node.md).
+
+## Disable Geo
+
+To find out how to disable Geo, see [Disabling Geo](replication/disable_geo.md).
+
+## Current limitations
+
+CAUTION: **Caution:**
+This list of limitations only reflects the latest version of GitLab. If you are using an older version, extra limitations may be in place.
+
+- Pushing directly to a **secondary** node redirects (for HTTP) or proxies (for SSH) the request to the **primary** node instead of [handling it directly](https://gitlab.com/gitlab-org/gitlab/-/issues/1381), except when using Git over HTTP with credentials embedded within the URI. For example, `https://user:password@secondary.tld`.
+- Cloning, pulling, or pushing repositories that exist on the **primary** node but not on the **secondary** nodes where [selective synchronization](replication/configuration.md#selective-synchronization) does not include the project is not supported over SSH [but support is planned](https://gitlab.com/groups/gitlab-org/-/epics/2562). HTTP(S) is supported.
+- The **primary** node has to be online for OAuth login to happen. Existing sessions and Git are not affected. Support for the **secondary** node to use an OAuth provider independent from the primary is [being planned](https://gitlab.com/gitlab-org/gitlab/-/issues/208465).
+- The installation takes multiple manual steps that together can take about an hour depending on circumstances. We are working on improving this experience. See [Omnibus GitLab issue #2978](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/2978) for details.
+- Real-time updates of issues/merge requests (for example, via long polling) doesn't work on the **secondary** node.
+- [Selective synchronization](replication/configuration.md#selective-synchronization) applies only to files and repositories. Other datasets are replicated to the **secondary** node in full, making it inappropriate for use as an access control mechanism.
+- Object pools for forked project deduplication work only on the **primary** node, and are duplicated on the **secondary** node.
+- [External merge request diffs](../merge_request_diffs.md) will not be replicated if they are on-disk, and viewing merge requests will fail. However, external MR diffs in object storage **are** supported. The default configuration (in-database) does work.
+- GitLab Runners cannot register with a **secondary** node. Support for this is [planned for the future](https://gitlab.com/gitlab-org/gitlab/-/issues/3294).
+
+### Limitations on replication/verification
+
+You can keep track of the progress to implement the missing items in
+these epics/issues:
+
+- [Unreplicated Data Types](https://gitlab.com/groups/gitlab-org/-/epics/893)
+- [Verify all replicated data](https://gitlab.com/groups/gitlab-org/-/epics/1430)
+
+There is a complete list of all GitLab [data types](replication/datatypes.md) and [existing support for replication and verification](replication/datatypes.md#limitations-on-replicationverification).
+
+## Frequently Asked Questions
+
+For answers to common questions, see the [Geo FAQ](replication/faq.md).
+
+## Log files
+
+Since GitLab 9.5, Geo stores structured log messages in a `geo.log` file. For Omnibus installations, this file is at `/var/log/gitlab/gitlab-rails/geo.log`.
+
+This file contains information about when Geo attempts to sync repositories and files. Each line in the file contains a separate JSON entry that can be ingested into. For example, Elasticsearch or Splunk.
+
+For example:
+
+```json
+{"severity":"INFO","time":"2017-08-06T05:40:16.104Z","message":"Repository update","project_id":1,"source":"repository","resync_repository":true,"resync_wiki":true,"class":"Gitlab::Geo::LogCursor::Daemon","cursor_delay_s":0.038}
+```
+
+This message shows that Geo detected that a repository update was needed for project `1`.
+
+## Troubleshooting
+
+For troubleshooting steps, see [Geo Troubleshooting](replication/troubleshooting.md).
diff --git a/doc/administration/geo/replication/configuration.md b/doc/administration/geo/replication/configuration.md
index 74fa8e3b8f2..8c818bcfbe2 100644
--- a/doc/administration/geo/replication/configuration.md
+++ b/doc/administration/geo/replication/configuration.md
@@ -12,7 +12,7 @@ type: howto
NOTE: **Note:**
This is the final step in setting up a **secondary** Geo node. Stages of the
setup process must be completed in the documented order.
-Before attempting the steps in this stage, [complete all prior stages](index.md#using-omnibus-gitlab).
+Before attempting the steps in this stage, [complete all prior stages](../setup/index.md#using-omnibus-gitlab).
The basic steps of configuring a **secondary** node are to:
diff --git a/doc/administration/geo/replication/database.md b/doc/administration/geo/replication/database.md
index 0bc37ce6438..da5376190b9 100644
--- a/doc/administration/geo/replication/database.md
+++ b/doc/administration/geo/replication/database.md
@@ -1,469 +1,5 @@
---
-stage: Enablement
-group: Geo
-info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
-type: howto
+redirect_to: '../setup/database.md'
---
-# Geo database replication **(PREMIUM ONLY)**
-
-NOTE: **Note:**
-If your GitLab installation uses external (not managed by Omnibus) PostgreSQL
-instances, the Omnibus roles will not be able to perform all necessary
-configuration steps. In this case,
-[follow the Geo with external PostgreSQL instances document instead](external_database.md).
-
-NOTE: **Note:**
-The stages of the setup process must be completed in the documented order.
-Before attempting the steps in this stage, [complete all prior stages](index.md#using-omnibus-gitlab).
-
-This document describes the minimal steps you have to take in order to
-replicate your **primary** GitLab database to a **secondary** node's database. You may
-have to change some values according to your database setup, how big it is, etc.
-
-You are encouraged to first read through all the steps before executing them
-in your testing/production environment.
-
-## PostgreSQL replication
-
-The GitLab **primary** node where the write operations happen will connect to
-the **primary** database server, and **secondary** nodes will
-connect to their own database servers (which are also read-only).
-
-We recommend using [PostgreSQL replication slots](https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75)
-to ensure that the **primary** node retains all the data necessary for the **secondary** nodes to
-recover. See below for more details.
-
-The following guide assumes that:
-
-- You are using Omnibus and therefore you are using PostgreSQL 11 or later
- which includes the [`pg_basebackup` tool](https://www.postgresql.org/docs/11/app-pgbasebackup.html).
-- You have a **primary** node already set up (the GitLab server you are
- replicating from), running Omnibus' PostgreSQL (or equivalent version), and
- you have a new **secondary** server set up with the same versions of the OS,
- PostgreSQL, and GitLab on all nodes.
-
-CAUTION: **Warning:**
-Geo works with streaming replication. Logical replication is not supported at this time.
-There is an [issue where support is being discussed](https://gitlab.com/gitlab-org/gitlab/-/issues/7420).
-
-### Step 1. Configure the **primary** server
-
-1. SSH into your GitLab **primary** server and login as root:
-
- ```shell
- sudo -i
- ```
-
-1. Edit `/etc/gitlab/gitlab.rb` and add a **unique** name for your node:
-
- ```ruby
- # The unique identifier for the Geo node.
- gitlab_rails['geo_node_name'] = '<node_name_here>'
- ```
-
-1. Reconfigure the **primary** node for the change to take effect:
-
- ```shell
- gitlab-ctl reconfigure
- ```
-
-1. Execute the command below to define the node as **primary** node:
-
- ```shell
- gitlab-ctl set-geo-primary-node
- ```
-
- This command will use your defined `external_url` in `/etc/gitlab/gitlab.rb`.
-
-1. GitLab 10.4 and up only: Do the following to make sure the `gitlab` database user has a password defined:
-
- Generate a MD5 hash of the desired password:
-
- ```shell
- gitlab-ctl pg-password-md5 gitlab
- # Enter password: <your_password_here>
- # Confirm password: <your_password_here>
- # fca0b89a972d69f00eb3ec98a5838484
- ```
-
- Edit `/etc/gitlab/gitlab.rb`:
-
- ```ruby
- # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab`
- postgresql['sql_user_password'] = '<md5_hash_of_your_password>'
-
- # Every node that runs Puma or Sidekiq needs to have the database
- # password specified as below. If you have a high-availability setup, this
- # must be present in all application nodes.
- gitlab_rails['db_password'] = '<your_password_here>'
- ```
-
-1. Omnibus GitLab already has a [replication user](https://wiki.postgresql.org/wiki/Streaming_Replication)
- called `gitlab_replicator`. You must set the password for this user manually.
- You will be prompted to enter a password:
-
- ```shell
- gitlab-ctl set-replication-password
- ```
-
- This command will also read the `postgresql['sql_replication_user']` Omnibus
- setting in case you have changed `gitlab_replicator` username to something
- else.
-
- If you are using an external database not managed by Omnibus GitLab, you need
- to create the replicator user and define a password to it manually:
-
- ```sql
- --- Create a new user 'replicator'
- CREATE USER gitlab_replicator;
-
- --- Set/change a password and grants replication privilege
- ALTER USER gitlab_replicator WITH REPLICATION ENCRYPTED PASSWORD '<replication_password>';
- ```
-
-1. Configure PostgreSQL to listen on network interfaces:
-
- For security reasons, PostgreSQL does not listen on any network interfaces
- by default. However, Geo requires the **secondary** node to be able to
- connect to the **primary** node's database. For this reason, we need the address of
- each node.
-
- NOTE: **Note:**
- For external PostgreSQL instances, see [additional instructions](external_database.md).
-
- If you are using a cloud provider, you can lookup the addresses for each
- Geo node through your cloud provider's management console.
-
- To lookup the address of a Geo node, SSH in to the Geo node and execute:
-
- ```shell
- ##
- ## Private address
- ##
- ip route get 255.255.255.255 | awk '{print "Private address:", $NF; exit}'
-
- ##
- ## Public address
- ##
- echo "External address: $(curl --silent ipinfo.io/ip)"
- ```
-
- In most cases, the following addresses will be used to configure GitLab
- Geo:
-
- | Configuration | Address |
- |:----------------------------------------|:------------------------------------------------------|
- | `postgresql['listen_address']` | **Primary** node's public or VPC private address. |
- | `postgresql['md5_auth_cidr_addresses']` | **Secondary** node's public or VPC private addresses. |
-
- If you are using Google Cloud Platform, SoftLayer, or any other vendor that
- provides a virtual private cloud (VPC) you can use the **primary** and **secondary** nodes
- private addresses (corresponds to "internal address" for Google Cloud Platform) for
- `postgresql['md5_auth_cidr_addresses']` and `postgresql['listen_address']`.
-
- The `listen_address` option opens PostgreSQL up to network connections with the interface
- corresponding to the given address. See [the PostgreSQL documentation](https://www.postgresql.org/docs/11/runtime-config-connection.html)
- for more details.
-
- NOTE: **Note:**
- If you need to use `0.0.0.0` or `*` as the listen_address, you will also need to add
- `127.0.0.1/32` to the `postgresql['md5_auth_cidr_addresses']` setting, to allow Rails to connect through
- `127.0.0.1`. For more information, see [omnibus-5258](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5258).
-
- Depending on your network configuration, the suggested addresses may not
- be correct. If your **primary** node and **secondary** nodes connect over a local
- area network, or a virtual network connecting availability zones like
- [Amazon's VPC](https://aws.amazon.com/vpc/) or [Google's VPC](https://cloud.google.com/vpc/)
- you should use the **secondary** node's private address for `postgresql['md5_auth_cidr_addresses']`.
-
- Edit `/etc/gitlab/gitlab.rb` and add the following, replacing the IP
- addresses with addresses appropriate to your network configuration:
-
- ```ruby
- ##
- ## Geo Primary role
- ## - configure dependent flags automatically to enable Geo
- ##
- roles ['geo_primary_role']
-
- ##
- ## Primary address
- ## - replace '<primary_node_ip>' with the public or VPC address of your Geo primary node
- ##
- postgresql['listen_address'] = '<primary_node_ip>'
-
- ##
- # Allow PostgreSQL client authentication from the primary and secondary IPs. These IPs may be
- # public or VPC addresses in CIDR format, for example ['198.51.100.1/32', '198.51.100.2/32']
- ##
- postgresql['md5_auth_cidr_addresses'] = ['<primary_node_ip>/32', '<secondary_node_ip>/32']
-
- ##
- ## Replication settings
- ## - set this to be the number of Geo secondary nodes you have
- ##
- postgresql['max_replication_slots'] = 1
- # postgresql['max_wal_senders'] = 10
- # postgresql['wal_keep_segments'] = 10
-
- ##
- ## Disable automatic database migrations temporarily
- ## (until PostgreSQL is restarted and listening on the private address).
- ##
- gitlab_rails['auto_migrate'] = false
- ```
-
-1. Optional: If you want to add another **secondary** node, the relevant setting would look like:
-
- ```ruby
- postgresql['md5_auth_cidr_addresses'] = ['<primary_node_ip>/32', '<secondary_node_ip>/32', '<another_secondary_node_ip>/32']
- ```
-
- You may also want to edit the `wal_keep_segments` and `max_wal_senders` to match your
- database replication requirements. Consult the [PostgreSQL - Replication documentation](https://www.postgresql.org/docs/11/runtime-config-replication.html)
- for more information.
-
-1. Save the file and reconfigure GitLab for the database listen changes and
- the replication slot changes to be applied:
-
- ```shell
- gitlab-ctl reconfigure
- ```
-
- Restart PostgreSQL for its changes to take effect:
-
- ```shell
- gitlab-ctl restart postgresql
- ```
-
-1. Re-enable migrations now that PostgreSQL is restarted and listening on the
- private address.
-
- Edit `/etc/gitlab/gitlab.rb` and **change** the configuration to `true`:
-
- ```ruby
- gitlab_rails['auto_migrate'] = true
- ```
-
- Save the file and reconfigure GitLab:
-
- ```shell
- gitlab-ctl reconfigure
- ```
-
-1. Now that the PostgreSQL server is set up to accept remote connections, run
- `netstat -plnt | grep 5432` to make sure that PostgreSQL is listening on port
- `5432` to the **primary** server's private address.
-
-1. A certificate was automatically generated when GitLab was reconfigured. This
- will be used automatically to protect your PostgreSQL traffic from
- eavesdroppers, but to protect against active ("man-in-the-middle") attackers,
- the **secondary** node needs a copy of the certificate. Make a copy of the PostgreSQL
- `server.crt` file on the **primary** node by running this command:
-
- ```shell
- cat ~gitlab-psql/data/server.crt
- ```
-
- Copy the output into a clipboard or into a local file. You
- will need it when setting up the **secondary** node! The certificate is not sensitive
- data.
-
-### Step 2. Configure the **secondary** server
-
-1. SSH into your GitLab **secondary** server and login as root:
-
- ```shell
- sudo -i
- ```
-
-1. Stop application server and Sidekiq
-
- ```shell
- gitlab-ctl stop puma
- gitlab-ctl stop sidekiq
- ```
-
- NOTE: **Note:**
- This step is important so we don't try to execute anything before the node is fully configured.
-
-1. [Check TCP connectivity](../../raketasks/maintenance.md) to the **primary** node's PostgreSQL server:
-
- ```shell
- gitlab-rake gitlab:tcp_check[<primary_node_ip>,5432]
- ```
-
- NOTE: **Note:**
- If this step fails, you may be using the wrong IP address, or a firewall may
- be preventing access to the server. Check the IP address, paying close
- attention to the difference between public and private addresses and ensure
- that, if a firewall is present, the **secondary** node is permitted to connect to the
- **primary** node on port 5432.
-
-1. Create a file `server.crt` in the **secondary** server, with the content you got on the last step of the **primary** node's setup:
-
- ```shell
- editor server.crt
- ```
-
-1. Set up PostgreSQL TLS verification on the **secondary** node:
-
- Install the `server.crt` file:
-
- ```shell
- install \
- -D \
- -o gitlab-psql \
- -g gitlab-psql \
- -m 0400 \
- -T server.crt ~gitlab-psql/.postgresql/root.crt
- ```
-
- PostgreSQL will now only recognize that exact certificate when verifying TLS
- connections. The certificate can only be replicated by someone with access
- to the private key, which is **only** present on the **primary** node.
-
-1. Test that the `gitlab-psql` user can connect to the **primary** node's database
- (the default Omnibus database name is `gitlabhq_production`):
-
- ```shell
- sudo \
- -u gitlab-psql /opt/gitlab/embedded/bin/psql \
- --list \
- -U gitlab_replicator \
- -d "dbname=gitlabhq_production sslmode=verify-ca" \
- -W \
- -h <primary_node_ip>
- ```
-
- When prompted enter the password you set in the first step for the
- `gitlab_replicator` user. If all worked correctly, you should see
- the list of **primary** node's databases.
-
- A failure to connect here indicates that the TLS configuration is incorrect.
- Ensure that the contents of `~gitlab-psql/data/server.crt` on the **primary** node
- match the contents of `~gitlab-psql/.postgresql/root.crt` on the **secondary** node.
-
-1. Configure PostgreSQL:
-
- This step is similar to how we configured the **primary** instance.
- We need to enable this, even if using a single node.
-
- Edit `/etc/gitlab/gitlab.rb` and add the following, replacing the IP
- addresses with addresses appropriate to your network configuration:
-
- ```ruby
- ##
- ## Geo Secondary role
- ## - configure dependent flags automatically to enable Geo
- ##
- roles ['geo_secondary_role']
-
- ##
- ## Secondary address
- ## - replace '<secondary_node_ip>' with the public or VPC address of your Geo secondary node
- ##
- postgresql['listen_address'] = '<secondary_node_ip>'
- postgresql['md5_auth_cidr_addresses'] = ['<secondary_node_ip>/32']
-
- ##
- ## Database credentials password (defined previously in primary node)
- ## - replicate same values here as defined in primary node
- ##
- postgresql['sql_user_password'] = '<md5_hash_of_your_password>'
- gitlab_rails['db_password'] = '<your_password_here>'
- ```
-
- For external PostgreSQL instances, see [additional instructions](external_database.md).
- If you bring a former **primary** node back online to serve as a **secondary** node, then you also need to remove `roles ['geo_primary_role']` or `geo_primary_role['enable'] = true`.
-
-1. Reconfigure GitLab for the changes to take effect:
-
- ```shell
- gitlab-ctl reconfigure
- ```
-
-1. Restart PostgreSQL for the IP change to take effect:
-
- ```shell
- gitlab-ctl restart postgresql
- ```
-
-### Step 3. Initiate the replication process
-
-Below we provide a script that connects the database on the **secondary** node to
-the database on the **primary** node, replicates the database, and creates the
-needed files for streaming replication.
-
-The directories used are the defaults that are set up in Omnibus. If you have
-changed any defaults, configure it as you see fit replacing the directories and paths.
-
-CAUTION: **Warning:**
-Make sure to run this on the **secondary** server as it removes all PostgreSQL's
-data before running `pg_basebackup`.
-
-1. SSH into your GitLab **secondary** server and login as root:
-
- ```shell
- sudo -i
- ```
-
-1. Choose a database-friendly name to use for your **secondary** node to
- use as the replication slot name. For example, if your domain is
- `secondary.geo.example.com`, you may use `secondary_example` as the slot
- name as shown in the commands below.
-
-1. Execute the command below to start a backup/restore and begin the replication
-
- CAUTION: **Warning:**
- Each Geo **secondary** node must have its own unique replication slot name.
- Using the same slot name between two secondaries will break PostgreSQL replication.
-
- ```shell
- gitlab-ctl replicate-geo-database \
- --slot-name=<secondary_node_name> \
- --host=<primary_node_ip>
- ```
-
- NOTE: **Note:**
- Replication slot names must only contain lowercase letters, numbers, and the underscore character.
-
- When prompted, enter the _plaintext_ password you set up for the `gitlab_replicator`
- user in the first step.
-
- This command also takes a number of additional options. You can use `--help`
- to list them all, but here are a couple of tips:
-
- - If PostgreSQL is listening on a non-standard port, add `--port=` as well.
- - If your database is too large to be transferred in 30 minutes, you will need
- to increase the timeout, e.g., `--backup-timeout=3600` if you expect the
- initial replication to take under an hour.
- - Pass `--sslmode=disable` to skip PostgreSQL TLS authentication altogether
- (e.g., you know the network path is secure, or you are using a site-to-site
- VPN). This is **not** safe over the public Internet!
- - You can read more details about each `sslmode` in the
- [PostgreSQL documentation](https://www.postgresql.org/docs/11/libpq-ssl.html#LIBPQ-SSL-PROTECTION);
- the instructions above are carefully written to ensure protection against
- both passive eavesdroppers and active "man-in-the-middle" attackers.
- - Change the `--slot-name` to the name of the replication slot
- to be used on the **primary** database. The script will attempt to create the
- replication slot automatically if it does not exist.
- - If you're repurposing an old server into a Geo **secondary** node, you'll need to
- add `--force` to the command line.
- - When not in a production machine you can disable backup step if you
- really sure this is what you want by adding `--skip-backup`
-
-The replication process is now complete.
-
-## PgBouncer support (optional)
-
-[PgBouncer](https://www.pgbouncer.org/) may be used with GitLab Geo to pool
-PostgreSQL connections. We recommend using PgBouncer if you use GitLab in a
-high-availability configuration with a cluster of nodes supporting a Geo
-**primary** node and another cluster of nodes supporting a Geo **secondary** node. For more
-information, see [High Availability with Omnibus GitLab](../../postgresql/replication_and_failover.md).
-
-## Troubleshooting
-
-Read the [troubleshooting document](troubleshooting.md).
+This document was moved to [another location](../setup/index.md).
diff --git a/doc/administration/geo/replication/datatypes.md b/doc/administration/geo/replication/datatypes.md
index b8d01e80371..166a724f9c1 100644
--- a/doc/administration/geo/replication/datatypes.md
+++ b/doc/administration/geo/replication/datatypes.md
@@ -45,8 +45,8 @@ verification methods:
| Blobs | Archived CI build traces _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ |
| Blobs | Container registry _(filesystem)_ | Geo with API/Docker API | _Not implemented_ |
| Blobs | Container registry _(object storage)_ | Geo with API/Managed/Docker API (*2*) | _Not implemented_ |
-| Blobs | Package registry _(filesystem)_ | Geo with API | _Not implemented_ |
-| Blobs | Package registry _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ |
+| Blobs | Package registry _(filesystem)_ | Geo with API | _Not implemented_ |
+| Blobs | Package registry _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ |
- (*1*): Redis replication can be used as part of HA with Redis sentinel. It's not used between Geo nodes.
- (*2*): Object storage replication can be performed by Geo or by your object storage provider/appliance
@@ -134,7 +134,7 @@ The replication for some data types is behind a corresponding feature flag:
> - They're enabled on GitLab.com.
> - They can't be enabled or disabled per-project.
> - They are recommended for production use.
-> - For GitLab self-managed instances, GitLab administrators can opt to [disable them](#enable-or-disable-replication-for-some-data-types-core-only). **(CORE ONLY)**
+> - For GitLab self-managed instances, GitLab administrators can opt to [disable them](#enable-or-disable-replication-for-some-data-types). **(CORE ONLY)**
#### Enable or disable replication (for some data types) **(CORE ONLY)**
@@ -160,34 +160,34 @@ replicating data from those features will cause the data to be **lost**.
If you wish to use those features on a **secondary** node, or to execute a failover
successfully, you must replicate their data using some other means.
-| Feature | Replicated (added in GitLab version) | Verified (added in GitLab version) | Notes |
-|:---------------------------------------------------------------------|:---------------------------------------------------------|:--------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------|
-| Application data in PostgreSQL | **Yes** (10.2) | **Yes** (10.2) | |
-| Project repository | **Yes** (10.2) | **Yes** (10.7) | |
-| Project wiki repository | **Yes** (10.2) | **Yes** (10.7) | |
-| Project designs repository | **Yes** (12.7) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/32467) | |
-| Uploads | **Yes** (10.2) | [No](https://gitlab.com/groups/gitlab-org/-/epics/1817) | Verified only on transfer, or manually (*1*) |
-| LFS objects | **Yes** (10.2) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/8922) | Verified only on transfer, or manually (*1*). Unavailable for new LFS objects in 11.11.x and 12.0.x (*2*). |
-| CI job artifacts (other than traces) | **Yes** (10.4) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/8923) | Verified only manually (*1*) |
-| Archived traces | **Yes** (10.4) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/8923) | Verified only on transfer, or manually (*1*) |
-| Personal snippets | **Yes** (10.2) | **Yes** (10.2) | |
-| [Versioned snippets](../../../user/snippets.md#versioned-snippets) | [No](https://gitlab.com/groups/gitlab-org/-/epics/2809) | [No](https://gitlab.com/groups/gitlab-org/-/epics/2810) | |
-| Project snippets | **Yes** (10.2) | **Yes** (10.2) | |
-| Object pools for forked project deduplication | **Yes** | No | |
-| [Server-side Git hooks](../../server_hooks.md) | No | No | |
-| [Elasticsearch integration](../../../integration/elasticsearch.md) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/1186) | No | |
-| [GitLab Pages](../../pages/index.md) | [No](https://gitlab.com/groups/gitlab-org/-/epics/589) | No | |
-| [Container Registry](../../packages/container_registry.md) | **Yes** (12.3) | No | |
-| [NPM Registry](../../../user/packages/npm_registry/index.md) | **Yes** (13.2) | No | Behind feature flag `geo_package_file_replication`, enabled by default | |
-| [Maven Repository](../../../user/packages/maven_repository/index.md) | **Yes** (13.2) | No | Behind feature flag `geo_package_file_replication`, enabled by default | |
-| [Conan Repository](../../../user/packages/conan_repository/index.md) | **Yes** (13.2) | No | Behind feature flag `geo_package_file_replication`, enabled by default | |
-| [NuGet Repository](../../../user/packages/nuget_repository/index.md) | **Yes** (13.2) | No | Behind feature flag `geo_package_file_replication`, enabled by default | |
-| [PyPi Repository](../../../user/packages/pypi_repository/index.md) | **Yes** (13.2) | No | Behind feature flag `geo_package_file_replication`, enabled by default | |
-| [Composer Repository](../../../user/packages/composer_repository/index.md) | **Yes** (13.2) | No | Behind feature flag `geo_package_file_replication`, enabled by default | |
-| [External merge request diffs](../../merge_request_diffs.md) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/33817) | No | |
-| [Terraform State](../../terraform_state.md) | [No](https://gitlab.com/groups/gitlab-org/-/epics/3112)(*3*) | No | |
-| [Vulnerability Export](../../../user/application_security/security_dashboard/#export-vulnerabilities) | [No](https://gitlab.com/groups/gitlab-org/-/epics/3111)(*3*) | No | | |
-| Content in object storage | **Yes** (12.4) | No | |
+| Feature | Replicated (added in GitLab version) | Verified (added in GitLab version) | Notes |
+|:------------------------------------------------------------------------------------------------------|:-------------------------------------------------------------|:----------------------------------------------------------|:-----------------------------------------------------------------------------------------------------------|
+| Application data in PostgreSQL | **Yes** (10.2) | **Yes** (10.2) | |
+| Project repository | **Yes** (10.2) | **Yes** (10.7) | |
+| Project wiki repository | **Yes** (10.2) | **Yes** (10.7) | |
+| Project designs repository | **Yes** (12.7) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/32467) | |
+| Uploads | **Yes** (10.2) | [No](https://gitlab.com/groups/gitlab-org/-/epics/1817) | Verified only on transfer, or manually (*1*) |
+| LFS objects | **Yes** (10.2) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/8922) | Verified only on transfer, or manually (*1*). Unavailable for new LFS objects in 11.11.x and 12.0.x (*2*). |
+| CI job artifacts (other than traces) | **Yes** (10.4) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/8923) | Verified only manually (*1*) |
+| Archived traces | **Yes** (10.4) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/8923) | Verified only on transfer, or manually (*1*) |
+| Personal snippets | **Yes** (10.2) | **Yes** (10.2) | |
+| [Versioned snippets](../../../user/snippets.md#versioned-snippets) | [No](https://gitlab.com/groups/gitlab-org/-/epics/2809) | [No](https://gitlab.com/groups/gitlab-org/-/epics/2810) | |
+| Project snippets | **Yes** (10.2) | **Yes** (10.2) | |
+| Object pools for forked project deduplication | **Yes** | No | |
+| [Server-side Git hooks](../../server_hooks.md) | No | No | |
+| [Elasticsearch integration](../../../integration/elasticsearch.md) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/1186) | No | |
+| [GitLab Pages](../../pages/index.md) | [No](https://gitlab.com/groups/gitlab-org/-/epics/589) | No | |
+| [Container Registry](../../packages/container_registry.md) | **Yes** (12.3) | No | |
+| [NPM Registry](../../../user/packages/npm_registry/index.md) | **Yes** (13.2) | No | Behind feature flag `geo_package_file_replication`, enabled by default |
+| [Maven Repository](../../../user/packages/maven_repository/index.md) | **Yes** (13.2) | No | Behind feature flag `geo_package_file_replication`, enabled by default |
+| [Conan Repository](../../../user/packages/conan_repository/index.md) | **Yes** (13.2) | No | Behind feature flag `geo_package_file_replication`, enabled by default |
+| [NuGet Repository](../../../user/packages/nuget_repository/index.md) | **Yes** (13.2) | No | Behind feature flag `geo_package_file_replication`, enabled by default |
+| [PyPi Repository](../../../user/packages/pypi_repository/index.md) | **Yes** (13.2) | No | Behind feature flag `geo_package_file_replication`, enabled by default |
+| [Composer Repository](../../../user/packages/composer_repository/index.md) | **Yes** (13.2) | No | Behind feature flag `geo_package_file_replication`, enabled by default |
+| [External merge request diffs](../../merge_request_diffs.md) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/33817) | No | |
+| [Terraform State](../../terraform_state.md) | [No](https://gitlab.com/groups/gitlab-org/-/epics/3112)(*3*) | No | |
+| [Vulnerability Export](../../../user/application_security/security_dashboard/#export-vulnerabilities) | [No](https://gitlab.com/groups/gitlab-org/-/epics/3111)(*3*) | No | |
+| Content in object storage | **Yes** (12.4) | No | |
- (*1*): The integrity can be verified manually using
[Integrity Check Rake Task](../../raketasks/check.md) on both nodes and comparing
diff --git a/doc/administration/geo/replication/external_database.md b/doc/administration/geo/replication/external_database.md
index d860a3dd368..0d0cd8a37d5 100644
--- a/doc/administration/geo/replication/external_database.md
+++ b/doc/administration/geo/replication/external_database.md
@@ -1,234 +1,5 @@
---
-stage: Enablement
-group: Geo
-info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
-type: howto
+redirect_to: '../setup/external_database.md'
---
-# Geo with external PostgreSQL instances **(PREMIUM ONLY)**
-
-This document is relevant if you are using a PostgreSQL instance that is *not
-managed by Omnibus*. This includes cloud-managed instances like AWS RDS, or
-manually installed and configured PostgreSQL instances.
-
-NOTE: **Note:**
-We strongly recommend running Omnibus-managed instances as they are actively
-developed and tested. We aim to be compatible with most external
-(not managed by Omnibus) databases but we do not guarantee compatibility.
-
-## **Primary** node
-
-1. SSH into a GitLab **primary** application server and login as root:
-
- ```shell
- sudo -i
- ```
-
-1. Edit `/etc/gitlab/gitlab.rb` and add a **unique** ID for your node (arbitrary value):
-
- ```ruby
- # The unique identifier for the Geo node.
- gitlab_rails['geo_node_name'] = '<node_name_here>'
- ```
-
-1. Reconfigure the **primary** node for the change to take effect:
-
- ```shell
- gitlab-ctl reconfigure
- ```
-
-1. Execute the command below to define the node as **primary** node:
-
- ```shell
- gitlab-ctl set-geo-primary-node
- ```
-
- This command will use your defined `external_url` in `/etc/gitlab/gitlab.rb`.
-
-### Configure the external database to be replicated
-
-To set up an external database, you can either:
-
-- Set up streaming replication yourself (for example, in AWS RDS).
-- Perform the Omnibus configuration manually as follows.
-
-#### Leverage your cloud provider's tools to replicate the primary database
-
-Given you have a primary node set up on AWS EC2 that uses RDS.
-You can now just create a read-only replica in a different region and the
-replication process will be managed by AWS. Make sure you've set Network ACL, Subnet, and
-Security Group according to your needs, so the secondary application node can access the database.
-
-The following instructions detail how to create a read-only replica for common
-cloud providers:
-
-- Amazon RDS - [Creating a Read Replica](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html#USER_ReadRepl.Create)
-- Azure Database for PostgreSQL - [Create and manage read replicas in Azure Database for PostgreSQL](https://docs.microsoft.com/en-us/azure/postgresql/howto-read-replicas-portal)
-
-Once your read-only replica is set up, you can skip to [configure you secondary application node](#configure-secondary-application-nodes-to-use-the-external-read-replica).
-
-#### Manually configure the primary database for replication
-
-The [`geo_primary_role`](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles)
-configures the **primary** node's database to be replicated by making changes to
-`pg_hba.conf` and `postgresql.conf`. Make the following configuration changes
-manually to your external database configuration and ensure that you restart PostgreSQL
-afterwards for the changes to take effect:
-
-```plaintext
-##
-## Geo Primary Role
-## - pg_hba.conf
-##
-host all all <trusted primary IP>/32 md5
-host replication gitlab_replicator <trusted primary IP>/32 md5
-host all all <trusted secondary IP>/32 md5
-host replication gitlab_replicator <trusted secondary IP>/32 md5
-```
-
-```plaintext
-##
-## Geo Primary Role
-## - postgresql.conf
-##
-wal_level = hot_standby
-max_wal_senders = 10
-wal_keep_segments = 50
-max_replication_slots = 1 # number of secondary instances
-hot_standby = on
-```
-
-## **Secondary** nodes
-
-### Manually configure the replica database
-
-Make the following configuration changes manually to your `pg_hba.conf` and `postgresql.conf`
-of your external replica database and ensure that you restart PostgreSQL afterwards
-for the changes to take effect:
-
-```plaintext
-##
-## Geo Secondary Role
-## - pg_hba.conf
-##
-host all all <trusted secondary IP>/32 md5
-host replication gitlab_replicator <trusted secondary IP>/32 md5
-host all all <trusted primary IP>/24 md5
-```
-
-```plaintext
-##
-## Geo Secondary Role
-## - postgresql.conf
-##
-wal_level = hot_standby
-max_wal_senders = 10
-wal_keep_segments = 10
-hot_standby = on
-```
-
-### Configure **secondary** application nodes to use the external read-replica
-
-With Omnibus, the
-[`geo_secondary_role`](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles)
-has three main functions:
-
-1. Configure the replica database.
-1. Configure the tracking database.
-1. Enable the [Geo Log Cursor](index.md#geo-log-cursor) (not covered in this section).
-
-To configure the connection to the external read-replica database and enable Log Cursor:
-
-1. SSH into a GitLab **secondary** application server and login as root:
-
- ```shell
- sudo -i
- ```
-
-1. Edit `/etc/gitlab/gitlab.rb` and add the following
-
- ```ruby
- ##
- ## Geo Secondary role
- ## - configure dependent flags automatically to enable Geo
- ##
- roles ['geo_secondary_role']
-
- # note this is shared between both databases,
- # make sure you define the same password in both
- gitlab_rails['db_password'] = '<your_password_here>'
-
- gitlab_rails['db_username'] = 'gitlab'
- gitlab_rails['db_host'] = '<database_read_replica_host>'
-
- # Disable the bundled Omnibus PostgreSQL, since we are
- # using an external PostgreSQL
- postgresql['enable'] = false
- ```
-
-1. Save the file and [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure)
-
-### Configure the tracking database
-
-**Secondary** nodes use a separate PostgreSQL installation as a tracking
-database to keep track of replication status and automatically recover from
-potential replication issues. Omnibus automatically configures a tracking database
-when `roles ['geo_secondary_role']` is set.
-If you want to run this database external to Omnibus, please follow the instructions below.
-
-If you are using a cloud-managed service for the tracking database, you may need
-to grant additional roles to your tracking database user (by default, this is
-`gitlab_geo`):
-
-- Amazon RDS requires the [`rds_superuser`](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.html#Appendix.PostgreSQL.CommonDBATasks.Roles) role.
-- Azure Database for PostgreSQL requires the [`azure_pg_admin`](https://docs.microsoft.com/en-us/azure/postgresql/howto-create-users#how-to-create-additional-admin-users-in-azure-database-for-postgresql) role.
-
-If you have an external database ready to be used as the tracking database,
-follow the instructions below to use it:
-
-NOTE: **Note:**
-If you want to use AWS RDS as a tracking database, make sure it has access to
-the secondary database. Unfortunately, just assigning the same security group is not enough as
-outbound rules do not apply to RDS PostgreSQL databases. Therefore, you need to explicitly add an inbound
-rule to the read-replica's security group allowing any TCP traffic from
-the tracking database on port 5432.
-
-1. Ensure that your secondary node can communicate with your tracking database by
- manually changing the `pg_hba.conf` that is associated with your tracking database.
- Remember to restart PostgreSQL afterwards for the changes to take effect:
-
- ```plaintext
- ##
- ## Geo Tracking Database Role
- ## - pg_hba.conf
- ##
- host all all <trusted tracking IP>/32 md5
- host all all <trusted secondary IP>/32 md5
- ```
-
-1. SSH into a GitLab **secondary** server and login as root:
-
- ```shell
- sudo -i
- ```
-
-1. Edit `/etc/gitlab/gitlab.rb` with the connection parameters and credentials for
- the machine with the PostgreSQL instance:
-
- ```ruby
- geo_secondary['db_username'] = 'gitlab_geo'
- geo_secondary['db_password'] = '<your_password_here>'
-
- geo_secondary['db_host'] = '<tracking_database_host>'
- geo_secondary['db_port'] = <tracking_database_port> # change to the correct port
- geo_postgresql['enable'] = false # don't use internal managed instance
- ```
-
-1. Save the file and [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure)
-
-1. Run the tracking database migrations:
-
- ```shell
- gitlab-rake geo:db:create
- gitlab-rake geo:db:migrate
- ```
+This document was moved to [another location](../setup/external_database.md).
diff --git a/doc/administration/geo/replication/faq.md b/doc/administration/geo/replication/faq.md
index 522ad32c352..3892d73b465 100644
--- a/doc/administration/geo/replication/faq.md
+++ b/doc/administration/geo/replication/faq.md
@@ -9,7 +9,7 @@ type: howto
## What are the minimum requirements to run Geo?
-The requirements are listed [on the index page](index.md#requirements-for-running-geo)
+The requirements are listed [on the index page](../index.md#requirements-for-running-geo)
## How does Geo know which projects to sync?
diff --git a/doc/administration/geo/replication/geo_validation_tests.md b/doc/administration/geo/replication/geo_validation_tests.md
index 323f6f367b1..8247b8c6336 100644
--- a/doc/administration/geo/replication/geo_validation_tests.md
+++ b/doc/administration/geo/replication/geo_validation_tests.md
@@ -158,3 +158,15 @@ The following are PostgreSQL upgrade validation tests we performed.
- [`gitlab-ctl` reconfigure fails on Redis node in multi-node Geo setup](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/4706).
- [Geo multi-node upgrade from 12.0.9 to 12.1.9 does not upgrade PostgreSQL](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/4705).
- [Refresh foreign tables fails on app server in multi-node setup after upgrade to 12.1.9](https://gitlab.com/gitlab-org/gitlab/-/issues/32119).
+
+## Other tests
+
+The following are additional validation tests we performed.
+
+### August 2020
+
+[Test Gitaly Cluster on a Geo Deployment](https://gitlab.com/gitlab-org/gitlab/-/issues/223210):
+
+- Description: Tested a Geo deployment with Gitaly clusters configured on both the primary and secondary Geo sites. Triggered automatic Gitaly cluster failover on the primary Geo site, and ran end-to-end Geo tests. Then triggered Gitaly cluster failover on the secondary Geo site, and re-ran the end-to-end Geo tests.
+
+- Outcome: Successful end-to-end tests before and after Gitaly cluster failover on the primary site, and before and after Gitaly cluster failover on the secondary site.
diff --git a/doc/administration/geo/replication/index.md b/doc/administration/geo/replication/index.md
index f1cc9f0df8b..5e9dd73ecc5 100644
--- a/doc/administration/geo/replication/index.md
+++ b/doc/administration/geo/replication/index.md
@@ -1,316 +1,5 @@
---
-stage: Enablement
-group: Geo
-info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
-type: howto
+redirect_to: '../index.md'
---
-# Replication (Geo) **(PREMIUM ONLY)**
-
-> - Introduced in GitLab Enterprise Edition 8.9.
-> - Using Geo in combination with
-> [multi-node architectures](../../reference_architectures/index.md)
-> is considered **Generally Available** (GA) in
-> [GitLab Premium](https://about.gitlab.com/pricing/) 10.4.
-
-Replication with Geo is the solution for widely distributed development teams.
-
-## Overview
-
-Fetching large repositories can take a long time for teams located far from a single GitLab instance.
-
-Geo provides local, read-only instances of your GitLab instances. This can reduce the time it takes
-to clone and fetch large repositories, speeding up development.
-
-NOTE: **Note:**
-Check the [requirements](#requirements-for-running-geo) carefully before setting up Geo.
-
-For a video introduction to Geo, see [Introduction to GitLab Geo - GitLab Features](https://www.youtube.com/watch?v=-HDLxSjEh6w).
-
-CAUTION: **Caution:**
-Geo undergoes significant changes from release to release. Upgrades **are** supported and [documented](#updating-geo), but you should ensure that you're using the right version of the documentation for your installation.
-
-To make sure you're using the right version of the documentation, navigate to [the source version of this page on GitLab.com](https://gitlab.com/gitlab-org/gitlab/blob/master/doc/administration/geo/replication/index.md) and choose the appropriate release from the **Switch branch/tag** dropdown. For example, [`v11.2.3-ee`](https://gitlab.com/gitlab-org/gitlab/blob/v11.2.3-ee/doc/administration/geo/replication/index.md).
-
-## Use cases
-
-Implementing Geo provides the following benefits:
-
-- Reduce from minutes to seconds the time taken for your distributed developers to clone and fetch large repositories and projects.
-- Enable all of your developers to contribute ideas and work in parallel, no matter where they are.
-- Balance the read-only load between your **primary** and **secondary** nodes.
-
-In addition, it:
-
-- Can be used for cloning and fetching projects, in addition to reading any data available in the GitLab web interface (see [current limitations](#current-limitations)).
-- Overcomes slow connections between distant offices, saving time by improving speed for distributed teams.
-- Helps reducing the loading time for automated tasks, custom integrations, and internal workflows.
-- Can quickly fail over to a **secondary** node in a [disaster recovery](../disaster_recovery/index.md) scenario.
-- Allows [planned failover](../disaster_recovery/planned_failover.md) to a **secondary** node.
-
-Geo provides:
-
-- Read-only **secondary** nodes: Maintain one **primary** GitLab node while still enabling read-only **secondary** nodes for each of your distributed teams.
-- Authentication system hooks: **Secondary** nodes receives all authentication data (like user accounts and logins) from the **primary** instance.
-- An intuitive UI: **Secondary** nodes utilize the same web interface your team has grown accustomed to. In addition, there are visual notifications that block write operations and make it clear that a user is on a **secondary** node.
-
-## How it works
-
-Your Geo instance can be used for cloning and fetching projects, in addition to reading any data. This will make working with large repositories over large distances much faster.
-
-![Geo overview](img/geo_overview.png)
-
-When Geo is enabled, the:
-
-- Original instance is known as the **primary** node.
-- Replicated read-only nodes are known as **secondary** nodes.
-
-Keep in mind that:
-
-- **Secondary** nodes talk to the **primary** node to:
- - Get user data for logins (API).
- - Replicate repositories, LFS Objects, and Attachments (HTTPS + JWT).
-- Since GitLab Premium 10.0, the **primary** node no longer talks to **secondary** nodes to notify for changes (API).
-- Pushing directly to a **secondary** node (for both HTTP and SSH, including Git LFS) was [introduced](https://about.gitlab.com/releases/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3.
-- There are [limitations](#current-limitations) in the current implementation.
-
-### Architecture
-
-The following diagram illustrates the underlying architecture of Geo.
-
-![Geo architecture](img/geo_architecture.png)
-
-In this diagram:
-
-- There is the **primary** node and the details of one **secondary** node.
-- Writes to the database can only be performed on the **primary** node. A **secondary** node receives database
- updates via PostgreSQL streaming replication.
-- If present, the [LDAP server](#ldap) should be configured to replicate for [Disaster Recovery](../disaster_recovery/index.md) scenarios.
-- A **secondary** node performs different type of synchronizations against the **primary** node, using a special
- authorization protected by JWT:
- - Repositories are cloned/updated via Git over HTTPS.
- - Attachments, LFS objects, and other files are downloaded via HTTPS using a private API endpoint.
-
-From the perspective of a user performing Git operations:
-
-- The **primary** node behaves as a full read-write GitLab instance.
-- **Secondary** nodes are read-only but proxy Git push operations to the **primary** node. This makes **secondary** nodes appear to support push operations themselves.
-
-To simplify the diagram, some necessary components are omitted. Note that:
-
-- Git over SSH requires [`gitlab-shell`](https://gitlab.com/gitlab-org/gitlab-shell) and OpenSSH.
-- Git over HTTPS required [`gitlab-workhorse`](https://gitlab.com/gitlab-org/gitlab-workhorse).
-
-Note that a **secondary** node needs two different PostgreSQL databases:
-
-- A read-only database instance that streams data from the main GitLab database.
-- [Another database instance](#geo-tracking-database) used internally by the **secondary** node to record what data has been replicated.
-
-In **secondary** nodes, there is an additional daemon: [Geo Log Cursor](#geo-log-cursor).
-
-## Requirements for running Geo
-
-The following are required to run Geo:
-
-- An operating system that supports OpenSSH 6.9+ (needed for
- [fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md))
- The following operating systems are known to ship with a current version of OpenSSH:
- - [CentOS](https://www.centos.org) 7.4+
- - [Ubuntu](https://ubuntu.com) 16.04+
-- PostgreSQL 11+ with [Streaming Replication](https://wiki.postgresql.org/wiki/Streaming_Replication)
-- Git 2.9+
-- All nodes must run the same GitLab version.
-
-Additionally, check GitLab's [minimum requirements](../../../install/requirements.md),
-and we recommend you use:
-
-- At least GitLab Enterprise Edition 10.0 for basic Geo features.
-- The latest version for a better experience.
-
-### Firewall rules
-
-The following table lists basic ports that must be open between the **primary** and **secondary** nodes for Geo.
-
-| **Primary** node | **Secondary** node | Protocol |
-|:-----------------|:-------------------|:-------------|
-| 80 | 80 | HTTP |
-| 443 | 443 | TCP or HTTPS |
-| 22 | 22 | TCP |
-| 5432 | | PostgreSQL |
-
-See the full list of ports used by GitLab in [Package defaults](https://docs.gitlab.com/omnibus/package-information/defaults.html)
-
-NOTE: **Note:**
-[Web terminal](../../../ci/environments/index.md#web-terminals) support requires your load balancer to correctly handle WebSocket connections.
-When using HTTP or HTTPS proxying, your load balancer must be configured to pass through the `Connection` and `Upgrade` hop-by-hop headers. See the [web terminal](../../integration/terminal.md) integration guide for more details.
-
-NOTE: **Note:**
-When using HTTPS protocol for port 443, you will need to add an SSL certificate to the load balancers.
-If you wish to terminate SSL at the GitLab application server instead, use TCP protocol.
-
-### LDAP
-
-We recommend that if you use LDAP on your **primary** node, you also set up secondary LDAP servers on each **secondary** node. Otherwise, users will not be able to perform Git operations over HTTP(s) on the **secondary** node using HTTP Basic Authentication. However, Git via SSH and personal access tokens will still work.
-
-NOTE: **Note:**
-It is possible for all **secondary** nodes to share an LDAP server, but additional latency can be an issue. Also, consider what LDAP server will be available in a [disaster recovery](../disaster_recovery/index.md) scenario if a **secondary** node is promoted to be a **primary** node.
-
-Check for instructions on how to set up replication in your LDAP service. Instructions will be different depending on the software or service used. For example, OpenLDAP provides [these instructions](https://www.openldap.org/doc/admin24/replication.html).
-
-### Geo Tracking Database
-
-The tracking database instance is used as metadata to control what needs to be updated on the disk of the local instance. For example:
-
-- Download new assets.
-- Fetch new LFS Objects.
-- Fetch changes from a repository that has recently been updated.
-
-Because the replicated database instance is read-only, we need this additional database instance for each **secondary** node.
-
-### Geo Log Cursor
-
-This daemon:
-
-- Reads a log of events replicated by the **primary** node to the **secondary** database instance.
-- Updates the Geo Tracking Database instance with changes that need to be executed.
-
-When something is marked to be updated in the tracking database instance, asynchronous jobs running on the **secondary** node will execute the required operations and update the state.
-
-This new architecture allows GitLab to be resilient to connectivity issues between the nodes. It doesn't matter how long the **secondary** node is disconnected from the **primary** node as it will be able to replay all the events in the correct order and become synchronized with the **primary** node again.
-
-## Setup instructions
-
-These instructions assume you have a working instance of GitLab. They guide you through:
-
-1. Making your existing instance the **primary** node.
-1. Adding **secondary** nodes.
-
-CAUTION: **Caution:**
-The steps below should be followed in the order they appear. **Make sure the GitLab version is the same on all nodes.**
-
-### Using Omnibus GitLab
-
-If you installed GitLab using the Omnibus packages (highly recommended):
-
-1. [Install GitLab Enterprise Edition](https://about.gitlab.com/install/) on the server that will serve as the **secondary** node. Do not create an account or log in to the new **secondary** node.
-1. [Upload the GitLab License](../../../user/admin_area/license.md) on the **primary** node to unlock Geo. The license must be for [GitLab Premium](https://about.gitlab.com/pricing/) or higher.
-1. [Set up the database replication](database.md) (`primary (read-write) <-> secondary (read-only)` topology).
-1. [Configure fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md). This step is required and needs to be done on **both** the **primary** and **secondary** nodes.
-1. [Configure GitLab](configuration.md) to set the **primary** and **secondary** nodes.
-1. Optional: [Configure a secondary LDAP server](../../auth/ldap/index.md) for the **secondary** node. See [notes on LDAP](#ldap).
-1. [Follow the "Using a Geo Server" guide](using_a_geo_server.md).
-
-## Post-installation documentation
-
-After installing GitLab on the **secondary** nodes and performing the initial configuration, see the following documentation for post-installation information.
-
-### Configuring Geo
-
-For information on configuring Geo, see [Geo configuration](configuration.md).
-
-### Updating Geo
-
-For information on how to update your Geo nodes to the latest GitLab version, see [Updating the Geo nodes](updating_the_geo_nodes.md).
-
-### Pausing and resuming replication
-
-> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/35913) in [GitLab Premium](https://about.gitlab.com/pricing/) 13.2.
-
-In some circumstances, like during [upgrades](updating_the_geo_nodes.md) or a [planned failover](../disaster_recovery/planned_failover.md), it is desirable to pause replication between the primary and secondary.
-
-Pausing and resuming replication is done via a command line tool from the secondary node.
-
-**To Pause: (from secondary)**
-
-```shell
-gitlab-ctl geo-replication-pause
-```
-
-**To Resume: (from secondary)**
-
-```shell
-gitlab-ctl geo-replication-resume
-```
-
-### Configuring Geo for multiple nodes
-
-For information on configuring Geo for multiple nodes, see [Geo for multiple servers](multiple_servers.md).
-
-### Configuring Geo with Object Storage
-
-For information on configuring Geo with object storage, see [Geo with Object storage](object_storage.md).
-
-### Disaster Recovery
-
-For information on using Geo in disaster recovery situations to mitigate data-loss and restore services, see [Disaster Recovery](../disaster_recovery/index.md).
-
-### Replicating the Container Registry
-
-For more information on how to replicate the Container Registry, see [Docker Registry for a **secondary** node](docker_registry.md).
-
-### Security Review
-
-For more information on Geo security, see [Geo security review](security_review.md).
-
-### Tuning Geo
-
-For more information on tuning Geo, see [Tuning Geo](tuning.md).
-
-### Set up a location-aware Git URL
-
-For an example of how to set up a location-aware Git remote URL with AWS Route53, see [Location-aware Git remote URL with AWS Route53](location_aware_git_url.md).
-
-## Remove Geo node
-
-For more information on removing a Geo node, see [Removing **secondary** Geo nodes](remove_geo_node.md).
-
-## Disable Geo
-
-To find out how to disable Geo, see [Disabling Geo](disable_geo.md).
-
-## Current limitations
-
-CAUTION: **Caution:**
-This list of limitations only reflects the latest version of GitLab. If you are using an older version, extra limitations may be in place.
-
-- Pushing directly to a **secondary** node redirects (for HTTP) or proxies (for SSH) the request to the **primary** node instead of [handling it directly](https://gitlab.com/gitlab-org/gitlab/-/issues/1381), except when using Git over HTTP with credentials embedded within the URI. For example, `https://user:password@secondary.tld`.
-- Cloning, pulling, or pushing repositories that exist on the **primary** node but not on the **secondary** nodes where [selective synchronization](configuration.md#selective-synchronization) does not include the project is not supported over SSH [but support is planned](https://gitlab.com/groups/gitlab-org/-/epics/2562). HTTP(S) is supported.
-- The **primary** node has to be online for OAuth login to happen. Existing sessions and Git are not affected. Support for the **secondary** node to use an OAuth provider independent from the primary is [being planned](https://gitlab.com/gitlab-org/gitlab/-/issues/208465).
-- The installation takes multiple manual steps that together can take about an hour depending on circumstances. We are working on improving this experience. See [Omnibus GitLab issue #2978](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/2978) for details.
-- Real-time updates of issues/merge requests (for example, via long polling) doesn't work on the **secondary** node.
-- [Selective synchronization](configuration.md#selective-synchronization) applies only to files and repositories. Other datasets are replicated to the **secondary** node in full, making it inappropriate for use as an access control mechanism.
-- Object pools for forked project deduplication work only on the **primary** node, and are duplicated on the **secondary** node.
-- [External merge request diffs](../../merge_request_diffs.md) will not be replicated if they are on-disk, and viewing merge requests will fail. However, external MR diffs in object storage **are** supported. The default configuration (in-database) does work.
-- GitLab Runners cannot register with a **secondary** node. Support for this is [planned for the future](https://gitlab.com/gitlab-org/gitlab/-/issues/3294).
-
-### Limitations on replication/verification
-
-You can keep track of the progress to implement the missing items in
-these epics/issues:
-
-- [Unreplicated Data Types](https://gitlab.com/groups/gitlab-org/-/epics/893)
-- [Verify all replicated data](https://gitlab.com/groups/gitlab-org/-/epics/1430)
-
-There is a complete list of all GitLab [data types](datatypes.md) and [existing support for replication and verification](datatypes.md#limitations-on-replicationverification).
-
-## Frequently Asked Questions
-
-For answers to common questions, see the [Geo FAQ](faq.md).
-
-## Log files
-
-Since GitLab 9.5, Geo stores structured log messages in a `geo.log` file. For Omnibus installations, this file is at `/var/log/gitlab/gitlab-rails/geo.log`.
-
-This file contains information about when Geo attempts to sync repositories and files. Each line in the file contains a separate JSON entry that can be ingested into. For example, Elasticsearch or Splunk.
-
-For example:
-
-```json
-{"severity":"INFO","time":"2017-08-06T05:40:16.104Z","message":"Repository update","project_id":1,"source":"repository","resync_repository":true,"resync_wiki":true,"class":"Gitlab::Geo::LogCursor::Daemon","cursor_delay_s":0.038}
-```
-
-This message shows that Geo detected that a repository update was needed for project `1`.
-
-## Troubleshooting
-
-For troubleshooting steps, see [Geo Troubleshooting](troubleshooting.md).
+This document was moved to [another location](../index.md).
diff --git a/doc/administration/geo/replication/location_aware_git_url.md b/doc/administration/geo/replication/location_aware_git_url.md
index 8b086e3ff5f..ddcdea736e7 100644
--- a/doc/administration/geo/replication/location_aware_git_url.md
+++ b/doc/administration/geo/replication/location_aware_git_url.md
@@ -44,7 +44,7 @@ In any case, you require:
- A Route53 Hosted Zone managing your domain.
If you have not yet setup a Geo **primary** node and **secondary** node, please consult
-[the Geo setup instructions](index.md#setup-instructions).
+[the Geo setup instructions](../index.md#setup-instructions).
## Create a traffic policy
diff --git a/doc/administration/geo/replication/multiple_servers.md b/doc/administration/geo/replication/multiple_servers.md
index 31d1ea2cd2b..cba41c375a3 100644
--- a/doc/administration/geo/replication/multiple_servers.md
+++ b/doc/administration/geo/replication/multiple_servers.md
@@ -147,7 +147,7 @@ The following documentation assumes the database will be run on
a single node only. Multi-node PostgreSQL on **secondary** nodes is
[not currently supported](https://gitlab.com/groups/gitlab-org/-/epics/2536).
-Configure the [**secondary** database](database.md) as a read-only replica of
+Configure the [**secondary** database](../setup/database.md) as a read-only replica of
the **primary** database. Use the following as a guide.
1. Generate an MD5 hash of the desired password for the database user that the
@@ -222,7 +222,7 @@ the **primary** database. Use the following as a guide.
After making these changes, [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) so the changes take effect.
If using an external PostgreSQL instance, refer also to
-[Geo with external PostgreSQL instances](external_database.md).
+[Geo with external PostgreSQL instances](../setup/external_database.md).
### Step 3: Configure the tracking database on the **secondary** node
@@ -294,7 +294,7 @@ Configure the tracking database.
After making these changes, [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) so the changes take effect.
If using an external PostgreSQL instance, refer also to
-[Geo with external PostgreSQL instances](external_database.md).
+[Geo with external PostgreSQL instances](../setup/external_database.md).
### Step 4: Configure the frontend application servers on the **secondary** node
diff --git a/doc/administration/geo/replication/object_storage.md b/doc/administration/geo/replication/object_storage.md
index 159e2524198..642d31f6298 100644
--- a/doc/administration/geo/replication/object_storage.md
+++ b/doc/administration/geo/replication/object_storage.md
@@ -26,7 +26,7 @@ To have:
> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/10586) in GitLab 12.4.
CAUTION: **Caution:**
-This is a [**beta** feature](https://about.gitlab.com/handbook/product/#beta) and is not ready yet for production use at any scale.
+This is a [**beta** feature](https://about.gitlab.com/handbook/product/#beta) and is not ready yet for production use at any scale. The main limitations are a lack of testing at scale and no verification of any replicated data.
**Secondary** nodes can replicate files stored on the **primary** node regardless of
whether they are stored on the local filesystem or in object storage.
@@ -44,7 +44,7 @@ For LFS, follow the documentation to
For CI job artifacts, there is similar documentation to configure
[jobs artifact object storage](../../job_artifacts.md#using-object-storage)
-For user uploads, there is similar documentation to configure [upload object storage](../../uploads.md#using-object-storage-core-only)
+For user uploads, there is similar documentation to configure [upload object storage](../../uploads.md#using-object-storage)
If you want to migrate the **primary** node's files to object storage, you can
configure the **secondary** in a few ways:
diff --git a/doc/administration/geo/replication/security_review.md b/doc/administration/geo/replication/security_review.md
index f5edf79c6e4..28b8c8a9d51 100644
--- a/doc/administration/geo/replication/security_review.md
+++ b/doc/administration/geo/replication/security_review.md
@@ -37,7 +37,7 @@ from [owasp.org](https://owasp.org/).
private projects. Geo replicates them all indiscriminately. “Selective sync”
exists for files and repositories (but not database content), which would permit
only less-sensitive projects to be replicated to a **secondary** node if desired.
-- See also: [GitLab data classification policy](https://about.gitlab.com/handbook/engineering/security/data-classification-policy.html).
+- See also: [GitLab data classification policy](https://about.gitlab.com/handbook/engineering/security/data-classification-standard.html).
### What data backup and retention requirements have been defined for the application?
@@ -123,7 +123,7 @@ from [owasp.org](https://owasp.org/).
- Geo imposes no additional restrictions on operating system (see the
[GitLab installation](https://about.gitlab.com/install/) page for more
- details), however we recommend using the operating systems listed in the [Geo documentation](index.md#requirements-for-running-geo).
+ details), however we recommend using the operating systems listed in the [Geo documentation](../index.md#requirements-for-running-geo).
### What details regarding required OS components and lock‐down needs have been defined?
diff --git a/doc/administration/geo/replication/troubleshooting.md b/doc/administration/geo/replication/troubleshooting.md
index b8172322c10..f6d6f39fb19 100644
--- a/doc/administration/geo/replication/troubleshooting.md
+++ b/doc/administration/geo/replication/troubleshooting.md
@@ -251,7 +251,7 @@ sudo gitlab-rake gitlab:geo:check
When performing a PostgreSQL major version (9 > 10) update this is expected. Follow:
- - [initiate-the-replication-process](database.md#step-3-initiate-the-replication-process)
+ - [initiate-the-replication-process](../setup/database.md#step-3-initiate-the-replication-process)
## Fixing replication errors
@@ -268,7 +268,7 @@ default to 1. You may need to increase this value if you have more
Be sure to restart PostgreSQL for this to take
effect. See the [PostgreSQL replication
-setup](database.md#postgresql-replication) guide for more details.
+setup](../setup/database.md#postgresql-replication) guide for more details.
### Message: `FATAL: could not start WAL streaming: ERROR: replication slot "geo_secondary_my_domain_com" does not exist`?
@@ -276,11 +276,11 @@ This occurs when PostgreSQL does not have a replication slot for the
**secondary** node by that name.
You may want to rerun the [replication
-process](database.md) on the **secondary** node .
+process](../setup/database.md) on the **secondary** node .
### Message: "Command exceeded allowed execution time" when setting up replication?
-This may happen while [initiating the replication process](database.md#step-3-initiate-the-replication-process) on the **secondary** node,
+This may happen while [initiating the replication process](../setup/database.md#step-3-initiate-the-replication-process) on the **secondary** node,
and indicates that your initial dataset is too large to be replicated in the default timeout (30 minutes).
Re-run `gitlab-ctl replicate-geo-database`, but include a larger value for
@@ -603,9 +603,9 @@ or `gitlab-ctl promote-to-primary-node`, either:
bug](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/22021) was
fixed.
-If the above does not work, another possible reason is that you have paused replication
-from the original primary node before attempting to promote this node.
+### Message: ActiveRecord::RecordInvalid: Validation failed: Enabled Geo primary node cannot be disabled
+This error may occur if you have paused replication from the original primary node before attempting to promote this node.
To double check this, you can do the following:
- Get the current secondary node's ID using:
@@ -632,6 +632,23 @@ To double check this, you can do the following:
UPDATE geo_nodes SET enabled = 't' WHERE id = ID_FROM_ABOVE;
```
+### While Promoting the secondary, I got an error `ActiveRecord::RecordInvalid`
+
+If you disabled a secondary node, either with the [replication pause task](../index.md#pausing-and-resuming-replication)
+(13.2) or via the UI (13.1 and earlier), you must first re-enable the
+node before you can continue. This is fixed in 13.4.
+
+From `gitlab-psql`, execute the following, replacing `<your secondary url>`
+with the URL for your secondary server starting with `http` or `https` and ending with a `/`.
+
+```shell
+SECONDARY_URL="https://<secondary url>/"
+DATABASE_NAME="gitlabhq_production"
+sudo gitlab-psql -d "$DATABASE_NAME" -c "UPDATE geo_nodes SET enabled = true WHERE url = '$SECONDARY_URL';"
+```
+
+This should update 1 row.
+
### Message: ``NoMethodError: undefined method `secondary?' for nil:NilClass``
When [promoting a **secondary** node](../disaster_recovery/index.md#step-3-promoting-a-secondary-node),
@@ -674,6 +691,20 @@ sudo /opt/gitlab/embedded/bin/gitlab-pg-ctl promote
GitLab 12.9 and later are [unaffected by this error](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5147).
+### Two-factor authentication is broken after a failover
+
+The setup instructions for Geo prior to 10.5 failed to replicate the
+`otp_key_base` secret, which is used to encrypt the two-factor authentication
+secrets stored in the database. If it differs between **primary** and **secondary**
+nodes, users with two-factor authentication enabled won't be able to log in
+after a failover.
+
+If you still have access to the old **primary** node, you can follow the
+instructions in the
+[Upgrading to GitLab 10.5](../replication/version_specific_updates.md#updating-to-gitlab-105)
+section to resolve the error. Otherwise, the secret is lost and you'll need to
+[reset two-factor authentication for all users](../../../security/two_factor_authentication.md#disabling-2fa-for-everyone).
+
## Expired artifacts
If you notice for some reason there are more artifacts on the Geo
@@ -723,7 +754,7 @@ This error refers to a problem with the database replica on a **secondary** node
which Geo expects to have access to. It usually means, either:
- An unsupported replication method was used (for example, logical replication).
-- The instructions to setup a [Geo database replication](database.md) were not followed correctly.
+- The instructions to setup a [Geo database replication](../setup/database.md) were not followed correctly.
- Your database connection details are incorrect, that is you have specified the wrong
user in your `/etc/gitlab/gitlab.rb` file.
@@ -743,7 +774,7 @@ The most common problems that prevent the database from replicating correctly ar
- Database replication slot is misconfigured.
- Database is not using a replication slot or another alternative and cannot catch-up because WAL files were purged.
-Make sure you follow the [Geo database replication](database.md) instructions for supported configuration.
+Make sure you follow the [Geo database replication](../setup/database.md) instructions for supported configuration.
### Geo database version (...) does not match latest migration (...)
diff --git a/doc/administration/geo/replication/updating_the_geo_nodes.md b/doc/administration/geo/replication/updating_the_geo_nodes.md
index c0b1bc6688c..b78aeb06ebf 100644
--- a/doc/administration/geo/replication/updating_the_geo_nodes.md
+++ b/doc/administration/geo/replication/updating_the_geo_nodes.md
@@ -7,33 +7,15 @@ type: howto
# Updating the Geo nodes **(PREMIUM ONLY)**
+CAUTION: **Warning:**
+Please ensure you read these sections carefully before updating your Geo nodes! Not following version-specific update steps may result in unexpected downtime. Please [contact support](https://about.gitlab.com/support/#contact-support) if you have any specific questions.
+
Updating Geo nodes involves performing:
-1. [Version-specific update steps](#version-specific-update-steps), depending on the
+1. [Version-specific update steps](version_specific_updates.md), depending on the
version being updated to or from.
1. [General update steps](#general-update-steps), for all updates.
-## Version specific update steps
-
-Depending on which version of Geo you are updating to/from, there may be
-different steps.
-
-- [Updating to GitLab 12.9](version_specific_updates.md#updating-to-gitlab-129)
-- [Updating to GitLab 12.7](version_specific_updates.md#updating-to-gitlab-127)
-- [Updating to GitLab 12.2](version_specific_updates.md#updating-to-gitlab-122)
-- [Updating to GitLab 12.1](version_specific_updates.md#updating-to-gitlab-121)
-- [Updating to GitLab 12.0](version_specific_updates.md#updating-to-gitlab-120)
-- [Updating to GitLab 11.11](version_specific_updates.md#updating-to-gitlab-1111)
-- [Updating to GitLab 10.8](version_specific_updates.md#updating-to-gitlab-108)
-- [Updating to GitLab 10.6](version_specific_updates.md#updating-to-gitlab-106)
-- [Updating to GitLab 10.5](version_specific_updates.md#updating-to-gitlab-105)
-- [Updating to GitLab 10.3](version_specific_updates.md#updating-to-gitlab-103)
-- [Updating to GitLab 10.2](version_specific_updates.md#updating-to-gitlab-102)
-- [Updating to GitLab 10.1](version_specific_updates.md#updating-to-gitlab-101)
-- [Updating to GitLab 10.0](version_specific_updates.md#updating-to-gitlab-100)
-- [Updating from GitLab 9.3 or older](version_specific_updates.md#updating-from-gitlab-93-or-older)
-- [Updating to GitLab 9.0](version_specific_updates.md#updating-to-gitlab-90)
-
## General update steps
NOTE: **Note:**
@@ -42,12 +24,12 @@ These general update steps are not intended for [high-availability deployments](
To update the Geo nodes when a new GitLab version is released, update **primary**
and all **secondary** nodes:
-1. **Optional:** [Pause replication on each **secondary** node.](./index.md#pausing-and-resuming-replication)
+1. **Optional:** [Pause replication on each **secondary** node.](../index.md#pausing-and-resuming-replication)
1. Log into the **primary** node.
1. [Update GitLab on the **primary** node using Omnibus](https://docs.gitlab.com/omnibus/update/README.html).
1. Log into each **secondary** node.
1. [Update GitLab on each **secondary** node using Omnibus](https://docs.gitlab.com/omnibus/update/README.html).
-1. If you paused replication in step 1, [resume replication on each **secondary**](./index.md#pausing-and-resuming-replication)
+1. If you paused replication in step 1, [resume replication on each **secondary**](../index.md#pausing-and-resuming-replication)
1. [Test](#check-status-after-updating) **primary** and **secondary** nodes, and check version in each.
### Check status after updating
diff --git a/doc/administration/geo/replication/using_a_geo_server.md b/doc/administration/geo/replication/using_a_geo_server.md
index 3f2895f1c71..aeed5a44b30 100644
--- a/doc/administration/geo/replication/using_a_geo_server.md
+++ b/doc/administration/geo/replication/using_a_geo_server.md
@@ -9,7 +9,7 @@ type: howto
# Using a Geo Server **(PREMIUM ONLY)**
-After you set up the [database replication and configure the Geo nodes](index.md#setup-instructions), use your closest GitLab node as you would a normal standalone GitLab instance.
+After you set up the [database replication and configure the Geo nodes](../index.md#setup-instructions), use your closest GitLab node as you would a normal standalone GitLab instance.
Pushing directly to a **secondary** node (for both HTTP, SSH including Git LFS) was [introduced](https://about.gitlab.com/releases/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3.
diff --git a/doc/administration/geo/replication/version_specific_updates.md b/doc/administration/geo/replication/version_specific_updates.md
index 900d09bdd34..1ae246e3e61 100644
--- a/doc/administration/geo/replication/version_specific_updates.md
+++ b/doc/administration/geo/replication/version_specific_updates.md
@@ -314,7 +314,7 @@ sudo gitlab-ctl reconfigure
```
If you do not perform this step, you may find that two-factor authentication
-[is broken following DR](../disaster_recovery/index.md#i-followed-the-disaster-recovery-instructions-and-now-two-factor-auth-is-broken).
+[is broken following DR](troubleshooting.md#two-factor-authentication-is-broken-after-a-failover).
To prevent SSH requests to the newly promoted **primary** node from failing
due to SSH host key mismatch when updating the **primary** node domain's DNS record
@@ -343,7 +343,7 @@ Support for TLS-secured PostgreSQL replication has been added. If you are
currently using PostgreSQL replication across the open internet without an
external means of securing the connection (e.g., a site-to-site VPN), then you
should immediately reconfigure your **primary** and **secondary** PostgreSQL instances
-according to the [updated instructions](database.md).
+according to the [updated instructions](../setup/database.md).
If you *are* securing the connections externally and wish to continue doing so,
ensure you include the new option `--sslmode=prefer` in future invocations of
@@ -441,7 +441,7 @@ Omnibus is the following:
1. Check the steps about defining `postgresql['sql_user_password']`, `gitlab_rails['db_password']`.
1. Make sure `postgresql['max_replication_slots']` matches the number of **secondary** Geo nodes locations.
1. Install GitLab on the **secondary** server.
-1. Re-run the [database replication process](database.md#step-3-initiate-the-replication-process).
+1. Re-run the [database replication process](../setup/database.md#step-3-initiate-the-replication-process).
## Updating to GitLab 9.0
diff --git a/doc/administration/geo/setup/database.md b/doc/administration/geo/setup/database.md
new file mode 100644
index 00000000000..aefa8a0e399
--- /dev/null
+++ b/doc/administration/geo/setup/database.md
@@ -0,0 +1,473 @@
+---
+stage: Enablement
+group: Geo
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
+type: howto
+---
+
+# Geo database replication **(PREMIUM ONLY)**
+
+NOTE: **Note:**
+If your GitLab installation uses external (not managed by Omnibus) PostgreSQL
+instances, the Omnibus roles will not be able to perform all necessary
+configuration steps. In this case,
+[follow the Geo with external PostgreSQL instances document instead](external_database.md).
+
+NOTE: **Note:**
+The stages of the setup process must be completed in the documented order.
+Before attempting the steps in this stage, [complete all prior stages](../setup/index.md#using-omnibus-gitlab).
+
+This document describes the minimal steps you have to take in order to
+replicate your **primary** GitLab database to a **secondary** node's database. You may
+have to change some values according to your database setup, how big it is, etc.
+
+You are encouraged to first read through all the steps before executing them
+in your testing/production environment.
+
+## PostgreSQL replication
+
+The GitLab **primary** node where the write operations happen will connect to
+the **primary** database server, and **secondary** nodes will
+connect to their own database servers (which are also read-only).
+
+We recommend using [PostgreSQL replication slots](https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75)
+to ensure that the **primary** node retains all the data necessary for the **secondary** nodes to
+recover. See below for more details.
+
+The following guide assumes that:
+
+- You are using Omnibus and therefore you are using PostgreSQL 11 or later
+ which includes the [`pg_basebackup` tool](https://www.postgresql.org/docs/11/app-pgbasebackup.html).
+- You have a **primary** node already set up (the GitLab server you are
+ replicating from), running Omnibus' PostgreSQL (or equivalent version), and
+ you have a new **secondary** server set up with the same versions of the OS,
+ PostgreSQL, and GitLab on all nodes.
+
+CAUTION: **Warning:**
+Geo works with streaming replication. Logical replication is not supported at this time.
+There is an [issue where support is being discussed](https://gitlab.com/gitlab-org/gitlab/-/issues/7420).
+
+### Step 1. Configure the **primary** server
+
+1. SSH into your GitLab **primary** server and login as root:
+
+ ```shell
+ sudo -i
+ ```
+
+1. Edit `/etc/gitlab/gitlab.rb` and add a **unique** name for your node:
+
+ ```ruby
+ # The unique identifier for the Geo node.
+ gitlab_rails['geo_node_name'] = '<node_name_here>'
+ ```
+
+1. Reconfigure the **primary** node for the change to take effect:
+
+ ```shell
+ gitlab-ctl reconfigure
+ ```
+
+1. Execute the command below to define the node as **primary** node:
+
+ ```shell
+ gitlab-ctl set-geo-primary-node
+ ```
+
+ This command will use your defined `external_url` in `/etc/gitlab/gitlab.rb`.
+
+1. GitLab 10.4 and up only: Do the following to make sure the `gitlab` database user has a password defined:
+
+ NOTE: **Note:**
+ Until FDW settings are removed in GitLab version 14.0, avoid using single or double quotes in the
+ password for PostgreSQL as that will lead to errors when reconfiguring.
+
+ Generate a MD5 hash of the desired password:
+
+ ```shell
+ gitlab-ctl pg-password-md5 gitlab
+ # Enter password: <your_password_here>
+ # Confirm password: <your_password_here>
+ # fca0b89a972d69f00eb3ec98a5838484
+ ```
+
+ Edit `/etc/gitlab/gitlab.rb`:
+
+ ```ruby
+ # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab`
+ postgresql['sql_user_password'] = '<md5_hash_of_your_password>'
+
+ # Every node that runs Puma or Sidekiq needs to have the database
+ # password specified as below. If you have a high-availability setup, this
+ # must be present in all application nodes.
+ gitlab_rails['db_password'] = '<your_password_here>'
+ ```
+
+1. Omnibus GitLab already has a [replication user](https://wiki.postgresql.org/wiki/Streaming_Replication)
+ called `gitlab_replicator`. You must set the password for this user manually.
+ You will be prompted to enter a password:
+
+ ```shell
+ gitlab-ctl set-replication-password
+ ```
+
+ This command will also read the `postgresql['sql_replication_user']` Omnibus
+ setting in case you have changed `gitlab_replicator` username to something
+ else.
+
+ If you are using an external database not managed by Omnibus GitLab, you need
+ to create the replicator user and define a password to it manually:
+
+ ```sql
+ --- Create a new user 'replicator'
+ CREATE USER gitlab_replicator;
+
+ --- Set/change a password and grants replication privilege
+ ALTER USER gitlab_replicator WITH REPLICATION ENCRYPTED PASSWORD '<replication_password>';
+ ```
+
+1. Configure PostgreSQL to listen on network interfaces:
+
+ For security reasons, PostgreSQL does not listen on any network interfaces
+ by default. However, Geo requires the **secondary** node to be able to
+ connect to the **primary** node's database. For this reason, we need the address of
+ each node.
+
+ NOTE: **Note:**
+ For external PostgreSQL instances, see [additional instructions](external_database.md).
+
+ If you are using a cloud provider, you can lookup the addresses for each
+ Geo node through your cloud provider's management console.
+
+ To lookup the address of a Geo node, SSH in to the Geo node and execute:
+
+ ```shell
+ ##
+ ## Private address
+ ##
+ ip route get 255.255.255.255 | awk '{print "Private address:", $NF; exit}'
+
+ ##
+ ## Public address
+ ##
+ echo "External address: $(curl --silent ipinfo.io/ip)"
+ ```
+
+ In most cases, the following addresses will be used to configure GitLab
+ Geo:
+
+ | Configuration | Address |
+ |:----------------------------------------|:------------------------------------------------------|
+ | `postgresql['listen_address']` | **Primary** node's public or VPC private address. |
+ | `postgresql['md5_auth_cidr_addresses']` | **Secondary** node's public or VPC private addresses. |
+
+ If you are using Google Cloud Platform, SoftLayer, or any other vendor that
+ provides a virtual private cloud (VPC) you can use the **primary** and **secondary** nodes
+ private addresses (corresponds to "internal address" for Google Cloud Platform) for
+ `postgresql['md5_auth_cidr_addresses']` and `postgresql['listen_address']`.
+
+ The `listen_address` option opens PostgreSQL up to network connections with the interface
+ corresponding to the given address. See [the PostgreSQL documentation](https://www.postgresql.org/docs/11/runtime-config-connection.html)
+ for more details.
+
+ NOTE: **Note:**
+ If you need to use `0.0.0.0` or `*` as the listen_address, you will also need to add
+ `127.0.0.1/32` to the `postgresql['md5_auth_cidr_addresses']` setting, to allow Rails to connect through
+ `127.0.0.1`. For more information, see [omnibus-5258](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5258).
+
+ Depending on your network configuration, the suggested addresses may not
+ be correct. If your **primary** node and **secondary** nodes connect over a local
+ area network, or a virtual network connecting availability zones like
+ [Amazon's VPC](https://aws.amazon.com/vpc/) or [Google's VPC](https://cloud.google.com/vpc/)
+ you should use the **secondary** node's private address for `postgresql['md5_auth_cidr_addresses']`.
+
+ Edit `/etc/gitlab/gitlab.rb` and add the following, replacing the IP
+ addresses with addresses appropriate to your network configuration:
+
+ ```ruby
+ ##
+ ## Geo Primary role
+ ## - configure dependent flags automatically to enable Geo
+ ##
+ roles ['geo_primary_role']
+
+ ##
+ ## Primary address
+ ## - replace '<primary_node_ip>' with the public or VPC address of your Geo primary node
+ ##
+ postgresql['listen_address'] = '<primary_node_ip>'
+
+ ##
+ # Allow PostgreSQL client authentication from the primary and secondary IPs. These IPs may be
+ # public or VPC addresses in CIDR format, for example ['198.51.100.1/32', '198.51.100.2/32']
+ ##
+ postgresql['md5_auth_cidr_addresses'] = ['<primary_node_ip>/32', '<secondary_node_ip>/32']
+
+ ##
+ ## Replication settings
+ ## - set this to be the number of Geo secondary nodes you have
+ ##
+ postgresql['max_replication_slots'] = 1
+ # postgresql['max_wal_senders'] = 10
+ # postgresql['wal_keep_segments'] = 10
+
+ ##
+ ## Disable automatic database migrations temporarily
+ ## (until PostgreSQL is restarted and listening on the private address).
+ ##
+ gitlab_rails['auto_migrate'] = false
+ ```
+
+1. Optional: If you want to add another **secondary** node, the relevant setting would look like:
+
+ ```ruby
+ postgresql['md5_auth_cidr_addresses'] = ['<primary_node_ip>/32', '<secondary_node_ip>/32', '<another_secondary_node_ip>/32']
+ ```
+
+ You may also want to edit the `wal_keep_segments` and `max_wal_senders` to match your
+ database replication requirements. Consult the [PostgreSQL - Replication documentation](https://www.postgresql.org/docs/11/runtime-config-replication.html)
+ for more information.
+
+1. Save the file and reconfigure GitLab for the database listen changes and
+ the replication slot changes to be applied:
+
+ ```shell
+ gitlab-ctl reconfigure
+ ```
+
+ Restart PostgreSQL for its changes to take effect:
+
+ ```shell
+ gitlab-ctl restart postgresql
+ ```
+
+1. Re-enable migrations now that PostgreSQL is restarted and listening on the
+ private address.
+
+ Edit `/etc/gitlab/gitlab.rb` and **change** the configuration to `true`:
+
+ ```ruby
+ gitlab_rails['auto_migrate'] = true
+ ```
+
+ Save the file and reconfigure GitLab:
+
+ ```shell
+ gitlab-ctl reconfigure
+ ```
+
+1. Now that the PostgreSQL server is set up to accept remote connections, run
+ `netstat -plnt | grep 5432` to make sure that PostgreSQL is listening on port
+ `5432` to the **primary** server's private address.
+
+1. A certificate was automatically generated when GitLab was reconfigured. This
+ will be used automatically to protect your PostgreSQL traffic from
+ eavesdroppers, but to protect against active ("man-in-the-middle") attackers,
+ the **secondary** node needs a copy of the certificate. Make a copy of the PostgreSQL
+ `server.crt` file on the **primary** node by running this command:
+
+ ```shell
+ cat ~gitlab-psql/data/server.crt
+ ```
+
+ Copy the output into a clipboard or into a local file. You
+ will need it when setting up the **secondary** node! The certificate is not sensitive
+ data.
+
+### Step 2. Configure the **secondary** server
+
+1. SSH into your GitLab **secondary** server and login as root:
+
+ ```shell
+ sudo -i
+ ```
+
+1. Stop application server and Sidekiq
+
+ ```shell
+ gitlab-ctl stop puma
+ gitlab-ctl stop sidekiq
+ ```
+
+ NOTE: **Note:**
+ This step is important so we don't try to execute anything before the node is fully configured.
+
+1. [Check TCP connectivity](../../raketasks/maintenance.md) to the **primary** node's PostgreSQL server:
+
+ ```shell
+ gitlab-rake gitlab:tcp_check[<primary_node_ip>,5432]
+ ```
+
+ NOTE: **Note:**
+ If this step fails, you may be using the wrong IP address, or a firewall may
+ be preventing access to the server. Check the IP address, paying close
+ attention to the difference between public and private addresses and ensure
+ that, if a firewall is present, the **secondary** node is permitted to connect to the
+ **primary** node on port 5432.
+
+1. Create a file `server.crt` in the **secondary** server, with the content you got on the last step of the **primary** node's setup:
+
+ ```shell
+ editor server.crt
+ ```
+
+1. Set up PostgreSQL TLS verification on the **secondary** node:
+
+ Install the `server.crt` file:
+
+ ```shell
+ install \
+ -D \
+ -o gitlab-psql \
+ -g gitlab-psql \
+ -m 0400 \
+ -T server.crt ~gitlab-psql/.postgresql/root.crt
+ ```
+
+ PostgreSQL will now only recognize that exact certificate when verifying TLS
+ connections. The certificate can only be replicated by someone with access
+ to the private key, which is **only** present on the **primary** node.
+
+1. Test that the `gitlab-psql` user can connect to the **primary** node's database
+ (the default Omnibus database name is `gitlabhq_production`):
+
+ ```shell
+ sudo \
+ -u gitlab-psql /opt/gitlab/embedded/bin/psql \
+ --list \
+ -U gitlab_replicator \
+ -d "dbname=gitlabhq_production sslmode=verify-ca" \
+ -W \
+ -h <primary_node_ip>
+ ```
+
+ When prompted enter the password you set in the first step for the
+ `gitlab_replicator` user. If all worked correctly, you should see
+ the list of **primary** node's databases.
+
+ A failure to connect here indicates that the TLS configuration is incorrect.
+ Ensure that the contents of `~gitlab-psql/data/server.crt` on the **primary** node
+ match the contents of `~gitlab-psql/.postgresql/root.crt` on the **secondary** node.
+
+1. Configure PostgreSQL:
+
+ This step is similar to how we configured the **primary** instance.
+ We need to enable this, even if using a single node.
+
+ Edit `/etc/gitlab/gitlab.rb` and add the following, replacing the IP
+ addresses with addresses appropriate to your network configuration:
+
+ ```ruby
+ ##
+ ## Geo Secondary role
+ ## - configure dependent flags automatically to enable Geo
+ ##
+ roles ['geo_secondary_role']
+
+ ##
+ ## Secondary address
+ ## - replace '<secondary_node_ip>' with the public or VPC address of your Geo secondary node
+ ##
+ postgresql['listen_address'] = '<secondary_node_ip>'
+ postgresql['md5_auth_cidr_addresses'] = ['<secondary_node_ip>/32']
+
+ ##
+ ## Database credentials password (defined previously in primary node)
+ ## - replicate same values here as defined in primary node
+ ##
+ postgresql['sql_user_password'] = '<md5_hash_of_your_password>'
+ gitlab_rails['db_password'] = '<your_password_here>'
+ ```
+
+ For external PostgreSQL instances, see [additional instructions](external_database.md).
+ If you bring a former **primary** node back online to serve as a **secondary** node, then you also need to remove `roles ['geo_primary_role']` or `geo_primary_role['enable'] = true`.
+
+1. Reconfigure GitLab for the changes to take effect:
+
+ ```shell
+ gitlab-ctl reconfigure
+ ```
+
+1. Restart PostgreSQL for the IP change to take effect:
+
+ ```shell
+ gitlab-ctl restart postgresql
+ ```
+
+### Step 3. Initiate the replication process
+
+Below we provide a script that connects the database on the **secondary** node to
+the database on the **primary** node, replicates the database, and creates the
+needed files for streaming replication.
+
+The directories used are the defaults that are set up in Omnibus. If you have
+changed any defaults, configure it as you see fit replacing the directories and paths.
+
+CAUTION: **Warning:**
+Make sure to run this on the **secondary** server as it removes all PostgreSQL's
+data before running `pg_basebackup`.
+
+1. SSH into your GitLab **secondary** server and login as root:
+
+ ```shell
+ sudo -i
+ ```
+
+1. Choose a database-friendly name to use for your **secondary** node to
+ use as the replication slot name. For example, if your domain is
+ `secondary.geo.example.com`, you may use `secondary_example` as the slot
+ name as shown in the commands below.
+
+1. Execute the command below to start a backup/restore and begin the replication
+
+ CAUTION: **Warning:**
+ Each Geo **secondary** node must have its own unique replication slot name.
+ Using the same slot name between two secondaries will break PostgreSQL replication.
+
+ ```shell
+ gitlab-ctl replicate-geo-database \
+ --slot-name=<secondary_node_name> \
+ --host=<primary_node_ip>
+ ```
+
+ NOTE: **Note:**
+ Replication slot names must only contain lowercase letters, numbers, and the underscore character.
+
+ When prompted, enter the _plaintext_ password you set up for the `gitlab_replicator`
+ user in the first step.
+
+ This command also takes a number of additional options. You can use `--help`
+ to list them all, but here are a couple of tips:
+
+ - If PostgreSQL is listening on a non-standard port, add `--port=` as well.
+ - If your database is too large to be transferred in 30 minutes, you will need
+ to increase the timeout, e.g., `--backup-timeout=3600` if you expect the
+ initial replication to take under an hour.
+ - Pass `--sslmode=disable` to skip PostgreSQL TLS authentication altogether
+ (e.g., you know the network path is secure, or you are using a site-to-site
+ VPN). This is **not** safe over the public Internet!
+ - You can read more details about each `sslmode` in the
+ [PostgreSQL documentation](https://www.postgresql.org/docs/11/libpq-ssl.html#LIBPQ-SSL-PROTECTION);
+ the instructions above are carefully written to ensure protection against
+ both passive eavesdroppers and active "man-in-the-middle" attackers.
+ - Change the `--slot-name` to the name of the replication slot
+ to be used on the **primary** database. The script will attempt to create the
+ replication slot automatically if it does not exist.
+ - If you're repurposing an old server into a Geo **secondary** node, you'll need to
+ add `--force` to the command line.
+ - When not in a production machine you can disable backup step if you
+ really sure this is what you want by adding `--skip-backup`
+
+The replication process is now complete.
+
+## PgBouncer support (optional)
+
+[PgBouncer](https://www.pgbouncer.org/) may be used with GitLab Geo to pool
+PostgreSQL connections. We recommend using PgBouncer if you use GitLab in a
+high-availability configuration with a cluster of nodes supporting a Geo
+**primary** node and another cluster of nodes supporting a Geo **secondary** node. For more
+information, see [High Availability with Omnibus GitLab](../../postgresql/replication_and_failover.md).
+
+## Troubleshooting
+
+Read the [troubleshooting document](../replication/troubleshooting.md).
diff --git a/doc/administration/geo/setup/external_database.md b/doc/administration/geo/setup/external_database.md
new file mode 100644
index 00000000000..33a7bc38b6e
--- /dev/null
+++ b/doc/administration/geo/setup/external_database.md
@@ -0,0 +1,234 @@
+---
+stage: Enablement
+group: Geo
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
+type: howto
+---
+
+# Geo with external PostgreSQL instances **(PREMIUM ONLY)**
+
+This document is relevant if you are using a PostgreSQL instance that is *not
+managed by Omnibus*. This includes cloud-managed instances like AWS RDS, or
+manually installed and configured PostgreSQL instances.
+
+NOTE: **Note:**
+We strongly recommend running Omnibus-managed instances as they are actively
+developed and tested. We aim to be compatible with most external
+(not managed by Omnibus) databases but we do not guarantee compatibility.
+
+## **Primary** node
+
+1. SSH into a GitLab **primary** application server and login as root:
+
+ ```shell
+ sudo -i
+ ```
+
+1. Edit `/etc/gitlab/gitlab.rb` and add a **unique** ID for your node (arbitrary value):
+
+ ```ruby
+ # The unique identifier for the Geo node.
+ gitlab_rails['geo_node_name'] = '<node_name_here>'
+ ```
+
+1. Reconfigure the **primary** node for the change to take effect:
+
+ ```shell
+ gitlab-ctl reconfigure
+ ```
+
+1. Execute the command below to define the node as **primary** node:
+
+ ```shell
+ gitlab-ctl set-geo-primary-node
+ ```
+
+ This command will use your defined `external_url` in `/etc/gitlab/gitlab.rb`.
+
+### Configure the external database to be replicated
+
+To set up an external database, you can either:
+
+- Set up streaming replication yourself (for example, in AWS RDS).
+- Perform the Omnibus configuration manually as follows.
+
+#### Leverage your cloud provider's tools to replicate the primary database
+
+Given you have a primary node set up on AWS EC2 that uses RDS.
+You can now just create a read-only replica in a different region and the
+replication process will be managed by AWS. Make sure you've set Network ACL, Subnet, and
+Security Group according to your needs, so the secondary application node can access the database.
+
+The following instructions detail how to create a read-only replica for common
+cloud providers:
+
+- Amazon RDS - [Creating a Read Replica](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html#USER_ReadRepl.Create)
+- Azure Database for PostgreSQL - [Create and manage read replicas in Azure Database for PostgreSQL](https://docs.microsoft.com/en-us/azure/postgresql/howto-read-replicas-portal)
+
+Once your read-only replica is set up, you can skip to [configure you secondary application node](#configure-secondary-application-nodes-to-use-the-external-read-replica).
+
+#### Manually configure the primary database for replication
+
+The [`geo_primary_role`](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles)
+configures the **primary** node's database to be replicated by making changes to
+`pg_hba.conf` and `postgresql.conf`. Make the following configuration changes
+manually to your external database configuration and ensure that you restart PostgreSQL
+afterwards for the changes to take effect:
+
+```plaintext
+##
+## Geo Primary Role
+## - pg_hba.conf
+##
+host all all <trusted primary IP>/32 md5
+host replication gitlab_replicator <trusted primary IP>/32 md5
+host all all <trusted secondary IP>/32 md5
+host replication gitlab_replicator <trusted secondary IP>/32 md5
+```
+
+```plaintext
+##
+## Geo Primary Role
+## - postgresql.conf
+##
+wal_level = hot_standby
+max_wal_senders = 10
+wal_keep_segments = 50
+max_replication_slots = 1 # number of secondary instances
+hot_standby = on
+```
+
+## **Secondary** nodes
+
+### Manually configure the replica database
+
+Make the following configuration changes manually to your `pg_hba.conf` and `postgresql.conf`
+of your external replica database and ensure that you restart PostgreSQL afterwards
+for the changes to take effect:
+
+```plaintext
+##
+## Geo Secondary Role
+## - pg_hba.conf
+##
+host all all <trusted secondary IP>/32 md5
+host replication gitlab_replicator <trusted secondary IP>/32 md5
+host all all <trusted primary IP>/24 md5
+```
+
+```plaintext
+##
+## Geo Secondary Role
+## - postgresql.conf
+##
+wal_level = hot_standby
+max_wal_senders = 10
+wal_keep_segments = 10
+hot_standby = on
+```
+
+### Configure **secondary** application nodes to use the external read-replica
+
+With Omnibus, the
+[`geo_secondary_role`](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles)
+has three main functions:
+
+1. Configure the replica database.
+1. Configure the tracking database.
+1. Enable the [Geo Log Cursor](../index.md#geo-log-cursor) (not covered in this section).
+
+To configure the connection to the external read-replica database and enable Log Cursor:
+
+1. SSH into a GitLab **secondary** application server and login as root:
+
+ ```shell
+ sudo -i
+ ```
+
+1. Edit `/etc/gitlab/gitlab.rb` and add the following
+
+ ```ruby
+ ##
+ ## Geo Secondary role
+ ## - configure dependent flags automatically to enable Geo
+ ##
+ roles ['geo_secondary_role']
+
+ # note this is shared between both databases,
+ # make sure you define the same password in both
+ gitlab_rails['db_password'] = '<your_password_here>'
+
+ gitlab_rails['db_username'] = 'gitlab'
+ gitlab_rails['db_host'] = '<database_read_replica_host>'
+
+ # Disable the bundled Omnibus PostgreSQL, since we are
+ # using an external PostgreSQL
+ postgresql['enable'] = false
+ ```
+
+1. Save the file and [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure)
+
+### Configure the tracking database
+
+**Secondary** nodes use a separate PostgreSQL installation as a tracking
+database to keep track of replication status and automatically recover from
+potential replication issues. Omnibus automatically configures a tracking database
+when `roles ['geo_secondary_role']` is set.
+If you want to run this database external to Omnibus, please follow the instructions below.
+
+If you are using a cloud-managed service for the tracking database, you may need
+to grant additional roles to your tracking database user (by default, this is
+`gitlab_geo`):
+
+- Amazon RDS requires the [`rds_superuser`](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.html#Appendix.PostgreSQL.CommonDBATasks.Roles) role.
+- Azure Database for PostgreSQL requires the [`azure_pg_admin`](https://docs.microsoft.com/en-us/azure/postgresql/howto-create-users#how-to-create-additional-admin-users-in-azure-database-for-postgresql) role.
+
+If you have an external database ready to be used as the tracking database,
+follow the instructions below to use it:
+
+NOTE: **Note:**
+If you want to use AWS RDS as a tracking database, make sure it has access to
+the secondary database. Unfortunately, just assigning the same security group is not enough as
+outbound rules do not apply to RDS PostgreSQL databases. Therefore, you need to explicitly add an inbound
+rule to the read-replica's security group allowing any TCP traffic from
+the tracking database on port 5432.
+
+1. Ensure that your secondary node can communicate with your tracking database by
+ manually changing the `pg_hba.conf` that is associated with your tracking database.
+ Remember to restart PostgreSQL afterwards for the changes to take effect:
+
+ ```plaintext
+ ##
+ ## Geo Tracking Database Role
+ ## - pg_hba.conf
+ ##
+ host all all <trusted tracking IP>/32 md5
+ host all all <trusted secondary IP>/32 md5
+ ```
+
+1. SSH into a GitLab **secondary** server and login as root:
+
+ ```shell
+ sudo -i
+ ```
+
+1. Edit `/etc/gitlab/gitlab.rb` with the connection parameters and credentials for
+ the machine with the PostgreSQL instance:
+
+ ```ruby
+ geo_secondary['db_username'] = 'gitlab_geo'
+ geo_secondary['db_password'] = '<your_password_here>'
+
+ geo_secondary['db_host'] = '<tracking_database_host>'
+ geo_secondary['db_port'] = <tracking_database_port> # change to the correct port
+ geo_postgresql['enable'] = false # don't use internal managed instance
+ ```
+
+1. Save the file and [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure)
+
+1. Run the tracking database migrations:
+
+ ```shell
+ gitlab-rake geo:db:create
+ gitlab-rake geo:db:migrate
+ ```
diff --git a/doc/administration/geo/setup/index.md b/doc/administration/geo/setup/index.md
new file mode 100644
index 00000000000..a4d9fcba14d
--- /dev/null
+++ b/doc/administration/geo/setup/index.md
@@ -0,0 +1,32 @@
+---
+stage: Enablement
+group: Geo
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
+type: howto
+---
+
+# Setting up Geo
+
+These instructions assume you have a working instance of GitLab. They guide you through:
+
+1. Making your existing instance the **primary** node.
+1. Adding **secondary** nodes.
+
+CAUTION: **Caution:**
+The steps below should be followed in the order they appear. **Make sure the GitLab version is the same on all nodes.**
+
+## Using Omnibus GitLab
+
+If you installed GitLab using the Omnibus packages (highly recommended):
+
+1. [Install GitLab Enterprise Edition](https://about.gitlab.com/install/) on the server that will serve as the **secondary** node. Do not create an account or log in to the new **secondary** node.
+1. [Upload the GitLab License](../../../user/admin_area/license.md) on the **primary** node to unlock Geo. The license must be for [GitLab Premium](https://about.gitlab.com/pricing/) or higher.
+1. [Set up the database replication](database.md) (`primary (read-write) <-> secondary (read-only)` topology).
+1. [Configure fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md). This step is required and needs to be done on **both** the **primary** and **secondary** nodes.
+1. [Configure GitLab](../replication/configuration.md) to set the **primary** and **secondary** nodes.
+1. Optional: [Configure a secondary LDAP server](../../auth/ldap/index.md) for the **secondary** node. See [notes on LDAP](../index.md#ldap).
+1. [Follow the "Using a Geo Server" guide](../replication/using_a_geo_server.md).
+
+## Post-installation documentation
+
+After installing GitLab on the **secondary** nodes and performing the initial configuration, see the [following documentation for post-installation information](../index.md#post-installation-documentation).