summaryrefslogtreecommitdiff
path: root/doc/administration/geo
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/geo')
-rw-r--r--doc/administration/geo/disaster_recovery/index.md3
-rw-r--r--doc/administration/geo/disaster_recovery/planned_failover.md18
-rw-r--r--doc/administration/geo/replication/configuration.md29
-rw-r--r--doc/administration/geo/replication/database.md48
-rw-r--r--doc/administration/geo/replication/external_database.md4
-rw-r--r--doc/administration/geo/replication/faq.md4
-rw-r--r--doc/administration/geo/replication/high_availability.md46
-rw-r--r--doc/administration/geo/replication/img/adding_a_secondary_node.pngbin0 -> 87593 bytes
-rw-r--r--doc/administration/geo/replication/img/single_git_add_geolocation_rule.pngbin0 -> 76035 bytes
-rw-r--r--doc/administration/geo/replication/img/single_git_add_traffic_policy_endpoints.pngbin0 -> 88896 bytes
-rw-r--r--doc/administration/geo/replication/img/single_git_clone_panel.pngbin0 -> 20007 bytes
-rw-r--r--doc/administration/geo/replication/img/single_git_create_policy_records_with_traffic_policy.pngbin0 -> 102350 bytes
-rw-r--r--doc/administration/geo/replication/img/single_git_created_policy_record.pngbin0 -> 141505 bytes
-rw-r--r--doc/administration/geo/replication/img/single_git_name_policy.pngbin0 -> 37964 bytes
-rw-r--r--doc/administration/geo/replication/img/single_git_policy_diagram.pngbin0 -> 56194 bytes
-rw-r--r--doc/administration/geo/replication/img/single_git_traffic_policies.pngbin0 -> 214666 bytes
-rw-r--r--doc/administration/geo/replication/index.md92
-rw-r--r--doc/administration/geo/replication/location_aware_git_url.md119
-rw-r--r--doc/administration/geo/replication/object_storage.md52
-rw-r--r--doc/administration/geo/replication/security_review.md16
-rw-r--r--doc/administration/geo/replication/troubleshooting.md6
-rw-r--r--doc/administration/geo/replication/using_a_geo_server.md2
22 files changed, 321 insertions, 118 deletions
diff --git a/doc/administration/geo/disaster_recovery/index.md b/doc/administration/geo/disaster_recovery/index.md
index 5eb23422374..ad5284938fa 100644
--- a/doc/administration/geo/disaster_recovery/index.md
+++ b/doc/administration/geo/disaster_recovery/index.md
@@ -51,7 +51,7 @@ must disable the **primary** node.
NOTE: **Note:**
(**CentOS only**) In CentOS 6 or older, there is no easy way to prevent GitLab from being
- started if the machine reboots isn't available (see [gitlab-org/omnibus-gitlab#3058]).
+ started if the machine reboots isn't available (see [Omnibus GitLab issue #3058](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/3058)).
It may be safest to uninstall the GitLab package completely:
```sh
@@ -317,6 +317,5 @@ section to resolve the error. Otherwise, the secret is lost and you'll need to
[setup-geo]: ../replication/index.md#setup-instructions
[updating-geo]: ../replication/version_specific_updates.md#updating-to-gitlab-105
[sec-tfa]: ../../../security/two_factor_authentication.md#disabling-2fa-for-everyone
-[gitlab-org/omnibus-gitlab#3058]: https://gitlab.com/gitlab-org/omnibus-gitlab/issues/3058
[initiate-the-replication-process]: ../replication/database.html#step-3-initiate-the-replication-process
[configure-the-primary-server]: ../replication/database.html#step-1-configure-the-primary-server
diff --git a/doc/administration/geo/disaster_recovery/planned_failover.md b/doc/administration/geo/disaster_recovery/planned_failover.md
index 75e07bcf863..8fee172ec64 100644
--- a/doc/administration/geo/disaster_recovery/planned_failover.md
+++ b/doc/administration/geo/disaster_recovery/planned_failover.md
@@ -43,23 +43,14 @@ will go smoothly.
### Object storage
-Some classes of non-repository data can use object storage in preference to
-file storage. Geo [does not replicate data in object storage](../replication/object_storage.md),
-leaving that task up to the object store itself. For a planned failover, this
-means you can decouple the replication of this data from the failover of the
-GitLab service.
-
-If you're already using object storage, simply verify that your **secondary**
-node has access to the same data as the **primary** node - they must either they share the
-same object storage configuration, or the **secondary** node should be configured to
-access a [geographically-replicated][os-repl] copy provided by the object store
-itself.
-
If you have a large GitLab installation or cannot tolerate downtime, consider
[migrating to Object Storage][os-conf] **before** scheduling a planned failover.
Doing so reduces both the length of the maintenance window, and the risk of data
loss as a result of a poorly executed planned failover.
+In GitLab 12.4, you can optionally allow GitLab to manage replication of Object Storage for
+**secondary** nodes. For more information, see [Object Storage replication][os-conf].
+
### Review the configuration of each **secondary** node
Database settings are automatically replicated to the **secondary** node, but the
@@ -224,5 +215,4 @@ Don't forget to remove the broadcast message after failover is complete.
[background-verification]: background_verification.md
[limitations]: ../replication/index.md#current-limitations
[moving-repositories]: ../../operations/moving_repositories.md
-[os-conf]: ../replication/object_storage.md#configuration
-[os-repl]: ../replication/object_storage.md#replication
+[os-conf]: ../replication/object_storage.md
diff --git a/doc/administration/geo/replication/configuration.md b/doc/administration/geo/replication/configuration.md
index ddb5f22fd05..f09d9f20dab 100644
--- a/doc/administration/geo/replication/configuration.md
+++ b/doc/administration/geo/replication/configuration.md
@@ -25,7 +25,7 @@ Any change that requires access to the **Admin Area** needs to be done in the
GitLab stores a number of secret values in the `/etc/gitlab/gitlab-secrets.json`
file which *must* be the same on all nodes. Until there is
-a means of automatically replicating these between nodes (see issue [gitlab-org/gitlab-ee#3789]),
+a means of automatically replicating these between nodes (see [issue #3789](https://gitlab.com/gitlab-org/gitlab/issues/3789)),
they must be manually replicated to the **secondary** node.
1. SSH into the **primary** node, and execute the command below:
@@ -75,7 +75,7 @@ they must be manually replicated to the **secondary** node.
### Step 2. Manually replicate the **primary** node's SSH host keys
GitLab integrates with the system-installed SSH daemon, designating a user
-(typically named git) through which all access requests are handled.
+(typically named `git`) through which all access requests are handled.
In a [Disaster Recovery] situation, GitLab system
administrators will promote a **secondary** node to the **primary** node. DNS records for the
@@ -165,10 +165,32 @@ keys must be manually replicated to the **secondary** node.
### Step 3. Add the **secondary** node
+1. SSH into your GitLab **secondary** server and login as root:
+
+ ```sh
+ sudo -i
+ ```
+
+1. Edit `/etc/gitlab/gitlab.rb` and add a **unique** name for your node. You will need this in the next steps:
+
+ ```ruby
+ # The unique identifier for the Geo node.
+ gitlab_rails['geo_node_name'] = '<node_name_here>'
+ ```
+
+1. Reconfigure the **secondary** node for the change to take effect:
+
+ ```sh
+ gitlab-ctl reconfigure
+ ```
+
1. Visit the **primary** node's **Admin Area > Geo**
(`/admin/geo/nodes`) in your browser.
-1. Add the **secondary** node by providing its full URL. **Do NOT** check the
+1. Click the **New node** button.
+1. Add the **secondary** node. Use the **exact** name you inputed for `gitlab_rails['geo_node_name']` as the Name and the full URL as the URL. **Do NOT** check the
**This is a primary node** checkbox.
+
+ ![Add secondary node](img/adding_a_secondary_node.png)
1. Optionally, choose which groups or storage shards should be replicated by the
**secondary** node. Leave blank to replicate all. Read more in
[selective synchronization](#selective-synchronization).
@@ -299,7 +321,6 @@ See the [troubleshooting document](troubleshooting.md).
[setup-geo-omnibus]: index.md#using-omnibus-gitlab
[Hashed Storage]: ../../repository_storage_types.md
[Disaster Recovery]: ../disaster_recovery/index.md
-[gitlab-org/gitlab-ee#3789]: https://gitlab.com/gitlab-org/gitlab/issues/3789
[gitlab-com/infrastructure#2821]: https://gitlab.com/gitlab-com/infrastructure/issues/2821
[omnibus-ssl]: https://docs.gitlab.com/omnibus/settings/ssl.html
[using-geo]: using_a_geo_server.md
diff --git a/doc/administration/geo/replication/database.md b/doc/administration/geo/replication/database.md
index 33f240ed11f..fa1b0f0e1d7 100644
--- a/doc/administration/geo/replication/database.md
+++ b/doc/administration/geo/replication/database.md
@@ -1,9 +1,6 @@
# Geo database replication **(PREMIUM ONLY)**
NOTE: **Note:**
-The following steps are for Omnibus installs only. Using Geo with source-based installs was **deprecated** in GitLab 11.5.
-
-NOTE: **Note:**
If your GitLab installation uses external (not managed by Omnibus) PostgreSQL
instances, the Omnibus roles will not be able to perform all necessary
configuration steps. In this case,
@@ -37,8 +34,8 @@ recover. See below for more details.
The following guide assumes that:
- You are using Omnibus and therefore you are using PostgreSQL 9.6 or later
- which includes the [`pg_basebackup` tool][pgback] and improved
- [Foreign Data Wrapper][FDW] support.
+ which includes the [`pg_basebackup` tool](https://www.postgresql.org/docs/9.6/app-pgbasebackup.html) and improved
+ [Foreign Data Wrapper][FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html) support.
- You have a **primary** node already set up (the GitLab server you are
replicating from), running Omnibus' PostgreSQL (or equivalent version), and
you have a new **secondary** server set up with the same versions of the OS,
@@ -56,6 +53,19 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o
sudo -i
```
+1. Edit `/etc/gitlab/gitlab.rb` and add a **unique** name for your node:
+
+ ```ruby
+ # The unique identifier for the Geo node.
+ gitlab_rails['geo_node_name'] = '<node_name_here>'
+ ```
+
+1. Reconfigure the **primary** node for the change to take effect:
+
+ ```sh
+ gitlab-ctl reconfigure
+ ```
+
1. Execute the command below to define the node as **primary** node:
```sh
@@ -149,9 +159,9 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o
address (corresponds to "internal address" for Google Cloud Platform) for
`postgresql['md5_auth_cidr_addresses']` and `postgresql['listen_address']`.
- The `listen_address` option opens PostgreSQL up to network connections
- with the interface corresponding to the given address. See [the PostgreSQL
- documentation][pg-docs-runtime-conn] for more details.
+ The `listen_address` option opens PostgreSQL up to network connections with the interface
+ corresponding to the given address. See [the PostgreSQL documentation](https://www.postgresql.org/docs/9.6/runtime-config-connection.html)
+ for more details.
Depending on your network configuration, the suggested addresses may not
be correct. If your **primary** node and **secondary** nodes connect over a local
@@ -202,9 +212,8 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o
postgresql['md5_auth_cidr_addresses'] = ['<primary_node_ip>/32', '<secondary_node_ip>/32', '<another_secondary_node_ip>/32']
```
- You may also want to edit the `wal_keep_segments` and `max_wal_senders` to
- match your database replication requirements. Consult the [PostgreSQL -
- Replication documentation][pg-docs-runtime-replication]
+ You may also want to edit the `wal_keep_segments` and `max_wal_senders` to match your
+ database replication requirements. Consult the [PostgreSQL - Replication documentation](https://www.postgresql.org/docs/9.6/runtime-config-replication.html)
for more information.
1. Save the file and reconfigure GitLab for the database listen changes and
@@ -430,7 +439,7 @@ data before running `pg_basebackup`.
(e.g., you know the network path is secure, or you are using a site-to-site
VPN). This is **not** safe over the public Internet!
- You can read more details about each `sslmode` in the
- [PostgreSQL documentation][pg-docs-ssl];
+ [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/libpq-ssl.html#LIBPQ-SSL-PROTECTION);
the instructions above are carefully written to ensure protection against
both passive eavesdroppers and active "man-in-the-middle" attackers.
- Change the `--slot-name` to the name of the replication slot
@@ -443,16 +452,16 @@ data before running `pg_basebackup`.
The replication process is now complete.
-## PGBouncer support (optional)
+## PgBouncer support (optional)
-[PGBouncer](http://pgbouncer.github.io/) may be used with GitLab Geo to pool
-PostgreSQL connections. We recommend using PGBouncer if you use GitLab in a
+[PgBouncer](http://pgbouncer.github.io/) may be used with GitLab Geo to pool
+PostgreSQL connections. We recommend using PgBouncer if you use GitLab in a
high-availability configuration with a cluster of nodes supporting a Geo
**primary** node and another cluster of nodes supporting a Geo **secondary** node. For more
information, see [High Availability with GitLab Omnibus](../../high_availability/database.md#high-availability-with-gitlab-omnibus-premium-only).
-For a Geo **secondary** node to work properly with PGBouncer in front of the database,
-it will need a separate read-only user to make [PostgreSQL FDW queries][FDW]
+For a Geo **secondary** node to work properly with PgBouncer in front of the database,
+it will need a separate read-only user to make [PostgreSQL FDW queries](https://www.postgresql.org/docs/9.6/postgres-fdw.html)
work:
1. On the **primary** Geo database, enter the PostgreSQL on the console as an
@@ -498,11 +507,6 @@ work:
Read the [troubleshooting document](troubleshooting.md).
[replication-slots-article]: https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75
-[pgback]: http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html
[replication user]:https://wiki.postgresql.org/wiki/Streaming_Replication
-[FDW]: https://www.postgresql.org/docs/9.6/static/postgres-fdw.html
[toc]: index.md#using-omnibus-gitlab
[rake-maintenance]: ../../raketasks/maintenance.md
-[pg-docs-ssl]: https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION
-[pg-docs-runtime-conn]: https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html
-[pg-docs-runtime-replication]: https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html
diff --git a/doc/administration/geo/replication/external_database.md b/doc/administration/geo/replication/external_database.md
index 256195998a7..4451d3c6c08 100644
--- a/doc/administration/geo/replication/external_database.md
+++ b/doc/administration/geo/replication/external_database.md
@@ -132,7 +132,7 @@ when `roles ['geo_secondary_role']` is set. For high availability,
refer to [Geo High Availability](../../high_availability/README.md).
If you want to run this database external to Omnibus, please follow the instructions below.
-The tracking database requires an [FDW](https://www.postgresql.org/docs/9.6/static/postgres-fdw.html)
+The tracking database requires an [FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html)
connection with the **secondary** replica database for improved performance.
If you have an external database ready to be used as the tracking database,
@@ -173,7 +173,7 @@ the tracking database on port 5432.
gitlab-rake geo:db:migrate
```
-1. Configure the [PostgreSQL FDW](https://www.postgresql.org/docs/9.6/static/postgres-fdw.html)
+1. Configure the [PostgreSQL FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html)
connection and credentials:
Save the script below in a file, ex. `/tmp/geo_fdw.sh` and modify the connection
diff --git a/doc/administration/geo/replication/faq.md b/doc/administration/geo/replication/faq.md
index b3580a706c3..b07b518d3b1 100644
--- a/doc/administration/geo/replication/faq.md
+++ b/doc/administration/geo/replication/faq.md
@@ -43,9 +43,9 @@ attachments / avatars and the whole database. This means user accounts,
issues, merge requests, groups, project data, etc., will be available for
query.
-## Can I git push to a **secondary** node?
+## Can I `git push` to a **secondary** node?
-Yes! Pushing directly to a **secondary** node (for both HTTP and SSH, including git-lfs) was [introduced](https://about.gitlab.com/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3.
+Yes! Pushing directly to a **secondary** node (for both HTTP and SSH, including Git LFS) was [introduced](https://about.gitlab.com/blog/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3.
## How long does it take to have a commit replicated to a **secondary** node?
diff --git a/doc/administration/geo/replication/high_availability.md b/doc/administration/geo/replication/high_availability.md
index 9d84e10d496..faa9d051107 100644
--- a/doc/administration/geo/replication/high_availability.md
+++ b/doc/administration/geo/replication/high_availability.md
@@ -8,7 +8,7 @@ described, it is possible to adapt these instructions to your needs.
![Geo HA Diagram](../../high_availability/img/geo-ha-diagram.png)
-_[diagram source - gitlab employees only][diagram-source]_
+_[diagram source - GitLab employees only][diagram-source]_
The topology above assumes that the **primary** and **secondary** Geo clusters
are located in two separate locations, on their own virtual network
@@ -57,6 +57,11 @@ The following steps enable a GitLab cluster to serve as the **primary** node.
roles ['geo_primary_role']
##
+ ## The unique identifier for the Geo node.
+ ##
+ gitlab_rails['geo_node_name'] = '<node_name_here>'
+
+ ##
## Disable automatic migrations
##
gitlab_rails['auto_migrate'] = false
@@ -71,8 +76,16 @@ high availability configuration documentation for
[PostgreSQL](../../high_availability/database.md#configuring-the-application-nodes)
and [Redis](../../high_availability/redis.md#example-configuration-for-the-gitlab-application).
-The **primary** database will require modification later, as part of
-[step 2](#step-2-configure-the-main-read-only-replica-postgresql-database-on-the-secondary-node).
+### Step 2: Configure the **primary** database
+
+1. Edit `/etc/gitlab/gitlab.rb` and add the following:
+
+ ```ruby
+ ##
+ ## Configure the Geo primary role and the PostgreSQL role
+ ##
+ roles ['geo_primary_role', 'postgres_role']
+ ```
## Configure a **secondary** node
@@ -115,9 +128,9 @@ the **primary** database. Use the following as a guide.
```ruby
##
- ## Configure the PostgreSQL role
+ ## Configure the Geo secondary role and the PostgreSQL role
##
- roles ['postgres_role']
+ roles ['geo_secondary_role', 'postgres_role']
##
## Secondary address
@@ -222,6 +235,11 @@ following modifications:
roles ['geo_secondary_role', 'application_role']
##
+ ## The unique identifier for the Geo node.
+ ##
+ gitlab_rails['geo_node_name'] = '<node_name_here>'
+
+ ##
## Disable automatic migrations
##
gitlab_rails['auto_migrate'] = false
@@ -274,15 +292,15 @@ After making these changes [Reconfigure GitLab][gitlab-reconfigure] so the chang
On the secondary the following GitLab frontend services will be enabled:
-- geo-logcursor
-- gitlab-pages
-- gitlab-workhorse
-- logrotate
-- nginx
-- registry
-- remote-syslog
-- sidekiq
-- unicorn
+- `geo-logcursor`
+- `gitlab-pages`
+- `gitlab-workhorse`
+- `logrotate`
+- `nginx`
+- `registry`
+- `remote-syslog`
+- `sidekiq`
+- `unicorn`
Verify these services by running `sudo gitlab-ctl status` on the frontend
application servers.
diff --git a/doc/administration/geo/replication/img/adding_a_secondary_node.png b/doc/administration/geo/replication/img/adding_a_secondary_node.png
new file mode 100644
index 00000000000..5421b578672
--- /dev/null
+++ b/doc/administration/geo/replication/img/adding_a_secondary_node.png
Binary files differ
diff --git a/doc/administration/geo/replication/img/single_git_add_geolocation_rule.png b/doc/administration/geo/replication/img/single_git_add_geolocation_rule.png
new file mode 100644
index 00000000000..4b04ba8d1f1
--- /dev/null
+++ b/doc/administration/geo/replication/img/single_git_add_geolocation_rule.png
Binary files differ
diff --git a/doc/administration/geo/replication/img/single_git_add_traffic_policy_endpoints.png b/doc/administration/geo/replication/img/single_git_add_traffic_policy_endpoints.png
new file mode 100644
index 00000000000..c19ad57c953
--- /dev/null
+++ b/doc/administration/geo/replication/img/single_git_add_traffic_policy_endpoints.png
Binary files differ
diff --git a/doc/administration/geo/replication/img/single_git_clone_panel.png b/doc/administration/geo/replication/img/single_git_clone_panel.png
new file mode 100644
index 00000000000..8aa0bd2f7d8
--- /dev/null
+++ b/doc/administration/geo/replication/img/single_git_clone_panel.png
Binary files differ
diff --git a/doc/administration/geo/replication/img/single_git_create_policy_records_with_traffic_policy.png b/doc/administration/geo/replication/img/single_git_create_policy_records_with_traffic_policy.png
new file mode 100644
index 00000000000..a554532f3b8
--- /dev/null
+++ b/doc/administration/geo/replication/img/single_git_create_policy_records_with_traffic_policy.png
Binary files differ
diff --git a/doc/administration/geo/replication/img/single_git_created_policy_record.png b/doc/administration/geo/replication/img/single_git_created_policy_record.png
new file mode 100644
index 00000000000..74c42395e15
--- /dev/null
+++ b/doc/administration/geo/replication/img/single_git_created_policy_record.png
Binary files differ
diff --git a/doc/administration/geo/replication/img/single_git_name_policy.png b/doc/administration/geo/replication/img/single_git_name_policy.png
new file mode 100644
index 00000000000..1a976539e94
--- /dev/null
+++ b/doc/administration/geo/replication/img/single_git_name_policy.png
Binary files differ
diff --git a/doc/administration/geo/replication/img/single_git_policy_diagram.png b/doc/administration/geo/replication/img/single_git_policy_diagram.png
new file mode 100644
index 00000000000..d62952dbbb3
--- /dev/null
+++ b/doc/administration/geo/replication/img/single_git_policy_diagram.png
Binary files differ
diff --git a/doc/administration/geo/replication/img/single_git_traffic_policies.png b/doc/administration/geo/replication/img/single_git_traffic_policies.png
new file mode 100644
index 00000000000..b3193c23d99
--- /dev/null
+++ b/doc/administration/geo/replication/img/single_git_traffic_policies.png
Binary files differ
diff --git a/doc/administration/geo/replication/index.md b/doc/administration/geo/replication/index.md
index f9f56b96e22..1fef2e85ce6 100644
--- a/doc/administration/geo/replication/index.md
+++ b/doc/administration/geo/replication/index.md
@@ -63,7 +63,7 @@ Keep in mind that:
- Get user data for logins (API).
- Replicate repositories, LFS Objects, and Attachments (HTTPS + JWT).
- Since GitLab Premium 10.0, the **primary** node no longer talks to **secondary** nodes to notify for changes (API).
-- Pushing directly to a **secondary** node (for both HTTP and SSH, including git-lfs) was [introduced](https://about.gitlab.com/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3.
+- Pushing directly to a **secondary** node (for both HTTP and SSH, including Git LFS) was [introduced](https://about.gitlab.com/blog/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3.
- There are [limitations](#current-limitations) in the current implementation.
### Architecture
@@ -108,7 +108,7 @@ The following are required to run Geo:
[fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md))
The following operating systems are known to ship with a current version of OpenSSH:
- [CentOS](https://www.centos.org) 7.4+
- - [Ubuntu](https://www.ubuntu.com) 16.04+
+ - [Ubuntu](https://ubuntu.com) 16.04+
- PostgreSQL 9.6+ with [FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html) support and [Streaming Replication](https://wiki.postgresql.org/wiki/Streaming_Replication)
- Git 2.9+
- All nodes must run the same GitLab version.
@@ -229,6 +229,10 @@ For more information on Geo security, see [Geo security review](security_review.
For more information on tuning Geo, see [Tuning Geo](tuning.md).
+### Set up a location-aware Git URL
+
+For an example of how to set up a location-aware Git remote URL with AWS Route53, see [Location-aware Git remote URL with AWS Route53](location_aware_git_url.md).
+
## Remove Geo node
For more information on removing a Geo node, see [Removing **secondary** Geo nodes](remove_geo_node.md).
@@ -240,7 +244,7 @@ This list of limitations only reflects the latest version of GitLab. If you are
- Pushing directly to a **secondary** node redirects (for HTTP) or proxies (for SSH) the request to the **primary** node instead of [handling it directly](https://gitlab.com/gitlab-org/gitlab/issues/1381), except when using Git over HTTP with credentials embedded within the URI. For example, `https://user:password@secondary.tld`.
- The **primary** node has to be online for OAuth login to happen. Existing sessions and Git are not affected.
-- The installation takes multiple manual steps that together can take about an hour depending on circumstances. We are working on improving this experience. See [gitlab-org/omnibus-gitlab#2978](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2978) for details.
+- The installation takes multiple manual steps that together can take about an hour depending on circumstances. We are working on improving this experience. See [Omnibus GitLab issue #2978](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2978) for details.
- Real-time updates of issues/merge requests (for example, via long polling) doesn't work on the **secondary** node.
- [Selective synchronization](configuration.md#selective-synchronization) applies only to files and repositories. Other datasets are replicated to the **secondary** node in full, making it inappropriate for use as an access control mechanism.
- Object pools for forked project deduplication work only on the **primary** node, and are duplicated on the **secondary** node.
@@ -251,36 +255,58 @@ This list of limitations only reflects the latest version of GitLab. If you are
The following table lists the GitLab features along with their replication
and verification status on a **secondary** node.
-You can keep track of the progress to include the missing items in:
-
-- [ee-893](https://gitlab.com/groups/gitlab-org/-/epics/893).
-- [ee-1430](https://gitlab.com/groups/gitlab-org/-/epics/1430).
-
-| Feature | Replicated | Verified |
-|-----------|------------|----------|
-| All database content (e.g. snippets, epics, issues, merge requests, groups, and project metadata) | Yes | Yes |
-| Project repository | Yes | Yes |
-| Project wiki repository | Yes | Yes |
-| Project designs repository | No | No |
-| Uploads (e.g. attachments to issues, merge requests, epics, and avatars) | Yes | Yes, only on transfer, or manually (1) |
-| LFS Objects | Yes | Yes, only on transfer, or manually (1) |
-| CI job artifacts (other than traces) | Yes | No, only manually (1) |
-| Archived traces | Yes | Yes, only on transfer, or manually (1) |
-| Personal snippets | Yes | Yes |
-| Version-controlled personal snippets ([unsupported](https://gitlab.com/gitlab-org/gitlab-foss/issues/13426)) | No | No |
-| Project snippets | Yes | Yes |
-| Version-controlled project snippets ([unsupported](https://gitlab.com/gitlab-org/gitlab-foss/issues/13426)) | No | No |
-| Object pools for forked project deduplication | No | No |
-| [Server-side Git Hooks](../../custom_hooks.md) | No | No |
-| [Elasticsearch integration](../../../integration/elasticsearch.md) | No | No |
-| [GitLab Pages](../../pages/index.md) | No | No |
-| [Container Registry](../../packages/container_registry.md) | Yes | No |
-| [NPM Registry](../../../user/packages/npm_registry/index.md) | No | No |
-| [Maven Packages](../../../user/packages/maven_repository/index.md) | No | No |
-| [External merge request diffs](../../merge_request_diffs.md) | No, if they are on-disk | No |
-| Content in object storage ([track progress](https://gitlab.com/groups/gitlab-org/-/epics/1526)) | No | No |
-
-1. The integrity can be verified manually using [Integrity Check Rake Task](../../raketasks/check.md) on both nodes and comparing the output between them.
+You can keep track of the progress to implement the missing items in
+these epics/issues:
+
+- [Unreplicated Data Types](https://gitlab.com/groups/gitlab-org/-/epics/893)
+- [Verify all replicated data](https://gitlab.com/groups/gitlab-org/-/epics/1430)
+
+| Feature | Replicated | Verified | Notes |
+|-----------------------------------------------------|--------------------------|-----------------------------|--------------------------------------------|
+| All database content | **Yes** | **Yes** | |
+| Project repository | **Yes** | **Yes** | |
+| Project wiki repository | **Yes** | **Yes** | |
+| Project designs repository | [No][design-replication] | [No][design-verification] | |
+| Uploads | **Yes** | [No][upload-verification] | Verified only on transfer, or manually (1) |
+| LFS Objects | **Yes** | [No][lfs-verification] | Verified only on transfer, or manually (1) |
+| CI job artifacts (other than traces) | **Yes** | [No][artifact-verification] | Verified only manually (1) |
+| Archived traces | **Yes** | [No][artifact-verification] | Verified only on transfer, or manually (1) |
+| Personal snippets | **Yes** | **Yes** | |
+| Version-controlled personal snippets | No | No | [Not yet supported][unsupported-snippets] |
+| Project snippets | **Yes** | **Yes** | |
+| Version-controlled project snippets | No | No | [Not yet supported][unsupported-snippets] |
+| Object pools for forked project deduplication | **Yes** | No | |
+| [Server-side Git Hooks][custom-hooks] | No | No | |
+| [Elasticsearch integration][elasticsearch] | No | No | |
+| [GitLab Pages][gitlab-pages] | [No][pages-replication] | No | |
+| [Container Registry][container-registry] | **Yes** | No | |
+| [NPM Registry][npm-registry] | No | No | |
+| [Maven Repository][maven-repository] | No | No | |
+| [Conan Repository][conan-repository] | No | No | |
+| [External merge request diffs][merge-request-diffs] | [No][diffs-replication] | No | |
+| Content in object storage | **Yes** | No | |
+
+[design-replication]: https://gitlab.com/groups/gitlab-org/-/epics/1633
+[design-verification]: https://gitlab.com/gitlab-org/gitlab/issues/32467
+[upload-verification]: https://gitlab.com/groups/gitlab-org/-/epics/1817
+[lfs-verification]: https://gitlab.com/gitlab-org/gitlab/issues/8922
+[artifact-verification]: https://gitlab.com/gitlab-org/gitlab/issues/8923
+[diffs-replication]: https://gitlab.com/gitlab-org/gitlab/issues/33817
+[pages-replication]: https://gitlab.com/groups/gitlab-org/-/epics/589
+
+[unsupported-snippets]: https://gitlab.com/gitlab-org/gitlab/issues/14228
+[custom-hooks]: ../../custom_hooks.md
+[elasticsearch]: ../../../integration/elasticsearch.md
+[gitlab-pages]: ../../pages/index.md
+[container-registry]: ../../packages/container_registry.md
+[npm-registry]: ../../../user/packages/npm_registry/index.md
+[maven-repository]: ../../../user/packages/maven_repository/index.md
+[conan-repository]: ../../../user/packages/conan_repository/index.md
+[merge-request-diffs]: ../../merge_request_diffs.md
+
+1. The integrity can be verified manually using
+[Integrity Check Rake Task](../../raketasks/check.md)
+on both nodes and comparing the output between them.
DANGER: **DANGER**
Features not on this list, or with **No** in the **Replicated** column,
diff --git a/doc/administration/geo/replication/location_aware_git_url.md b/doc/administration/geo/replication/location_aware_git_url.md
new file mode 100644
index 00000000000..6183a0ad119
--- /dev/null
+++ b/doc/administration/geo/replication/location_aware_git_url.md
@@ -0,0 +1,119 @@
+# Location-aware Git remote URL with AWS Route53 **(PREMIUM ONLY)**
+
+You can provide GitLab users with a single remote URL that automatically uses
+the Geo node closest to them. This means users don't need to update their Git
+configuration to take advantage of closer Geo nodes as they move.
+
+This is possible because, Git push requests can be automatically redirected
+(HTTP) or proxied (SSH) from **secondary** nodes to the **primary** node.
+
+Though these instructions use [AWS Route53](https://aws.amazon.com/route53/),
+other services such as [Cloudflare](https://www.cloudflare.com/) could be used
+as well.
+
+NOTE: **Note**
+You can also use a load balancer to distribute web UI or API traffic to
+[multiple Geo **secondary** nodes](../../../user/admin_area/geo_nodes.md#multiple-secondary-nodes-behind-a-load-balancer).
+Importantly, the **primary** node cannot yet be included. See the feature request
+[Support putting the **primary** behind a Geo node load balancer](https://gitlab.com/gitlab-org/gitlab/issues/10888).
+
+## Prerequisites
+
+In this example, we have already set up:
+
+- `primary.example.com` as a Geo **primary** node.
+- `secondary.example.com` as a Geo **secondary** node.
+
+We will create a `git.example.com` subdomain that will automatically direct
+requests:
+
+- From Europe to the **secondary** node.
+- From all other locations to the **primary** node.
+
+In any case, you require:
+
+- A working GitLab **primary** node that is accessible at its own address.
+- A working GitLab **secondary** node.
+- A Route53 Hosted Zone managing your domain.
+
+If you have not yet setup a Geo **primary** node and **secondary** node, please consult
+[the Geo setup instructions](https://docs.gitlab.com/ee/administration/geo/replication/#setup-instructions).
+
+## Create a traffic policy
+
+In a Route53 Hosted Zone, traffic policies can be used to set up a variety of
+routing configurations.
+
+1. Navigate to the
+[Route53 dashboard](https://console.aws.amazon.com/route53/home) and click
+**Traffic policies**.
+
+ ![Traffic policies](img/single_git_traffic_policies.png)
+
+1. Click the **Create traffic policy** button.
+
+ ![Name policy](img/single_git_name_policy.png)
+
+1. Fill in the **Policy Name** field with `Single Git Host` and click **Next**.
+
+ ![Policy diagram](img/single_git_policy_diagram.png)
+
+1. Leave **DNS type** as `A: IP Address in IPv4 format`.
+1. Click **Connect to...** and select **Geolocation rule**.
+
+ ![Add geolocation rule](img/single_git_add_geolocation_rule.png)
+
+1. For the first **Location**, leave it as `Default`.
+1. Click **Connect to...** and select **New endpoint**.
+1. Choose **Type** `value` and fill it in with `<your **primary** IP address>`.
+1. For the second **Location**, choose `Europe`.
+1. Click **Connect to...** and select **New endpoint**.
+1. Choose **Type** `value` and fill it in with `<your **secondary** IP address>`.
+
+ ![Add traffic policy endpoints](img/single_git_add_traffic_policy_endpoints.png)
+
+1. Click **Create traffic policy**.
+
+ ![Create policy records with traffic policy](img/single_git_create_policy_records_with_traffic_policy.png)
+
+1. Fill in **Policy record DNS name** with `git`.
+1. Click **Create policy records**.
+
+ ![Created policy record](img/single_git_created_policy_record.png)
+
+You have successfully set up a single host, e.g. `git.example.com` which
+distributes traffic to your Geo nodes by geolocation!
+
+## Configure Git clone URLs to use the special Git URL
+
+When a user clones a repository for the first time, they typically copy the Git
+remote URL from the project page. By default, these SSH and HTTP URLs are based
+on the external URL of the current host. For example:
+
+- `git@secondary.example.com:group1/project1.git`
+- `https://secondary.example.com/group1/project1.git`
+
+![Clone panel](img/single_git_clone_panel.png)
+
+You can customize the:
+
+- SSH remote URL to use the location-aware `git.example.com`. To do so, change the SSH remote URL's
+ host by setting `gitlab_rails['gitlab_ssh_host']` in `gitlab.rb` of web nodes.
+- HTTP remote URL as shown in
+ [Custom Git clone URL for HTTP(S)](../../../user/admin_area/settings/visibility_and_access_controls.md#custom-git-clone-url-for-https).
+
+## Example Git request handling behavior
+
+After following the configuration steps above, handling for Git requests is now location aware.
+For requests:
+
+- Outside Europe, all requests are directed to the **primary** node.
+- Within Europe, over:
+ - HTTP:
+ - `git clone http://git.example.com/foo/bar.git` is directed to the **secondary** node.
+ - `git push` is initially directed to the **secondary**, which automatically
+ redirects to `primary.example.com`.
+ - SSH:
+ - `git clone git@git.example.com:foo/bar.git` is directed to the **secondary**.
+ - `git push` is initially directed to the **secondary**, which automatically
+ proxies the request to `primary.example.com`.
diff --git a/doc/administration/geo/replication/object_storage.md b/doc/administration/geo/replication/object_storage.md
index 878b67a8f8e..a9087abcbd9 100644
--- a/doc/administration/geo/replication/object_storage.md
+++ b/doc/administration/geo/replication/object_storage.md
@@ -1,16 +1,33 @@
# Geo with Object storage **(PREMIUM ONLY)**
-Geo can be used in combination with Object Storage (AWS S3, or
-other compatible object storage).
+Geo can be used in combination with Object Storage (AWS S3, or other compatible object storage).
-## Configuration
+Currently, **secondary** nodes can use either:
-At this time it is required that if object storage is enabled on the
-**primary** node, it must also be enabled on each **secondary** node.
+- The same storage bucket as the **primary** node.
+- A replicated storage bucket.
-**Secondary** nodes can use the same storage bucket as the **primary** node, or
-they can use a replicated storage bucket. At this time GitLab does not
-take care of content replication in object storage.
+To have:
+
+- GitLab manage replication, follow [Enabling GitLab replication](#enabling-gitlab-managed-object-storage-replication).
+- Third-party services manage replication, follow [Third-party replication services](#third-party-replication-services).
+
+## Enabling GitLab managed object storage replication
+
+> [Introduced](https://gitlab.com/gitlab-org/gitlab/issues/10586) in GitLab 12.4.
+
+CAUTION: **Caution:**
+This is a [**beta** feature](https://about.gitlab.com/handbook/product/#beta) and is not ready yet for production use at any scale.
+
+**Secondary** nodes can replicate files stored on the **primary** node regardless of
+whether they are stored on the local filesystem or in object storage.
+
+To enable GitLab replication, you must:
+
+1. Go to **Admin Area > Geo**.
+1. Press **Edit** on the **secondary** node.
+1. Enable the **Allow this secondary node to replicate content on Object Storage**
+ checkbox.
For LFS, follow the documentation to
[set up LFS object storage](../../../workflow/lfs/lfs_administration.md#storing-lfs-objects-in-remote-object-storage).
@@ -20,12 +37,21 @@ For CI job artifacts, there is similar documentation to configure
For user uploads, there is similar documentation to configure [upload object storage](../../uploads.md#using-object-storage-core-only)
-You should enable and configure object storage on both **primary** and **secondary**
-nodes. Migrating existing data to object storage should be performed on the
-**primary** node only. **Secondary** nodes will automatically notice that the migrated
-files are now in object storage.
+If you want to migrate the **primary** node's files to object storage, you can
+configure the **secondary** in a few ways:
+
+- Use the exact same object storage.
+- Use a separate object store but leverage your object storage solution's built-in
+ replication.
+- Use a separate object store and enable the **Allow this secondary node to replicate
+ content on Object Storage** setting.
+
+GitLab does not currently support the case where both:
+
+- The **primary** node uses local storage.
+- A **secondary** node uses object storage.
-## Replication
+## Third-party replication services
When using Amazon S3, you can use
[CRR](https://docs.aws.amazon.com/AmazonS3/latest/dev/crr.html) to
diff --git a/doc/administration/geo/replication/security_review.md b/doc/administration/geo/replication/security_review.md
index 832d02be9a5..68bf5b5d23a 100644
--- a/doc/administration/geo/replication/security_review.md
+++ b/doc/administration/geo/replication/security_review.md
@@ -1,9 +1,9 @@
# Geo security review (Q&A) **(PREMIUM ONLY)**
-The following security review of the Geo feature set focuses on security
-aspects of the feature as they apply to customers running their own GitLab
-instances. The review questions are based in part on the [application security architecture](https://www.owasp.org/index.php/Application_Security_Architecture_Cheat_Sheet)
-questions from [owasp.org](https://www.owasp.org).
+The following security review of the Geo feature set focuses on security aspects of
+the feature as they apply to customers running their own GitLab instances. The review
+questions are based in part on the [OWASP Application Security Verification Standard Project](https://www.owasp.org/index.php/Category:OWASP_Application_Security_Verification_Standard_Project)
+from [owasp.org](https://www.owasp.org/index.php/Main_Page).
## Business Model
@@ -30,7 +30,7 @@ questions from [owasp.org](https://www.owasp.org).
private projects. Geo replicates them all indiscriminately. “Selective sync”
exists for files and repositories (but not database content), which would permit
only less-sensitive projects to be replicated to a **secondary** node if desired.
-- See also: [developing a data classification policy](https://gitlab.com/gitlab-com/security/issues/4).
+- See also: [GitLab data classification policy](https://about.gitlab.com/handbook/engineering/security/data-classification-policy.html).
### What data backup and retention requirements have been defined for the application?
@@ -49,9 +49,9 @@ questions from [owasp.org](https://www.owasp.org).
### How do the end‐users interact with the application?
- **Secondary** nodes provide all the interfaces a **primary** node does
- (notably a HTTP/HTTPS web application, and HTTP/HTTPS or SSH git repository
+ (notably a HTTP/HTTPS web application, and HTTP/HTTPS or SSH Git repository
access), but is constrained to read-only activities. The principal use case is
- envisioned to be cloning git repositories from the **secondary** node in favor of the
+ envisioned to be cloning Git repositories from the **secondary** node in favor of the
**primary** node, but end-users may use the GitLab web interface to view projects,
issues, merge requests, snippets, etc.
@@ -229,7 +229,7 @@ questions from [owasp.org](https://www.owasp.org).
- A static secret shared across all hosts in a GitLab deployment.
- In transit, data should be encrypted, although the application does permit
communication to proceed unencrypted. The two main transits are the **secondary** node’s
- replication process for PostgreSQL, and for git repositories/files. Both should
+ replication process for PostgreSQL, and for Git repositories/files. Both should
be protected using TLS, with the keys for that managed via Omnibus per existing
configuration for end-user access to GitLab.
diff --git a/doc/administration/geo/replication/troubleshooting.md b/doc/administration/geo/replication/troubleshooting.md
index 263fc05dce9..4d64941411a 100644
--- a/doc/administration/geo/replication/troubleshooting.md
+++ b/doc/administration/geo/replication/troubleshooting.md
@@ -252,7 +252,7 @@ to start again from scratch, there are a few steps that can help you:
gitlab-ctl stop geo-logcursor
```
- You can watch sidekiq logs to know when sidekiq jobs processing have finished:
+ You can watch Sidekiq logs to know when Sidekiq jobs processing have finished:
```sh
gitlab-ctl tail sidekiq
@@ -280,8 +280,8 @@ to start again from scratch, there are a few steps that can help you:
Any uploaded content like file attachments, avatars or LFS objects are stored in a
subfolder in one of the two paths below:
- - /var/opt/gitlab/gitlab-rails/shared
- - /var/opt/gitlab/gitlab-rails/uploads
+ - `/var/opt/gitlab/gitlab-rails/shared`
+ - `/var/opt/gitlab/gitlab-rails/uploads`
To rename all of them:
diff --git a/doc/administration/geo/replication/using_a_geo_server.md b/doc/administration/geo/replication/using_a_geo_server.md
index 55b5d486676..55c7e78da92 100644
--- a/doc/administration/geo/replication/using_a_geo_server.md
+++ b/doc/administration/geo/replication/using_a_geo_server.md
@@ -4,7 +4,7 @@
After you set up the [database replication and configure the Geo nodes][req], use your closest GitLab node as you would a normal standalone GitLab instance.
-Pushing directly to a **secondary** node (for both HTTP, SSH including git-lfs) was [introduced](https://about.gitlab.com/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3.
+Pushing directly to a **secondary** node (for both HTTP, SSH including Git LFS) was [introduced](https://about.gitlab.com/blog/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3.
Example of the output you will see when pushing to a **secondary** node: