diff options
Diffstat (limited to 'doc/administration/geo/replication')
13 files changed, 663 insertions, 500 deletions
diff --git a/doc/administration/geo/replication/configuration.md b/doc/administration/geo/replication/configuration.md index 0b076e7ff3c..86a8e5b28d1 100644 --- a/doc/administration/geo/replication/configuration.md +++ b/doc/administration/geo/replication/configuration.md @@ -262,7 +262,7 @@ You can login to the **secondary** node with the same credentials you used for t **secondary** Geo node and if Geo is enabled. The initial replication, or 'backfill', will probably still be in progress. You -can monitor the synchronization process on each geo node from the **primary** +can monitor the synchronization process on each Geo node from the **primary** node's **Geo Nodes** dashboard in your browser. ![Geo dashboard](img/geo_node_dashboard.png) @@ -314,19 +314,17 @@ It is important to note that selective synchronization: Selective synchronization restrictions are implemented on the **secondary** nodes, not the **primary** node. -### Git operations on unreplicated respositories +### Git operations on unreplicated repositories -> [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/2562) in GitLab 12.10. +> [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/2562) in GitLab 12.10 for HTTP(S) and in GitLab 13.0 for SSH. -Git clone, pull, and push operations over HTTP(S) are supported for repositories that +Git clone, pull, and push operations over HTTP(S) and SSH are supported for repositories that exist on the **primary** node but not on **secondary** nodes. This situation can occur when: - Selective synchronization does not include the project attached to the repository. - The repository is actively being replicated but has not completed yet. -SSH [support is planned](https://gitlab.com/groups/gitlab-org/-/epics/2562). - ## Upgrading Geo See the [updating the Geo nodes document](updating_the_geo_nodes.md). diff --git a/doc/administration/geo/replication/database.md b/doc/administration/geo/replication/database.md index ffdec5a83c7..62bd0e6ac19 100644 --- a/doc/administration/geo/replication/database.md +++ b/doc/administration/geo/replication/database.md @@ -33,9 +33,9 @@ recover. See below for more details. The following guide assumes that: -- You are using Omnibus and therefore you are using PostgreSQL 9.6 or later - which includes the [`pg_basebackup` tool](https://www.postgresql.org/docs/9.6/app-pgbasebackup.html) and improved - [Foreign Data Wrapper](https://www.postgresql.org/docs/9.6/postgres-fdw.html) support. +- You are using Omnibus and therefore you are using PostgreSQL 11 or later + which includes the [`pg_basebackup` tool](https://www.postgresql.org/docs/11/app-pgbasebackup.html) and improved + [Foreign Data Wrapper](https://www.postgresql.org/docs/11/postgres-fdw.html) support. - You have a **primary** node already set up (the GitLab server you are replicating from), running Omnibus' PostgreSQL (or equivalent version), and you have a new **secondary** server set up with the same versions of the OS, @@ -91,7 +91,7 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab` postgresql['sql_user_password'] = '<md5_hash_of_your_password>' - # Every node that runs Unicorn or Sidekiq needs to have the database + # Every node that runs Puma or Sidekiq needs to have the database # password specified as below. If you have a high-availability setup, this # must be present in all application nodes. gitlab_rails['db_password'] = '<your_password_here>' @@ -160,7 +160,7 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o `postgresql['md5_auth_cidr_addresses']` and `postgresql['listen_address']`. The `listen_address` option opens PostgreSQL up to network connections with the interface - corresponding to the given address. See [the PostgreSQL documentation](https://www.postgresql.org/docs/9.6/runtime-config-connection.html) + corresponding to the given address. See [the PostgreSQL documentation](https://www.postgresql.org/docs/11/runtime-config-connection.html) for more details. Depending on your network configuration, the suggested addresses may not @@ -213,7 +213,7 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o ``` You may also want to edit the `wal_keep_segments` and `max_wal_senders` to match your - database replication requirements. Consult the [PostgreSQL - Replication documentation](https://www.postgresql.org/docs/9.6/runtime-config-replication.html) + database replication requirements. Consult the [PostgreSQL - Replication documentation](https://www.postgresql.org/docs/11/runtime-config-replication.html) for more information. 1. Save the file and reconfigure GitLab for the database listen changes and @@ -273,7 +273,7 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o 1. Stop application server and Sidekiq ```shell - gitlab-ctl stop unicorn + gitlab-ctl stop puma gitlab-ctl stop sidekiq ``` @@ -442,7 +442,7 @@ data before running `pg_basebackup`. (e.g., you know the network path is secure, or you are using a site-to-site VPN). This is **not** safe over the public Internet! - You can read more details about each `sslmode` in the - [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/libpq-ssl.html#LIBPQ-SSL-PROTECTION); + [PostgreSQL documentation](https://www.postgresql.org/docs/11/libpq-ssl.html#LIBPQ-SSL-PROTECTION); the instructions above are carefully written to ensure protection against both passive eavesdroppers and active "man-in-the-middle" attackers. - Change the `--slot-name` to the name of the replication slot @@ -461,10 +461,10 @@ The replication process is now complete. PostgreSQL connections. We recommend using PgBouncer if you use GitLab in a high-availability configuration with a cluster of nodes supporting a Geo **primary** node and another cluster of nodes supporting a Geo **secondary** node. For more -information, see [High Availability with GitLab Omnibus](../../high_availability/database.md#high-availability-with-gitlab-omnibus-premium-only). +information, see [High Availability with Omnibus GitLab](../../high_availability/database.md#high-availability-with-omnibus-gitlab-premium-only). For a Geo **secondary** node to work properly with PgBouncer in front of the database, -it will need a separate read-only user to make [PostgreSQL FDW queries](https://www.postgresql.org/docs/9.6/postgres-fdw.html) +it will need a separate read-only user to make [PostgreSQL FDW queries](https://www.postgresql.org/docs/11/postgres-fdw.html) work: 1. On the **primary** Geo database, enter the PostgreSQL on the console as an diff --git a/doc/administration/geo/replication/datatypes.md b/doc/administration/geo/replication/datatypes.md index 3431df3ed1f..17031b11f51 100644 --- a/doc/administration/geo/replication/datatypes.md +++ b/doc/administration/geo/replication/datatypes.md @@ -79,7 +79,7 @@ GitLab stores files and blobs such as Issue attachments or LFS objects into eith - A Storage Appliance that exposes an Object Storage-compatible API. When using the filesystem store instead of Object Storage, you need to use network mounted filesystems -to run GitLab when using more than one server (for example with a High Availability setup). +to run GitLab when using more than one server. With respect to replication and verification: @@ -135,6 +135,7 @@ successfully, you must replicate their data using some other means. | CI job artifacts (other than traces) | **Yes** | [No](https://gitlab.com/gitlab-org/gitlab/issues/8923) | Verified only manually (*1*) | | Archived traces | **Yes** | [No](https://gitlab.com/gitlab-org/gitlab/issues/8923) | Verified only on transfer, or manually (*1*) | | Personal snippets | **Yes** | **Yes** | | +| [Versioned snippets](../../../user/snippets.md#versioned-snippets) | [No](https://gitlab.com/groups/gitlab-org/-/epics/2809) | [No](https://gitlab.com/groups/gitlab-org/-/epics/2810) | | | Project snippets | **Yes** | **Yes** | | | Object pools for forked project deduplication | **Yes** | No | | | [Server-side Git Hooks](../../custom_hooks.md) | No | No | | @@ -145,7 +146,7 @@ successfully, you must replicate their data using some other means. | [Maven Repository](../../../user/packages/maven_repository/index.md) | [No](https://gitlab.com/groups/gitlab-org/-/epics/2346) | No | | | [Conan Repository](../../../user/packages/conan_repository/index.md) | [No](https://gitlab.com/groups/gitlab-org/-/epics/2346) | No | | | [NuGet Repository](../../../user/packages/nuget_repository/index.md) | [No](https://gitlab.com/groups/gitlab-org/-/epics/2346) | No | | -| [PyPi Repository](../../../user/packages/pypi_repository/index.md) | [No](https://gitlab.com/groups/gitlab-org/-/epics/2554) | No | | +| [PyPi Repository](../../../user/packages/pypi_repository/index.md) | [No](https://gitlab.com/groups/gitlab-org/-/epics/2554) | No | | | [External merge request diffs](../../merge_request_diffs.md) | [No](https://gitlab.com/gitlab-org/gitlab/issues/33817) | No | | | Content in object storage | **Yes** | No | | diff --git a/doc/administration/geo/replication/external_database.md b/doc/administration/geo/replication/external_database.md index e305718580e..ae3069a0e40 100644 --- a/doc/administration/geo/replication/external_database.md +++ b/doc/administration/geo/replication/external_database.md @@ -17,6 +17,19 @@ developed and tested. We aim to be compatible with most external sudo -i ``` +1. Edit `/etc/gitlab/gitlab.rb` and add a **unique** ID for your node (arbitrary value): + + ```ruby + # The unique identifier for the Geo node. + gitlab_rails['geo_node_name'] = '<node_name_here>' + ``` + +1. Reconfigure the **primary** node for the change to take effect: + + ```shell + gitlab-ctl reconfigure + ``` + 1. Execute the command below to define the node as **primary** node: ```shell @@ -38,7 +51,14 @@ Given you have a primary node set up on AWS EC2 that uses RDS. You can now just create a read-only replica in a different region and the replication process will be managed by AWS. Make sure you've set Network ACL, Subnet, and Security Group according to your needs, so the secondary application node can access the database. -Skip to the [Configure secondary application node](#configure-secondary-application-nodes-to-use-the-external-read-replica) section below. + +The following instructions detail how to create a read-only replica for common +cloud providers: + +- Amazon RDS - [Creating a Read Replica](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/USER_ReadRepl.html#USER_ReadRepl.Create) +- Azure Database for PostgreSQL - [Create and manage read replicas in Azure Database for PostgreSQL](https://docs.microsoft.com/en-us/azure/postgresql/howto-read-replicas-portal) + +Once your read-only replica is set up, you can skip to [configure you secondary application node](#configure-secondary-application-nodes-to-use-the-external-read-replica). #### Manually configure the primary database for replication @@ -133,6 +153,10 @@ To configure the connection to the external read-replica database and enable Log gitlab_rails['db_username'] = 'gitlab' gitlab_rails['db_host'] = '<database_read_replica_host>' + + # Disable the bundled Omnibus PostgreSQL, since we are + # using an external PostgreSQL + postgresql['enable'] = false ``` 1. Save the file and [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) @@ -142,11 +166,17 @@ To configure the connection to the external read-replica database and enable Log **Secondary** nodes use a separate PostgreSQL installation as a tracking database to keep track of replication status and automatically recover from potential replication issues. Omnibus automatically configures a tracking database -when `roles ['geo_secondary_role']` is set. For high availability, -refer to [Geo High Availability](../../availability/index.md). +when `roles ['geo_secondary_role']` is set. If you want to run this database external to Omnibus, please follow the instructions below. -The tracking database requires an [FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html) +If you are using a cloud-managed service for the tracking database, you may need +to grant additional roles to your tracking database user (by default, this is +`gitlab_geo`): + +- Amazon RDS requires the [`rds_superuser`](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Appendix.PostgreSQL.CommonDBATasks.html#Appendix.PostgreSQL.CommonDBATasks.Roles) role. +- Azure Database for PostgreSQL requires the [`azure_pg_admin`](https://docs.microsoft.com/en-us/azure/postgresql/howto-create-users#how-to-create-additional-admin-users-in-azure-database-for-postgresql) role. + +The tracking database requires an [FDW](https://www.postgresql.org/docs/11/postgres-fdw.html) connection with the **secondary** replica database for improved performance. If you have an external database ready to be used as the tracking database, @@ -200,7 +230,7 @@ the tracking database on port 5432. gitlab-rake geo:db:migrate ``` -1. Configure the [PostgreSQL FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html) +1. Configure the [PostgreSQL FDW](https://www.postgresql.org/docs/11/postgres-fdw.html) connection and credentials: Save the script below in a file, ex. `/tmp/geo_fdw.sh` and modify the connection diff --git a/doc/administration/geo/replication/geo_validation_tests.md b/doc/administration/geo/replication/geo_validation_tests.md new file mode 100644 index 00000000000..a8b0bdeb7da --- /dev/null +++ b/doc/administration/geo/replication/geo_validation_tests.md @@ -0,0 +1,100 @@ +# Geo validation tests + +The Geo team performs manual testing and validation on common deployment configurations to ensure +that Geo works when upgrading between minor GitLab versions and major PostgreSQL database versions. + +This section contains a journal of recent validation tests and links to the relevant issues. + +## GitLab upgrades + +The following are GitLab upgrade validation tests we performed. + +### February 2020 + +[Upgrade Geo multi-server installation](https://gitlab.com/gitlab-org/gitlab/-/issues/201837): + +- Description: Tested upgrading from GitLab 12.7.5 to the latest GitLab 12.8 package in a multi-server + configuration. +- Outcome: Partial success because we did not run the looping pipeline during the demo to monitor + downtime. + +### January 2020 + +[Upgrade Geo multi-server installation](https://gitlab.com/gitlab-org/gitlab/-/issues/200085): + +- Description: Tested upgrading from GitLab 12.6.x to the latest GitLab 12.7 package in a multi-server + configuration. +- Outcome: Upgrade test was successful. +- Follow up issues: + - [Investigate Geo end-to-end test failures](https://gitlab.com/gitlab-org/gitlab/issues/201823). + - [Add more logging to Geo end-to-end tests](https://gitlab.com/gitlab-org/gitlab/issues/201830). + - [Excess service restarts during zero-downtime upgrade](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5047). + +[Upgrade Geo multi-server installation](https://gitlab.com/gitlab-org/gitlab/-/issues/199836): + +- Description: Tested upgrading from GitLab 12.5.7 to GitLab 12.6.6 in a multi-server configuration. +- Outcome: Upgrade test was successful. +- Follow up issue: + [Update documentation for zero-downtime upgrades to ensure deploy node it not in use](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5046). + +[Upgrade Geo multi-server installation](https://gitlab.com/gitlab-org/gitlab/-/issues/37044): + +- Description: Tested upgrading from GitLab 12.4.x to the latest GitLab 12.5 package in a multi-server + configuration. +- Outcome: Upgrade test was successful. +- Follow up issues: + - [Investigate why HTTP push spec failed on primary node](https://gitlab.com/gitlab-org/gitlab/issues/199825). + - [Investigate if documentation should be modified to include refresh foreign tables task](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5041). + +### October 2019 + +[Upgrade Geo multi-server installation](https://gitlab.com/gitlab-org/gitlab/-/issues/35262): + +- Description: Tested upgrading from GitLab 12.3.5 to GitLab 12.4.1 in a multi-server configuration. +- Outcome: Upgrade test was successful. + +[Upgrade Geo multi-server installation](https://gitlab.com/gitlab-org/gitlab/-/issues/32437): + +- Description: Tested upgrading from GitLab 12.2.8 to GitLab 12.3.5. +- Outcome: Upgrade test was successful. + +[Upgrade Geo multi-server installation](https://gitlab.com/gitlab-org/gitlab/-/issues/32435): + +- Description: Tested upgrading from GitLab 12.1.9 to GitLab 12.2.8. +- Outcome: Partial success due to possible misconfiguration issues. + +## PostgreSQL upgrades + +The following are PostgreSQL upgrade validation tests we performed. + +### April 2020 + +[PostgreSQL 11 upgrade procedure for Geo installations](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/4975): + +- Description: Prior to making PostgreSQL 11 the default version of PostgreSQL in GitLab 12.10, we + tested upgrading to PostgreSQL 11 in Geo deployments on GitLab 12.9. +- Outcome: Partially successful. Issues were discovered in multi-server configurations with a separate + tracking database and concerns were raised about allowing automatic upgrades when Geo enabled. +- Follow up issues: + - [`replicate-geo-database` incorrectly tries to back up repositories](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5241). + - [`pg-upgrade` fails to upgrade a standalone Geo tracking database](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5242). + - [`revert-pg-upgrade` fails to downgrade the PostgreSQL data of a Geo secondary’s standalone tracking database](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5243). + - [Timeout error on Geo secondary read-replica near the end of `gitlab-ctl pg-upgrade`](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5235). + +[Verify Geo installation with PostgreSQL 11](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/4971): + +- Description: Prior to making PostgreSQL 11 the default version of PostgreSQL in GitLab 12.10, we + tested fresh installations of GitLab 12.9 with Geo installed with PostgreSQL 11. +- Outcome: Installation test was successful. + +### September 2019 + +[Test and validate PostgreSQL 10.0 upgrade for Geo](https://gitlab.com/gitlab-org/gitlab/issues/12092): + +- Description: With the 12.0 release, GitLab required an upgrade to PostgreSQL 10.0. We tested + various upgrade scenarios from GitLab 11.11.5 through to GitLab 12.1.8. +- Outcome: Multiple issues were found when upgrading and addressed in follow-up issues. +- Follow up issues: + - [`gitlab-ctl` reconfigure fails on Redis node in multi-server Geo setup](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/4706). + - [Geo multi-server upgrade from 12.0.9 to 12.1.9 does not upgrade PostgreSQL](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/4705). + - [Refresh foreign tables fails on app server in multi-server setup after upgrade to 12.1.9](https://gitlab.com/gitlab-org/gitlab/-/issues/32119). diff --git a/doc/administration/geo/replication/high_availability.md b/doc/administration/geo/replication/high_availability.md index 5099e73d5e8..214f15b7565 100644 --- a/doc/administration/geo/replication/high_availability.md +++ b/doc/administration/geo/replication/high_availability.md @@ -1,460 +1,5 @@ -# Geo High Availability **(PREMIUM ONLY)** +--- +redirect_to: 'multiple_servers.md' +--- -This document describes a minimal reference architecture for running Geo -in a high availability configuration. If your HA setup differs from the one -described, it is possible to adapt these instructions to your needs. - -## Architecture overview - -![Geo HA Diagram](../../high_availability/img/geo-ha-diagram.png) - -_[diagram source - GitLab employees only](https://docs.google.com/drawings/d/1z0VlizKiLNXVVVaERFwgsIOuEgjcUqDTWPdQYsE7Z4c/edit)_ - -The topology above assumes that the **primary** and **secondary** Geo clusters -are located in two separate locations, on their own virtual network -with private IP addresses. The network is configured such that all machines within -one geographic location can communicate with each other using their private IP addresses. -The IP addresses given are examples and may be different depending on the -network topology of your deployment. - -The only external way to access the two Geo deployments is by HTTPS at -`gitlab.us.example.com` and `gitlab.eu.example.com` in the example above. - -NOTE: **Note:** -The **primary** and **secondary** Geo deployments must be able to communicate to each other over HTTPS. - -## Redis and PostgreSQL High Availability - -Geo supports: - -- Redis and PostgreSQL on the **primary** node configured for high availability -- Redis on **secondary** nodes configured for high availability. - -NOTE: **Note:** -Support for PostgreSQL on **secondary** nodes in high availability configuration -[is planned](https://gitlab.com/groups/gitlab-org/-/epics/2536). - -Because of the additional complexity involved in setting up this configuration -for PostgreSQL and Redis, it is not covered by this Geo HA documentation. - -For more information about setting up a highly available PostgreSQL cluster and Redis cluster using the omnibus package see the high availability documentation for -[PostgreSQL](../../high_availability/database.md) and -[Redis](../../high_availability/redis.md), respectively. - -NOTE: **Note:** -It is possible to use cloud hosted services for PostgreSQL and Redis, but this is beyond the scope of this document. - -## Prerequisites: Two working GitLab HA clusters - -One cluster will serve as the **primary** node. Use the -[GitLab HA documentation](../../availability/index.md) to set this up. If -you already have a working GitLab instance that is in-use, it can be used as a -**primary**. - -The second cluster will serve as the **secondary** node. Again, use the -[GitLab HA documentation](../../availability/index.md) to set this up. -It's a good idea to log in and test it, however, note that its data will be -wiped out as part of the process of replicating from the **primary**. - -## Configure the GitLab cluster to be the **primary** node - -The following steps enable a GitLab cluster to serve as the **primary** node. - -### Step 1: Configure the **primary** frontend servers - -1. Edit `/etc/gitlab/gitlab.rb` and add the following: - - ```ruby - ## - ## Enable the Geo primary role - ## - roles ['geo_primary_role'] - - ## - ## The unique identifier for the Geo node. - ## - gitlab_rails['geo_node_name'] = '<node_name_here>' - - ## - ## Disable automatic migrations - ## - gitlab_rails['auto_migrate'] = false - ``` - -After making these changes, [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) so the changes take effect. - -NOTE: **Note:** PostgreSQL and Redis should have already been disabled on the -application servers, and connections from the application servers to those -services on the backend servers configured, during normal GitLab HA set up. See -high availability configuration documentation for -[PostgreSQL](../../high_availability/database.md#configuring-the-application-nodes) -and [Redis](../../high_availability/redis.md#example-configuration-for-the-gitlab-application). - -### Step 2: Configure the **primary** database - -1. Edit `/etc/gitlab/gitlab.rb` and add the following: - - ```ruby - ## - ## Configure the Geo primary role and the PostgreSQL role - ## - roles ['geo_primary_role', 'postgres_role'] - ``` - -## Configure a **secondary** node - -A **secondary** cluster is similar to any other GitLab HA cluster, with two -major differences: - -- The main PostgreSQL database is a read-only replica of the **primary** node's - PostgreSQL database. -- There is also a single PostgreSQL database for the **secondary** cluster, - called the "tracking database", which tracks the synchronization state of - various resources. - -Therefore, we will set up the HA components one-by-one, and include deviations -from the normal HA setup. However, we highly recommend first configuring a -brand-new cluster as if it were not part of a Geo setup so that it can be -tested and verified as a working cluster. And only then should it be modified -for use as a Geo **secondary**. This helps to separate problems that are related -and are not related to Geo setup. - -### Step 1: Configure the Redis and Gitaly services on the **secondary** node - -Configure the following services, again using the non-Geo high availability -documentation: - -- [Configuring Redis for GitLab HA](../../high_availability/redis.md) for high - availability. -- [Gitaly](../../high_availability/gitaly.md), which will store data that is - synchronized from the **primary** node. - -NOTE: **Note:** -[NFS](../../high_availability/nfs.md) can be used in place of Gitaly but is not -recommended. - -### Step 2: Configure the main read-only replica PostgreSQL database on the **secondary** node - -NOTE: **Note:** The following documentation assumes the database will be run on -a single node only. PostgreSQL HA on **secondary** nodes is -[not currently supported](https://gitlab.com/groups/gitlab-org/-/epics/2536). - -Configure the [**secondary** database](database.md) as a read-only replica of -the **primary** database. Use the following as a guide. - -1. Generate an MD5 hash of the desired password for the database user that the - GitLab application will use to access the read-replica database: - - Note that the username (`gitlab` by default) is incorporated into the hash. - - ```shell - gitlab-ctl pg-password-md5 gitlab - # Enter password: <your_password_here> - # Confirm password: <your_password_here> - # fca0b89a972d69f00eb3ec98a5838484 - ``` - - Use this hash to fill in `<md5_hash_of_your_password>` in the next step. - -1. Edit `/etc/gitlab/gitlab.rb` in the replica database machine, and add the - following: - - ```ruby - ## - ## Configure the Geo secondary role and the PostgreSQL role - ## - roles ['geo_secondary_role', 'postgres_role'] - - ## - ## Secondary address - ## - replace '<secondary_node_ip>' with the public or VPC address of your Geo secondary node - ## - replace '<tracking_database_ip>' with the public or VPC address of your Geo tracking database node - ## - postgresql['listen_address'] = '<secondary_node_ip>' - postgresql['md5_auth_cidr_addresses'] = ['<secondary_node_ip>/32', '<tracking_database_ip>/32'] - - ## - ## Database credentials password (defined previously in primary node) - ## - replicate same values here as defined in primary node - ## - postgresql['sql_user_password'] = '<md5_hash_of_your_password>' - gitlab_rails['db_password'] = '<your_password_here>' - - ## - ## When running the Geo tracking database on a separate machine, disable it - ## here and allow connections from the tracking database host. And ensure - ## the tracking database IP is in postgresql['md5_auth_cidr_addresses'] above. - ## - geo_postgresql['enable'] = false - - ## - ## Disable `geo_logcursor` service so Rails doesn't get configured here - ## - geo_logcursor['enable'] = false - ``` - -After making these changes, [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) so the changes take effect. - -If using an external PostgreSQL instance, refer also to -[Geo with external PostgreSQL instances](external_database.md). - -### Step 3: Configure the tracking database on the **secondary** node - -NOTE: **Note:** This documentation assumes the tracking database will be run on -only a single machine, rather than as a PostgreSQL cluster. - -Configure the tracking database. - -1. Generate an MD5 hash of the desired password for the database user that the - GitLab application will use to access the tracking database: - - Note that the username (`gitlab_geo` by default) is incorporated into the - hash. - - ```shell - gitlab-ctl pg-password-md5 gitlab_geo - # Enter password: <your_password_here> - # Confirm password: <your_password_here> - # fca0b89a972d69f00eb3ec98a5838484 - ``` - - Use this hash to fill in `<tracking_database_password_md5_hash>` in the next - step. - -1. Edit `/etc/gitlab/gitlab.rb` in the tracking database machine, and add the - following: - - ```ruby - ## - ## Enable the Geo secondary tracking database - ## - geo_postgresql['enable'] = true - geo_postgresql['listen_address'] = '<ip_address_of_this_host>' - geo_postgresql['sql_user_password'] = '<tracking_database_password_md5_hash>' - - ## - ## Configure FDW connection to the replica database - ## - geo_secondary['db_fdw'] = true - geo_postgresql['fdw_external_password'] = '<replica_database_password_plaintext>' - geo_postgresql['md5_auth_cidr_addresses'] = ['<replica_database_ip>/32'] - gitlab_rails['db_host'] = '<replica_database_ip>' - - # Prevent reconfigure from attempting to run migrations on the replica DB - gitlab_rails['auto_migrate'] = false - - ## - ## Disable all other services that aren't needed, since we don't have a role - ## that does this. - ## - alertmanager['enable'] = false - consul['enable'] = false - gitaly['enable'] = false - gitlab_exporter['enable'] = false - gitlab_workhorse['enable'] = false - nginx['enable'] = false - node_exporter['enable'] = false - pgbouncer_exporter['enable'] = false - postgresql['enable'] = false - prometheus['enable'] = false - redis['enable'] = false - redis_exporter['enable'] = false - repmgr['enable'] = false - sidekiq['enable'] = false - unicorn['enable'] = false - ``` - -After making these changes, [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) so the changes take effect. - -If using an external PostgreSQL instance, refer also to -[Geo with external PostgreSQL instances](external_database.md). - -### Step 4: Configure the frontend application servers on the **secondary** node - -In the architecture overview, there are two machines running the GitLab -application services. These services are enabled selectively in the -configuration. - -Configure the application servers following -[Configuring GitLab for HA](../../high_availability/gitlab.md), then make the -following modifications: - -1. Edit `/etc/gitlab/gitlab.rb` on each application server in the **secondary** - cluster, and add the following: - - ```ruby - ## - ## Enable the Geo secondary role - ## - roles ['geo_secondary_role', 'application_role'] - - ## - ## The unique identifier for the Geo node. - ## - gitlab_rails['geo_node_name'] = '<node_name_here>' - - ## - ## Disable automatic migrations - ## - gitlab_rails['auto_migrate'] = false - - ## - ## Configure the connection to the tracking DB. And disable application - ## servers from running tracking databases. - ## - geo_secondary['db_host'] = '<geo_tracking_db_host>' - geo_secondary['db_password'] = '<geo_tracking_db_password>' - geo_postgresql['enable'] = false - - ## - ## Configure connection to the streaming replica database, if you haven't - ## already - ## - gitlab_rails['db_host'] = '<replica_database_host>' - gitlab_rails['db_password'] = '<replica_database_password>' - - ## - ## Configure connection to Redis, if you haven't already - ## - gitlab_rails['redis_host'] = '<redis_host>' - gitlab_rails['redis_password'] = '<redis_password>' - - ## - ## If you are using custom users not managed by Omnibus, you need to specify - ## UIDs and GIDs like below, and ensure they match between servers in a - ## cluster to avoid permissions issues - ## - user['uid'] = 9000 - user['gid'] = 9000 - web_server['uid'] = 9001 - web_server['gid'] = 9001 - registry['uid'] = 9002 - registry['gid'] = 9002 - ``` - -NOTE: **Note:** -If you had set up PostgreSQL cluster using the omnibus package and you had set -up `postgresql['sql_user_password'] = 'md5 digest of secret'` setting, keep in -mind that `gitlab_rails['db_password']` and `geo_secondary['db_password']` -mentioned above contains the plaintext passwords. This is used to let the Rails -servers connect to the databases. - -NOTE: **Note:** -Make sure that current node IP is listed in `postgresql['md5_auth_cidr_addresses']` setting of your remote database. - -After making these changes [Reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) so the changes take effect. - -On the secondary the following GitLab frontend services will be enabled: - -- `geo-logcursor` -- `gitlab-pages` -- `gitlab-workhorse` -- `logrotate` -- `nginx` -- `registry` -- `remote-syslog` -- `sidekiq` -- `unicorn` - -Verify these services by running `sudo gitlab-ctl status` on the frontend -application servers. - -### Step 5: Set up the LoadBalancer for the **secondary** node - -In this topology, a load balancer is required at each geographic location to -route traffic to the application servers. - -See [Load Balancer for GitLab HA](../../high_availability/load_balancer.md) for -more information. - -### Step 6: Configure the backend application servers on the **secondary** node - -The minimal reference architecture diagram above shows all application services -running together on the same machines. However, for high availability we -[strongly recommend running all services separately](../../availability/index.md). - -For example, a Sidekiq server could be configured similarly to the frontend -application servers above, with some changes to run only the `sidekiq` service: - -1. Edit `/etc/gitlab/gitlab.rb` on each Sidekiq server in the **secondary** - cluster, and add the following: - - ```ruby - ## - ## Enable the Geo secondary role - ## - roles ['geo_secondary_role'] - - ## - ## Enable the Sidekiq service - ## - sidekiq['enable'] = true - - ## - ## Ensure unnecessary services are disabled - ## - alertmanager['enable'] = false - consul['enable'] = false - geo_logcursor['enable'] = false - gitaly['enable'] = false - gitlab_exporter['enable'] = false - gitlab_workhorse['enable'] = false - nginx['enable'] = false - node_exporter['enable'] = false - pgbouncer_exporter['enable'] = false - postgresql['enable'] = false - prometheus['enable'] = false - redis['enable'] = false - redis_exporter['enable'] = false - repmgr['enable'] = false - unicorn['enable'] = false - - ## - ## The unique identifier for the Geo node. - ## - gitlab_rails['geo_node_name'] = '<node_name_here>' - - ## - ## Disable automatic migrations - ## - gitlab_rails['auto_migrate'] = false - - ## - ## Configure the connection to the tracking DB. And disable application - ## servers from running tracking databases. - ## - geo_secondary['db_host'] = '<geo_tracking_db_host>' - geo_secondary['db_password'] = '<geo_tracking_db_password>' - geo_postgresql['enable'] = false - - ## - ## Configure connection to the streaming replica database, if you haven't - ## already - ## - gitlab_rails['db_host'] = '<replica_database_host>' - gitlab_rails['db_password'] = '<replica_database_password>' - - ## - ## Configure connection to Redis, if you haven't already - ## - gitlab_rails['redis_host'] = '<redis_host>' - gitlab_rails['redis_password'] = '<redis_password>' - - ## - ## If you are using custom users not managed by Omnibus, you need to specify - ## UIDs and GIDs like below, and ensure they match between servers in a - ## cluster to avoid permissions issues - ## - user['uid'] = 9000 - user['gid'] = 9000 - web_server['uid'] = 9001 - web_server['gid'] = 9001 - registry['uid'] = 9002 - registry['gid'] = 9002 - ``` - - You can similarly configure a server to run only the `geo-logcursor` service - with `geo_logcursor['enable'] = true` and disabling Sidekiq with - `sidekiq['enable'] = false`. - - These servers do not need to be attached to the load balancer. +This document was moved to [another location](multiple_servers.md). diff --git a/doc/administration/geo/replication/index.md b/doc/administration/geo/replication/index.md index 7c661abef9a..87bd7b69515 100644 --- a/doc/administration/geo/replication/index.md +++ b/doc/administration/geo/replication/index.md @@ -2,7 +2,7 @@ > - Introduced in GitLab Enterprise Edition 8.9. > - Using Geo in combination with -> [High Availability](../../availability/index.md) +> [multi-server architectures](../../reference_architectures/index.md) > is considered **Generally Available** (GA) in > [GitLab Premium](https://about.gitlab.com/pricing/) 10.4. @@ -110,7 +110,7 @@ The following are required to run Geo: The following operating systems are known to ship with a current version of OpenSSH: - [CentOS](https://www.centos.org) 7.4+ - [Ubuntu](https://ubuntu.com) 16.04+ -- PostgreSQL 9.6+ with [FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html) support and [Streaming Replication](https://wiki.postgresql.org/wiki/Streaming_Replication) +- PostgreSQL 11+ with [FDW](https://www.postgresql.org/docs/11/postgres-fdw.html) support and [Streaming Replication](https://wiki.postgresql.org/wiki/Streaming_Replication) - Git 2.9+ - All nodes must run the same GitLab version. @@ -134,7 +134,7 @@ The following table lists basic ports that must be open between the **primary** See the full list of ports used by GitLab in [Package defaults](https://docs.gitlab.com/omnibus/package-information/defaults.html) NOTE: **Note:** -[Web terminal](../../../ci/environments.md#web-terminals) support requires your load balancer to correctly handle WebSocket connections. +[Web terminal](../../../ci/environments/index.md#web-terminals) support requires your load balancer to correctly handle WebSocket connections. When using HTTP or HTTPS proxying, your load balancer must be configured to pass through the `Connection` and `Upgrade` hop-by-hop headers. See the [web terminal](../../integration/terminal.md) integration guide for more details. NOTE: **Note:** @@ -206,9 +206,9 @@ For information on configuring Geo, see [Geo configuration](configuration.md). For information on how to update your Geo nodes to the latest GitLab version, see [Updating the Geo nodes](updating_the_geo_nodes.md). -### Configuring Geo high availability +### Configuring Geo for multiple servers -For information on configuring Geo for high availability, see [Geo High Availability](high_availability.md). +For information on configuring Geo for multiple servers, see [Geo for multiple servers](multiple_servers.md). ### Configuring Geo with Object Storage @@ -245,7 +245,7 @@ This list of limitations only reflects the latest version of GitLab. If you are - Pushing directly to a **secondary** node redirects (for HTTP) or proxies (for SSH) the request to the **primary** node instead of [handling it directly](https://gitlab.com/gitlab-org/gitlab/issues/1381), except when using Git over HTTP with credentials embedded within the URI. For example, `https://user:password@secondary.tld`. - Cloning, pulling, or pushing repositories that exist on the **primary** node but not on the **secondary** nodes where [selective synchronization](configuration.md#selective-synchronization) does not include the project is not supported over SSH [but support is planned](https://gitlab.com/groups/gitlab-org/-/epics/2562). HTTP(S) is supported. -- The **primary** node has to be online for OAuth login to happen. Existing sessions and Git are not affected. +- The **primary** node has to be online for OAuth login to happen. Existing sessions and Git are not affected. Support for the **secondary** node to use an OAuth provider independent from the primary is [being planned](https://gitlab.com/gitlab-org/gitlab/issues/208465). - The installation takes multiple manual steps that together can take about an hour depending on circumstances. We are working on improving this experience. See [Omnibus GitLab issue #2978](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2978) for details. - Real-time updates of issues/merge requests (for example, via long polling) doesn't work on the **secondary** node. - [Selective synchronization](configuration.md#selective-synchronization) applies only to files and repositories. Other datasets are replicated to the **secondary** node in full, making it inappropriate for use as an access control mechanism. diff --git a/doc/administration/geo/replication/multiple_servers.md b/doc/administration/geo/replication/multiple_servers.md new file mode 100644 index 00000000000..9322c4cc417 --- /dev/null +++ b/doc/administration/geo/replication/multiple_servers.md @@ -0,0 +1,459 @@ +# Geo for multiple servers **(PREMIUM ONLY)** + +This document describes a minimal reference architecture for running Geo +in a multi-server configuration. If your multi-server setup differs from the one +described, it is possible to adapt these instructions to your needs. + +## Architecture overview + +![Geo multi-server diagram](../../high_availability/img/geo-ha-diagram.png) + +_[diagram source - GitLab employees only](https://docs.google.com/drawings/d/1z0VlizKiLNXVVVaERFwgsIOuEgjcUqDTWPdQYsE7Z4c/edit)_ + +The topology above assumes that the **primary** and **secondary** Geo clusters +are located in two separate locations, on their own virtual network +with private IP addresses. The network is configured such that all machines within +one geographic location can communicate with each other using their private IP addresses. +The IP addresses given are examples and may be different depending on the +network topology of your deployment. + +The only external way to access the two Geo deployments is by HTTPS at +`gitlab.us.example.com` and `gitlab.eu.example.com` in the example above. + +NOTE: **Note:** +The **primary** and **secondary** Geo deployments must be able to communicate to each other over HTTPS. + +## Redis and PostgreSQL for multiple servers + +Geo supports: + +- Redis and PostgreSQL on the **primary** node configured for multiple servers. +- Redis on **secondary** nodes configured for multiple servers. + +NOTE: **Note:** +Support for PostgreSQL on **secondary** nodes in multi-server configuration +[is planned](https://gitlab.com/groups/gitlab-org/-/epics/2536). + +Because of the additional complexity involved in setting up this configuration +for PostgreSQL and Redis, it is not covered by this Geo multi-server documentation. + +For more information about setting up a multi-server PostgreSQL cluster and Redis cluster using the omnibus package see the multi-server documentation for +[PostgreSQL](../../high_availability/database.md) and +[Redis](../../high_availability/redis.md), respectively. + +NOTE: **Note:** +It is possible to use cloud hosted services for PostgreSQL and Redis, but this is beyond the scope of this document. + +## Prerequisites: Two working GitLab multi-server clusters + +One cluster will serve as the **primary** node. Use the +[GitLab multi-server documentation](../../reference_architectures/index.md) to set this up. If +you already have a working GitLab instance that is in-use, it can be used as a +**primary**. + +The second cluster will serve as the **secondary** node. Again, use the +[GitLab multi-server documentation](../../reference_architectures/index.md) to set this up. +It's a good idea to log in and test it, however, note that its data will be +wiped out as part of the process of replicating from the **primary**. + +## Configure the GitLab cluster to be the **primary** node + +The following steps enable a GitLab cluster to serve as the **primary** node. + +### Step 1: Configure the **primary** frontend servers + +1. Edit `/etc/gitlab/gitlab.rb` and add the following: + + ```ruby + ## + ## Enable the Geo primary role + ## + roles ['geo_primary_role'] + + ## + ## The unique identifier for the Geo node. + ## + gitlab_rails['geo_node_name'] = '<node_name_here>' + + ## + ## Disable automatic migrations + ## + gitlab_rails['auto_migrate'] = false + ``` + +After making these changes, [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) so the changes take effect. + +NOTE: **Note:** PostgreSQL and Redis should have already been disabled on the +application servers, and connections from the application servers to those +services on the backend servers configured, during normal GitLab multi-server set up. See +multi-server configuration documentation for +[PostgreSQL](../../high_availability/database.md#configuring-the-application-nodes) +and [Redis](../../high_availability/redis.md#example-configuration-for-the-gitlab-application). + +### Step 2: Configure the **primary** database + +1. Edit `/etc/gitlab/gitlab.rb` and add the following: + + ```ruby + ## + ## Configure the Geo primary role and the PostgreSQL role + ## + roles ['geo_primary_role', 'postgres_role'] + ``` + +## Configure a **secondary** node + +A **secondary** cluster is similar to any other GitLab multi-server cluster, with two +major differences: + +- The main PostgreSQL database is a read-only replica of the **primary** node's + PostgreSQL database. +- There is also a single PostgreSQL database for the **secondary** cluster, + called the "tracking database", which tracks the synchronization state of + various resources. + +Therefore, we will set up the multi-server components one-by-one, and include deviations +from the normal multi-server setup. However, we highly recommend first configuring a +brand-new cluster as if it were not part of a Geo setup so that it can be +tested and verified as a working cluster. And only then should it be modified +for use as a Geo **secondary**. This helps to separate problems that are related +and are not related to Geo setup. + +### Step 1: Configure the Redis and Gitaly services on the **secondary** node + +Configure the following services, again using the non-Geo multi-server +documentation: + +- [Configuring Redis for GitLab](../../high_availability/redis.md) for multiple servers. +- [Gitaly](../../high_availability/gitaly.md), which will store data that is + synchronized from the **primary** node. + +NOTE: **Note:** +[NFS](../../high_availability/nfs.md) can be used in place of Gitaly but is not +recommended. + +### Step 2: Configure the main read-only replica PostgreSQL database on the **secondary** node + +NOTE: **Note:** The following documentation assumes the database will be run on +a single node only. Multi-server PostgreSQL on **secondary** nodes is +[not currently supported](https://gitlab.com/groups/gitlab-org/-/epics/2536). + +Configure the [**secondary** database](database.md) as a read-only replica of +the **primary** database. Use the following as a guide. + +1. Generate an MD5 hash of the desired password for the database user that the + GitLab application will use to access the read-replica database: + + Note that the username (`gitlab` by default) is incorporated into the hash. + + ```shell + gitlab-ctl pg-password-md5 gitlab + # Enter password: <your_password_here> + # Confirm password: <your_password_here> + # fca0b89a972d69f00eb3ec98a5838484 + ``` + + Use this hash to fill in `<md5_hash_of_your_password>` in the next step. + +1. Edit `/etc/gitlab/gitlab.rb` in the replica database machine, and add the + following: + + ```ruby + ## + ## Configure the Geo secondary role and the PostgreSQL role + ## + roles ['geo_secondary_role', 'postgres_role'] + + ## + ## Secondary address + ## - replace '<secondary_node_ip>' with the public or VPC address of your Geo secondary node + ## - replace '<tracking_database_ip>' with the public or VPC address of your Geo tracking database node + ## + postgresql['listen_address'] = '<secondary_node_ip>' + postgresql['md5_auth_cidr_addresses'] = ['<secondary_node_ip>/32', '<tracking_database_ip>/32'] + + ## + ## Database credentials password (defined previously in primary node) + ## - replicate same values here as defined in primary node + ## + postgresql['sql_user_password'] = '<md5_hash_of_your_password>' + gitlab_rails['db_password'] = '<your_password_here>' + + ## + ## When running the Geo tracking database on a separate machine, disable it + ## here and allow connections from the tracking database host. And ensure + ## the tracking database IP is in postgresql['md5_auth_cidr_addresses'] above. + ## + geo_postgresql['enable'] = false + + ## + ## Disable `geo_logcursor` service so Rails doesn't get configured here + ## + geo_logcursor['enable'] = false + ``` + +After making these changes, [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) so the changes take effect. + +If using an external PostgreSQL instance, refer also to +[Geo with external PostgreSQL instances](external_database.md). + +### Step 3: Configure the tracking database on the **secondary** node + +NOTE: **Note:** This documentation assumes the tracking database will be run on +only a single machine, rather than as a PostgreSQL cluster. + +Configure the tracking database. + +1. Generate an MD5 hash of the desired password for the database user that the + GitLab application will use to access the tracking database: + + Note that the username (`gitlab_geo` by default) is incorporated into the + hash. + + ```shell + gitlab-ctl pg-password-md5 gitlab_geo + # Enter password: <your_password_here> + # Confirm password: <your_password_here> + # fca0b89a972d69f00eb3ec98a5838484 + ``` + + Use this hash to fill in `<tracking_database_password_md5_hash>` in the next + step. + +1. Edit `/etc/gitlab/gitlab.rb` in the tracking database machine, and add the + following: + + ```ruby + ## + ## Enable the Geo secondary tracking database + ## + geo_postgresql['enable'] = true + geo_postgresql['listen_address'] = '<ip_address_of_this_host>' + geo_postgresql['sql_user_password'] = '<tracking_database_password_md5_hash>' + + ## + ## Configure FDW connection to the replica database + ## + geo_secondary['db_fdw'] = true + geo_postgresql['fdw_external_password'] = '<replica_database_password_plaintext>' + geo_postgresql['md5_auth_cidr_addresses'] = ['<replica_database_ip>/32'] + gitlab_rails['db_host'] = '<replica_database_ip>' + + # Prevent reconfigure from attempting to run migrations on the replica DB + gitlab_rails['auto_migrate'] = false + + ## + ## Disable all other services that aren't needed, since we don't have a role + ## that does this. + ## + alertmanager['enable'] = false + consul['enable'] = false + gitaly['enable'] = false + gitlab_exporter['enable'] = false + gitlab_workhorse['enable'] = false + nginx['enable'] = false + node_exporter['enable'] = false + pgbouncer_exporter['enable'] = false + postgresql['enable'] = false + prometheus['enable'] = false + redis['enable'] = false + redis_exporter['enable'] = false + repmgr['enable'] = false + sidekiq['enable'] = false + puma['enable'] = false + ``` + +After making these changes, [reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) so the changes take effect. + +If using an external PostgreSQL instance, refer also to +[Geo with external PostgreSQL instances](external_database.md). + +### Step 4: Configure the frontend application servers on the **secondary** node + +In the architecture overview, there are two machines running the GitLab +application services. These services are enabled selectively in the +configuration. + +Configure the application servers following +[Configuring GitLab for multiple servers](../../high_availability/gitlab.md), then make the +following modifications: + +1. Edit `/etc/gitlab/gitlab.rb` on each application server in the **secondary** + cluster, and add the following: + + ```ruby + ## + ## Enable the Geo secondary role + ## + roles ['geo_secondary_role', 'application_role'] + + ## + ## The unique identifier for the Geo node. + ## + gitlab_rails['geo_node_name'] = '<node_name_here>' + + ## + ## Disable automatic migrations + ## + gitlab_rails['auto_migrate'] = false + + ## + ## Configure the connection to the tracking DB. And disable application + ## servers from running tracking databases. + ## + geo_secondary['db_host'] = '<geo_tracking_db_host>' + geo_secondary['db_password'] = '<geo_tracking_db_password>' + geo_postgresql['enable'] = false + + ## + ## Configure connection to the streaming replica database, if you haven't + ## already + ## + gitlab_rails['db_host'] = '<replica_database_host>' + gitlab_rails['db_password'] = '<replica_database_password>' + + ## + ## Configure connection to Redis, if you haven't already + ## + gitlab_rails['redis_host'] = '<redis_host>' + gitlab_rails['redis_password'] = '<redis_password>' + + ## + ## If you are using custom users not managed by Omnibus, you need to specify + ## UIDs and GIDs like below, and ensure they match between servers in a + ## cluster to avoid permissions issues + ## + user['uid'] = 9000 + user['gid'] = 9000 + web_server['uid'] = 9001 + web_server['gid'] = 9001 + registry['uid'] = 9002 + registry['gid'] = 9002 + ``` + +NOTE: **Note:** +If you had set up PostgreSQL cluster using the omnibus package and you had set +up `postgresql['sql_user_password'] = 'md5 digest of secret'` setting, keep in +mind that `gitlab_rails['db_password']` and `geo_secondary['db_password']` +mentioned above contains the plaintext passwords. This is used to let the Rails +servers connect to the databases. + +NOTE: **Note:** +Make sure that current node IP is listed in `postgresql['md5_auth_cidr_addresses']` setting of your remote database. + +After making these changes [Reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) so the changes take effect. + +On the secondary the following GitLab frontend services will be enabled: + +- `geo-logcursor` +- `gitlab-pages` +- `gitlab-workhorse` +- `logrotate` +- `nginx` +- `registry` +- `remote-syslog` +- `sidekiq` +- `puma` + +Verify these services by running `sudo gitlab-ctl status` on the frontend +application servers. + +### Step 5: Set up the LoadBalancer for the **secondary** node + +In this topology, a load balancer is required at each geographic location to +route traffic to the application servers. + +See [Load Balancer for GitLab with multiple servers](../../high_availability/load_balancer.md) for +more information. + +### Step 6: Configure the backend application servers on the **secondary** node + +The minimal reference architecture diagram above shows all application services +running together on the same machines. However, for multiple servers we +[strongly recommend running all services separately](../../reference_architectures/index.md). + +For example, a Sidekiq server could be configured similarly to the frontend +application servers above, with some changes to run only the `sidekiq` service: + +1. Edit `/etc/gitlab/gitlab.rb` on each Sidekiq server in the **secondary** + cluster, and add the following: + + ```ruby + ## + ## Enable the Geo secondary role + ## + roles ['geo_secondary_role'] + + ## + ## Enable the Sidekiq service + ## + sidekiq['enable'] = true + + ## + ## Ensure unnecessary services are disabled + ## + alertmanager['enable'] = false + consul['enable'] = false + geo_logcursor['enable'] = false + gitaly['enable'] = false + gitlab_exporter['enable'] = false + gitlab_workhorse['enable'] = false + nginx['enable'] = false + node_exporter['enable'] = false + pgbouncer_exporter['enable'] = false + postgresql['enable'] = false + prometheus['enable'] = false + redis['enable'] = false + redis_exporter['enable'] = false + repmgr['enable'] = false + puma['enable'] = false + + ## + ## The unique identifier for the Geo node. + ## + gitlab_rails['geo_node_name'] = '<node_name_here>' + + ## + ## Disable automatic migrations + ## + gitlab_rails['auto_migrate'] = false + + ## + ## Configure the connection to the tracking DB. And disable application + ## servers from running tracking databases. + ## + geo_secondary['db_host'] = '<geo_tracking_db_host>' + geo_secondary['db_password'] = '<geo_tracking_db_password>' + geo_postgresql['enable'] = false + + ## + ## Configure connection to the streaming replica database, if you haven't + ## already + ## + gitlab_rails['db_host'] = '<replica_database_host>' + gitlab_rails['db_password'] = '<replica_database_password>' + + ## + ## Configure connection to Redis, if you haven't already + ## + gitlab_rails['redis_host'] = '<redis_host>' + gitlab_rails['redis_password'] = '<redis_password>' + + ## + ## If you are using custom users not managed by Omnibus, you need to specify + ## UIDs and GIDs like below, and ensure they match between servers in a + ## cluster to avoid permissions issues + ## + user['uid'] = 9000 + user['gid'] = 9000 + web_server['uid'] = 9001 + web_server['gid'] = 9001 + registry['uid'] = 9002 + registry['gid'] = 9002 + ``` + + You can similarly configure a server to run only the `geo-logcursor` service + with `geo_logcursor['enable'] = true` and disabling Sidekiq with + `sidekiq['enable'] = false`. + + These servers do not need to be attached to the load balancer. diff --git a/doc/administration/geo/replication/security_review.md b/doc/administration/geo/replication/security_review.md index 18fe1ad22cd..0ac8157220a 100644 --- a/doc/administration/geo/replication/security_review.md +++ b/doc/administration/geo/replication/security_review.md @@ -73,7 +73,7 @@ from [owasp.org](https://owasp.org/). - Nothing Geo-specific. Any user where `admin: true` is set in the database is considered an admin with super-user privileges. - See also: [more granular access control](https://gitlab.com/gitlab-org/gitlab-foss/issues/32730) - (not geo-specific) + (not Geo-specific). - Much of Geo’s integration (database replication, for instance) must be configured with the application, typically by system administrators. @@ -177,7 +177,7 @@ from [owasp.org](https://owasp.org/). ### What databases and application servers support the application? -- PostgreSQL >= 9.6, Redis, Sidekiq, Unicorn. +- PostgreSQL >= 11, Redis, Sidekiq, Puma. ### How will database connection strings, encryption keys, and other sensitive components be stored, accessed, and protected from unauthorized detection? diff --git a/doc/administration/geo/replication/troubleshooting.md b/doc/administration/geo/replication/troubleshooting.md index fae9705e935..293414a6e5e 100644 --- a/doc/administration/geo/replication/troubleshooting.md +++ b/doc/administration/geo/replication/troubleshooting.md @@ -497,6 +497,12 @@ to start again from scratch, there are a few steps that can help you: gitlab-ctl start ``` +1. Refresh Foreign Data Wrapper tables + + ```shell + gitlab-rake geo:db:refresh_foreign_tables + ``` + ## Fixing errors during a failover or when promoting a secondary to a primary node The following are possible errors that might be encountered during failover or @@ -538,6 +544,27 @@ or `gitlab-ctl promote-to-primary-node`, either: bug](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/22021) was fixed. +### Message: ``NoMethodError: undefined method `secondary?' for nil:NilClass`` + +When [promoting a **secondary** node](../disaster_recovery/index.md#step-3-promoting-a-secondary-node), +you might encounter the following error: + +```plaintext +sudo gitlab-rake geo:set_secondary_as_primary + +rake aborted! +NoMethodError: undefined method `secondary?' for nil:NilClass +/opt/gitlab/embedded/service/gitlab-rails/ee/lib/tasks/geo.rake:232:in `block (3 levels) in <top (required)>' +/opt/gitlab/embedded/service/gitlab-rails/ee/lib/tasks/geo.rake:221:in `block (2 levels) in <top (required)>' +/opt/gitlab/embedded/bin/bundle:23:in `load' +/opt/gitlab/embedded/bin/bundle:23:in `<main>' +Tasks: TOP => geo:set_secondary_as_primary +(See full trace by running task with --trace) +``` + +This command is intended to be executed on a secondary node only, and this error +is displayed if you attempt to run this command on a primary node. + ### Message: `sudo: gitlab-pg-ctl: command not found` When @@ -624,9 +651,9 @@ To check the configuration: ``` This password is normally set on the tracking database during - [Step 3: Configure the tracking database on the secondary node](high_availability.md#step-3-configure-the-tracking-database-on-the-secondary-node), + [Step 3: Configure the tracking database on the secondary node](multiple_servers.md#step-3-configure-the-tracking-database-on-the-secondary-node), and it is set on the app nodes during - [Step 4: Configure the frontend application servers on the secondary node](high_availability.md#step-4-configure-the-frontend-application-servers-on-the-secondary-node). + [Step 4: Configure the frontend application servers on the secondary node](multiple_servers.md#step-4-configure-the-frontend-application-servers-on-the-secondary-node). 1. Check whether any tables are present with the following statement: @@ -833,6 +860,8 @@ which Geo expects to have access to. It usually means, either: - An unsupported replication method was used (for example, logical replication). - The instructions to setup a [Geo database replication](database.md) were not followed correctly. +- Your database connection details are incorrect, that is you have specified the wrong + user in your `/etc/gitlab/gitlab.rb` file. A common source of confusion with **secondary** nodes is that it requires two separate PostgreSQL instances: @@ -854,7 +883,7 @@ Make sure you follow the [Geo database replication](database.md) instructions fo ### Geo database version (...) does not match latest migration (...) -If you are using GitLab Omnibus installation, something might have failed during upgrade. You can: +If you are using Omnibus GitLab installation, something might have failed during upgrade. You can: - Run `sudo gitlab-ctl reconfigure`. - Manually trigger the database migration by running: `sudo gitlab-rake geo:db:migrate` as root on the **secondary** node. diff --git a/doc/administration/geo/replication/updating_the_geo_nodes.md b/doc/administration/geo/replication/updating_the_geo_nodes.md index df66b1b36ec..fa1576e19eb 100644 --- a/doc/administration/geo/replication/updating_the_geo_nodes.md +++ b/doc/administration/geo/replication/updating_the_geo_nodes.md @@ -11,6 +11,7 @@ Updating Geo nodes involves performing: Depending on which version of Geo you are updating to/from, there may be different steps. +- [Updating to GitLab 12.9](version_specific_updates.md#updating-to-gitlab-129) - [Updating to GitLab 12.7](version_specific_updates.md#updating-to-gitlab-127) - [Updating to GitLab 12.2](version_specific_updates.md#updating-to-gitlab-122) - [Updating to GitLab 12.1](version_specific_updates.md#updating-to-gitlab-121) @@ -44,7 +45,7 @@ and all **secondary** nodes: Now that the update process is complete, you may want to check whether everything is working correctly: -1. Run the Geo raketask on all nodes, everything should be green: +1. Run the Geo Rake task on all nodes, everything should be green: ```shell sudo gitlab-rake gitlab:geo:check diff --git a/doc/administration/geo/replication/using_a_geo_server.md b/doc/administration/geo/replication/using_a_geo_server.md index 0f55272f667..2fec2b2b59c 100644 --- a/doc/administration/geo/replication/using_a_geo_server.md +++ b/doc/administration/geo/replication/using_a_geo_server.md @@ -1,4 +1,4 @@ -[//]: # (Please update EE::GitLab::GeoGitAccess::GEO_SERVER_DOCS_URL if this file is moved) +<!-- Please update EE::GitLab::GeoGitAccess::GEO_SERVER_DOCS_URL if this file is moved) --> # Using a Geo Server **(PREMIUM ONLY)** diff --git a/doc/administration/geo/replication/version_specific_updates.md b/doc/administration/geo/replication/version_specific_updates.md index 81868d19c7f..db8bbddec3b 100644 --- a/doc/administration/geo/replication/version_specific_updates.md +++ b/doc/administration/geo/replication/version_specific_updates.md @@ -30,7 +30,7 @@ GitLab 12.2 includes the following minor PostgreSQL updates: This update will occur even if major PostgreSQL updates are disabled. -Before [refreshing Foreign Data Wrapper during a Geo HA upgrade](https://docs.gitlab.com/omnibus/update/README.html#run-post-deployment-migrations-and-checks), +Before [refreshing Foreign Data Wrapper during a Geo upgrade](https://docs.gitlab.com/omnibus/update/README.html#run-post-deployment-migrations-and-checks), restart the Geo tracking database: ```shell @@ -100,8 +100,8 @@ authentication method. postgresql['sql_user_password'] = '<md5_hash_of_your_password>' # Every node that runs Unicorn or Sidekiq needs to have the database - # password specified as below. If you have a high-availability setup, this - # must be present in all application nodes. + # password specified as below. + # This must be present in all application nodes. gitlab_rails['db_password'] = '<your_password_here>' ``` @@ -125,8 +125,8 @@ authentication method. postgresql['sql_user_password'] = '<md5_hash_of_your_password>' # Every node that runs Unicorn or Sidekiq needs to have the database - # password specified as below. If you have a high-availability setup, this - # must be present in all application nodes. + # password specified as below. + # This must be present in all application nodes. gitlab_rails['db_password'] = '<your_password_here>' # Enable Foreign Data Wrapper |