diff options
Diffstat (limited to 'doc/administration/geo')
22 files changed, 321 insertions, 118 deletions
diff --git a/doc/administration/geo/disaster_recovery/index.md b/doc/administration/geo/disaster_recovery/index.md index 5eb23422374..ad5284938fa 100644 --- a/doc/administration/geo/disaster_recovery/index.md +++ b/doc/administration/geo/disaster_recovery/index.md @@ -51,7 +51,7 @@ must disable the **primary** node. NOTE: **Note:** (**CentOS only**) In CentOS 6 or older, there is no easy way to prevent GitLab from being - started if the machine reboots isn't available (see [gitlab-org/omnibus-gitlab#3058]). + started if the machine reboots isn't available (see [Omnibus GitLab issue #3058](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/3058)). It may be safest to uninstall the GitLab package completely: ```sh @@ -317,6 +317,5 @@ section to resolve the error. Otherwise, the secret is lost and you'll need to [setup-geo]: ../replication/index.md#setup-instructions [updating-geo]: ../replication/version_specific_updates.md#updating-to-gitlab-105 [sec-tfa]: ../../../security/two_factor_authentication.md#disabling-2fa-for-everyone -[gitlab-org/omnibus-gitlab#3058]: https://gitlab.com/gitlab-org/omnibus-gitlab/issues/3058 [initiate-the-replication-process]: ../replication/database.html#step-3-initiate-the-replication-process [configure-the-primary-server]: ../replication/database.html#step-1-configure-the-primary-server diff --git a/doc/administration/geo/disaster_recovery/planned_failover.md b/doc/administration/geo/disaster_recovery/planned_failover.md index 75e07bcf863..8fee172ec64 100644 --- a/doc/administration/geo/disaster_recovery/planned_failover.md +++ b/doc/administration/geo/disaster_recovery/planned_failover.md @@ -43,23 +43,14 @@ will go smoothly. ### Object storage -Some classes of non-repository data can use object storage in preference to -file storage. Geo [does not replicate data in object storage](../replication/object_storage.md), -leaving that task up to the object store itself. For a planned failover, this -means you can decouple the replication of this data from the failover of the -GitLab service. - -If you're already using object storage, simply verify that your **secondary** -node has access to the same data as the **primary** node - they must either they share the -same object storage configuration, or the **secondary** node should be configured to -access a [geographically-replicated][os-repl] copy provided by the object store -itself. - If you have a large GitLab installation or cannot tolerate downtime, consider [migrating to Object Storage][os-conf] **before** scheduling a planned failover. Doing so reduces both the length of the maintenance window, and the risk of data loss as a result of a poorly executed planned failover. +In GitLab 12.4, you can optionally allow GitLab to manage replication of Object Storage for +**secondary** nodes. For more information, see [Object Storage replication][os-conf]. + ### Review the configuration of each **secondary** node Database settings are automatically replicated to the **secondary** node, but the @@ -224,5 +215,4 @@ Don't forget to remove the broadcast message after failover is complete. [background-verification]: background_verification.md [limitations]: ../replication/index.md#current-limitations [moving-repositories]: ../../operations/moving_repositories.md -[os-conf]: ../replication/object_storage.md#configuration -[os-repl]: ../replication/object_storage.md#replication +[os-conf]: ../replication/object_storage.md diff --git a/doc/administration/geo/replication/configuration.md b/doc/administration/geo/replication/configuration.md index ddb5f22fd05..f09d9f20dab 100644 --- a/doc/administration/geo/replication/configuration.md +++ b/doc/administration/geo/replication/configuration.md @@ -25,7 +25,7 @@ Any change that requires access to the **Admin Area** needs to be done in the GitLab stores a number of secret values in the `/etc/gitlab/gitlab-secrets.json` file which *must* be the same on all nodes. Until there is -a means of automatically replicating these between nodes (see issue [gitlab-org/gitlab-ee#3789]), +a means of automatically replicating these between nodes (see [issue #3789](https://gitlab.com/gitlab-org/gitlab/issues/3789)), they must be manually replicated to the **secondary** node. 1. SSH into the **primary** node, and execute the command below: @@ -75,7 +75,7 @@ they must be manually replicated to the **secondary** node. ### Step 2. Manually replicate the **primary** node's SSH host keys GitLab integrates with the system-installed SSH daemon, designating a user -(typically named git) through which all access requests are handled. +(typically named `git`) through which all access requests are handled. In a [Disaster Recovery] situation, GitLab system administrators will promote a **secondary** node to the **primary** node. DNS records for the @@ -165,10 +165,32 @@ keys must be manually replicated to the **secondary** node. ### Step 3. Add the **secondary** node +1. SSH into your GitLab **secondary** server and login as root: + + ```sh + sudo -i + ``` + +1. Edit `/etc/gitlab/gitlab.rb` and add a **unique** name for your node. You will need this in the next steps: + + ```ruby + # The unique identifier for the Geo node. + gitlab_rails['geo_node_name'] = '<node_name_here>' + ``` + +1. Reconfigure the **secondary** node for the change to take effect: + + ```sh + gitlab-ctl reconfigure + ``` + 1. Visit the **primary** node's **Admin Area > Geo** (`/admin/geo/nodes`) in your browser. -1. Add the **secondary** node by providing its full URL. **Do NOT** check the +1. Click the **New node** button. +1. Add the **secondary** node. Use the **exact** name you inputed for `gitlab_rails['geo_node_name']` as the Name and the full URL as the URL. **Do NOT** check the **This is a primary node** checkbox. + + ![Add secondary node](img/adding_a_secondary_node.png) 1. Optionally, choose which groups or storage shards should be replicated by the **secondary** node. Leave blank to replicate all. Read more in [selective synchronization](#selective-synchronization). @@ -299,7 +321,6 @@ See the [troubleshooting document](troubleshooting.md). [setup-geo-omnibus]: index.md#using-omnibus-gitlab [Hashed Storage]: ../../repository_storage_types.md [Disaster Recovery]: ../disaster_recovery/index.md -[gitlab-org/gitlab-ee#3789]: https://gitlab.com/gitlab-org/gitlab/issues/3789 [gitlab-com/infrastructure#2821]: https://gitlab.com/gitlab-com/infrastructure/issues/2821 [omnibus-ssl]: https://docs.gitlab.com/omnibus/settings/ssl.html [using-geo]: using_a_geo_server.md diff --git a/doc/administration/geo/replication/database.md b/doc/administration/geo/replication/database.md index 33f240ed11f..fa1b0f0e1d7 100644 --- a/doc/administration/geo/replication/database.md +++ b/doc/administration/geo/replication/database.md @@ -1,9 +1,6 @@ # Geo database replication **(PREMIUM ONLY)** NOTE: **Note:** -The following steps are for Omnibus installs only. Using Geo with source-based installs was **deprecated** in GitLab 11.5. - -NOTE: **Note:** If your GitLab installation uses external (not managed by Omnibus) PostgreSQL instances, the Omnibus roles will not be able to perform all necessary configuration steps. In this case, @@ -37,8 +34,8 @@ recover. See below for more details. The following guide assumes that: - You are using Omnibus and therefore you are using PostgreSQL 9.6 or later - which includes the [`pg_basebackup` tool][pgback] and improved - [Foreign Data Wrapper][FDW] support. + which includes the [`pg_basebackup` tool](https://www.postgresql.org/docs/9.6/app-pgbasebackup.html) and improved + [Foreign Data Wrapper][FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html) support. - You have a **primary** node already set up (the GitLab server you are replicating from), running Omnibus' PostgreSQL (or equivalent version), and you have a new **secondary** server set up with the same versions of the OS, @@ -56,6 +53,19 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o sudo -i ``` +1. Edit `/etc/gitlab/gitlab.rb` and add a **unique** name for your node: + + ```ruby + # The unique identifier for the Geo node. + gitlab_rails['geo_node_name'] = '<node_name_here>' + ``` + +1. Reconfigure the **primary** node for the change to take effect: + + ```sh + gitlab-ctl reconfigure + ``` + 1. Execute the command below to define the node as **primary** node: ```sh @@ -149,9 +159,9 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o address (corresponds to "internal address" for Google Cloud Platform) for `postgresql['md5_auth_cidr_addresses']` and `postgresql['listen_address']`. - The `listen_address` option opens PostgreSQL up to network connections - with the interface corresponding to the given address. See [the PostgreSQL - documentation][pg-docs-runtime-conn] for more details. + The `listen_address` option opens PostgreSQL up to network connections with the interface + corresponding to the given address. See [the PostgreSQL documentation](https://www.postgresql.org/docs/9.6/runtime-config-connection.html) + for more details. Depending on your network configuration, the suggested addresses may not be correct. If your **primary** node and **secondary** nodes connect over a local @@ -202,9 +212,8 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o postgresql['md5_auth_cidr_addresses'] = ['<primary_node_ip>/32', '<secondary_node_ip>/32', '<another_secondary_node_ip>/32'] ``` - You may also want to edit the `wal_keep_segments` and `max_wal_senders` to - match your database replication requirements. Consult the [PostgreSQL - - Replication documentation][pg-docs-runtime-replication] + You may also want to edit the `wal_keep_segments` and `max_wal_senders` to match your + database replication requirements. Consult the [PostgreSQL - Replication documentation](https://www.postgresql.org/docs/9.6/runtime-config-replication.html) for more information. 1. Save the file and reconfigure GitLab for the database listen changes and @@ -430,7 +439,7 @@ data before running `pg_basebackup`. (e.g., you know the network path is secure, or you are using a site-to-site VPN). This is **not** safe over the public Internet! - You can read more details about each `sslmode` in the - [PostgreSQL documentation][pg-docs-ssl]; + [PostgreSQL documentation](https://www.postgresql.org/docs/9.6/libpq-ssl.html#LIBPQ-SSL-PROTECTION); the instructions above are carefully written to ensure protection against both passive eavesdroppers and active "man-in-the-middle" attackers. - Change the `--slot-name` to the name of the replication slot @@ -443,16 +452,16 @@ data before running `pg_basebackup`. The replication process is now complete. -## PGBouncer support (optional) +## PgBouncer support (optional) -[PGBouncer](http://pgbouncer.github.io/) may be used with GitLab Geo to pool -PostgreSQL connections. We recommend using PGBouncer if you use GitLab in a +[PgBouncer](http://pgbouncer.github.io/) may be used with GitLab Geo to pool +PostgreSQL connections. We recommend using PgBouncer if you use GitLab in a high-availability configuration with a cluster of nodes supporting a Geo **primary** node and another cluster of nodes supporting a Geo **secondary** node. For more information, see [High Availability with GitLab Omnibus](../../high_availability/database.md#high-availability-with-gitlab-omnibus-premium-only). -For a Geo **secondary** node to work properly with PGBouncer in front of the database, -it will need a separate read-only user to make [PostgreSQL FDW queries][FDW] +For a Geo **secondary** node to work properly with PgBouncer in front of the database, +it will need a separate read-only user to make [PostgreSQL FDW queries](https://www.postgresql.org/docs/9.6/postgres-fdw.html) work: 1. On the **primary** Geo database, enter the PostgreSQL on the console as an @@ -498,11 +507,6 @@ work: Read the [troubleshooting document](troubleshooting.md). [replication-slots-article]: https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75 -[pgback]: http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html [replication user]:https://wiki.postgresql.org/wiki/Streaming_Replication -[FDW]: https://www.postgresql.org/docs/9.6/static/postgres-fdw.html [toc]: index.md#using-omnibus-gitlab [rake-maintenance]: ../../raketasks/maintenance.md -[pg-docs-ssl]: https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION -[pg-docs-runtime-conn]: https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html -[pg-docs-runtime-replication]: https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html diff --git a/doc/administration/geo/replication/external_database.md b/doc/administration/geo/replication/external_database.md index 256195998a7..4451d3c6c08 100644 --- a/doc/administration/geo/replication/external_database.md +++ b/doc/administration/geo/replication/external_database.md @@ -132,7 +132,7 @@ when `roles ['geo_secondary_role']` is set. For high availability, refer to [Geo High Availability](../../high_availability/README.md). If you want to run this database external to Omnibus, please follow the instructions below. -The tracking database requires an [FDW](https://www.postgresql.org/docs/9.6/static/postgres-fdw.html) +The tracking database requires an [FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html) connection with the **secondary** replica database for improved performance. If you have an external database ready to be used as the tracking database, @@ -173,7 +173,7 @@ the tracking database on port 5432. gitlab-rake geo:db:migrate ``` -1. Configure the [PostgreSQL FDW](https://www.postgresql.org/docs/9.6/static/postgres-fdw.html) +1. Configure the [PostgreSQL FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html) connection and credentials: Save the script below in a file, ex. `/tmp/geo_fdw.sh` and modify the connection diff --git a/doc/administration/geo/replication/faq.md b/doc/administration/geo/replication/faq.md index b3580a706c3..b07b518d3b1 100644 --- a/doc/administration/geo/replication/faq.md +++ b/doc/administration/geo/replication/faq.md @@ -43,9 +43,9 @@ attachments / avatars and the whole database. This means user accounts, issues, merge requests, groups, project data, etc., will be available for query. -## Can I git push to a **secondary** node? +## Can I `git push` to a **secondary** node? -Yes! Pushing directly to a **secondary** node (for both HTTP and SSH, including git-lfs) was [introduced](https://about.gitlab.com/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3. +Yes! Pushing directly to a **secondary** node (for both HTTP and SSH, including Git LFS) was [introduced](https://about.gitlab.com/blog/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3. ## How long does it take to have a commit replicated to a **secondary** node? diff --git a/doc/administration/geo/replication/high_availability.md b/doc/administration/geo/replication/high_availability.md index 9d84e10d496..faa9d051107 100644 --- a/doc/administration/geo/replication/high_availability.md +++ b/doc/administration/geo/replication/high_availability.md @@ -8,7 +8,7 @@ described, it is possible to adapt these instructions to your needs. ![Geo HA Diagram](../../high_availability/img/geo-ha-diagram.png) -_[diagram source - gitlab employees only][diagram-source]_ +_[diagram source - GitLab employees only][diagram-source]_ The topology above assumes that the **primary** and **secondary** Geo clusters are located in two separate locations, on their own virtual network @@ -57,6 +57,11 @@ The following steps enable a GitLab cluster to serve as the **primary** node. roles ['geo_primary_role'] ## + ## The unique identifier for the Geo node. + ## + gitlab_rails['geo_node_name'] = '<node_name_here>' + + ## ## Disable automatic migrations ## gitlab_rails['auto_migrate'] = false @@ -71,8 +76,16 @@ high availability configuration documentation for [PostgreSQL](../../high_availability/database.md#configuring-the-application-nodes) and [Redis](../../high_availability/redis.md#example-configuration-for-the-gitlab-application). -The **primary** database will require modification later, as part of -[step 2](#step-2-configure-the-main-read-only-replica-postgresql-database-on-the-secondary-node). +### Step 2: Configure the **primary** database + +1. Edit `/etc/gitlab/gitlab.rb` and add the following: + + ```ruby + ## + ## Configure the Geo primary role and the PostgreSQL role + ## + roles ['geo_primary_role', 'postgres_role'] + ``` ## Configure a **secondary** node @@ -115,9 +128,9 @@ the **primary** database. Use the following as a guide. ```ruby ## - ## Configure the PostgreSQL role + ## Configure the Geo secondary role and the PostgreSQL role ## - roles ['postgres_role'] + roles ['geo_secondary_role', 'postgres_role'] ## ## Secondary address @@ -222,6 +235,11 @@ following modifications: roles ['geo_secondary_role', 'application_role'] ## + ## The unique identifier for the Geo node. + ## + gitlab_rails['geo_node_name'] = '<node_name_here>' + + ## ## Disable automatic migrations ## gitlab_rails['auto_migrate'] = false @@ -274,15 +292,15 @@ After making these changes [Reconfigure GitLab][gitlab-reconfigure] so the chang On the secondary the following GitLab frontend services will be enabled: -- geo-logcursor -- gitlab-pages -- gitlab-workhorse -- logrotate -- nginx -- registry -- remote-syslog -- sidekiq -- unicorn +- `geo-logcursor` +- `gitlab-pages` +- `gitlab-workhorse` +- `logrotate` +- `nginx` +- `registry` +- `remote-syslog` +- `sidekiq` +- `unicorn` Verify these services by running `sudo gitlab-ctl status` on the frontend application servers. diff --git a/doc/administration/geo/replication/img/adding_a_secondary_node.png b/doc/administration/geo/replication/img/adding_a_secondary_node.png Binary files differnew file mode 100644 index 00000000000..5421b578672 --- /dev/null +++ b/doc/administration/geo/replication/img/adding_a_secondary_node.png diff --git a/doc/administration/geo/replication/img/single_git_add_geolocation_rule.png b/doc/administration/geo/replication/img/single_git_add_geolocation_rule.png Binary files differnew file mode 100644 index 00000000000..4b04ba8d1f1 --- /dev/null +++ b/doc/administration/geo/replication/img/single_git_add_geolocation_rule.png diff --git a/doc/administration/geo/replication/img/single_git_add_traffic_policy_endpoints.png b/doc/administration/geo/replication/img/single_git_add_traffic_policy_endpoints.png Binary files differnew file mode 100644 index 00000000000..c19ad57c953 --- /dev/null +++ b/doc/administration/geo/replication/img/single_git_add_traffic_policy_endpoints.png diff --git a/doc/administration/geo/replication/img/single_git_clone_panel.png b/doc/administration/geo/replication/img/single_git_clone_panel.png Binary files differnew file mode 100644 index 00000000000..8aa0bd2f7d8 --- /dev/null +++ b/doc/administration/geo/replication/img/single_git_clone_panel.png diff --git a/doc/administration/geo/replication/img/single_git_create_policy_records_with_traffic_policy.png b/doc/administration/geo/replication/img/single_git_create_policy_records_with_traffic_policy.png Binary files differnew file mode 100644 index 00000000000..a554532f3b8 --- /dev/null +++ b/doc/administration/geo/replication/img/single_git_create_policy_records_with_traffic_policy.png diff --git a/doc/administration/geo/replication/img/single_git_created_policy_record.png b/doc/administration/geo/replication/img/single_git_created_policy_record.png Binary files differnew file mode 100644 index 00000000000..74c42395e15 --- /dev/null +++ b/doc/administration/geo/replication/img/single_git_created_policy_record.png diff --git a/doc/administration/geo/replication/img/single_git_name_policy.png b/doc/administration/geo/replication/img/single_git_name_policy.png Binary files differnew file mode 100644 index 00000000000..1a976539e94 --- /dev/null +++ b/doc/administration/geo/replication/img/single_git_name_policy.png diff --git a/doc/administration/geo/replication/img/single_git_policy_diagram.png b/doc/administration/geo/replication/img/single_git_policy_diagram.png Binary files differnew file mode 100644 index 00000000000..d62952dbbb3 --- /dev/null +++ b/doc/administration/geo/replication/img/single_git_policy_diagram.png diff --git a/doc/administration/geo/replication/img/single_git_traffic_policies.png b/doc/administration/geo/replication/img/single_git_traffic_policies.png Binary files differnew file mode 100644 index 00000000000..b3193c23d99 --- /dev/null +++ b/doc/administration/geo/replication/img/single_git_traffic_policies.png diff --git a/doc/administration/geo/replication/index.md b/doc/administration/geo/replication/index.md index f9f56b96e22..1fef2e85ce6 100644 --- a/doc/administration/geo/replication/index.md +++ b/doc/administration/geo/replication/index.md @@ -63,7 +63,7 @@ Keep in mind that: - Get user data for logins (API). - Replicate repositories, LFS Objects, and Attachments (HTTPS + JWT). - Since GitLab Premium 10.0, the **primary** node no longer talks to **secondary** nodes to notify for changes (API). -- Pushing directly to a **secondary** node (for both HTTP and SSH, including git-lfs) was [introduced](https://about.gitlab.com/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3. +- Pushing directly to a **secondary** node (for both HTTP and SSH, including Git LFS) was [introduced](https://about.gitlab.com/blog/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3. - There are [limitations](#current-limitations) in the current implementation. ### Architecture @@ -108,7 +108,7 @@ The following are required to run Geo: [fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md)) The following operating systems are known to ship with a current version of OpenSSH: - [CentOS](https://www.centos.org) 7.4+ - - [Ubuntu](https://www.ubuntu.com) 16.04+ + - [Ubuntu](https://ubuntu.com) 16.04+ - PostgreSQL 9.6+ with [FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html) support and [Streaming Replication](https://wiki.postgresql.org/wiki/Streaming_Replication) - Git 2.9+ - All nodes must run the same GitLab version. @@ -229,6 +229,10 @@ For more information on Geo security, see [Geo security review](security_review. For more information on tuning Geo, see [Tuning Geo](tuning.md). +### Set up a location-aware Git URL + +For an example of how to set up a location-aware Git remote URL with AWS Route53, see [Location-aware Git remote URL with AWS Route53](location_aware_git_url.md). + ## Remove Geo node For more information on removing a Geo node, see [Removing **secondary** Geo nodes](remove_geo_node.md). @@ -240,7 +244,7 @@ This list of limitations only reflects the latest version of GitLab. If you are - Pushing directly to a **secondary** node redirects (for HTTP) or proxies (for SSH) the request to the **primary** node instead of [handling it directly](https://gitlab.com/gitlab-org/gitlab/issues/1381), except when using Git over HTTP with credentials embedded within the URI. For example, `https://user:password@secondary.tld`. - The **primary** node has to be online for OAuth login to happen. Existing sessions and Git are not affected. -- The installation takes multiple manual steps that together can take about an hour depending on circumstances. We are working on improving this experience. See [gitlab-org/omnibus-gitlab#2978](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2978) for details. +- The installation takes multiple manual steps that together can take about an hour depending on circumstances. We are working on improving this experience. See [Omnibus GitLab issue #2978](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2978) for details. - Real-time updates of issues/merge requests (for example, via long polling) doesn't work on the **secondary** node. - [Selective synchronization](configuration.md#selective-synchronization) applies only to files and repositories. Other datasets are replicated to the **secondary** node in full, making it inappropriate for use as an access control mechanism. - Object pools for forked project deduplication work only on the **primary** node, and are duplicated on the **secondary** node. @@ -251,36 +255,58 @@ This list of limitations only reflects the latest version of GitLab. If you are The following table lists the GitLab features along with their replication and verification status on a **secondary** node. -You can keep track of the progress to include the missing items in: - -- [ee-893](https://gitlab.com/groups/gitlab-org/-/epics/893). -- [ee-1430](https://gitlab.com/groups/gitlab-org/-/epics/1430). - -| Feature | Replicated | Verified | -|-----------|------------|----------| -| All database content (e.g. snippets, epics, issues, merge requests, groups, and project metadata) | Yes | Yes | -| Project repository | Yes | Yes | -| Project wiki repository | Yes | Yes | -| Project designs repository | No | No | -| Uploads (e.g. attachments to issues, merge requests, epics, and avatars) | Yes | Yes, only on transfer, or manually (1) | -| LFS Objects | Yes | Yes, only on transfer, or manually (1) | -| CI job artifacts (other than traces) | Yes | No, only manually (1) | -| Archived traces | Yes | Yes, only on transfer, or manually (1) | -| Personal snippets | Yes | Yes | -| Version-controlled personal snippets ([unsupported](https://gitlab.com/gitlab-org/gitlab-foss/issues/13426)) | No | No | -| Project snippets | Yes | Yes | -| Version-controlled project snippets ([unsupported](https://gitlab.com/gitlab-org/gitlab-foss/issues/13426)) | No | No | -| Object pools for forked project deduplication | No | No | -| [Server-side Git Hooks](../../custom_hooks.md) | No | No | -| [Elasticsearch integration](../../../integration/elasticsearch.md) | No | No | -| [GitLab Pages](../../pages/index.md) | No | No | -| [Container Registry](../../packages/container_registry.md) | Yes | No | -| [NPM Registry](../../../user/packages/npm_registry/index.md) | No | No | -| [Maven Packages](../../../user/packages/maven_repository/index.md) | No | No | -| [External merge request diffs](../../merge_request_diffs.md) | No, if they are on-disk | No | -| Content in object storage ([track progress](https://gitlab.com/groups/gitlab-org/-/epics/1526)) | No | No | - -1. The integrity can be verified manually using [Integrity Check Rake Task](../../raketasks/check.md) on both nodes and comparing the output between them. +You can keep track of the progress to implement the missing items in +these epics/issues: + +- [Unreplicated Data Types](https://gitlab.com/groups/gitlab-org/-/epics/893) +- [Verify all replicated data](https://gitlab.com/groups/gitlab-org/-/epics/1430) + +| Feature | Replicated | Verified | Notes | +|-----------------------------------------------------|--------------------------|-----------------------------|--------------------------------------------| +| All database content | **Yes** | **Yes** | | +| Project repository | **Yes** | **Yes** | | +| Project wiki repository | **Yes** | **Yes** | | +| Project designs repository | [No][design-replication] | [No][design-verification] | | +| Uploads | **Yes** | [No][upload-verification] | Verified only on transfer, or manually (1) | +| LFS Objects | **Yes** | [No][lfs-verification] | Verified only on transfer, or manually (1) | +| CI job artifacts (other than traces) | **Yes** | [No][artifact-verification] | Verified only manually (1) | +| Archived traces | **Yes** | [No][artifact-verification] | Verified only on transfer, or manually (1) | +| Personal snippets | **Yes** | **Yes** | | +| Version-controlled personal snippets | No | No | [Not yet supported][unsupported-snippets] | +| Project snippets | **Yes** | **Yes** | | +| Version-controlled project snippets | No | No | [Not yet supported][unsupported-snippets] | +| Object pools for forked project deduplication | **Yes** | No | | +| [Server-side Git Hooks][custom-hooks] | No | No | | +| [Elasticsearch integration][elasticsearch] | No | No | | +| [GitLab Pages][gitlab-pages] | [No][pages-replication] | No | | +| [Container Registry][container-registry] | **Yes** | No | | +| [NPM Registry][npm-registry] | No | No | | +| [Maven Repository][maven-repository] | No | No | | +| [Conan Repository][conan-repository] | No | No | | +| [External merge request diffs][merge-request-diffs] | [No][diffs-replication] | No | | +| Content in object storage | **Yes** | No | | + +[design-replication]: https://gitlab.com/groups/gitlab-org/-/epics/1633 +[design-verification]: https://gitlab.com/gitlab-org/gitlab/issues/32467 +[upload-verification]: https://gitlab.com/groups/gitlab-org/-/epics/1817 +[lfs-verification]: https://gitlab.com/gitlab-org/gitlab/issues/8922 +[artifact-verification]: https://gitlab.com/gitlab-org/gitlab/issues/8923 +[diffs-replication]: https://gitlab.com/gitlab-org/gitlab/issues/33817 +[pages-replication]: https://gitlab.com/groups/gitlab-org/-/epics/589 + +[unsupported-snippets]: https://gitlab.com/gitlab-org/gitlab/issues/14228 +[custom-hooks]: ../../custom_hooks.md +[elasticsearch]: ../../../integration/elasticsearch.md +[gitlab-pages]: ../../pages/index.md +[container-registry]: ../../packages/container_registry.md +[npm-registry]: ../../../user/packages/npm_registry/index.md +[maven-repository]: ../../../user/packages/maven_repository/index.md +[conan-repository]: ../../../user/packages/conan_repository/index.md +[merge-request-diffs]: ../../merge_request_diffs.md + +1. The integrity can be verified manually using +[Integrity Check Rake Task](../../raketasks/check.md) +on both nodes and comparing the output between them. DANGER: **DANGER** Features not on this list, or with **No** in the **Replicated** column, diff --git a/doc/administration/geo/replication/location_aware_git_url.md b/doc/administration/geo/replication/location_aware_git_url.md new file mode 100644 index 00000000000..6183a0ad119 --- /dev/null +++ b/doc/administration/geo/replication/location_aware_git_url.md @@ -0,0 +1,119 @@ +# Location-aware Git remote URL with AWS Route53 **(PREMIUM ONLY)** + +You can provide GitLab users with a single remote URL that automatically uses +the Geo node closest to them. This means users don't need to update their Git +configuration to take advantage of closer Geo nodes as they move. + +This is possible because, Git push requests can be automatically redirected +(HTTP) or proxied (SSH) from **secondary** nodes to the **primary** node. + +Though these instructions use [AWS Route53](https://aws.amazon.com/route53/), +other services such as [Cloudflare](https://www.cloudflare.com/) could be used +as well. + +NOTE: **Note** +You can also use a load balancer to distribute web UI or API traffic to +[multiple Geo **secondary** nodes](../../../user/admin_area/geo_nodes.md#multiple-secondary-nodes-behind-a-load-balancer). +Importantly, the **primary** node cannot yet be included. See the feature request +[Support putting the **primary** behind a Geo node load balancer](https://gitlab.com/gitlab-org/gitlab/issues/10888). + +## Prerequisites + +In this example, we have already set up: + +- `primary.example.com` as a Geo **primary** node. +- `secondary.example.com` as a Geo **secondary** node. + +We will create a `git.example.com` subdomain that will automatically direct +requests: + +- From Europe to the **secondary** node. +- From all other locations to the **primary** node. + +In any case, you require: + +- A working GitLab **primary** node that is accessible at its own address. +- A working GitLab **secondary** node. +- A Route53 Hosted Zone managing your domain. + +If you have not yet setup a Geo **primary** node and **secondary** node, please consult +[the Geo setup instructions](https://docs.gitlab.com/ee/administration/geo/replication/#setup-instructions). + +## Create a traffic policy + +In a Route53 Hosted Zone, traffic policies can be used to set up a variety of +routing configurations. + +1. Navigate to the +[Route53 dashboard](https://console.aws.amazon.com/route53/home) and click +**Traffic policies**. + + ![Traffic policies](img/single_git_traffic_policies.png) + +1. Click the **Create traffic policy** button. + + ![Name policy](img/single_git_name_policy.png) + +1. Fill in the **Policy Name** field with `Single Git Host` and click **Next**. + + ![Policy diagram](img/single_git_policy_diagram.png) + +1. Leave **DNS type** as `A: IP Address in IPv4 format`. +1. Click **Connect to...** and select **Geolocation rule**. + + ![Add geolocation rule](img/single_git_add_geolocation_rule.png) + +1. For the first **Location**, leave it as `Default`. +1. Click **Connect to...** and select **New endpoint**. +1. Choose **Type** `value` and fill it in with `<your **primary** IP address>`. +1. For the second **Location**, choose `Europe`. +1. Click **Connect to...** and select **New endpoint**. +1. Choose **Type** `value` and fill it in with `<your **secondary** IP address>`. + + ![Add traffic policy endpoints](img/single_git_add_traffic_policy_endpoints.png) + +1. Click **Create traffic policy**. + + ![Create policy records with traffic policy](img/single_git_create_policy_records_with_traffic_policy.png) + +1. Fill in **Policy record DNS name** with `git`. +1. Click **Create policy records**. + + ![Created policy record](img/single_git_created_policy_record.png) + +You have successfully set up a single host, e.g. `git.example.com` which +distributes traffic to your Geo nodes by geolocation! + +## Configure Git clone URLs to use the special Git URL + +When a user clones a repository for the first time, they typically copy the Git +remote URL from the project page. By default, these SSH and HTTP URLs are based +on the external URL of the current host. For example: + +- `git@secondary.example.com:group1/project1.git` +- `https://secondary.example.com/group1/project1.git` + +![Clone panel](img/single_git_clone_panel.png) + +You can customize the: + +- SSH remote URL to use the location-aware `git.example.com`. To do so, change the SSH remote URL's + host by setting `gitlab_rails['gitlab_ssh_host']` in `gitlab.rb` of web nodes. +- HTTP remote URL as shown in + [Custom Git clone URL for HTTP(S)](../../../user/admin_area/settings/visibility_and_access_controls.md#custom-git-clone-url-for-https). + +## Example Git request handling behavior + +After following the configuration steps above, handling for Git requests is now location aware. +For requests: + +- Outside Europe, all requests are directed to the **primary** node. +- Within Europe, over: + - HTTP: + - `git clone http://git.example.com/foo/bar.git` is directed to the **secondary** node. + - `git push` is initially directed to the **secondary**, which automatically + redirects to `primary.example.com`. + - SSH: + - `git clone git@git.example.com:foo/bar.git` is directed to the **secondary**. + - `git push` is initially directed to the **secondary**, which automatically + proxies the request to `primary.example.com`. diff --git a/doc/administration/geo/replication/object_storage.md b/doc/administration/geo/replication/object_storage.md index 878b67a8f8e..a9087abcbd9 100644 --- a/doc/administration/geo/replication/object_storage.md +++ b/doc/administration/geo/replication/object_storage.md @@ -1,16 +1,33 @@ # Geo with Object storage **(PREMIUM ONLY)** -Geo can be used in combination with Object Storage (AWS S3, or -other compatible object storage). +Geo can be used in combination with Object Storage (AWS S3, or other compatible object storage). -## Configuration +Currently, **secondary** nodes can use either: -At this time it is required that if object storage is enabled on the -**primary** node, it must also be enabled on each **secondary** node. +- The same storage bucket as the **primary** node. +- A replicated storage bucket. -**Secondary** nodes can use the same storage bucket as the **primary** node, or -they can use a replicated storage bucket. At this time GitLab does not -take care of content replication in object storage. +To have: + +- GitLab manage replication, follow [Enabling GitLab replication](#enabling-gitlab-managed-object-storage-replication). +- Third-party services manage replication, follow [Third-party replication services](#third-party-replication-services). + +## Enabling GitLab managed object storage replication + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/issues/10586) in GitLab 12.4. + +CAUTION: **Caution:** +This is a [**beta** feature](https://about.gitlab.com/handbook/product/#beta) and is not ready yet for production use at any scale. + +**Secondary** nodes can replicate files stored on the **primary** node regardless of +whether they are stored on the local filesystem or in object storage. + +To enable GitLab replication, you must: + +1. Go to **Admin Area > Geo**. +1. Press **Edit** on the **secondary** node. +1. Enable the **Allow this secondary node to replicate content on Object Storage** + checkbox. For LFS, follow the documentation to [set up LFS object storage](../../../workflow/lfs/lfs_administration.md#storing-lfs-objects-in-remote-object-storage). @@ -20,12 +37,21 @@ For CI job artifacts, there is similar documentation to configure For user uploads, there is similar documentation to configure [upload object storage](../../uploads.md#using-object-storage-core-only) -You should enable and configure object storage on both **primary** and **secondary** -nodes. Migrating existing data to object storage should be performed on the -**primary** node only. **Secondary** nodes will automatically notice that the migrated -files are now in object storage. +If you want to migrate the **primary** node's files to object storage, you can +configure the **secondary** in a few ways: + +- Use the exact same object storage. +- Use a separate object store but leverage your object storage solution's built-in + replication. +- Use a separate object store and enable the **Allow this secondary node to replicate + content on Object Storage** setting. + +GitLab does not currently support the case where both: + +- The **primary** node uses local storage. +- A **secondary** node uses object storage. -## Replication +## Third-party replication services When using Amazon S3, you can use [CRR](https://docs.aws.amazon.com/AmazonS3/latest/dev/crr.html) to diff --git a/doc/administration/geo/replication/security_review.md b/doc/administration/geo/replication/security_review.md index 832d02be9a5..68bf5b5d23a 100644 --- a/doc/administration/geo/replication/security_review.md +++ b/doc/administration/geo/replication/security_review.md @@ -1,9 +1,9 @@ # Geo security review (Q&A) **(PREMIUM ONLY)** -The following security review of the Geo feature set focuses on security -aspects of the feature as they apply to customers running their own GitLab -instances. The review questions are based in part on the [application security architecture](https://www.owasp.org/index.php/Application_Security_Architecture_Cheat_Sheet) -questions from [owasp.org](https://www.owasp.org). +The following security review of the Geo feature set focuses on security aspects of +the feature as they apply to customers running their own GitLab instances. The review +questions are based in part on the [OWASP Application Security Verification Standard Project](https://www.owasp.org/index.php/Category:OWASP_Application_Security_Verification_Standard_Project) +from [owasp.org](https://www.owasp.org/index.php/Main_Page). ## Business Model @@ -30,7 +30,7 @@ questions from [owasp.org](https://www.owasp.org). private projects. Geo replicates them all indiscriminately. “Selective sync” exists for files and repositories (but not database content), which would permit only less-sensitive projects to be replicated to a **secondary** node if desired. -- See also: [developing a data classification policy](https://gitlab.com/gitlab-com/security/issues/4). +- See also: [GitLab data classification policy](https://about.gitlab.com/handbook/engineering/security/data-classification-policy.html). ### What data backup and retention requirements have been defined for the application? @@ -49,9 +49,9 @@ questions from [owasp.org](https://www.owasp.org). ### How do the end‐users interact with the application? - **Secondary** nodes provide all the interfaces a **primary** node does - (notably a HTTP/HTTPS web application, and HTTP/HTTPS or SSH git repository + (notably a HTTP/HTTPS web application, and HTTP/HTTPS or SSH Git repository access), but is constrained to read-only activities. The principal use case is - envisioned to be cloning git repositories from the **secondary** node in favor of the + envisioned to be cloning Git repositories from the **secondary** node in favor of the **primary** node, but end-users may use the GitLab web interface to view projects, issues, merge requests, snippets, etc. @@ -229,7 +229,7 @@ questions from [owasp.org](https://www.owasp.org). - A static secret shared across all hosts in a GitLab deployment. - In transit, data should be encrypted, although the application does permit communication to proceed unencrypted. The two main transits are the **secondary** node’s - replication process for PostgreSQL, and for git repositories/files. Both should + replication process for PostgreSQL, and for Git repositories/files. Both should be protected using TLS, with the keys for that managed via Omnibus per existing configuration for end-user access to GitLab. diff --git a/doc/administration/geo/replication/troubleshooting.md b/doc/administration/geo/replication/troubleshooting.md index 263fc05dce9..4d64941411a 100644 --- a/doc/administration/geo/replication/troubleshooting.md +++ b/doc/administration/geo/replication/troubleshooting.md @@ -252,7 +252,7 @@ to start again from scratch, there are a few steps that can help you: gitlab-ctl stop geo-logcursor ``` - You can watch sidekiq logs to know when sidekiq jobs processing have finished: + You can watch Sidekiq logs to know when Sidekiq jobs processing have finished: ```sh gitlab-ctl tail sidekiq @@ -280,8 +280,8 @@ to start again from scratch, there are a few steps that can help you: Any uploaded content like file attachments, avatars or LFS objects are stored in a subfolder in one of the two paths below: - - /var/opt/gitlab/gitlab-rails/shared - - /var/opt/gitlab/gitlab-rails/uploads + - `/var/opt/gitlab/gitlab-rails/shared` + - `/var/opt/gitlab/gitlab-rails/uploads` To rename all of them: diff --git a/doc/administration/geo/replication/using_a_geo_server.md b/doc/administration/geo/replication/using_a_geo_server.md index 55b5d486676..55c7e78da92 100644 --- a/doc/administration/geo/replication/using_a_geo_server.md +++ b/doc/administration/geo/replication/using_a_geo_server.md @@ -4,7 +4,7 @@ After you set up the [database replication and configure the Geo nodes][req], use your closest GitLab node as you would a normal standalone GitLab instance. -Pushing directly to a **secondary** node (for both HTTP, SSH including git-lfs) was [introduced](https://about.gitlab.com/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3. +Pushing directly to a **secondary** node (for both HTTP, SSH including Git LFS) was [introduced](https://about.gitlab.com/blog/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3. Example of the output you will see when pushing to a **secondary** node: |