summaryrefslogtreecommitdiff
path: root/doc/administration/geo/replication
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/geo/replication')
-rw-r--r--doc/administration/geo/replication/configuration.md4
-rw-r--r--doc/administration/geo/replication/datatypes.md2
-rw-r--r--doc/administration/geo/replication/docker_registry.md12
-rw-r--r--doc/administration/geo/replication/faq.md8
-rw-r--r--doc/administration/geo/replication/location_aware_git_url.md2
-rw-r--r--doc/administration/geo/replication/remove_geo_node.md9
-rw-r--r--doc/administration/geo/replication/security_review.md6
-rw-r--r--doc/administration/geo/replication/troubleshooting.md78
-rw-r--r--doc/administration/geo/replication/updating_the_geo_nodes.md4
-rw-r--r--doc/administration/geo/replication/usage.md2
-rw-r--r--doc/administration/geo/replication/version_specific_updates.md40
11 files changed, 130 insertions, 37 deletions
diff --git a/doc/administration/geo/replication/configuration.md b/doc/administration/geo/replication/configuration.md
index 926c4c565aa..e8ffa1ae91a 100644
--- a/doc/administration/geo/replication/configuration.md
+++ b/doc/administration/geo/replication/configuration.md
@@ -196,9 +196,9 @@ keys must be manually replicated to the **secondary** node.
gitlab-ctl reconfigure
```
-1. On the top bar, select **Menu >** **{admin}** **Admin**.
+1. On the top bar of the primary node, select **Menu >** **{admin}** **Admin**.
1. On the left sidebar, select **Geo > Nodes**.
-1. Select **New node**.
+1. Select **Add site**.
![Add secondary node](img/adding_a_secondary_node_v13_3.png)
1. Fill in **Name** with the `gitlab_rails['geo_node_name']` in
`/etc/gitlab/gitlab.rb`. These values must always match *exactly*, character
diff --git a/doc/administration/geo/replication/datatypes.md b/doc/administration/geo/replication/datatypes.md
index 6989765dbad..a56d9dc813c 100644
--- a/doc/administration/geo/replication/datatypes.md
+++ b/doc/administration/geo/replication/datatypes.md
@@ -209,6 +209,6 @@ successfully, you must replicate their data using some other means.
#### Limitation of verification for files in Object Storage
-GitLab managed Object Storage replication support [is in beta](object_storage.md#enabling-gitlab-managed-object-storage-replication).
+GitLab managed Object Storage replication support [is in beta](object_storage.md#enabling-gitlab-managed-object-storage-replication).
Locally stored files are verified but remote stored files are not.
diff --git a/doc/administration/geo/replication/docker_registry.md b/doc/administration/geo/replication/docker_registry.md
index cc0719442a1..5cc4f66017b 100644
--- a/doc/administration/geo/replication/docker_registry.md
+++ b/doc/administration/geo/replication/docker_registry.md
@@ -53,7 +53,7 @@ We need to make Docker Registry send notification events to the
registry['notifications'] = [
{
'name' => 'geo_event',
- 'url' => 'https://example.com/api/v4/container_registry_event/events',
+ 'url' => 'https://<example.com>/api/v4/container_registry_event/events',
'timeout' => '500ms',
'threshold' => 5,
'backoff' => '1s',
@@ -65,7 +65,8 @@ We need to make Docker Registry send notification events to the
```
NOTE:
- Replace `<replace_with_a_secret_token>` with a case sensitive alphanumeric string
+ Replace `<example.com>` with the `external_url` defined in your primary site's `/etc/gitlab/gitlab.rb` file, and
+ replace `<replace_with_a_secret_token>` with a case sensitive alphanumeric string
that starts with a letter. You can generate one with `< /dev/urandom tr -dc _A-Z-a-z-0-9 | head -c 32 | sed "s/^[0-9]*//"; echo`
NOTE:
@@ -109,11 +110,14 @@ For each application and Sidekiq node on the **secondary** site:
1. Copy `/var/opt/gitlab/gitlab-rails/etc/gitlab-registry.key` from the **primary** to the node.
-1. Edit `/etc/gitlab/gitlab.rb`:
+1. Edit `/etc/gitlab/gitlab.rb` and add:
```ruby
gitlab_rails['geo_registry_replication_enabled'] = true
- gitlab_rails['geo_registry_replication_primary_api_url'] = 'https://primary.example.com:5050/' # Primary registry address, it will be used by the secondary node to directly communicate to primary registry
+
+ # Primary registry's hostname and port, it will be used by
+ # the secondary node to directly communicate to primary registry
+ gitlab_rails['geo_registry_replication_primary_api_url'] = 'https://primary.example.com:5050/'
```
1. Reconfigure the node for the change to take effect:
diff --git a/doc/administration/geo/replication/faq.md b/doc/administration/geo/replication/faq.md
index ef41b2ff172..28030dccb3b 100644
--- a/doc/administration/geo/replication/faq.md
+++ b/doc/administration/geo/replication/faq.md
@@ -23,7 +23,7 @@ For each project to sync:
1. Geo issues a `git fetch geo --mirror` to get the latest information from the **primary** site.
If there are no changes, the sync is fast. Otherwise, it has to pull the latest commits.
-1. The **secondary** site updates the tracking database to store the fact that it has synced projects A, B, C, etc.
+1. The **secondary** site updates the tracking database to store the fact that it has synced projects A, B, C, and so on.
1. Repeat until all projects are synced.
When someone pushes a commit to the **primary** site, it generates an event in the GitLab database that the repository has changed.
@@ -46,8 +46,8 @@ Read the documentation for [Disaster Recovery](../disaster_recovery/index.md).
## What data is replicated to a **secondary** site?
We currently replicate project repositories, LFS objects, generated
-attachments / avatars and the whole database. This means user accounts,
-issues, merge requests, groups, project data, etc., will be available for
+attachments and avatars, and the whole database. This means user accounts,
+issues, merge requests, groups, project data, and so on, will be available for
query.
## Can I `git push` to a **secondary** site?
@@ -58,7 +58,7 @@ Yes! Pushing directly to a **secondary** site (for both HTTP and SSH, including
All replication operations are asynchronous and are queued to be dispatched. Therefore, it depends on a lot of
factors including the amount of traffic, how big your commit is, the
-connectivity between your sites, your hardware, etc.
+connectivity between your sites, your hardware, and so on.
## What if the SSH server runs at a different port?
diff --git a/doc/administration/geo/replication/location_aware_git_url.md b/doc/administration/geo/replication/location_aware_git_url.md
index 014ca59e571..a80c293149e 100644
--- a/doc/administration/geo/replication/location_aware_git_url.md
+++ b/doc/administration/geo/replication/location_aware_git_url.md
@@ -88,7 +88,7 @@ routing configurations.
![Created policy record](img/single_git_created_policy_record.png)
-You have successfully set up a single host, e.g. `git.example.com` which
+You have successfully set up a single host, for example, `git.example.com` which
distributes traffic to your Geo sites by geolocation!
## Configure Git clone URLs to use the special Git URL
diff --git a/doc/administration/geo/replication/remove_geo_node.md b/doc/administration/geo/replication/remove_geo_node.md
deleted file mode 100644
index b72cd3cbb95..00000000000
--- a/doc/administration/geo/replication/remove_geo_node.md
+++ /dev/null
@@ -1,9 +0,0 @@
----
-redirect_to: '../../geo/replication/remove_geo_site.md'
-remove_date: '2021-06-01'
----
-
-This document was moved to [another location](../../geo/replication/remove_geo_site.md).
-
-<!-- This redirect file can be deleted after 2021-06-01 -->
-<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page -->
diff --git a/doc/administration/geo/replication/security_review.md b/doc/administration/geo/replication/security_review.md
index ae41599311b..966902a3d74 100644
--- a/doc/administration/geo/replication/security_review.md
+++ b/doc/administration/geo/replication/security_review.md
@@ -60,7 +60,7 @@ from [owasp.org](https://owasp.org/).
access), but is constrained to read-only activities. The principal use case is
envisioned to be cloning Git repositories from the **secondary** site in favor of the
**primary** site, but end-users may use the GitLab web interface to view projects,
- issues, merge requests, snippets, etc.
+ issues, merge requests, snippets, and so on.
### What security expectations do the endā€users have?
@@ -203,7 +203,7 @@ from [owasp.org](https://owasp.org/).
### What data entry paths does the application support?
- Data is entered via the web application exposed by GitLab itself. Some data is
- also entered using system administration commands on the GitLab servers (e.g.,
+ also entered using system administration commands on the GitLab servers (for example
`gitlab-ctl set-primary-node`).
- **Secondary** sites also receive inputs via PostgreSQL streaming replication from the **primary** site.
@@ -247,7 +247,7 @@ from [owasp.org](https://owasp.org/).
### What encryption requirements have been defined for data in transit - including transmission over WAN, LAN, SecureFTP, or publicly accessible protocols such as http: and https:?
- Data must have the option to be encrypted in transit, and be secure against
- both passive and active attack (e.g., MITM attacks should not be possible).
+ both passive and active attack (for example, MITM attacks should not be possible).
## Access
diff --git a/doc/administration/geo/replication/troubleshooting.md b/doc/administration/geo/replication/troubleshooting.md
index c00f523957c..d63e927627a 100644
--- a/doc/administration/geo/replication/troubleshooting.md
+++ b/doc/administration/geo/replication/troubleshooting.md
@@ -327,7 +327,7 @@ Slots where `active` is `f` are not active.
- When this slot should be active, because you have a **secondary** node configured using that slot,
log in to that **secondary** node and check the PostgreSQL logs why the replication is not running.
-- If you are no longer using the slot (e.g. you no longer have Geo enabled), you can remove it with in the
+- If you are no longer using the slot (for example, you no longer have Geo enabled), you can remove it with in the
PostgreSQL console session:
```sql
@@ -378,7 +378,7 @@ This happens on wrongly-formatted addresses in `postgresql['md5_auth_cidr_addres
```
To fix this, update the IP addresses in `/etc/gitlab/gitlab.rb` under `postgresql['md5_auth_cidr_addresses']`
-to respect the CIDR format (i.e. `1.2.3.4/32`).
+to respect the CIDR format (that is, `1.2.3.4/32`).
### Message: `LOG: invalid IP mask "md5": Name or service not known`
@@ -390,7 +390,7 @@ This happens when you have added IP addresses without a subnet mask in `postgres
```
To fix this, add the subnet mask in `/etc/gitlab/gitlab.rb` under `postgresql['md5_auth_cidr_addresses']`
-to respect the CIDR format (i.e. `1.2.3.4/32`).
+to respect the CIDR format (that is, `1.2.3.4/32`).
### Message: `Found data in the gitlabhq_production database!` when running `gitlab-ctl replicate-geo-database`
@@ -588,6 +588,75 @@ to start again from scratch, there are a few steps that can help you:
gitlab-ctl start
```
+### Design repository failures on mirrored projects and project imports
+
+On the top bar, under **Menu >** **{admin}** **Admin > Geo > Nodes**,
+if the Design repositories progress bar shows
+`Synced` and `Failed` greater than 100%, and negative `Queued`, then the instance
+is likely affected by
+[a bug in GitLab 13.2 and 13.3](https://gitlab.com/gitlab-org/gitlab/-/issues/241668).
+It was [fixed in 13.4+](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/40643).
+
+To determine the actual replication status of design repositories in
+a [Rails console](../../operations/rails_console.md):
+
+```ruby
+secondary = Gitlab::Geo.current_node
+counts = {}
+secondary.designs.select("projects.id").find_each do |p|
+ registry = Geo::DesignRegistry.find_by(project_id: p.id)
+ state = registry ? "#{registry.state}" : "registry does not exist yet"
+ # puts "Design ID##{p.id}: #{state}" # uncomment this for granular information
+ counts[state] ||= 0
+ counts[state] += 1
+end
+puts "\nCounts:", counts
+```
+
+Example output:
+
+```plaintext
+Design ID#5: started
+Design ID#6: synced
+Design ID#7: failed
+Design ID#8: pending
+Design ID#9: synced
+
+Counts:
+{"started"=>1, "synced"=>2, "failed"=>1, "pending"=>1}
+```
+
+Example output if there are actually zero design repository replication failures:
+
+```plaintext
+Design ID#5: synced
+Design ID#6: synced
+Design ID#7: synced
+
+Counts:
+{"synced"=>3}
+```
+
+#### If you are promoting a Geo secondary site running on a single server
+
+`gitlab-ctl promotion-preflight-checks` will fail due to the existence of
+`failed` rows in the `geo_design_registry` table. Use the
+[previous snippet](#design-repository-failures-on-mirrored-projects-and-project-imports) to
+determine the actual replication status of Design repositories.
+
+`gitlab-ctl promote-to-primary-node` will fail since it runs preflight checks.
+If the [previous snippet](#design-repository-failures-on-mirrored-projects-and-project-imports)
+shows that all designs are synced, then you can use the
+`--skip-preflight-checks` option or the `--force` option to move forward with
+promotion.
+
+#### If you are promoting a Geo secondary site running on multiple servers
+
+`gitlab-ctl promotion-preflight-checks` will fail due to the existence of
+`failed` rows in the `geo_design_registry` table. Use the
+[previous snippet](#design-repository-failures-on-mirrored-projects-and-project-imports) to
+determine the actual replication status of Design repositories.
+
## Fixing errors during a failover or when promoting a secondary to a primary node
The following are possible errors that might be encountered during failover or
@@ -726,6 +795,7 @@ sudo gitlab-ctl promotion-preflight-checks
sudo /opt/gitlab/embedded/bin/gitlab-pg-ctl promote
sudo gitlab-ctl reconfigure
sudo gitlab-rake geo:set_secondary_as_primary
+```
## Expired artifacts
@@ -794,7 +864,7 @@ PostgreSQL instances:
The most common problems that prevent the database from replicating correctly are:
-- **Secondary** nodes cannot reach the **primary** node. Check credentials, firewall rules, etc.
+- **Secondary** nodes cannot reach the **primary** node. Check credentials, firewall rules, and so on.
- SSL certificate problems. Make sure you copied `/etc/gitlab/gitlab-secrets.json` from the **primary** node.
- Database storage disk is full.
- Database replication slot is misconfigured.
diff --git a/doc/administration/geo/replication/updating_the_geo_nodes.md b/doc/administration/geo/replication/updating_the_geo_nodes.md
index 0c68adf162d..03570048071 100644
--- a/doc/administration/geo/replication/updating_the_geo_nodes.md
+++ b/doc/administration/geo/replication/updating_the_geo_nodes.md
@@ -28,9 +28,9 @@ and all **secondary** nodes:
1. **Optional:** [Pause replication on each **secondary** node.](../index.md#pausing-and-resuming-replication)
1. Log into the **primary** node.
-1. [Update GitLab on the **primary** node using Omnibus's Geo-specific steps](https://docs.gitlab.com/omnibus/update/README.html#geo-deployment).
+1. [Update GitLab on the **primary** node using Omnibus](https://docs.gitlab.com/omnibus/update/#update-using-the-official-repositories).
1. Log into each **secondary** node.
-1. [Update GitLab on each **secondary** node using Omnibus's Geo-specific steps](https://docs.gitlab.com/omnibus/update/README.html#geo-deployment).
+1. [Update GitLab on each **secondary** node using Omnibus](https://docs.gitlab.com/omnibus/update/#update-using-the-official-repositories).
1. If you paused replication in step 1, [resume replication on each **secondary**](../index.md#pausing-and-resuming-replication)
1. [Test](#check-status-after-updating) **primary** and **secondary** nodes, and check version in each.
diff --git a/doc/administration/geo/replication/usage.md b/doc/administration/geo/replication/usage.md
index 1491aa3427e..7fe8eec467e 100644
--- a/doc/administration/geo/replication/usage.md
+++ b/doc/administration/geo/replication/usage.md
@@ -27,7 +27,7 @@ Everything up-to-date
```
NOTE:
-If you're using HTTPS instead of [SSH](../../../ssh/README.md) to push to the secondary,
+If you're using HTTPS instead of [SSH](../../../ssh/index.md) to push to the secondary,
you can't store credentials in the URL like `user:password@URL`. Instead, you can use a
[`.netrc` file](https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html)
for Unix-like operating systems or `_netrc` for Windows. In that case, the credentials
diff --git a/doc/administration/geo/replication/version_specific_updates.md b/doc/administration/geo/replication/version_specific_updates.md
index 301be931b29..e193fc630b9 100644
--- a/doc/administration/geo/replication/version_specific_updates.md
+++ b/doc/administration/geo/replication/version_specific_updates.md
@@ -11,16 +11,35 @@ Review this page for update instructions for your version. These steps
accompany the [general steps](updating_the_geo_nodes.md#general-update-steps)
for updating Geo nodes.
+## Updating to GitLab 13.12
+
+We found an issue where [secondary nodes re-download all LFS files](https://gitlab.com/gitlab-org/gitlab/-/issues/334550) upon update. This bug:
+
+- Only applies to Geo secondary sites that have replicated LFS objects.
+- Is _not_ a data loss risk.
+- Causes churn and wasted bandwidth re-downloading all LFS objects.
+- May impact performance for GitLab installations with a large number of LFS files.
+
+If you don't have many LFS objects or can stand a bit of churn, then it is safe to let the secondary sites re-download LFS objects.
+If you do have many LFS objects, or many Geo secondary sites, or limited bandwidth, or a combination of them all, then we recommend you skip GitLab 13.12.0 through 13.12.6 and update to GitLab 13.12.7 or newer.
+
+### If you have already updated to an affected version, and the re-sync is ongoing
+
+You can manually migrate the legacy sync state to the new state column by running the following command in a [Rails console](../../operations/rails_console.md). It should take under a minute:
+
+```ruby
+Geo::LfsObjectRegistry.where(state: 0, success: true).update_all(state: 2)
+```
+
## Updating to GitLab 13.11
-We found an [issue with Git clone/pull through HTTP(s)](https://gitlab.com/gitlab-org/gitlab/-/issues/330787) on Geo secondaries and on any GitLab instance if maintenance mode is enabled. This was caused by a regression in GitLab Workhorse. This is fixed in the [GitLab 13.11.4 patch release](https://about.gitlab.com/releases/2021/05/14/gitlab-13-11-4-released/). To avoid this issue, upgrade to GitLab 13.11.4 or later.
+We found an [issue with Git clone/pull through HTTP(s)](https://gitlab.com/gitlab-org/gitlab/-/issues/330787) on Geo secondaries and on any GitLab instance if maintenance mode is enabled. This was caused by a regression in GitLab Workhorse. This is fixed in the [GitLab 13.11.4 patch release](https://about.gitlab.com/releases/2021/05/14/gitlab-13-11-4-released/). To avoid this issue, upgrade to GitLab 13.11.4 or later.
## Updating to GitLab 13.9
We've detected an issue [with a column rename](https://gitlab.com/gitlab-org/gitlab/-/issues/324160)
-that may prevent upgrades to GitLab 13.9.0, 13.9.1, 13.9.2 and 13.9.3.
-We are working on a patch, but until a fixed version is released, you can manually complete
-the zero-downtime upgrade:
+that will prevent upgrades to GitLab 13.9.0, 13.9.1, 13.9.2 and 13.9.3 when following the zero-downtime steps. It is necessary
+to perform the following additional steps for the zero-downtime upgrade:
1. Before running the final `sudo gitlab-rake db:migrate` command on the deploy node,
execute the following queries using the PostgreSQL console (or `sudo gitlab-psql`)
@@ -40,9 +59,18 @@ the zero-downtime upgrade:
```
If you have already run the final `sudo gitlab-rake db:migrate` command on the deploy node and have
-encountered the [column rename issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160), you can still
-follow the previous steps to complete the update.
+encountered the [column rename issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160), you will
+see the following error:
+
+```shell
+-- remove_column(:application_settings, :asset_proxy_whitelist)
+rake aborted!
+StandardError: An error has occurred, all later migrations canceled:
+PG::DependentObjectsStillExist: ERROR: cannot drop column asset_proxy_whitelist of table application_settings because other objects depend on it
+DETAIL: trigger trigger_0d588df444c8 on table application_settings depends on column asset_proxy_whitelist of table application_settings
+```
+To work around this bug, follow the previous steps to complete the update.
More details are available [in this issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160).
## Updating to GitLab 13.7