diff options
Diffstat (limited to 'doc/administration/geo/replication/troubleshooting.md')
-rw-r--r-- | doc/administration/geo/replication/troubleshooting.md | 87 |
1 files changed, 29 insertions, 58 deletions
diff --git a/doc/administration/geo/replication/troubleshooting.md b/doc/administration/geo/replication/troubleshooting.md index 6d990fd12ba..1fd923dbaf1 100644 --- a/doc/administration/geo/replication/troubleshooting.md +++ b/doc/administration/geo/replication/troubleshooting.md @@ -583,64 +583,6 @@ to start again from scratch, there are a few steps that can help you: gitlab-ctl start ``` -## Fixing errors during a PostgreSQL upgrade or downgrade - -### Message: `ERROR: psql: FATAL: role "gitlab-consul" does not exist` - -When -[upgrading PostgreSQL on a Geo instance](https://docs.gitlab.com/omnibus/settings/database.html#upgrading-a-geo-instance), you might encounter the -following error: - -```plaintext -$ sudo gitlab-ctl pg-upgrade --target-version=11 -Checking for an omnibus managed postgresql: OK -Checking if postgresql['version'] is set: OK -Checking if we already upgraded: NOT OK -Checking for a newer version of PostgreSQL to install -Upgrading PostgreSQL to 11.7 -Checking if PostgreSQL bin files are symlinked to the expected location: OK -Waiting 30 seconds to ensure tasks complete before PostgreSQL upgrade. -See https://docs.gitlab.com/omnibus/settings/database.html#upgrade-packaged-postgresql-server for details -If you do not want to upgrade the PostgreSQL server at this time, enter Ctrl-C and see the documentation for details - -Please hit Ctrl-C now if you want to cancel the operation. -..............................Detected an HA cluster. -Error running command: /opt/gitlab/embedded/bin/psql -qt -d gitlab_repmgr -h /var/opt/gitlab/postgresql -p 5432 -c "SELECT name FROM repmgr_gitlab_cluster.repl_nodes WHERE type='master' AND active != 'f'" -U gitlab-consul -ERROR: psql: FATAL: role "gitlab-consul" does not exist -Traceback (most recent call last): - 10: from /opt/gitlab/embedded/bin/omnibus-ctl:23:in `<main>' - 9: from /opt/gitlab/embedded/bin/omnibus-ctl:23:in `load' - 8: from /opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/omnibus-ctl-0.6.0/bin/omnibus-ctl:31:in `<top (required)>' - 7: from /opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/omnibus-ctl-0.6.0/lib/omnibus-ctl.rb:746:in `run' - 6: from /opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/omnibus-ctl-0.6.0/lib/omnibus-ctl.rb:204:in `block in add_command_under_category' - 5: from /opt/gitlab/embedded/service/omnibus-ctl/pg-upgrade.rb:171:in `block in load_file' - 4: from /opt/gitlab/embedded/service/omnibus-ctl-ee/lib/repmgr.rb:248:in `is_master?' - 3: from /opt/gitlab/embedded/service/omnibus-ctl-ee/lib/repmgr.rb:100:in `execute_psql' - 2: from /opt/gitlab/embedded/service/omnibus-ctl-ee/lib/repmgr.rb:113:in `cmd' - 1: from /opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/mixlib-shellout-3.0.9/lib/mixlib/shellout.rb:287:in `error!' -/opt/gitlab/embedded/lib/ruby/gems/2.6.0/gems/mixlib-shellout-3.0.9/lib/mixlib/shellout.rb:300:in `invalid!': Expected process to exit with [0], but received '2' (Mixlib::ShellOut::ShellCommandFailed) ----- Begin output of /opt/gitlab/embedded/bin/psql -qt -d gitlab_repmgr -h /var/opt/gitlab/postgresql -p 5432 -c "SELECT name FROM repmgr_gitlab_cluster.repl_nodes WHERE type='master' AND active != 'f'" -U gitlab-consul ---- -STDOUT: -STDERR: psql: FATAL: role "gitlab-consul" does not exist ----- End output of /opt/gitlab/embedded/bin/psql -qt -d gitlab_repmgr -h /var/opt/gitlab/postgresql -p 5432 -c "SELECT name FROM repmgr_gitlab_cluster.repl_nodes WHERE type='master' AND active != 'f'" -U gitlab-consul ---- -Ran /opt/gitlab/embedded/bin/psql -qt -d gitlab_repmgr -h /var/opt/gitlab/postgresql -p 5432 -c "SELECT name FROM repmgr_gitlab_cluster.repl_nodes WHERE type='master' AND active != 'f'" -U gitlab-consul returned 2 -``` - -If you are upgrading the PostgreSQL read-replica of a Geo secondary node, and -you are not using `consul` or `repmgr`, you may need to disable `consul` and/or -`repmgr` services in `gitlab.rb`: - -```ruby -consul['enable'] = false -repmgr['enable'] = false -``` - -Then reconfigure GitLab: - -```shell -sudo gitlab-ctl reconfigure -``` - ## Fixing errors during a failover or when promoting a secondary to a primary node The following are possible errors that might be encountered during failover or @@ -756,6 +698,30 @@ this command reports `ERROR - Replication is not up-to-date` even if replication is actually up-to-date. If replication and verification output shows that it is complete, you can add `--skip-preflight-checks` to make the command complete promotion. This bug was fixed in GitLab 13.8 and later. +### Errors when using `--skip-preflight-checks` or `--force` + +Before GitLab 13.5, you could bump into one of the following errors when using +`--skip-preflight-checks` or `--force`: + +```plaintext +get_ctl_options': invalid option: --skip-preflight-checks (OptionParser::InvalidOption) + +get_ctl_options': invalid option: --force (OptionParser::InvalidOption) +``` + +This can happen with XFS or filesystems that list files in lexical order, because the +load order of the Omnibus command files can be different than expected, and a global function would get redefined. +More details can be found in [the related issue](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/6076). + +The workaround is to manually run the preflight checks and promote the database, by running +the following commands on the Geo secondary site: + +```shell +sudo gitlab-ctl promotion-preflight-checks +sudo /opt/gitlab/embedded/bin/gitlab-pg-ctl promote +sudo gitlab-ctl reconfigure +sudo gitlab-rake geo:set_secondary_as_primary + ## Expired artifacts If you notice for some reason there are more artifacts on the Geo @@ -854,6 +820,11 @@ To resolve this issue: the **primary** node using IPv4 in the `/etc/hosts` file. Alternatively, you should [enable IPv6 on the **primary** node](https://docs.gitlab.com/omnibus/settings/nginx.html#setting-the-nginx-listen-address-or-addresses). +### Geo Admin Area shows 'Unknown' for health status and 'Request failed with status code 401' + +If using a load balancer, ensure that the load balancer's URL is set as the `external_url` in the +`/etc/gitlab/gitlab.rb` of the nodes behind the load balancer. + ### GitLab Pages return 404 errors after promoting This is due to [Pages data not being managed by Geo](datatypes.md#limitations-on-replicationverification). |