diff options
author | Achilleas Pipinellis <axil@gitlab.com> | 2019-03-12 16:22:37 +0100 |
---|---|---|
committer | Achilleas Pipinellis <axil@gitlab.com> | 2019-03-12 16:22:37 +0100 |
commit | c19ed72155378c7e684d771964ad027f57c4ca34 (patch) | |
tree | aee24cf4cd24ca3962a2adf68c7b89cfb37ef242 /doc/administration | |
parent | a817f7905c084f725b6fa01955be4fd8ad28c747 (diff) | |
download | gitlab-ce-docs/ee-to-ce.tar.gz |
Merge EE docs into CEdocs/ee-to-ce
Diffstat (limited to 'doc/administration')
81 files changed, 8052 insertions, 152 deletions
diff --git a/doc/administration/audit_events.md b/doc/administration/audit_events.md new file mode 100644 index 00000000000..7a2fedd141b --- /dev/null +++ b/doc/administration/audit_events.md @@ -0,0 +1,116 @@ +--- +last_updated: 2019-02-04 +--- + +# Audit Events + +GitLab offers a way to view the changes made within the GitLab server for owners and administrators on a [paid plan][ee]. + +GitLab system administrators can also take advantage of the logs located on the +filesystem, see [the logs system documentation](logs.md) for more details. + +## Overview + +**Audit Events** is a tool for GitLab owners and administrators to be +able to track important events such as who performed certain actions and the +time they happened. These actions could be, for example, change a user +permission level, who added a new user, or who removed a user. + +## Use-cases + +- Check who was the person who changed the permission level of a particular + user for a project in GitLab. +- Use it to track which users have access to a certain group of projects + in GitLab, and who gave them that permission level. + +## List of events + +There are two kinds of events logged: + +- Events scoped to the group or project, used by group / project managers + to look up who made what change. +- Instance events scoped to the whole GitLab instance, used by your Compliance team to + perform formal audits. + +### Group events **[STARTER]** + +NOTE: **Note:** +You need Owner [permissions] to view the group Audit Events page. + +To view a group's audit events, navigate to **Group > Settings > Audit Events**. +From there, you can see the following actions: + +- Group name/path changed +- Group repository size limit changed +- Group created/deleted +- Group changed visibility +- User was added to group and with which [permissions] +- Permissions changes of a user assigned to a group +- Removed user from group +- Project added to group and with which visibility level +- Project removed from group +- [Project shared with group](../user/project/members/share_project_with_groups.md) + and with which [permissions] +- Removal of a previously shared group with a project +- LFS enabled/disabled +- Shared runners minutes limit changed +- Membership lock enabled/disabled +- Request access enabled/disabled +- 2FA enforcement/grace period changed +- Roles allowed to create project changed + +### Project events **[STARTER]** + +NOTE: **Note:** +You need Maintainer [permissions] or higher to view the project Audit Events page. + +To view a project's audit events, navigate to **Project > Settings > Audit Events**. +From there, you can see the following actions: + +- Added/removed deploy keys +- Project created/deleted/renamed/moved(transferred)/changed path +- Project changed visibility level +- User was added to project and with which [permissions] +- Permission changes of a user assigned to a project +- User was removed from project + +### Instance events **[PREMIUM ONLY]** + +> [Introduced][ee-2336] in [GitLab Premium][ee] 9.3. + +Server-wide audit logging introduces the ability to observe user actions across +the entire instance of your GitLab server, making it easy to understand who +changed what and when for audit purposes. + +To view the server-wide admin log, visit **Admin Area > Monitoring > Audit Log**. + +In addition to the group and project events, the following user actions are also +recorded: + +- Failed Logins +- Sign-in events and the authentication type (standard, LDAP, OmniAuth, etc.) +- Added SSH key +- Added/removed email +- Changed password +- Ask for password reset +- Grant OAuth access + +It is possible to filter particular actions by choosing an audit data type from +the filter drop-down. You can further filter by specific group, project or user +(for authentication events). + +![audit log](audit_log.png) + +### Missing events + +Some events are not being tracked in Audit Events. Please see the following +epics for more detail on which events are not being tracked and our progress +on adding these events into GitLab: + +- [Project settings and activity](https://gitlab.com/groups/gitlab-org/-/epics/474) +- [Group settings and activity](https://gitlab.com/groups/gitlab-org/-/epics/475) +- [Instance-level settings and activity](https://gitlab.com/groups/gitlab-org/-/epics/476) + +[ee-2336]: https://gitlab.com/gitlab-org/gitlab-ee/issues/2336 +[ee]: https://about.gitlab.com/pricing/ +[permissions]: ../user/permissions.md diff --git a/doc/administration/audit_log.png b/doc/administration/audit_log.png Binary files differnew file mode 100644 index 00000000000..d4f4c2abf38 --- /dev/null +++ b/doc/administration/audit_log.png diff --git a/doc/administration/auditor_access_form.png b/doc/administration/auditor_access_form.png Binary files differnew file mode 100644 index 00000000000..c179a7d3b0a --- /dev/null +++ b/doc/administration/auditor_access_form.png diff --git a/doc/administration/auditor_users.md b/doc/administration/auditor_users.md new file mode 100644 index 00000000000..ef8c8197d6d --- /dev/null +++ b/doc/administration/auditor_users.md @@ -0,0 +1,88 @@ +# Auditor users **[PREMIUM ONLY]** + +>[Introduced][ee-998] in [GitLab Premium][eep] 8.17. + +Auditor users are given read-only access to all projects, groups, and other +resources on the GitLab instance. + +## Overview + +Auditor users can have full access to their own resources (projects, groups, +snippets, etc.), and read-only access to **all** other resources, except the +Admin area. To put another way, they are just regular users (who can be added +to projects, create personal snippets, create milestones on their groups, etc.) +who also happen to have read-only access to all projects on the system that +they haven't been explicitly [given access][permissions] to. + +The Auditor role is _not_ a read-only version of the Admin role. Auditor users +will not be able to access the project/group settings pages, or the Admin Area. + +To sum up, assuming you have logged-in as an Auditor user: + +- For a project the Auditor is not member of, the Auditor should have + read-only access. If the project is public or internal, they would have the + same access as the users that are not members of that project/group. +- For a project the Auditor owns, the Auditor should have full access to + everything. +- For a project the Auditor has been added to as a member, the Auditor should + have the same access as the [permissions] they were given to. For example, if + they were added as a Developer, they could then push commits or comment on + issues. +- The Auditor cannot view the Admin area, or perform any admin actions. + +For more information about what an Auditor can or can't do, see the +[Permissions and restrictions of an Auditor user](#permissions-and-restrictions-of-an-auditor-user) +section. + +## Use cases + +1. Your compliance department wants to run tests against the entire GitLab base + to ensure users are complying with password, credit card, and other sensitive + data policies. With Auditor users, this can be achieved very easily without + resulting to tactics like giving a user admin rights or having to use the API + to add them to all projects. +1. If particular users need visibility or access to most of all projects in + your GitLab instance, instead of manually adding the user to all projects, + you can simply create an Auditor user and share the credentials with those + that you want to grant access to. + +## Adding an Auditor user + +1. Create a new user or edit an existing one by navigating to + **Admin Area > Users**. You will find the option of the access level under + the 'Access' section. + + ![Admin Area Form](auditor_access_form.png) + +1. Click **Save changes** or **Create user** for the changes to take effect. + +To revoke the Auditor permissions from a user, simply make them a Regular user +following the same steps as above. + +## Permissions and restrictions of an Auditor user + +An Auditor user should be able to access all projects and groups of a GitLab +instance, with the following permissions/restrictions: + +- Has read-only access to the API +- Can access projects that are: + - Private + - Public + - Internal +- Can read all files in a repository +- Can read issues / MRs +- Can read project snippets +- Cannot be Admin and Auditor at the same time +- Cannot access the Admin area +- In a group / project they're not a member of: + - Cannot access project settings + - Cannot access group settings + - Cannot commit to repository + - Cannot create / comment on issues / MRs + - Cannot create/modify files from the Web UI + - Cannot merge a merge request + - Cannot create project snippets + +[ee-998]: https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/998 +[eep]: https://about.gitlab.com/pricing/ +[permissions]: ../user/permissions.md diff --git a/doc/administration/auth/README.md b/doc/administration/auth/README.md index 54be7b616cc..c5c1806376f 100644 --- a/doc/administration/auth/README.md +++ b/doc/administration/auth/README.md @@ -9,9 +9,11 @@ providers. - [LDAP](ldap.md) Includes Active Directory, Apple Open Directory, Open LDAP, and 389 Server + - [LDAP for GitLab EE](ldap-ee.md): LDAP additions to GitLab Enterprise Editions **[STARTER ONLY]** - [OmniAuth](../../integration/omniauth.md) Sign in via Twitter, GitHub, GitLab.com, Google, Bitbucket, Facebook, Shibboleth, Crowd, Azure, Authentiq ID, and JWT - [CAS](../../integration/cas.md) Configure GitLab to sign in using CAS - [SAML](../../integration/saml.md) Configure GitLab as a SAML 2.0 Service Provider - [Okta](okta.md) Configure GitLab to sign in using Okta - [Authentiq](authentiq.md): Enable the Authentiq OmniAuth provider for passwordless authentication +- [Smartcard](smartcard.md) Smartcard authentication diff --git a/doc/administration/auth/how_to_configure_ldap_gitlab_ce/index.md b/doc/administration/auth/how_to_configure_ldap_gitlab_ce/index.md index 15276d364a0..5f4ae278acf 100644 --- a/doc/administration/auth/how_to_configure_ldap_gitlab_ce/index.md +++ b/doc/administration/auth/how_to_configure_ldap_gitlab_ce/index.md @@ -14,7 +14,7 @@ Managing a large number of users in GitLab can become a burden for system admini In this guide we will focus on configuring GitLab with Active Directory. [Active Directory](https://en.wikipedia.org/wiki/Active_Directory) is a popular LDAP compatible directory service provided by Microsoft, included in all modern Windows Server operating systems. -GitLab has supported LDAP integration since [version 2.2](https://about.gitlab.com/2012/02/22/gitlab-version-2-2/). With GitLab LDAP [group syncing](https://docs.gitlab.com/ee/administration/auth/how_to_configure_ldap_gitlab_ee/index.html#group-sync) being added to GitLab Enterprise Edition in [version 6.0](https://about.gitlab.com/2013/08/20/gitlab-6-dot-0-released/). LDAP integration has become one of the most popular features in GitLab. +GitLab has supported LDAP integration since [version 2.2](https://about.gitlab.com/2012/02/22/gitlab-version-2-2/). With GitLab LDAP [group syncing](../how_to_configure_ldap_gitlab_ee/index.html#group-sync) being added to GitLab Enterprise Edition in [version 6.0](https://about.gitlab.com/2013/08/20/gitlab-6-dot-0-released/). LDAP integration has become one of the most popular features in GitLab. ## Getting started @@ -111,7 +111,7 @@ The initial configuration of LDAP in GitLab requires changes to the `gitlab.rb` The two Active Directory specific values are `active_directory: true` and `uid: 'sAMAccountName'`. `sAMAccountName` is an attribute returned by Active Directory used for GitLab usernames. See the example output from `ldapsearch` for a full list of attributes a "person" object (user) has in **AD** - [`ldapsearch` example](#using-ldapsearch-unix) -> Both group_base and admin_group configuration options are only available in GitLab Enterprise Edition. See [GitLab EE - LDAP Features](https://docs.gitlab.com/ee/administration/auth/how_to_configure_ldap_gitlab_ee/index.html#gitlab-enterprise-edition---ldap-features) +> Both group_base and admin_group configuration options are only available in GitLab Enterprise Edition. See [GitLab EE - LDAP Features](../how_to_configure_ldap_gitlab_ee/index.html#gitlab-enterprise-edition---ldap-features) ### Example `gitlab.rb` LDAP @@ -267,4 +267,4 @@ have extended functionalities with LDAP, such as: - Updating user permissions - Multiple LDAP servers -Read through the article on [LDAP for GitLab EE](https://docs.gitlab.com/ee/administration/auth/how_to_configure_ldap_gitlab_ee/) for an overview. +Read through the article on [LDAP for GitLab EE](../how_to_configure_ldap_gitlab_ee/index.md) for an overview. diff --git a/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/admin_group.png b/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/admin_group.png Binary files differnew file mode 100644 index 00000000000..9896379d669 --- /dev/null +++ b/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/admin_group.png diff --git a/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/group_link_final.png b/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/group_link_final.png Binary files differnew file mode 100644 index 00000000000..21fb5a7d0ce --- /dev/null +++ b/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/group_link_final.png diff --git a/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/group_linking.gif b/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/group_linking.gif Binary files differnew file mode 100644 index 00000000000..d35cf55804f --- /dev/null +++ b/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/group_linking.gif diff --git a/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/manual_permissions.gif b/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/manual_permissions.gif Binary files differnew file mode 100644 index 00000000000..29b28df1cbd --- /dev/null +++ b/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/manual_permissions.gif diff --git a/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/multi_login.gif b/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/multi_login.gif Binary files differnew file mode 100644 index 00000000000..d317add9837 --- /dev/null +++ b/doc/administration/auth/how_to_configure_ldap_gitlab_ee/img/multi_login.gif diff --git a/doc/administration/auth/how_to_configure_ldap_gitlab_ee/index.md b/doc/administration/auth/how_to_configure_ldap_gitlab_ee/index.md new file mode 100644 index 00000000000..4e0a0035790 --- /dev/null +++ b/doc/administration/auth/how_to_configure_ldap_gitlab_ee/index.md @@ -0,0 +1,119 @@ +--- +author: Chris Wilson +author_gitlab: MrChrisW +level: intermediary +article_type: admin guide +date: 2017-05-03 +--- + +# How to configure LDAP with GitLab EE + +## Introduction + +The present article follows [How to Configure LDAP with GitLab CE](../how_to_configure_ldap_gitlab_ce/index.md). Make sure to read through it before moving forward. + +## GitLab Enterprise Edition - LDAP features + +[GitLab Enterprise Edition (EE)](https://about.gitlab.com/pricing/) has a number of advantages when it comes to integrating with Active Directory (LDAP): + +- [Administrator Sync](../ldap-ee.md#administrator-sync): As an extension of group sync, you can automatically manage your global GitLab administrators. Specify a group CN for `admin_group` and all members of the LDAP group will be given administrator privileges. +- [Group Sync](#group-sync): This allows GitLab group membership to be automatically updated based on LDAP group members. +- [Multiple LDAP servers](#multiple-ldap-servers): The ability to configure multiple LDAP servers. This is useful if an organization has different LDAP servers within departments. This is not designed for failover. We're working on [supporting LDAP failover](https://gitlab.com/gitlab-org/gitlab-ee/issues/139) in GitLab. + +- Daily user synchronization: Once a day, GitLab will run a synchronization to check and update GitLab users against LDAP. This process updates all user details automatically. + +On the following section, you'll find a description for each of these features. Read through [LDAP GitLab EE docs](../ldap-ee.md) for complementary information. + +![GitLab OU Structure](img/admin_group.png) + +All members of the group `Global Admins` will be given **administrator** access to GitLab, allowing them to view the `/admin` dashboard. + +### Group Sync + +Group syncing allows AD (LDAP) groups to be mapped to GitLab groups. This provides more control over per-group user management. To configure group syncing edit the `group_base` **DN** (`'OU=Global Groups,OU=GitLab INT,DC=GitLab,DC=org'`). This **OU** contains all groups that will be associated with [GitLab groups](../../../user/group/index.md). + +#### Creating group links - example + +As an example, let's suppose we have a "UKGov" GitLab group, which deals with confidential government information. Therefore, it is important that users of this group are given the correct permissions to projects contained within the group. Granular group permissions can be applied based on the AD group. + +**UK Developers** of our "UKGov" group are given **"developer"** permissions. + +_The developer permission allows the development staff to effectively manage all project code, issues, and merge requests._ + +**UK Support** staff of our "UKGov" group are given **"reporter"** permissions. + +_The reporter permission allows support staff to manage issues, labels, and review project code._ + +**US People Ops** of our "UKGov" group are given **"guest"** permissions. + +![Creating group links](img/group_linking.gif) + +> Guest permissions allows people ops staff to review and lodge new issues while allowing no read or write access to project code or [confidential issues](../../../user/project/issues/confidential_issues.md#permissions-and-access-to-confidential-issues) created by other users. + +See the [permission list](../../../user/permissions.md) for complementary info. + +#### Group permissions - example + +Considering the previous example, our staff will have +access to our GitLab instance with the following structure: + +![GitLab OU Structure](img/group_link_final.png) + +Using this permission structure in our example allows only UK staff access to sensitive information stored in the projects code, while still allowing other teams to work effectively. As all permissions are controlled via AD groups new users can be quickly added to existing groups. New group members will then automatically inherit the required permissions. + +> [More information](../ldap-ee.md#group-sync) on group syncing. + +### Updating user permissions - new feature + +Since GitLab [v8.15](https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/822) LDAP user permissions can now be manually overridden by an admin user. To override a user's permissions visit the groups **Members** page and select **Edit permissions**. + +![Setting manual permissions](img/manual_permissions.gif) + +### Multiple LDAP servers + +GitLab EE can support multiple LDAP servers. Simply configure another server in the `gitlab.rb` file within the `ldap_servers` block. In the example below we configure a new secondary server with the label **GitLab Secondary AD**. This is shown on the GitLab login screen. Large enterprises often utilize multiple LDAP servers for segregating organizational departments. + +![Multiple LDAP Servers Login](img/multi_login.gif) + +Considering the example illustrated on the image above, +our `gitlab.rb` configuration would look like: + +```ruby +gitlab_rails['ldap_enabled'] = true +gitlab_rails['ldap_servers'] = { +'main' => { + 'label' => 'GitLab AD', + 'host' => 'ad.example.org', + 'port' => 636, + 'uid' => 'sAMAccountName', + 'method' => 'ssl', + 'bind_dn' => 'CN=GitLabSRV,CN=Users,DC=GitLab,DC=org', + 'password' => 'Password1', + 'active_directory' => true, + 'base' => 'OU=GitLab INT,DC=GitLab,DC=org', + 'group_base' => 'OU=Global Groups,OU=GitLab INT,DC=GitLab,DC=org', + 'admin_group' => 'Global Admins' + }, + +'secondary' => { + 'label' => 'GitLab Secondary AD', + 'host' => 'ad-secondary.example.net', + 'port' => 636, + 'uid' => 'sAMAccountName', + 'method' => 'ssl', + 'bind_dn' => 'CN=GitLabSRV,CN=Users,DC=GitLab,DC=com', + 'password' => 'Password1', + 'active_directory' => true, + 'base' => 'OU=GitLab Secondary,DC=GitLab,DC=com', + 'group_base' => 'OU=Global Groups,OU=GitLab INT,DC=GitLab,DC=com', + 'admin_group' => 'Global Admins' + } +} +``` + +## Conclusion + +Integration of GitLab with Active Directory (LDAP) reduces the complexity of user management. +It has the advantage of improving user permission controls, whilst easing the deployment of GitLab into an existing [IT environment](https://www.techopedia.com/definition/29199/it-infrastructure). GitLab EE offers advanced group management and multiple LDAP servers. + +With the assistance of the [GitLab Support](https://about.gitlab.com/support) team, setting up GitLab with an existing AD/LDAP solution will be a smooth and painless process. diff --git a/doc/administration/auth/ldap-ee.md b/doc/administration/auth/ldap-ee.md new file mode 100644 index 00000000000..aab3367086d --- /dev/null +++ b/doc/administration/auth/ldap-ee.md @@ -0,0 +1,557 @@ +# LDAP Additions in GitLab EE **[STARTER ONLY]** + +This is a continuation of the main [LDAP documentation](ldap.md), detailing LDAP +features specific to GitLab Enterprise Edition Starter, Premium and Ultimate. + +## Overview + +[LDAP](https://en.wikipedia.org/wiki/Lightweight_Directory_Access_Protocol) +stands for **Lightweight Directory Access Protocol**, which +is a standard application protocol for +accessing and maintaining distributed directory information services +over an Internet Protocol (IP) network. + +GitLab integrates with LDAP to support **user authentication**. This integration +works with most LDAP-compliant directory servers, including Microsoft Active +Directory, Apple Open Directory, Open LDAP, and 389 Server. +**GitLab Enterprise Edition** includes enhanced integration, including group +membership syncing. + +## Use cases + +- User sync: Once a day, GitLab will update users against LDAP +- Group sync: Once an hour, GitLab will update group membership + based on LDAP group members + +## Multiple LDAP servers + +With GitLab Enterprise Edition Starter, you can configure multiple LDAP servers +that your GitLab instance will connect to. + +To add another LDAP server, you can start by duplicating the settings under +[the main configuration](ldap.md#configuration) and edit them to match the +additional LDAP server. + +Be sure to choose a different provider ID made of letters a-z and numbers 0-9. +This ID will be stored in the database so that GitLab can remember which LDAP +server a user belongs to. + +## User sync + +Once per day, GitLab will run a worker to check and update GitLab +users against LDAP. + +The process will execute the following access checks: + +1. Ensure the user is still present in LDAP +1. If the LDAP server is Active Directory, ensure the user is active (not + blocked/disabled state). This will only be checked if + `active_directory: true` is set in the LDAP configuration [^1] + +The user will be set to `ldap_blocked` state in GitLab if the above conditions +fail. This means the user will not be able to login or push/pull code. + +The process will also update the following user information: + +1. Email address +1. If `sync_ssh_keys` is set, SSH public keys +1. If Kerberos is enabled, Kerberos identity + +NOTE: **Note:** +The LDAP sync process updates existing users while new users will +be created on first sign in. + +## Group Sync + +If your LDAP supports the `memberof` property, GitLab will add the user to any +new groups they might be added to when the user logs in. That way they don't need +to wait for the hourly sync to be granted access to the groups that they are in +in LDAP. + +In GitLab Premium, we can also add a GitLab group to sync with one or multiple LDAP groups or we can +also add a filter. The filter must comply with the syntax defined in [RFC 2254](https://tools.ietf.org/search/rfc2254). + +A group sync process will run every hour on the hour, and `group_base` must be set +in LDAP configuration for LDAP synchronizations based on group CN to work. This allows +GitLab group membership to be automatically updated based on LDAP group members. + +The `group_base` configuration should be a base LDAP 'container', such as an +'organization' or 'organizational unit', that contains LDAP groups that should +be available to GitLab. For example, `group_base` could be +`ou=groups,dc=example,dc=com`. In the config file it will look like the +following. + +**Omnibus configuration** + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['ldap_servers'] = YAML.load <<-EOS + main: + ## snip... + ## + ## Base where we can search for groups + ## + ## Ex. ou=groups,dc=gitlab,dc=example + ## + ## + group_base: ou=groups,dc=example,dc=com + EOS + ``` + +1. [Reconfigure GitLab][reconfigure] for the changes to take effect. + +**Source configuration** + +1. Edit `/home/git/gitlab/config/gitlab.yml`: + + ```yaml + production: + ldap: + servers: + main: + # snip... + group_base: ou=groups,dc=example,dc=com + ``` + +1. [Restart GitLab][restart] for the changes to take effect. + +--- + +To take advantage of group sync, group owners or maintainers will need to create an +LDAP group link in their group **Settings > LDAP Groups** page. Multiple LDAP +groups and/or filters can be linked with a single GitLab group. When the link is +created, an access level/role is specified (Guest, Reporter, Developer, Maintainer, +or Owner). + +## Administrator sync + +As an extension of group sync, you can automatically manage your global GitLab +administrators. Specify a group CN for `admin_group` and all members of the +LDAP group will be given administrator privileges. The configuration will look +like the following. + +NOTE: **Note:** +Administrators will not be synced unless `group_base` is also +specified alongside `admin_group`. Also, only specify the CN of the admin +group, as opposed to the full DN. + +**Omnibus configuration** + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['ldap_servers'] = YAML.load <<-EOS + main: + ## snip... + ## + ## Base where we can search for groups + ## + ## Ex. ou=groups,dc=gitlab,dc=example + ## + ## + group_base: ou=groups,dc=example,dc=com + + ## + ## The CN of a group containing GitLab administrators + ## + ## Ex. administrators + ## + ## Note: Not `cn=administrators` or the full DN + ## + ## + admin_group: my_admin_group + + EOS + ``` + +1. [Reconfigure GitLab][reconfigure] for the changes to take effect. + +**Source configuration** + +1. Edit `/home/git/gitlab/config/gitlab.yml`: + + ```yaml + production: + ldap: + servers: + main: + # snip... + group_base: ou=groups,dc=example,dc=com + admin_group: my_admin_group + ``` + +1. [Restart GitLab][restart] for the changes to take effect. + +## Adjusting LDAP user sync schedule + +> Introduced in GitLab Enterprise Edition Starter. + +NOTE: **Note:** +These are cron formatted values. You can use a crontab generator to create +these values, for example http://www.crontabgenerator.com/. + +By default, GitLab will run a worker once per day at 01:30 a.m. server time to +check and update GitLab users against LDAP. + +You can manually configure LDAP user sync times by setting the +following configuration values. The example below shows how to set LDAP user +sync to run once every 12 hours at the top of the hour. + +**Omnibus installations** + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['ldap_sync_worker_cron'] = "0 */12 * * *" + ``` + +1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. + +**Source installations** + +1. Edit `config/gitlab.yaml`: + + ```yaml + cron_jobs: + ldap_sync_worker_cron: + "0 */12 * * *" + ``` + +1. [Restart GitLab](../restart_gitlab.md#installations-from-source) for the changes to take effect. + +## Adjusting LDAP group sync schedule + +NOTE: **Note:** +These are cron formatted values. You can use a crontab generator to create +these values, for example http://www.crontabgenerator.com/. + +By default, GitLab will run a group sync process every hour, on the hour. + +CAUTION: **Important:** +It's recommended that you do not run too short intervals as this +could lead to multiple syncs running concurrently. This is primarily a concern +for installations with a large number of LDAP users. Please review the +[LDAP group sync benchmark metrics](#benchmarks) to see how +your installation compares before proceeding. + +You can manually configure LDAP group sync times by setting the +following configuration values. The example below shows how to set group +sync to run once every 2 hours at the top of the hour. + +**Omnibus installations** + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['ldap_group_sync_worker_cron'] = "0 */2 * * * *" + ``` + +1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. + +**Source installations** + +1. Edit `config/gitlab.yaml`: + + ```yaml + cron_jobs: + ldap_group_sync_worker_cron: + "*/30 * * * *" + ``` + +1. [Restart GitLab](../restart_gitlab.md#installations-from-source) for the changes to take effect. + +## External groups + +> Introduced in GitLab Enterprise Edition Starter 8.9. + +Using the `external_groups` setting will allow you to mark all users belonging +to these groups as [external users](../../user/permissions.md#external-users-permissions). +Group membership is checked periodically through the `LdapGroupSync` background +task. + +**Omnibus configuration** + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['ldap_servers'] = YAML.load <<-EOS + main: + ## snip... + ## + ## An array of CNs of groups containing users that should be considered external + ## + ## Ex. ['interns', 'contractors'] + ## + ## Note: Not `cn=interns` or the full DN + ## + external_groups: ['interns', 'contractors'] + EOS + ``` + +1. [Reconfigure GitLab][reconfigure] for the changes to take effect. + +**Source configuration** + +1. Edit `config/gitlab.yaml`: + + ```yaml + production: + ldap: + servers: + main: + # snip... + external_groups: ['interns', 'contractors'] + ``` + +1. [Restart GitLab][restart] for the changes to take effect. + +## Group sync technical details + +There is a lot going on with group sync 'under the hood'. This section +outlines what LDAP queries are executed and what behavior you can expect +from group sync. + +Group member access will be downgraded from a higher level if their LDAP group +membership changes. For example, if a user has 'Owner' rights in a group and the +next group sync reveals they should only have 'Developer' privileges, their +access will be adjusted accordingly. The only exception is if the user is the +*last* owner in a group. Groups need at least one owner to fulfill +administrative duties. + +### Supported LDAP group types/attributes + +GitLab supports LDAP groups that use member attributes `member`, `submember`, +`uniquemember`, `memberof` and `memberuid`. This means group sync supports, at +least, LDAP groups with object class `groupOfNames`, `posixGroup`, and +`groupOfUniqueName`. Other object classes should work fine as long as members +are defined as one of the mentioned attributes. This also means GitLab supports +Microsoft Active Directory, Apple Open Directory, Open LDAP, and 389 Server. +Other LDAP servers should work, too. + +Active Directory also supports nested groups. Group sync will recursively +resolve membership if `active_directory: true` is set in the configuration file. + +> **Note:** Nested group membership will only be resolved if the nested group + also falls within the configured `group_base`. For example, if GitLab sees a + nested group with DN `cn=nested_group,ou=special_groups,dc=example,dc=com` but + the configured `group_base` is `ou=groups,dc=example,dc=com`, `cn=nested_group` + will be ignored. + +### Queries + +- Each LDAP group is queried a maximum of one time with base `group_base` and + filter `(cn=<cn_from_group_link>)`. +- If the LDAP group has the `memberuid` attribute, GitLab will execute another + LDAP query per member to obtain each user's full DN. These queries are + executed with base `base`, scope 'base object', and a filter depending on + whether `user_filter` is set. Filter may be `(uid=<uid_from_group>)` or a + joining of `user_filter`. + +### Benchmarks + +Group sync was written to be as performant as possible. Data is cached, database +queries are optimized, and LDAP queries are minimized. The last benchmark run +revealed the following metrics: + +For 20,000 LDAP users, 11,000 LDAP groups and 1,000 GitLab groups with 10 +LDAP group links each: + +- Initial sync (no existing members assigned in GitLab) took 1.8 hours +- Subsequent syncs (checking membership, no writes) took 15 minutes + +These metrics are meant to provide a baseline and performance may vary based on +any number of factors. This was a pretty extreme benchmark and most instances will +not have near this many users or groups. Disk speed, database performance, +network and LDAP server response time will affect these metrics. + +## Troubleshooting + +### Referral error + +If you see `LDAP search error: Referral` in the logs, or when troubleshooting +LDAP Group Sync, this error may indicate a configuration problem. The LDAP +configuration `/etc/gitlab/gitlab.rb` (Omnibus) or `config/gitlab.yml` (source) +is in YAML format and is sensitive to indentation. Check that `group_base` and +`admin_group` configuration keys are indented 2 spaces past the server +identifier. The default identifier is `main` and an example snippet looks like +the following: + +```yaml +main: # 'main' is the GitLab 'provider ID' of this LDAP server + label: 'LDAP' + host: 'ldap.example.com' + ... + group_base: 'cn=my_group,ou=groups,dc=example,dc=com' + admin_group: 'my_admin_group' +``` + +[reconfigure]: ../restart_gitlab.md#omnibus-gitlab-reconfigure +[restart]: ../restart_gitlab.md#installations-from-source + +[^1]: In Active Directory, a user is marked as disabled/blocked if the user + account control attribute (`userAccountControl:1.2.840.113556.1.4.803`) + has bit 2 set. See https://ctogonewild.com/2009/09/03/bitmask-searches-in-ldap/ + for more information. + +### User DN has changed + +When an LDAP user is created in GitLab, their LDAP DN is stored for later reference. + +If GitLab cannot find a user by their DN, it will attempt to fallback +to finding the user by their email. If the lookup is successful, GitLab will +update the stored DN to the new value. + +### User is not being added to a group + +Sometimes you may think a particular user should be added to a GitLab group via +LDAP group sync, but for some reason it's not happening. There are several +things to check to debug the situation. + +- Ensure LDAP configuration has a `group_base` specified. This configuration is + required for group sync to work properly. +- Ensure the correct LDAP group link is added to the GitLab group. Check group + links by visiting the GitLab group, then **Settings dropdown -> LDAP groups**. +- Check that the user has an LDAP identity + 1. Sign in to GitLab as an administrator user. + 1. Navigate to **Admin area -> Users**. + 1. Search for the user + 1. Open the user, by clicking on their name. Do not click 'Edit'. + 1. Navigate to the **Identities** tab. There should be an LDAP identity with + an LDAP DN as the 'Identifier'. + +If all of the above looks good, jump in to a little more advanced debugging. +Often, the best way to learn more about why group sync is behaving a certain +way is to enable debug logging. There is verbose output that details every +step of the sync. + +1. Start a Rails console + + ```bash + # For Omnibus installations + sudo gitlab-rails console + + # For installations from source + sudo -u git -H bundle exec rails console production + ``` +1. Set the log level to debug (only for this session): + + ```ruby + Rails.logger.level = Logger::DEBUG + ``` +1. Choose a GitLab group to test with. This group should have an LDAP group link + already configured. If the output is `nil`, the group could not be found. + If a bunch of group attributes are output, your group was found successfully. + + ```ruby + group = Group.find_by(name: 'my_group') + + # Output + => #<Group:0x007fe825196558 id: 1234, name: "my_group"...> + ``` +1. Run a group sync for this particular group. + + ```ruby + EE::Gitlab::Auth::LDAP::Sync::Group.execute_all_providers(group) + ``` +1. Look through the output of the sync. See [example log output](#example-log-output) + below for more information about the output. +1. If you still aren't able to see why the user isn't being added, query the + LDAP group directly to see what members are listed. Still in the Rails console, + run the following query: + + ```ruby + adapter = Gitlab::Auth::LDAP::Adapter.new('ldapmain') # If `main` is the LDAP provider + ldap_group = EE::Gitlab::Auth::LDAP::Group.find_by_cn('group_cn_here', adapter) + + # Output + => #<EE::Gitlab::Auth::LDAP::Group:0x007fcbdd0bb6d8 + ``` +1. Query the LDAP group's member DNs and see if the user's DN is in the list. + One of the DNs here should match the 'Identifier' from the LDAP identity + checked earlier. If it doesn't, the user does not appear to be in the LDAP + group. + + ```ruby + ldap_group.member_dns + + # Output + => ["uid=john,ou=people,dc=example,dc=com", "uid=mary,ou=people,dc=example,dc=com"] + ``` +1. Some LDAP servers don't store members by DN. Rather, they use UIDs instead. + If you didn't see results from the last query, try querying by UIDs instead. + + ```ruby + ldap_group.member_uids + + # Output + => ['john','mary'] + ``` + +#### Example log output + +The output of the last command will be very verbose, but contains lots of +helpful information. For the most part you can ignore log entries that are SQL +statements. + +Indicates the point where syncing actually begins: + +```bash +Started syncing all providers for 'my_group' group +``` + +The follow entry shows an array of all user DNs GitLab sees in the LDAP server. +Note that these are the users for a single LDAP group, not a GitLab group. If +you have multiple LDAP groups linked to this GitLab group, you will see multiple +log entries like this - one for each LDAP group. If you don't see an LDAP user +DN in this log entry, LDAP is not returning the user when we do the lookup. +Verify the user is actually in the LDAP group. + +```bash +Members in 'ldap_group_1' LDAP group: ["uid=john0,ou=people,dc=example,dc=com", +"uid=mary0,ou=people,dc=example,dc=com", "uid=john1,ou=people,dc=example,dc=com", +"uid=mary1,ou=people,dc=example,dc=com", "uid=john2,ou=people,dc=example,dc=com", +"uid=mary2,ou=people,dc=example,dc=com", "uid=john3,ou=people,dc=example,dc=com", +"uid=mary3,ou=people,dc=example,dc=com", "uid=john4,ou=people,dc=example,dc=com", +"uid=mary4,ou=people,dc=example,dc=com"] +``` + +Shortly after each of the above entries, you will see a hash of resolved member +access levels. This hash represents all user DNs GitLab thinks should have +access to this group, and at which access level (role). This hash is additive, +and more DNs may be added, or existing entries modified, based on additional +LDAP group lookups. The very last occurrence of this entry should indicate +exactly which users GitLab believes should be added to the group. + +> **Note:** 10 is 'Guest', 20 is 'Reporter', 30 is 'Developer', 40 is 'Maintainer' + and 50 is 'Owner' + +```bash +Resolved 'my_group' group member access: {"uid=john0,ou=people,dc=example,dc=com"=>30, +"uid=mary0,ou=people,dc=example,dc=com"=>30, "uid=john1,ou=people,dc=example,dc=com"=>30, +"uid=mary1,ou=people,dc=example,dc=com"=>30, "uid=john2,ou=people,dc=example,dc=com"=>30, +"uid=mary2,ou=people,dc=example,dc=com"=>30, "uid=john3,ou=people,dc=example,dc=com"=>30, +"uid=mary3,ou=people,dc=example,dc=com"=>30, "uid=john4,ou=people,dc=example,dc=com"=>30, +"uid=mary4,ou=people,dc=example,dc=com"=>30} +``` + +It's not uncommon to see warnings like the following. These indicate that GitLab +would have added the user to a group, but the user could not be found in GitLab. +Usually this is not a cause for concern. + +If you think a particular user should already exist in GitLab, but you're seeing +this entry, it could be due to a mismatched DN stored in GitLab. See +[User DN has changed](#User-DN-has-changed) to update the user's LDAP identity. + +```bash +User with DN `uid=john0,ou=people,dc=example,dc=com` should have access +to 'my_group' group but there is no user in GitLab with that +identity. Membership will be updated once the user signs in for +the first time. +``` + +Finally, the following entry says syncing has finished for this group: + +```bash +Finished syncing all providers for 'my_group' group +``` diff --git a/doc/administration/auth/ldap.md b/doc/administration/auth/ldap.md index 440c2b1285a..90406bf0f5b 100644 --- a/doc/administration/auth/ldap.md +++ b/doc/administration/auth/ldap.md @@ -14,7 +14,7 @@ including group membership syncing as well as multiple LDAP servers support. The information on this page is relevant for both GitLab CE and EE. For more details about EE-specific LDAP features, see the -[LDAP Enterprise Edition documentation](https://docs.gitlab.com/ee/administration/auth/ldap-ee.html). +[LDAP Enterprise Edition documentation](ldap-ee.md). ## Security @@ -39,7 +39,7 @@ immediately block all access. NOTE: **Note**: GitLab Enterprise Edition Starter supports a -[configurable sync time](https://docs.gitlab.com/ee/administration/auth/ldap-ee.html#adjusting-ldap-user-and-group-sync-schedules), +[configurable sync time](ldap-ee.md#adjusting-ldap-user-sync-schedule), with a default of one hour. ## Git password authentication @@ -56,6 +56,7 @@ to connect to one GitLab server. For a complete guide on configuring LDAP with GitLab Community Edition, please check the admin guide [How to configure LDAP with GitLab CE](how_to_configure_ldap_gitlab_ce/index.md). +For GitLab Enterprise Editions, see also [How to configure LDAP with GitLab EE](how_to_configure_ldap_gitlab_ee/index.md). To enable LDAP integration you need to add your LDAP server settings in `/etc/gitlab/gitlab.rb` or `/home/git/gitlab/config/gitlab.yml` for Omnibus @@ -380,7 +381,7 @@ group, you can use the following syntax: Find more information about this "LDAP_MATCHING_RULE_IN_CHAIN" filter at <https://docs.microsoft.com/en-us/windows/desktop/ADSI/search-filter-syntax>. Support for nested members in the user filter should not be confused with -[group sync nested groups support (EE only)](https://docs.gitlab.com/ee/administration/auth/ldap-ee.html#supported-ldap-group-types-attributes). +[group sync nested groups support (EE only)](ldap-ee.md#supported-ldap-group-typesattributes). Please note that GitLab does not support the custom filter syntax used by omniauth-ldap. diff --git a/doc/administration/auth/smartcard.md b/doc/administration/auth/smartcard.md new file mode 100644 index 00000000000..1107b955c4a --- /dev/null +++ b/doc/administration/auth/smartcard.md @@ -0,0 +1,186 @@ +# Smartcard authentication + +GitLab supports authentication using smartcards. + +## Authentication methods + +GitLab supports two authentication methods: + +- X.509 certificates with local databases. +- LDAP servers. + +### Authentication against a local database with X.509 certificates + +> [Introduced](https://gitlab.com/gitlab-org/gitlab-ee/issues/726) in +[GitLab Premium](https://about.gitlab.com/pricing/) 11.6 as an experimental +feature. Smartcard authentication against local databases may change or be +removed completely in future releases. + +Smartcards with X.509 certificates can be used to authenticate with GitLab. + +To use a smartcard with an X.509 certificate to authenticate against a local +database with GitLab, `CN` and `emailAddress` must be defined in the +certificate. For example: + +``` +Certificate: + Data: + Version: 1 (0x0) + Serial Number: 12856475246677808609 (0xb26b601ecdd555e1) + Signature Algorithm: sha256WithRSAEncryption + Issuer: O=Random Corp Ltd, CN=Random Corp + Validity + Not Before: Oct 30 12:00:00 2018 GMT + Not After : Oct 30 12:00:00 2019 GMT + Subject: CN=Gitlab User, emailAddress=gitlab-user@example.com +``` + +### Authentication against an LDAP server + +> [Introduced](https://gitlab.com/gitlab-org/gitlab-ee/issues/7693) in +[GitLab Premium](https://about.gitlab.com/pricing/) 11.8 as an experimental +feature. Smartcard authentication against an LDAP server may change or be +removed completely in future releases. + +GitLab implements a standard way of certificate matching following +[RFC4523](https://tools.ietf.org/html/rfc4523). It uses the +`certificateExactMatch` certificate matching rule against the `userCertificate` +attribute. As a prerequisite, you must use an LDAP server that: + +- Supports the `certificateExactMatch` matching rule. +- Has the certificate stored in the `userCertificate` attribute. + +## Configure GitLab for smartcard authentication + +**For Omnibus installations** + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['smartcard_enabled'] = true + gitlab_rails['smartcard_ca_file'] = "/etc/ssl/certs/CA.pem" + gitlab_rails['smartcard_client_certificate_required_port'] = 3444 + ``` + +1. Save the file and [reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure) + GitLab for the changes to take effect. + +--- + +**For installations from source** + +1. Configure NGINX to request a client side certificate + + In NGINX configuration, an **additional** server context must be defined with + the same configuration except: + + - The additional NGINX server context must be configured to run on a different + port: + + ``` + listen *:3444 ssl; + ``` + + - The additional NGINX server context must be configured to require the client + side certificate: + + ``` + ssl_verify_depth 2; + ssl_client_certificate /etc/ssl/certs/CA.pem; + ssl_verify_client on; + ``` + + - The additional NGINX server context must be configured to forward the client + side certificate: + + ``` + proxy_set_header X-SSL-Client-Certificate $ssl_client_escaped_cert; + ``` + + For example, the following is an example server context in an NGINX + configuration file (eg. in `/etc/nginx/sites-available/gitlab-ssl`): + + ``` + server { + listen *:3444 ssl; + + # certificate for configuring SSL + ssl_certificate /path/to/example.com.crt; + ssl_certificate_key /path/to/example.com.key; + + ssl_verify_depth 2; + # CA certificate for client side certificate verification + ssl_client_certificate /etc/ssl/certs/CA.pem; + ssl_verify_client on; + + location / { + proxy_set_header Host $http_host; + proxy_set_header X-Real-IP $remote_addr; + proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; + proxy_set_header X-Forwarded-Proto $scheme; + proxy_set_header Upgrade $http_upgrade; + proxy_set_header Connection $connection_upgrade; + + proxy_set_header X-SSL-Client-Certificate $ssl_client_escaped_cert; + + proxy_read_timeout 300; + + proxy_pass http://gitlab-workhorse; + } + } + ``` + +1. Edit `config/gitlab.yml`: + + ```yaml + ## Smartcard authentication settings + smartcard: + # Allow smartcard authentication + enabled: true + + # Path to a file containing a CA certificate + ca_file: '/etc/ssl/certs/CA.pem' + + # Port where the client side certificate is requested by NGINX + client_certificate_required_port: 3444 + ``` + +1. Save the file and [restart](../restart_gitlab.md#installations-from-source) + GitLab for the changes to take effect. + +### Additional steps when authenticating against an LDAP server + +**For Omnibus installations** + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_rails['ldap_servers'] = YAML.load <<-EOS + main: + # snip... + # Enable smartcard authentication against the LDAP server. Valid values + # are "false", "optional", and "required". + smartcard_auth: optional + EOS + ``` + +1. Save the file and [reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure) + GitLab for the changes to take effect. + +**For installations from source** + +1. Edit `config/gitlab.yml`: + + ```yaml + production: + ldap: + servers: + main: + # snip... + # Enable smartcard authentication against the LDAP server. Valid values + # are "false", "optional", and "required". + smartcard_auth: optional + ``` + +1. Save the file and [restart](../restart_gitlab.md#installations-from-source) + GitLab for the changes to take effect. diff --git a/doc/administration/compliance.md b/doc/administration/compliance.md index 72cb57fb36c..246addb6dc9 100644 --- a/doc/administration/compliance.md +++ b/doc/administration/compliance.md @@ -11,8 +11,8 @@ GitLab’s [security features](../security/README.md) may also help you meet rel |**[Enforce TOS acceptance](../user/admin_area/settings/terms.md)**<br>Enforce your users accepting new terms of service by blocking GitLab traffic.|Core+|| |**[Email all users of a project, group, or entire server](../user/admin_area/settings/terms.md)**<br>An admin can email groups of users based on project or group membership, or email everyone using the GitLab instance. This is great for scheduled maintenance or upgrades.|Starter+|| |**[Omnibus package supports log forwarding](https://docs.gitlab.com/omnibus/settings/logs.html#udp-log-forwarding)**<br>Forward your logs to a central system.|Starter+|| -|**[Lock project membership to group](https://docs.gitlab.com/ee/user/group/index.html#member-lock-starter)**<br>Group owners can prevent new members from being added to projects within a group.|Starter+|✓| -|**[LDAP group sync](https://docs.gitlab.com/ee/administration/auth/ldap-ee.html#group-sync)**<br>GitLab Enterprise Edition gives admins the ability to automatically sync groups and manage SSH keys, permissions, and authentication, so you can focus on building your product, not configuring your tools.|Starter+|| -|**[LDAP group sync filters](https://docs.gitlab.com/ee/administration/auth/ldap-ee.html#group-sync)**<br>GitLab Enterprise Edition Premium gives more flexibility to synchronize with LDAP based on filters, meaning you can leverage LDAP attributes to map GitLab permissions.|Premium+|| -|**[Audit logs](https://docs.gitlab.com/ee/administration/audit_events.html)**<br>To maintain the integrity of your code, GitLab Enterprise Edition Premium gives admins the ability to view any modifications made within the GitLab server in an advanced audit log system, so you can control, analyze and track every change.|Premium+|| -|**[Auditor users](https://docs.gitlab.com/ee/administration/auditor_users.html)**<br>Auditor users are users who are given read-only access to all projects, groups, and other resources on the GitLab instance.|Premium+||
\ No newline at end of file +|**[Lock project membership to group](../user/group/index.md#member-lock-starter)**<br>Group owners can prevent new members from being added to projects within a group.|Starter+|✓| +|**[LDAP group sync](auth/ldap-ee.md#group-sync)**<br>GitLab Enterprise Edition gives admins the ability to automatically sync groups and manage SSH keys, permissions, and authentication, so you can focus on building your product, not configuring your tools.|Starter+|| +|**[LDAP group sync filters](auth/ldap-ee.md#group-sync)**<br>GitLab Enterprise Edition Premium gives more flexibility to synchronize with LDAP based on filters, meaning you can leverage LDAP attributes to map GitLab permissions.|Premium+|| +|**[Audit logs](audit_events.md)**<br>To maintain the integrity of your code, GitLab Enterprise Edition Premium gives admins the ability to view any modifications made within the GitLab server in an advanced audit log system, so you can control, analyze and track every change.|Premium+|| +|**[Auditor users](auditor_users.md)**<br>Auditor users are users who are given read-only access to all projects, groups, and other resources on the GitLab instance.|Premium+|| diff --git a/doc/administration/custom_hooks.md b/doc/administration/custom_hooks.md index 60ad4bf4e8f..da661b7f121 100644 --- a/doc/administration/custom_hooks.md +++ b/doc/administration/custom_hooks.md @@ -1,11 +1,13 @@ # Custom Git Hooks > **Note:** Custom Git hooks must be configured on the filesystem of the GitLab -server. Only GitLab server administrators will be able to complete these tasks. -Please explore [webhooks] and [CI] as an option if you do not -have filesystem access. For a user configurable Git hook interface, see -[Push Rules](https://docs.gitlab.com/ee/push_rules/push_rules.html), -available in GitLab Enterprise Edition. +> server. Only GitLab server administrators will be able to complete these tasks. +> Please explore [webhooks] and [CI] as an option if you do not +> have filesystem access. For a user configurable Git hook interface, see +> [Push Rules](https://docs.gitlab.com/ee/push_rules/push_rules.html), +> available in GitLab Enterprise Edition. +> +> **Note:** Custom Git hooks won't be replicated to secondary nodes if you use [GitLab Geo][gitlab-geo] Git natively supports hooks that are executed on different actions. Examples of server-side git hooks include pre-receive, post-receive, and update. @@ -85,5 +87,6 @@ STDERR takes precedence over STDOUT. [CI]: ../ci/README.md [hooks]: https://git-scm.com/book/en/v2/Customizing-Git-Git-Hooks#Server-Side-Hooks [webhooks]: ../user/project/integrations/webhooks.md +[gitlab-geo]: ../administration/geo/replication/index.md [5073]: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/5073 [93]: https://gitlab.com/gitlab-org/gitlab-shell/merge_requests/93 diff --git a/doc/administration/database_load_balancing.md b/doc/administration/database_load_balancing.md new file mode 100644 index 00000000000..7f3be402b84 --- /dev/null +++ b/doc/administration/database_load_balancing.md @@ -0,0 +1,277 @@ +# Database Load Balancing **[PREMIUM ONLY]** + +> [Introduced][ee-1283] in [GitLab Premium][eep] 9.0. + +Distribute read-only queries among multiple database servers. + +## Overview + +Database load balancing improves the distribution of database workloads across +multiple computing resources. Load balancing aims to optimize resource use, +maximize throughput, minimize response time, and avoid overload of any single +resource. Using multiple components with load balancing instead of a single +component may increase reliability and availability through redundancy. +[_Wikipedia article_][wikipedia] + +When database load balancing is enabled in GitLab, the load is balanced using +a simple round-robin algorithm, without any external dependencies such as Redis. +Load balancing is not enabled for Sidekiq as this would lead to consistency +problems, and Sidekiq mostly performs writes anyway. + +In the following image, you can see the load is balanced rather evenly among +all the secondaries (`db4`, `db5`, `db6`). Because `SELECT` queries are not +sent to the primary (unless necessary), the primary (`db3`) hardly has any load. + +![DB load balancing graph](img/db_load_balancing_postgres_stats.png) + +## Requirements + +For load balancing to work you will need at least PostgreSQL 9.2 or newer, +[**MySQL is not supported**][db-req]. You also need to make sure that you have +at least 1 secondary in [hot standby][hot-standby] mode. + +Load balancing also requires that the configured hosts **always** point to the +primary, even after a database failover. Furthermore, the additional hosts to +balance load among must **always** point to secondary databases. This means that +you should put a load balance in front of every database, and have GitLab connect +to those load balancers. + +For example, say you have a primary (`db1.gitlab.com`) and two secondaries, +`db2.gitlab.com` and `db3.gitlab.com`. For this setup you will need to have 3 +load balancers, one for every host. For example: + +* `primary.gitlab.com` forwards to `db1.gitlab.com` +* `secondary1.gitlab.com` forwards to `db2.gitlab.com` +* `secondary2.gitlab.com` forwards to `db3.gitlab.com` + +Now let's say that a failover happens and db2 becomes the new primary. This +means forwarding should now happen as follows: + +* `primary.gitlab.com` forwards to `db2.gitlab.com` +* `secondary1.gitlab.com` forwards to `db1.gitlab.com` +* `secondary2.gitlab.com` forwards to `db3.gitlab.com` + +GitLab does not take care of this for you, so you will need to do so yourself. + +Finally, load balancing requires that GitLab can connect to all hosts using the +same credentials and port as configured in the +[Enabling load balancing](#enabling-load-balancing) section. Using +different ports or credentials for different hosts is not supported. + +## Use cases + +- For GitLab instances with thousands of users and high traffic, you can use + database load balancing to reduce the load on the primary database and + increase responsiveness, thus resulting in faster page load inside GitLab. + +## Enabling load balancing + +For the environment in which you want to use load balancing, you'll need to add +the following. This will balance the load between `host1.example.com` and +`host2.example.com`. + +**In Omnibus installations:** + +1. Edit `/etc/gitlab/gitlab.rb` and add the following line: + + ```ruby + gitlab_rails['db_load_balancing'] = { 'hosts' => ['host1.example.com', 'host2.example.com'] } + ``` + +1. Save the file and [reconfigure GitLab][] for the changes to take effect. + +--- + +**In installations from source:** + +1. Edit `/home/git/gitlab/config/database.yml` and add or amend the following lines: + + ```yaml + production: + username: gitlab + database: gitlab + encoding: unicode + load_balancing: + hosts: + - host1.example.com + - host2.example.com + ``` + +1. Save the file and [restart GitLab][] for the changes to take effect. + +## Service Discovery + +> [Introduced][ee-5883] in [GitLab Premium][eep] 11.0. + +Service discovery allows GitLab to automatically retrieve a list of secondary +databases to use, instead of having to manually specify these in the +`database.yml` configuration file. Service discovery works by periodically +checking a DNS A record, using the IPs returned by this record as the addresses +for the secondaries. For service discovery to work, all you need is a DNS server +and an A record containing the IP addresses of your secondaries. + +To use service discovery you need to change your `database.yml` configuration +file so it looks like the following: + +```yaml +production: + username: gitlab + database: gitlab + encoding: unicode + load_balancing: + discover: + nameserver: localhost + record: secondary.postgresql.service.consul + port: 8600 + interval: 60 + disconnect_timeout: 120 +``` + +Here the `discover:` section specifies the configuration details to use for +service discovery. + +### Configuration + +The following options can be set: + +| Option | Description | Default | +|----------------------|---------------------------------------------------------------------------------------------------|-----------| +| `nameserver` | The nameserver to use for looking up the DNS record. | localhost | +| `record` | The A record to look up. This option is required for service discovery to work. | | +| `port` | The port of the nameserver. | 8600 | +| `interval` | The minimum time in seconds between checking the DNS record. | 60 | +| `disconnect_timeout` | The time in seconds after which an old connection is closed, after the list of hosts was updated. | 120 | +| `use_tcp` | Lookup DNS resources using TCP instead of UDP | false | + +The `interval` value specifies the _minimum_ time between checks. If the A +record has a TTL greater than this value, then service discovery will honor said +TTL. For example, if the TTL of the A record is 90 seconds, then service +discovery will wait at least 90 seconds before checking the A record again. + +When the list of hosts is updated, it might take a while for the old connections +to be terminated. The `disconnect_timeout` setting can be used to enforce an +upper limit on the time it will take to terminate all old database connections. + +Some nameservers (like [Consul][consul-udp]) can return a truncated list of hosts when +queried over UDP. To overcome this issue, you can use TCP for querying by setting +`use_tcp` to `true`. + +### Forking + +If you use an application server that forks, such as Unicorn, you _have to_ +update your Unicorn configuration to start service discovery _after_ a fork. +Failure to do so will lead to service discovery only running in the parent +process. If you are using Unicorn, then you can add the following to your +Unicorn configuration file: + +```ruby +after_fork do |server, worker| + defined?(Gitlab::Database::LoadBalancing) && + Gitlab::Database::LoadBalancing.start_service_discovery +end +``` + +This will ensure that service discovery is started in both the parent and all +child processes. + +## Balancing queries + +Read-only `SELECT` queries will be balanced among all the secondary hosts. +Everything else (including transactions) will be executed on the primary. +Queries such as `SELECT ... FOR UPDATE` are also executed on the primary. + +## Prepared statements + +Prepared statements don't work well with load balancing and are disabled +automatically when load balancing is enabled. This should have no impact on +response timings. + +## Primary sticking + +After a write has been performed, GitLab will stick to using the primary for a +certain period of time, scoped to the user that performed the write. GitLab will +revert back to using secondaries when they have either caught up, or after 30 +seconds. + +## Failover handling + +In the event of a failover or an unresponsive database, the load balancer will +try to use the next available host. If no secondaries are available the +operation is performed on the primary instead. + +In the event of a connection error being produced when writing data, the +operation will be retried up to 3 times using an exponential back-off. + +When using load balancing, you should be able to safely restart a database server +without it immediately leading to errors being presented to the users. + +## Logging + +The load balancer logs various messages, such as: + +* When a host is marked as offline +* When a host comes back online +* When all secondaries are offline + +Each log message contains the tag `[DB-LB]` to make searching/filtering of such +log entries easier. For example: + +``` +[DB-LB] Host 10.123.2.5 came back online +[DB-LB] Marking host 10.123.2.7 as offline +[DB-LB] Marking host 10.123.2.7 as offline +[DB-LB] Marking host 10.123.2.7 as offline +[DB-LB] Marking host 10.123.2.7 as offline +[DB-LB] Marking host 10.123.2.7 as offline +[DB-LB] Host 10.123.2.6 came back online +[DB-LB] Marking host 10.123.2.7 as offline +[DB-LB] Marking host 10.123.2.7 as offline +[DB-LB] Marking host 10.123.2.7 as offline +[DB-LB] Host 10.123.2.7 came back online +[DB-LB] Host 10.123.2.7 came back online +``` + +## Handling Stale Reads + +> [Introduced][ee-3526] in [GitLab Premium][eep] 10.3. + +To prevent reading from an outdated secondary the load balancer will check if it +is in sync with the primary. If the data is determined to be recent enough the +secondary can be used, otherwise it will be ignored. To reduce the overhead of +these checks we only perform these checks at certain intervals. + +There are three configuration options that influence this behaviour: + +| Option | Description | Default | +|------------------------------|----------------------------------------------------------------------------------------------------------------|------------| +| `max_replication_difference` | The amount of data (in bytes) a secondary is allowed to lag behind when it hasn't replicated data for a while. | 8 MB | +| `max_replication_lag_time` | The maximum number of seconds a secondary is allowed to lag behind before we stop using it. | 60 seconds | +| `replica_check_interval` | The minimum number of seconds we have to wait before checking the status of a secondary. | 60 seconds | + +The defaults should be sufficient for most users. Should you want to change them +you can specify them in `config/database.yml` like so: + +```yaml +production: + username: gitlab + database: gitlab + encoding: unicode + load_balancing: + hosts: + - host1.example.com + - host2.example.com + max_replication_difference: 16777216 # 16 MB + max_replication_lag_time: 30 + replica_check_interval: 30 +``` + +[hot-standby]: https://www.postgresql.org/docs/9.6/static/hot-standby.html +[ee-1283]: https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/1283 +[eep]: https://about.gitlab.com/pricing/ +[reconfigure gitlab]: restart_gitlab.md#omnibus-gitlab-reconfigure "How to reconfigure Omnibus GitLab" +[restart gitlab]: restart_gitlab.md#installations-from-source "How to restart GitLab" +[wikipedia]: https://en.wikipedia.org/wiki/Load_balancing_(computing) +[db-req]: ../install/requirements.md#database +[ee-3526]: https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/3526 +[ee-5883]: https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/5883 +[consul-udp]: https://www.consul.io/docs/agent/dns.html#udp-based-dns-queries diff --git a/doc/administration/geo/disaster_recovery/background_verification.md b/doc/administration/geo/disaster_recovery/background_verification.md new file mode 100644 index 00000000000..7d2fd51f834 --- /dev/null +++ b/doc/administration/geo/disaster_recovery/background_verification.md @@ -0,0 +1,172 @@ +# Automatic background verification **[PREMIUM ONLY]** + +NOTE: **Note:** +Automatic background verification of repositories and wikis was added in +GitLab EE 10.6 but is enabled by default only on GitLab EE 11.1. You can +disable or enable this feature manually by following +[these instructions](#disabling-or-enabling-the-automatic-background-verification). + +Automatic background verification ensures that the transferred data matches a +calculated checksum. If the checksum of the data on the **primary** node matches checksum of the +data on the **secondary** node, the data transferred successfully. Following a planned failover, +any corrupted data may be **lost**, depending on the extent of the corruption. + +If verification fails on the **primary** node, this indicates that Geo is +successfully replicating a corrupted object; restore it from backup or remove it +it from the **primary** node to resolve the issue. + +If verification succeeds on the **primary** node but fails on the **secondary** node, +this indicates that the object was corrupted during the replication process. +Geo actively try to correct verification failures marking the repository to +be resynced with a backoff period. If you want to reset the verification for +these failures, so you should follow [these instructions][reset-verification]. + +If verification is lagging significantly behind replication, consider giving +the node more time before scheduling a planned failover. + +## Disabling or enabling the automatic background verification + +Run the following commands in a Rails console on the **primary** node: + +```sh +# Omnibus GitLab +gitlab-rails console + +# Installation from source +cd /home/git/gitlab +sudo -u git -H bin/rails console RAILS_ENV=production +``` + +To check if automatic background verification is enabled: + +```ruby +Gitlab::Geo.repository_verification_enabled? +``` + +To disable automatic background verification: + +```ruby +Feature.disable('geo_repository_verification') +``` + +To enable automatic background verification: + +```ruby +Feature.enable('geo_repository_verification') +``` + +## Repository verification + +Navigate to the **Admin Area > Geo** dashboard on the **primary** node and expand +the **Verification information** tab for that node to view automatic checksumming +status for repositories and wikis. Successes are shown in green, pending work +in grey, and failures in red. + +![Verification status](img/verification-status-primary.png) + +Navigate to the **Admin Area > Geo** dashboard on the **secondary** node and expand +the **Verification information** tab for that node to view automatic verification +status for repositories and wikis. As with checksumming, successes are shown in +green, pending work in grey, and failures in red. + +![Verification status](img/verification-status-secondary.png) + +## Using checksums to compare Geo nodes + +To check the health of Geo **secondary** nodes, we use a checksum over the list of +Git references and their values. The checksum includes `HEAD`, `heads`, `tags`, +`notes`, and GitLab-specific references to ensure true consistency. If two nodes +have the same checksum, then they definitely hold the same references. We compute +the checksum for every node after every update to make sure that they are all +in sync. + +## Repository re-verification + +> [Introduced](https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/8550) in GitLab Enterprise Edition 11.6. Available in [GitLab Premium](https://about.gitlab.com/pricing/). + +Due to bugs or transient infrastructure failures, it is possible for Git +repositories to change unexpectedly without being marked for verification. +Geo constantly reverifies the repositories to ensure the integrity of the +data. The default and recommended re-verification interval is 7 days, though +an interval as short as 1 day can be set. Shorter intervals reduce risk but +increase load and vice versa. + +Navigate to the **Admin Area > Geo** dashboard on the **primary** node, and +click the **Edit** button for the **primary** node to customize the minimum +re-verification interval: + +![Re-verification interval](img/reverification-interval.png) + +The automatic background re-verification is enabled by default, but you can +disable if you need. Run the following commands in a Rails console on the +**primary** node: + +```sh +# Omnibus GitLab +gitlab-rails console + +# Installation from source +cd /home/git/gitlab +sudo -u git -H bin/rails console RAILS_ENV=production +``` + +To disable automatic background re-verification: + +```ruby +Feature.disable('geo_repository_reverification') +``` + +To enable automatic background re-verification: + +```ruby +Feature.enable('geo_repository_reverification') +``` + +## Reset verification for projects where verification has failed + +Geo actively try to correct verification failures marking the repository to +be resynced with a backoff period. If you want to reset them manually, this +rake task marks projects where verification has failed or the checksum mismatch +to be resynced without the backoff period: + +For repositories: + +- Omnibus Installation + + ```sh + sudo gitlab-rake geo:verification:repository:reset + ``` + +- Source Installation + + ```sh + sudo -u git -H bundle exec rake geo:verification:repository:reset RAILS_ENV=production + ``` + +For wikis: + +- Omnibus Installation + + ```sh + sudo gitlab-rake geo:verification:wiki:reset + ``` + +- Source Installation + + ```sh + sudo -u git -H bundle exec rake geo:verification:wiki:reset RAILS_ENV=production + ``` + +## Current limitations + +Until [issue #5064][ee-5064] is completed, background verification doesn't cover +CI job artifacts and traces, LFS objects, or user uploads in file storage. +Verify their integrity manually by following [these instructions][foreground-verification] +on both nodes, and comparing the output between them. + +Data in object storage is **not verified**, as the object store is responsible +for ensuring the integrity of the data. + +[reset-verification]: background_verification.md#reset-verification-for-projects-where-verification-has-failed +[foreground-verification]: ../../raketasks/check.md +[ee-5064]: https://gitlab.com/gitlab-org/gitlab-ee/issues/5064 diff --git a/doc/administration/geo/disaster_recovery/bring_primary_back.md b/doc/administration/geo/disaster_recovery/bring_primary_back.md new file mode 100644 index 00000000000..209ca2f50d0 --- /dev/null +++ b/doc/administration/geo/disaster_recovery/bring_primary_back.md @@ -0,0 +1,61 @@ +# Bring a demoted primary node back online + +After a failover, it is possible to fail back to the demoted **primary** node to +restore your original configuration. This process consists of two steps: + +1. Making the old **primary** node a **secondary** node. +1. Promoting a **secondary** node to a **primary** node. + +CAUTION: **Caution:** +If you have any doubts about the consistency of the data on this node, we recommend setting it up from scratch. + +## Configure the former **primary** node to be a **secondary** node + +Since the former **primary** node will be out of sync with the current **primary** node, the first step is to bring the former **primary** node up to date. Note, deletion of data stored on disk like +repositories and uploads will not be replayed when bringing the former **primary** node back +into sync, which may result in increased disk usage. +Alternatively, you can [set up a new **secondary** GitLab instance][setup-geo] to avoid this. + +To bring the former **primary** node up to date: + +1. SSH into the former **primary** node that has fallen behind. +1. Make sure all the services are up: + + ```sh + sudo gitlab-ctl start + ``` + + > **Note 1:** If you [disabled the **primary** node permanently][disaster-recovery-disable-primary], + > you need to undo those steps now. For Debian/Ubuntu you just need to run + > `sudo systemctl enable gitlab-runsvdir`. For CentOS 6, you need to install + > the GitLab instance from scratch and set it up as a **secondary** node by + > following [Setup instructions][setup-geo]. In this case, you don't need to follow the next step. + > + > **Note 2:** If you [changed the DNS records](index.md#step-4-optional-updating-the-primary-domain-dns-record) + > for this node during disaster recovery procedure you may need to [block + > all the writes to this node](https://gitlab.com/gitlab-org/gitlab-ee/blob/master/doc/gitlab-geo/planned-failover.md#block-primary-traffic) + > during this procedure. + +1. [Setup database replication][database-replication]. Note that in this + case, **primary** node refers to the current **primary** node, and **secondary** node refers to the + former **primary** node. + +If you have lost your original **primary** node, follow the +[setup instructions][setup-geo] to set up a new **secondary** node. + +## Promote the **secondary** node to **primary** node + +When the initial replication is complete and the **primary** node and **secondary** node are +closely in sync, you can do a [planned failover]. + +## Restore the **secondary** node + +If your objective is to have two nodes again, you need to bring your **secondary** +node back online as well by repeating the first step +([configure the former **primary** node to be a **secondary** node](#configure-the-former-primary-node-to-be-a-secondary-node)) +for the **secondary** node. + +[setup-geo]: ../replication/index.md#setup-instructions +[database-replication]: ../replication/database.md +[disaster-recovery-disable-primary]: index.md#step-2-permanently-disable-the-primary-node +[planned failover]: planned_failover.md diff --git a/doc/administration/geo/disaster_recovery/img/replication-status.png b/doc/administration/geo/disaster_recovery/img/replication-status.png Binary files differnew file mode 100644 index 00000000000..d7085927c75 --- /dev/null +++ b/doc/administration/geo/disaster_recovery/img/replication-status.png diff --git a/doc/administration/geo/disaster_recovery/img/reverification-interval.png b/doc/administration/geo/disaster_recovery/img/reverification-interval.png Binary files differnew file mode 100644 index 00000000000..ad4597a4f49 --- /dev/null +++ b/doc/administration/geo/disaster_recovery/img/reverification-interval.png diff --git a/doc/administration/geo/disaster_recovery/img/verification-status-primary.png b/doc/administration/geo/disaster_recovery/img/verification-status-primary.png Binary files differnew file mode 100644 index 00000000000..2503408ec5d --- /dev/null +++ b/doc/administration/geo/disaster_recovery/img/verification-status-primary.png diff --git a/doc/administration/geo/disaster_recovery/img/verification-status-secondary.png b/doc/administration/geo/disaster_recovery/img/verification-status-secondary.png Binary files differnew file mode 100644 index 00000000000..462274d8b14 --- /dev/null +++ b/doc/administration/geo/disaster_recovery/img/verification-status-secondary.png diff --git a/doc/administration/geo/disaster_recovery/index.md b/doc/administration/geo/disaster_recovery/index.md new file mode 100644 index 00000000000..1de25671090 --- /dev/null +++ b/doc/administration/geo/disaster_recovery/index.md @@ -0,0 +1,323 @@ +# Disaster Recovery **[PREMIUM ONLY]** + +Geo replicates your database, your Git repositories, and few other assets. +We will support and replicate more data in the future, that will enable you to +failover with minimal effort, in a disaster situation. + +See [Geo current limitations][geo-limitations] for more information. + +CAUTION: **Warning:** +Disaster recovery for multi-secondary configurations is in **Alpha**. +For the latest updates, check the multi-secondary [Disaster Recovery epic][gitlab-org&65]. + +## Promoting a **secondary** Geo node in single-secondary configurations + +We don't currently provide an automated way to promote a Geo replica and do a +failover, but you can do it manually if you have `root` access to the machine. + +This process promotes a **secondary** Geo node to a **primary** node. To regain +geographic redundancy as quickly as possible, you should add a new **secondary** node +immediately after following these instructions. + +### Step 1. Allow replication to finish if possible + +If the **secondary** node is still replicating data from the **primary** node, follow +[the planned failover docs][planned-failover] as closely as possible in +order to avoid unnecessary data loss. + +### Step 2. Permanently disable the **primary** node + +CAUTION: **Warning:** +If the **primary** node goes offline, there may be data saved on the **primary** node +that has not been replicated to the **secondary** node. This data should be treated +as lost if you proceed. + +If an outage on the **primary** node happens, you should do everything possible to +avoid a split-brain situation where writes can occur in two different GitLab +instances, complicating recovery efforts. So to prepare for the failover, we +must disable the **primary** node. + +1. SSH into the **primary** node to stop and disable GitLab, if possible: + + ```sh + sudo gitlab-ctl stop + ``` + + Prevent GitLab from starting up again if the server unexpectedly reboots: + + ```sh + sudo systemctl disable gitlab-runsvdir + ``` + + > **CentOS only**: In CentOS 6 or older, there is no easy way to prevent GitLab from being + > started if the machine reboots isn't available (see [gitlab-org/omnibus-gitlab#3058]). + > It may be safest to uninstall the GitLab package completely: + + ```sh + yum remove gitlab-ee + ``` + + > **Ubuntu 14.04 LTS**: If you are using an older version of Ubuntu + > or any other distro based on the Upstart init system, you can prevent GitLab + > from starting if the machine reboots by doing the following: + + ```sh + initctl stop gitlab-runsvvdir + echo 'manual' > /etc/init/gitlab-runsvdir.override + initctl reload-configuration + ``` + +1. If you do not have SSH access to the **primary** node, take the machine offline and + prevent it from rebooting by any means at your disposal. + Since there are many ways you may prefer to accomplish this, we will avoid a + single recommendation. You may need to: + - Reconfigure the load balancers. + - Change DNS records (e.g., point the primary DNS record to the **secondary** + node in order to stop usage of the **primary** node). + - Stop the virtual servers. + - Block traffic through a firewall. + - Revoke object storage permissions from the **primary** node. + - Physically disconnect a machine. + +1. If you plan to + [update the primary domain DNS record](#step-4-optional-updating-the-primary-domain-dns-record), + you may wish to lower the TTL now to speed up propagation. + +### Step 3. Promoting a **secondary** node + +NOTE: **Note:** +A new **secondary** should not be added at this time. If you want to add a new +**secondary**, do this after you have completed the entire process of promoting +the **secondary** to the **primary**. + +#### Promoting a **secondary** node running on a single machine + +1. SSH in to your **secondary** node and login as root: + + ```sh + sudo -i + ``` + +1. Edit `/etc/gitlab/gitlab.rb` to reflect its new status as **primary** by + removing any lines that enabled the `geo_secondary_role`: + + ```ruby + ## In pre-11.5 documentation, the role was enabled as follows. Remove this line. + geo_secondary_role['enable'] = true + + ## In 11.5+ documentation, the role was enabled as follows. Remove this line. + roles ['geo_secondary_role'] + ``` + +1. Promote the **secondary** node to the **primary** node. Execute: + + ```sh + gitlab-ctl promote-to-primary-node + ``` + +1. Verify you can connect to the newly promoted **primary** node using the URL used + previously for the **secondary** node. +1. If successful, the **secondary** node has now been promoted to the **primary** node. + +#### Promoting a **secondary** node with HA + +The `gitlab-ctl promote-to-primary-node` command cannot be used yet in +conjunction with High Availability or with multiple machines, as it can only +perform changes on a **secondary** with only a single machine. Instead, you must +do this manually. + +1. SSH in to the database node in the **secondary** and trigger PostgreSQL to + promote to read-write: + + ```bash + sudo gitlab-pg-ctl promote + ``` + +1. Edit `/etc/gitlab/gitlab.rb` on every machine in the **secondary** to + reflect its new status as **primary** by removing any lines that enabled the + `geo_secondary_role`: + + ```ruby + ## In pre-11.5 documentation, the role was enabled as follows. Remove this line. + geo_secondary_role['enable'] = true + + ## In 11.5+ documentation, the role was enabled as follows. Remove this line. + roles ['geo_secondary_role'] + ``` + + After making these changes [Reconfigure GitLab](../../restart_gitlab.md#omnibus-gitlab-reconfigure) each + machine so the changes take effect. + +1. Promote the **secondary** to **primary**. SSH into a single application + server and execute: + + ```bash + sudo gitlab-rake geo:set_secondary_as_primary + ``` + +1. Verify you can connect to the newly promoted **primary** using the URL used + previously for the **secondary**. +1. Success! The **secondary** has now been promoted to **primary**. + +### Step 4. (Optional) Updating the primary domain DNS record + +Updating the DNS records for the primary domain to point to the **secondary** node +will prevent the need to update all references to the primary domain to the +secondary domain, like changing Git remotes and API URLs. + +1. SSH into the **secondary** node and login as root: + + ```sh + sudo -i + ``` + +1. Update the primary domain's DNS record. After updating the primary domain's + DNS records to point to the **secondary** node, edit `/etc/gitlab/gitlab.rb` on the + **secondary** node to reflect the new URL: + + ```ruby + # Change the existing external_url configuration + external_url 'https://gitlab.example.com' + ``` + + NOTE: **Note** + Changing `external_url` won't prevent access via the old secondary URL, as + long as the secondary DNS records are still intact. + +1. Reconfigure the **secondary** node for the change to take effect: + + ```sh + gitlab-ctl reconfigure + ``` + +1. Execute the command below to update the newly promoted **primary** node URL: + + ```sh + gitlab-rake geo:update_primary_node_url + ``` + + This command will use the changed `external_url` configuration defined + in `/etc/gitlab/gitlab.rb`. + +1. Verify you can connect to the newly promoted **primary** using its URL. + If you updated the DNS records for the primary domain, these changes may + not have yet propagated depending on the previous DNS records TTL. + +### Step 5. (Optional) Add **secondary** Geo node to a promoted **primary** node + +Promoting a **secondary** node to **primary** node using the process above does not enable +Geo on the new **primary** node. + +To bring a new **secondary** node online, follow the [Geo setup instructions][setup-geo]. + +### Step 6. (Optional) Removing the secondary's tracking database + +Every **secondary** has a special tracking database that is used to save the status of the synchronization of all the items from the **primary**. +Because the **secondary** is already promoted, that data in the tracking database is no longer required. + +The data can be removed with the following command: + +```sh +sudo rm -rf /var/opt/gitlab/geo-postgresql +``` + +## Promoting secondary Geo replica in multi-secondary configurations + +If you have more than one **secondary** node and you need to promote one of them, we suggest you follow +[Promoting a **secondary** Geo node in single-secondary configurations](#promoting-a-secondary-geo-node-in-single-secondary-configurations) +and after that you also need two extra steps. + +### Step 1. Prepare the new **primary** node to serve one or more **secondary** nodes + +1. SSH into the new **primary** node and login as root: + + ```sh + sudo -i + ``` + +1. Edit `/etc/gitlab/gitlab.rb` + + ```ruby + ## Enable a Geo Primary role (if you haven't yet) + roles ['geo_primary_role'] + + ## + # Primary and Secondary addresses + # - replace '198.51.100.1' with the public or VPC address of your Geo primary node + # - replace '198.51.100.2' with the public or VPC address of your Geo secondary node + ## + postgresql['md5_auth_cidr_addresses'] = ['198.51.100.1/32', '198.51.100.2/32'] + + # Every secondary server needs to have its own slot so specify the number of secondary nodes you're going to have + postgresql['max_replication_slots'] = 1 + + ## + ## Disable automatic database migrations temporarily + ## (until PostgreSQL is restarted and listening on the private address). + ## + gitlab_rails['auto_migrate'] = false + + ``` + + (For more details about these settings you can read [Configure the primary server][configure-the-primary-server]) + +1. Save the file and reconfigure GitLab for the database listen changes and + the replication slot changes to be applied. + + ```sh + gitlab-ctl reconfigure + ``` + + Restart PostgreSQL for its changes to take effect: + + ```sh + gitlab-ctl restart postgresql + ``` + +1. Re-enable migrations now that PostgreSQL is restarted and listening on the + private address. + + Edit `/etc/gitlab/gitlab.rb` and **change** the configuration to `true`: + + ```ruby + gitlab_rails['auto_migrate'] = true + ``` + + Save the file and reconfigure GitLab: + + ```sh + gitlab-ctl reconfigure + ``` + +### Step 2. Initiate the replication process + +Now we need to make each **secondary** node listen to changes on the new **primary** node. To do that you need +to [initiate the replication process][initiate-the-replication-process] again but this time +for another **primary** node. All the old replication settings will be overwritten. + +## Troubleshooting + +### I followed the disaster recovery instructions and now two-factor auth is broken! + +The setup instructions for Geo prior to 10.5 failed to replicate the +`otp_key_base` secret, which is used to encrypt the two-factor authentication +secrets stored in the database. If it differs between **primary** and **secondary** +nodes, users with two-factor authentication enabled won't be able to log in +after a failover. + +If you still have access to the old **primary** node, you can follow the +instructions in the +[Upgrading to GitLab 10.5][updating-geo] +section to resolve the error. Otherwise, the secret is lost and you'll need to +[reset two-factor authentication for all users][sec-tfa]. + +[gitlab-org&65]: https://gitlab.com/groups/gitlab-org/-/epics/65 +[geo-limitations]: ../replication/index.md#current-limitations +[planned-failover]: planned_failover.md +[setup-geo]: ../replication/index.md#setup-instructions +[updating-geo]: ../replication/updating_the_geo_nodes.md#upgrading-to-gitlab-105 +[sec-tfa]: ../../../security/two_factor_authentication.md#disabling-2fa-for-everyone +[gitlab-org/omnibus-gitlab#3058]: https://gitlab.com/gitlab-org/omnibus-gitlab/issues/3058 +[gitlab-org/gitlab-ee#4284]: https://gitlab.com/gitlab-org/gitlab-ee/issues/4284 +[initiate-the-replication-process]: ../replication/database.html#step-3-initiate-the-replication-process +[configure-the-primary-server]: ../replication/database.html#step-1-configure-the-primary-server diff --git a/doc/administration/geo/disaster_recovery/planned_failover.md b/doc/administration/geo/disaster_recovery/planned_failover.md new file mode 100644 index 00000000000..9875a29d4c0 --- /dev/null +++ b/doc/administration/geo/disaster_recovery/planned_failover.md @@ -0,0 +1,229 @@ +# Disaster recovery for planned failover + +The primary use-case of Disaster Recovery is to ensure business continuity in +the event of unplanned outage, but it can be used in conjunction with a planned +failover to migrate your GitLab instance between regions without extended +downtime. + +As replication between Geo nodes is asynchronous, a planned failover requires +a maintenance window in which updates to the **primary** node are blocked. The +length of this window is determined by your replication capacity - once the +**secondary** node is completely synchronized with the **primary** node, the failover can occur without +data loss. + +This document assumes you already have a fully configured, working Geo setup. +Please read it and the [Disaster Recovery][disaster-recovery] failover +documentation in full before proceeding. Planned failover is a major operation, +and if performed incorrectly, there is a high risk of data loss. Consider +rehearsing the procedure until you are comfortable with the necessary steps and +have a high degree of confidence in being able to perform them accurately. + +## Not all data is automatically replicated + +If you are using any GitLab features that Geo [doesn't support][limitations], +you must make separate provisions to ensure that the **secondary** node has an +up-to-date copy of any data associated with that feature. This may extend the +required scheduled maintenance period significantly. + +A common strategy for keeping this period as short as possible for data stored +in files is to use `rsync` to transfer the data. An initial `rsync` can be +performed ahead of the maintenance window; subsequent `rsync`s (including a +final transfer inside the maintenance window) will then transfer only the +*changes* between the **primary** node and the **secondary** nodes. + +Repository-centric strategies for using `rsync` effectively can be found in the +[moving repositories][moving-repositories] documentation; these strategies can +be adapted for use with any other file-based data, such as GitLab Pages (to +be found in `/var/opt/gitlab/gitlab-rails/shared/pages` if using Omnibus). + +## Pre-flight checks + +Follow these steps before scheduling a planned failover to ensure the process +will go smoothly. + +### Object storage + +Some classes of non-repository data can use object storage in preference to +file storage. Geo [does not replicate data in object storage](../replication/object_storage.md), +leaving that task up to the object store itself. For a planned failover, this +means you can decouple the replication of this data from the failover of the +GitLab service. + +If you're already using object storage, simply verify that your **secondary** +node has access to the same data as the **primary** node - they must either they share the +same object storage configuration, or the **secondary** node should be configured to +access a [geographically-replicated][os-repl] copy provided by the object store +itself. + +If you have a large GitLab installation or cannot tolerate downtime, consider +[migrating to Object Storage][os-conf] **before** scheduling a planned failover. +Doing so reduces both the length of the maintenance window, and the risk of data +loss as a result of a poorly executed planned failover. + +### Review the configuration of each **secondary** node + +Database settings are automatically replicated to the **secondary** node, but the +`/etc/gitlab/gitlab.rb` file must be set up manually, and differs between +nodes. If features such as Mattermost, OAuth or LDAP integration are enabled +on the **primary** node but not the **secondary** node, they will be lost during failover. + +Review the `/etc/gitlab/gitlab.rb` file for both nodes and ensure the **secondary** node +supports everything the **primary** node does **before** scheduling a planned failover. + +### Run system checks + +Run the following on both **primary** and **secondary** nodes: + +```sh +gitlab-rake gitlab:check +gitlab-rake gitlab:geo:check +``` + +If any failures are reported on either node, they should be resolved **before** +scheduling a planned failover. + +### Check that secrets match between nodes + +The SSH host keys and `/etc/gitlab/gitlab-secrets.json` files should be +identical on all nodes. Check this by running the following on all nodes and +comparing the output: + +```sh +sudo sha256sum /etc/ssh/ssh_host* /etc/gitlab/gitlab-secrets.json +``` + +If any files differ, replace the content on the **secondary** node with the +content from the **primary** node. + +### Ensure Geo replication is up-to-date + +The maintenance window won't end until Geo replication and verification is +completely finished. To keep the window as short as possible, you should +ensure these processes are close to 100% as possible during active use. + +Navigate to the **Admin Area > Geo** dashboard on the **secondary** node to +review status. Replicated objects (shown in green) should be close to 100%, +and there should be no failures (shown in red). If a large proportion of +objects aren't yet replicated (shown in grey), consider giving the node more +time to complete + +![Replication status](img/replication-status.png) + +If any objects are failing to replicate, this should be investigated before +scheduling the maintenance window. Following a planned failover, anything that +failed to replicate will be **lost**. + +You can use the [Geo status API](../../../api/geo_nodes.md#retrieve-project-sync-or-verification-failures-that-occurred-on-the-current-node) to review failed objects and +the reasons for failure. + +A common cause of replication failures is the data being missing on the +**primary** node - you can resolve these failures by restoring the data from backup, +or removing references to the missing data. + +### Verify the integrity of replicated data + +This [content was moved to another location][background-verification]. + +### Notify users of scheduled maintenance + +On the **primary** node, navigate to **Admin Area > Messages**, add a broadcast +message. You can check under **Admin Area > Geo** to estimate how long it +will take to finish syncing. An example message would be: + +> A scheduled maintenance will take place at XX:XX UTC. We expect it to take +> less than 1 hour. + +## Prevent updates to the **primary** node + +Until a [read-only mode][ce-19739] is implemented, updates must be prevented +from happening manually. Note that your **secondary** node still needs read-only +access to the **primary** node during the maintenance window. + +1. At the scheduled time, using your cloud provider or your node's firewall, block + all HTTP, HTTPS and SSH traffic to/from the **primary** node, **except** for your IP and + the **secondary** node's IP. + + For instance, if your **secondary** node originates all its traffic from `5.6.7.8` and + your IP is `100.0.0.1`, you might run the following commands on the server(s) + making up your **primary** node: + + ```sh + sudo iptables -A INPUT -p tcp -s 5.6.7.8 --destination-port 22 -j ACCEPT + sudo iptables -A INPUT -p tcp -s 100.0.0.1 --destination-port 22 -j ACCEPT + sudo iptables -A INPUT --destination-port 22 -j REJECT + + sudo iptables -A INPUT -p tcp -s 5.6.7.8 --destination-port 80 -j ACCEPT + sudo iptables -A INPUT -p tcp -s 100.0.0.1 --destination-port 80 -j ACCEPT + sudo iptables -A INPUT --tcp-dport 80 -j REJECT + + sudo iptables -A INPUT -p tcp -s 5.6.7.8 --destination-port 443 -j ACCEPT + sudo iptables -A INPUT -p tcp -s 100.0.0.1 --destination-port 443 -j ACCEPT + sudo iptables -A INPUT --tcp-dport 443 -j REJECT + ``` + + From this point, users will be unable to view their data or make changes on the + **primary** node. They will also be unable to log in to the **secondary** node. + However, existing sessions will work for the remainder of the maintenance period, and + public data will be accessible throughout. + +1. Verify the **primary** node is blocked to HTTP traffic by visiting it in browser via + another IP. The server should refuse connection. + +1. Verify the **primary** node is blocked to Git over SSH traffic by attempting to pull an + existing Git repository with an SSH remote URL. The server should refuse + connection. + +1. Disable non-Geo periodic background jobs on the primary node by navigating + to **Admin Area > Monitoring > Background Jobs > Cron** , pressing `Disable All`, + and then pressing `Enable` for the `geo_sidekiq_cron_config_worker` cron job. + This job will re-enable several other cron jobs that are essential for planned + failover to complete successfully. + +## Finish replicating and verifying all data + +1. If you are manually replicating any data not managed by Geo, trigger the + final replication process now. +1. On the **primary** node, navigate to **Admin Area > Monitoring > Background Jobs > Queues** + and wait for all queues except those with `geo` in the name to drop to 0. + These queues contain work that has been submitted by your users; failing over + before it is completed will cause the work to be lost. +1. On the **primary** node, navigate to **Admin Area > Geo** and wait for the + following conditions to be true of the **secondary** node you are failing over to: + - All replication meters to each 100% replicated, 0% failures. + - All verification meters reach 100% verified, 0% failures. + - Database replication lag is 0ms. + - The Geo log cursor is up to date (0 events behind). + +1. On the **secondary** node, navigate to **Admin Area > Monitoring > Background Jobs > Queues** + and wait for all the `geo` queues to drop to 0 queued and 0 running jobs. +1. On the **secondary** node, use [these instructions][foreground-verification] + to verify the integrity of CI artifacts, LFS objects and uploads in file + storage. + +At this point, your **secondary** node will contain an up-to-date copy of everything the +**primary** node has, meaning nothing will be lost when you fail over. + +## Promote the **secondary** node + +Finally, follow the [Disaster Recovery docs][disaster-recovery] to promote the +**secondary** node to a **primary** node. This process will cause a brief outage on the **secondary** node, and users may need to log in again. + +Once it is completed, the maintenance window is over! Your new **primary** node will now +begin to diverge from the old one. If problems do arise at this point, failing +back to the old **primary** node [is possible][bring-primary-back], but likely to result +in the loss of any data uploaded to the new primary in the meantime. + +Don't forget to remove the broadcast message after failover is complete. + +[bring-primary-back]: bring_primary_back.md +[ce-19739]: https://gitlab.com/gitlab-org/gitlab-ce/issues/19739 +[container-registry]: ../replication/container_registry.md +[disaster-recovery]: index.md +[ee-4930]: https://gitlab.com/gitlab-org/gitlab-ee/issues/4930 +[ee-5064]: https://gitlab.com/gitlab-org/gitlab-ee/issues/5064 +[foreground-verification]: ../../raketasks/check.md +[background-verification]: background_verification.md +[limitations]: ../replication/index.md#current-limitations +[moving-repositories]: ../../operations/moving_repositories.md +[os-conf]: ../replication/object_storage.md#configuration +[os-repl]: ../replication/object_storage.md#replication diff --git a/doc/administration/geo/replication/configuration.md b/doc/administration/geo/replication/configuration.md new file mode 100644 index 00000000000..8c2e91dd0a0 --- /dev/null +++ b/doc/administration/geo/replication/configuration.md @@ -0,0 +1,315 @@ +# Geo configuration (GitLab Omnibus) + +NOTE: **Note:** +This is the documentation for the Omnibus GitLab packages. For installations +from source, follow the [**Geo nodes configuration for installations +from source**][configuration-source] guide. + +## Configuring a new **secondary** node + +NOTE: **Note:** +This is the final step in setting up a **secondary** Geo node. Stages of the +setup process must be completed in the documented order. +Before attempting the steps in this stage, [complete all prior stages][setup-geo-omnibus]. + +The basic steps of configuring a **secondary** node are to: + +- Replicate required configurations between the **primary** node and the **secondary** nodes. +- Configure a tracking database on each **secondary** node. +- Start GitLab on each **secondary** node. + +You are encouraged to first read through all the steps before executing them +in your testing/production environment. + +> **Notes:** +> - **Do not** setup any custom authentication for the **secondary** nodes. This will be + handled by the **primary** node. +> - Any change that requires access to the **Admin Area** needs to be done in the + **primary** node because the **secondary** node is a read-only replica. + +### Step 1. Manually replicate secret GitLab values + +GitLab stores a number of secret values in the `/etc/gitlab/gitlab-secrets.json` +file which *must* be the same on all nodes. Until there is +a means of automatically replicating these between nodes (see issue [gitlab-org/gitlab-ee#3789]), +they must be manually replicated to the **secondary** node. + +1. SSH into the **primary** node, and execute the command below: + + ```sh + sudo cat /etc/gitlab/gitlab-secrets.json + ``` + + This will display the secrets that need to be replicated, in JSON format. + +1. SSH into the **secondary** node and login as the `root` user: + + ```sh + sudo -i + ``` + +1. Make a backup of any existing secrets: + + ```sh + mv /etc/gitlab/gitlab-secrets.json /etc/gitlab/gitlab-secrets.json.`date +%F` + ``` + +1. Copy `/etc/gitlab/gitlab-secrets.json` from the **primary** node to the **secondary** node, or + copy-and-paste the file contents between nodes: + + ```sh + sudo editor /etc/gitlab/gitlab-secrets.json + + # paste the output of the `cat` command you ran on the primary + # save and exit + ``` + +1. Ensure the file permissions are correct: + + ```sh + chown root:root /etc/gitlab/gitlab-secrets.json + chmod 0600 /etc/gitlab/gitlab-secrets.json + ``` + +1. Reconfigure the **secondary** node for the change to take effect: + + ```sh + gitlab-ctl reconfigure + gitlab-ctl restart + ``` + +### Step 2. Manually replicate the **primary** node's SSH host keys + +GitLab integrates with the system-installed SSH daemon, designating a user +(typically named git) through which all access requests are handled. + +In a [Disaster Recovery] situation, GitLab system +administrators will promote a **secondary** node to the **primary** node. DNS records for the +**primary** domain should also be updated to point to the new **primary** node +(previously a **secondary** node). Doing so will avoid the need to update Git remotes and API URLs. + +This will cause all SSH requests to the newly promoted **primary** node to +fail due to SSH host key mismatch. To prevent this, the primary SSH host +keys must be manually replicated to the **secondary** node. + +1. SSH into the **secondary** node and login as the `root` user: + + ```sh + sudo -i + ``` + +1. Make a backup of any existing SSH host keys: + + ```sh + find /etc/ssh -iname ssh_host_* -exec cp {} {}.backup.`date +%F` \; + ``` + +1. Copy OpenSSH host keys from the **primary** node: + + If you can access your **primary** node using the **root** user: + + ```sh + # Run this from the secondary node, change `primary-node-fqdn` for the IP or FQDN of the server + scp root@primary-node-fqdn:/etc/ssh/ssh_host_*_key* /etc/ssh + ``` + + If you only have access through a user with **sudo** privileges: + + ```sh + # Run this from your primary node: + sudo tar --transform 's/.*\///g' -zcvf ~/geo-host-key.tar.gz /etc/ssh/ssh_host_*_key* + + # Run this from your secondary node: + scp user-with-sudo@primary-node-fqdn:geo-host-key.tar.gz . + tar zxvf ~/geo-host-key.tar.gz -C /etc/ssh + ``` + +1. On your **secondary** node, ensure the file permissions are correct: + + ```sh + chown root:root /etc/ssh/ssh_host_*_key* + chmod 0600 /etc/ssh/ssh_host_*_key* + ``` + +1. To verify key fingerprint matches, execute the following command on both nodes: + + ```sh + for file in /etc/ssh/ssh_host_*_key; do ssh-keygen -lf $file; done + ``` + + You should get an output similar to this one and they should be identical on both nodes: + + ```sh + 1024 SHA256:FEZX2jQa2bcsd/fn/uxBzxhKdx4Imc4raXrHwsbtP0M root@serverhostname (DSA) + 256 SHA256:uw98R35Uf+fYEQ/UnJD9Br4NXUFPv7JAUln5uHlgSeY root@serverhostname (ECDSA) + 256 SHA256:sqOUWcraZQKd89y/QQv/iynPTOGQxcOTIXU/LsoPmnM root@serverhostname (ED25519) + 2048 SHA256:qwa+rgir2Oy86QI+PZi/QVR+MSmrdrpsuH7YyKknC+s root@serverhostname (RSA) + ``` + +1. Verify that you have the correct public keys for the existing private keys: + + ```sh + # This will print the fingerprint for private keys: + for file in /etc/ssh/ssh_host_*_key; do ssh-keygen -lf $file; done + + # This will print the fingerprint for public keys: + for file in /etc/ssh/ssh_host_*_key.pub; do ssh-keygen -lf $file; done + ``` + + NOTE: **Note**: + The output for private keys and public keys command should generate the same fingerprint. + +1. Restart sshd on your **secondary** node: + + ```sh + # Debian or Ubuntu installations + sudo service ssh reload + + # CentOS installations + sudo service sshd reload + ``` + +### Step 3. Add the **secondary** node + +1. Visit the **primary** node's **Admin Area > Geo** + (`/admin/geo/nodes`) in your browser. +1. Add the **secondary** node by providing its full URL. **Do NOT** check the + **This is a primary node** checkbox. +1. Optionally, choose which namespaces should be replicated by the + **secondary** node. Leave blank to replicate all. Read more in + [selective synchronization](#selective-synchronization). +1. Click the **Add node** button. +1. SSH into your GitLab **secondary** server and restart the services: + + ```sh + gitlab-ctl restart + ``` + + Check if there are any common issue with your Geo setup by running: + + ```sh + gitlab-rake gitlab:geo:check + ``` + +1. SSH into your **primary** server and login as root to verify the + **secondary** node is reachable or there are any common issue with your Geo setup: + + ```sh + gitlab-rake gitlab:geo:check + ``` + +Once added to the admin panel and restarted, the **secondary** node will automatically start +replicating missing data from the **primary** node in a process known as **backfill**. +Meanwhile, the **primary** node will start to notify each **secondary** node of any changes, so +that the **secondary** node can act on those notifications immediately. + +Make sure the **secondary** node is running and accessible. +You can login to the **secondary** node with the same credentials as used for the **primary** node. + +### Step 4. Enabling Hashed Storage + +Using Hashed Storage significantly improves Geo replication. Project and group +renames no longer require synchronization between nodes. + +1. Visit the **primary** node's **Admin Area > Settings > Repository** + (`/admin/application_settings/repository`) in your browser. +1. In the **Repository storage** section, check **Use hashed storage paths for newly created and renamed projects**. + +### Step 5. (Optional) Configuring the **secondary** node to trust the **primary** node + +You can safely skip this step if your **primary** node uses a CA-issued HTTPS certificate. + +If your **primary** node is using a self-signed certificate for *HTTPS* support, you will +need to add that certificate to the **secondary** node's trust store. Retrieve the +certificate from the **primary** node and follow +[these instructions][omnibus-ssl] +on the **secondary** node. + +### Step 6. Enable Git access over HTTP/HTTPS + +Geo synchronizes repositories over HTTP/HTTPS, and therefore requires this clone +method to be enabled. Navigate to **Admin Area > Settings** +(`/admin/application_settings`) on the **primary** node, and set +`Enabled Git access protocols` to `Both SSH and HTTP(S)` or `Only HTTP(S)`. + +### Step 7. Verify proper functioning of the **secondary** node + +Your **secondary** node is now configured! + +You can login to the **secondary** node with the same credentials you used for the +**primary** node. Visit the **secondary** node's **Admin Area > Geo** +(`/admin/geo/nodes`) in your browser to check if it's correctly identified as a +**secondary** Geo node and if Geo is enabled. + +The initial replication, or 'backfill', will probably still be in progress. You +can monitor the synchronization process on each geo node from the **primary** +node's Geo Nodes dashboard in your browser. + +![Geo dashboard](img/geo_node_dashboard.png) + +If your installation isn't working properly, check the +[troubleshooting document]. + +The two most obvious issues that can become apparent in the dashboard are: + +1. Database replication not working well. +1. Instance to instance notification not working. In that case, it can be + something of the following: + - You are using a custom certificate or custom CA (see the + [troubleshooting document]). + - The instance is firewalled (check your firewall rules). + +Please note that disabling a **secondary** node will stop the synchronization process. + +Please note that if `git_data_dirs` is customized on the **primary** node for multiple +repository shards you must duplicate the same configuration on each **secondary** node. + +Point your users to the ["Using a Geo Server" guide][using-geo]. + +Currently, this is what is synced: + +- Git repositories. +- Wikis. +- LFS objects. +- Issues, merge requests, snippets, and comment attachments. +- Users, groups, and project avatars. + +## Selective synchronization + +Geo supports selective synchronization, which allows admins to choose +which projects should be synchronized by **secondary** nodes. + +It is important to note that selective synchronization does not: + +1. Restrict permissions from **secondary** nodes. +1. Hide project metadata from **secondary** nodes. + - Since Geo currently relies on PostgreSQL replication, all project metadata + gets replicated to **secondary** nodes, but repositories that have not been + selected will be empty. +1. Reduce the number of events generated for the Geo event log. + - The **primary** node generates events as long as any **secondary** nodes are present. + Selective synchronization restrictions are implemented on the **secondary** nodes, + not the **primary** node. + +A subset of projects can be chosen, either by group or by storage shard. The +former is ideal for replicating data belonging to a subset of users, while the +latter is more suited to progressively rolling out Geo to a large GitLab +instance. + +## Upgrading Geo + +See the [updating the Geo nodes document](updating_the_geo_nodes.md). + +## Troubleshooting + +See the [troubleshooting document](troubleshooting.md). + +[configuration-source]: configuration_source.md +[setup-geo-omnibus]: index.md#using-omnibus-gitlab +[Hashed Storage]: ../../repository_storage_types.md +[Disaster Recovery]: ../disaster_recovery/index.md +[gitlab-org/gitlab-ee#3789]: https://gitlab.com/gitlab-org/gitlab-ee/issues/3789 +[gitlab-com/infrastructure#2821]: https://gitlab.com/gitlab-com/infrastructure/issues/2821 +[omnibus-ssl]: https://docs.gitlab.com/omnibus/settings/ssl.html +[troubleshooting document]: troubleshooting.md +[using-geo]: using_a_geo_server.md diff --git a/doc/administration/geo/replication/configuration_source.md b/doc/administration/geo/replication/configuration_source.md new file mode 100644 index 00000000000..72d5831b1cb --- /dev/null +++ b/doc/administration/geo/replication/configuration_source.md @@ -0,0 +1,173 @@ +# Geo configuration (source) + +NOTE: **Note:** +This documentation applies to GitLab source installations. In GitLab 11.5, this documentation was deprecated and will be removed in a future release. +Please consider [migrating to GitLab Omnibus install](https://docs.gitlab.com/omnibus/update/convert_to_omnibus.html). For installations +using the Omnibus GitLab packages, follow the +[**Omnibus Geo nodes configuration**][configuration] guide. + +## Configuring a new **secondary** node + +NOTE: **Note:** +This is the final step in setting up a **secondary** node. Stages of the setup +process must be completed in the documented order. Before attempting the steps +in this stage, [complete all prior stages](index.md#using-gitlab-installed-from-source-deprecated). + +The basic steps of configuring a **secondary** node are to: + +- Replicate required configurations between the **primary** and **secondary** nodes. +- Configure a tracking database on each **secondary** node. +- Start GitLab on the **secondary** node. + +You are encouraged to first read through all the steps before executing them +in your testing/production environment. + +NOTE: **Note:** +**Do not** set up any custom authentication on **secondary** nodes, this will be handled by the **primary** node. + +NOTE: **Note:** +**Do not** add anything in the **secondary** node's admin area (**Admin Area > Geo**). This is handled solely by the **primary** node. + +### Step 1. Manually replicate secret GitLab values + +GitLab stores a number of secret values in the `/home/git/gitlab/config/secrets.yml` +file which *must* match between the **primary** and **secondary** nodes. Until there is +a means of automatically replicating these between nodes (see [gitlab-org/gitlab-ee#3789]), they must +be manually replicated to **secondary** nodes. + +1. SSH into the **primary** node, and execute the command below: + + ```sh + sudo cat /home/git/gitlab/config/secrets.yml + ``` + + This will display the secrets that need to be replicated, in YAML format. + +1. SSH into the **secondary** node and login as the `git` user: + + ```sh + sudo -i -u git + ``` + +1. Make a backup of any existing secrets: + + ```sh + mv /home/git/gitlab/config/secrets.yml /home/git/gitlab/config/secrets.yml.`date +%F` + ``` + +1. Copy `/home/git/gitlab/config/secrets.yml` from the **primary** node to the **secondary** node, or + copy-and-paste the file contents between nodes: + + ```sh + sudo editor /home/git/gitlab/config/secrets.yml + + # paste the output of the `cat` command you ran on the primary + # save and exit + ``` + +1. Ensure the file permissions are correct: + + ```sh + chown git:git /home/git/gitlab/config/secrets.yml + chmod 0600 /home/git/gitlab/config/secrets.yml + ``` + +1. Restart GitLab + + ```sh + service gitlab restart + ``` + +Once restarted, the **secondary** node will automatically start replicating missing data +from the **primary** node in a process known as backfill. Meanwhile, the **primary** node +will start to notify the **secondary** node of any changes, so that the **secondary** node can +act on those notifications immediately. + +Make sure the **secondary** node is running and accessible. You can login to +the **secondary** node with the same credentials as used for the **primary** node. + +### Step 2. Manually replicate the **primary** node's SSH host keys + +Read [Manually replicate the **primary** node's SSH host keys](configuration.md#step-2-manually-replicate-the-primary-nodes-ssh-host-keys) + +### Step 3. Add the **secondary** GitLab node + +1. Navigate to the **primary** node's **Admin Area > Geo** + (`/admin/geo/nodes`) in your browser. +1. Add the **secondary** node by providing its full URL. **Do NOT** check the + **This is a primary node** checkbox. +1. Optionally, choose which namespaces should be replicated by the + **secondary** node. Leave blank to replicate all. Read more in + [selective synchronization](#selective-synchronization). +1. Click the **Add node** button. +1. SSH into your GitLab **secondary** server and restart the services: + + ```sh + service gitlab restart + ``` + + Check if there are any common issue with your Geo setup by running: + + ```sh + bundle exec rake gitlab:geo:check + ``` + +1. SSH into your GitLab **primary** server and login as root to verify the + **secondary** node is reachable or there are any common issue with your Geo setup: + + ```sh + bundle exec rake gitlab:geo:check + ``` + +Once reconfigured, the **secondary** node will automatically start +replicating missing data from the **primary** node in a process known as backfill. +Meanwhile, the **primary** node will start to notify the **secondary** node of any changes, so +that the **secondary** node can act on those notifications immediately. + +Make sure the **secondary** node is running and accessible. +You can log in to the **secondary** node with the same credentials as used for the **primary** node. + +### Step 4. Enabling Hashed Storage + +Read [Enabling Hashed Storage](configuration.md#step-4-enabling-hashed-storage). + +### Step 5. (Optional) Configuring the secondary to trust the primary + +You can safely skip this step if your **primary** node uses a CA-issued HTTPS certificate. + +If your **primary** node is using a self-signed certificate for *HTTPS* support, you will +need to add that certificate to the **secondary** node's trust store. Retrieve the +certificate from the **primary** node and follow your distribution's instructions for +adding it to the **secondary** node's trust store. In Debian/Ubuntu, for example, with a +certificate file of `primary.geo.example.com.crt`, you would follow these steps: + +```sh +sudo -i +cp primary.geo.example.com.crt /usr/local/share/ca-certificates +update-ca-certificates +``` + +### Step 6. Enable Git access over HTTP/HTTPS + +Geo synchronizes repositories over HTTP/HTTPS, and therefore requires this clone +method to be enabled. Navigate to **Admin Area > Settings** +(`/admin/application_settings`) on the **primary** node, and set +`Enabled Git access protocols` to `Both SSH and HTTP(S)` or `Only HTTP(S)`. + +### Step 7. Verify proper functioning of the secondary node + +Read [Verify proper functioning of the secondary node][configuration-verify-node]. + +## Selective synchronization + +Read [Selective synchronization][configuration-selective-replication]. + +## Troubleshooting + +Read the [troubleshooting document][troubleshooting]. + +[gitlab-org/gitlab-ee#3789]: https://gitlab.com/gitlab-org/gitlab-ee/issues/3789 +[configuration]: configuration.md +[configuration-selective-replication]: configuration.md#selective-synchronization +[configuration-verify-node]: configuration.md#step-7-verify-proper-functioning-of-the-secondary-node +[troubleshooting]: troubleshooting.md diff --git a/doc/administration/geo/replication/database.md b/doc/administration/geo/replication/database.md new file mode 100644 index 00000000000..10e5409124c --- /dev/null +++ b/doc/administration/geo/replication/database.md @@ -0,0 +1,496 @@ +# Geo database replication (GitLab Omnibus) + +NOTE: **Note:** +This is the documentation for the Omnibus GitLab packages. For installations +from source, follow the +[Geo database replication (source)](database_source.md) guide. + +NOTE: **Note:** +If your GitLab installation uses external (not managed by Omnibus) PostgreSQL +instances, the Omnibus roles will not be able to perform all necessary +configuration steps. In this case, refer to +[additional instructions](external_database.md). + +NOTE: **Note:** +The stages of the setup process must be completed in the documented order. +Before attempting the steps in this stage, [complete all prior stages][toc]. + +This document describes the minimal steps you have to take in order to +replicate your **primary** GitLab database to a **secondary** node's database. You may +have to change some values according to your database setup, how big it is, etc. + +You are encouraged to first read through all the steps before executing them +in your testing/production environment. + +## PostgreSQL replication + +The GitLab **primary** node where the write operations happen will connect to +the **primary** database server, and **secondary** nodes will +connect to their own database servers (which are also read-only). + +NOTE: **Note:** +In database documentation, you may see "**primary**" being referenced as "master" +and "**secondary**" as either "slave" or "standby" server (read-only). + +We recommend using [PostgreSQL replication slots][replication-slots-article] +to ensure that the **primary** node retains all the data necessary for the **secondary** nodes to +recover. See below for more details. + +The following guide assumes that: + +- You are using Omnibus and therefore you are using PostgreSQL 9.6 or later + which includes the [`pg_basebackup` tool][pgback] and improved + [Foreign Data Wrapper][FDW] support. +- You have a **primary** node already set up (the GitLab server you are + replicating from), running Omnibus' PostgreSQL (or equivalent version), and + you have a new **secondary** server set up with the same versions of the OS, + PostgreSQL, and GitLab on all nodes. +- The IP of the **primary** server for our examples is `198.51.100.1`, whereas the + **secondary** node's IP is `198.51.100.2`. Note that the **primary** and **secondary** servers + **must** be able to communicate over these addresses. More on this in the + guide below. + +CAUTION: **Warning:** +Geo works with streaming replication. Logical replication is not supported at this time. +There is an [issue where support is being discussed](https://gitlab.com/gitlab-org/gitlab-ee/issues/7420). + +### Step 1. Configure the **primary** server + +1. SSH into your GitLab **primary** server and login as root: + + ```sh + sudo -i + ``` + +1. Execute the command below to define the node as **primary** node: + + ```sh + gitlab-ctl set-geo-primary-node + ``` + + This command will use your defined `external_url` in `/etc/gitlab/gitlab.rb`. + +1. GitLab 10.4 and up only: Do the following to make sure the `gitlab` database user has a password defined: + + Generate a MD5 hash of the desired password: + + ```sh + gitlab-ctl pg-password-md5 gitlab + # Enter password: mypassword + # Confirm password: mypassword + # fca0b89a972d69f00eb3ec98a5838484 + ``` + + Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab` + postgresql['sql_user_password'] = 'fca0b89a972d69f00eb3ec98a5838484' + + # Every node that runs Unicorn or Sidekiq needs to have the database + # password specified as below. If you have a high-availability setup, this + # must be present in all application nodes. + gitlab_rails['db_password'] = 'mypassword' + ``` + +1. Omnibus GitLab already has a [replication user] + called `gitlab_replicator`. You must set the password for this user manually. + You will be prompted to enter a password: + + ```sh + gitlab-ctl set-replication-password + ``` + + This command will also read the `postgresql['sql_replication_user']` Omnibus + setting in case you have changed `gitlab_replicator` username to something + else. + + If you are using an external database not managed by Omnibus GitLab, you need + to create the replicator user and define a password to it manually. + For information on how to create a replication user, refer to the + [appropriate step](database_source.md#step-1-configure-the-primary-server) + in [Geo database replication (source)](database_source.md). + +1. Configure PostgreSQL to listen on network interfaces: + + For security reasons, PostgreSQL does not listen on any network interfaces + by default. However, Geo requires the **secondary** node to be able to + connect to the **primary** node's database. For this reason, we need the address of + each node. Note: For external PostgreSQL instances, see [additional instructions](external_database.md). + + If you are using a cloud provider, you can lookup the addresses for each + Geo node through your cloud provider's management console. + + To lookup the address of a Geo node, SSH in to the Geo node and execute: + + ```sh + ## + ## Private address + ## + ip route get 255.255.255.255 | awk '{print "Private address:", $NF; exit}' + + ## + ## Public address + ## + echo "External address: $(curl --silent ipinfo.io/ip)" + ``` + + In most cases, the following addresses will be used to configure GitLab + Geo: + + | Configuration | Address | + |:----------------------------------------|:------------------------------------------------------| + | `postgresql['listen_address']` | **Primary** node's public or VPC private address. | + | `postgresql['md5_auth_cidr_addresses']` | **Secondary** node's public or VPC private addresses. | + + If you are using Google Cloud Platform, SoftLayer, or any other vendor that + provides a virtual private cloud (VPC) you can use the **secondary** node's private + address (corresponds to "internal address" for Google Cloud Platform) for + `postgresql['md5_auth_cidr_addresses']` and `postgresql['listen_address']`. + + The `listen_address` option opens PostgreSQL up to network connections + with the interface corresponding to the given address. See [the PostgreSQL + documentation][pg-docs-runtime-conn] for more details. + + Depending on your network configuration, the suggested addresses may not + be correct. If your **primary** node and **secondary** nodes connect over a local + area network, or a virtual network connecting availability zones like + [Amazon's VPC](https://aws.amazon.com/vpc/) or [Google's VPC](https://cloud.google.com/vpc/) + you should use the **secondary** node's private address for `postgresql['md5_auth_cidr_addresses']`. + + Edit `/etc/gitlab/gitlab.rb` and add the following, replacing the IP + addresses with addresses appropriate to your network configuration: + + ```ruby + ## + ## Geo Primary role + ## - configure dependent flags automatically to enable Geo + ## + roles ['geo_primary_role'] + + ## + ## Primary address + ## - replace '198.51.100.1' with the public or VPC address of your Geo primary node + ## + postgresql['listen_address'] = '198.51.100.1' + + ## + # Primary and Secondary addresses + # - replace '198.51.100.1' with the public or VPC address of your Geo primary node + # - replace '198.51.100.2' with the public or VPC address of your Geo secondary node + ## + postgresql['md5_auth_cidr_addresses'] = ['198.51.100.1/32','198.51.100.2/32'] + + ## + ## Replication settings + ## - set this to be the number of Geo secondary nodes you have + ## + postgresql['max_replication_slots'] = 1 + # postgresql['max_wal_senders'] = 10 + # postgresql['wal_keep_segments'] = 10 + + ## + ## Disable automatic database migrations temporarily + ## (until PostgreSQL is restarted and listening on the private address). + ## + gitlab_rails['auto_migrate'] = false + ``` + +1. Optional: If you want to add another **secondary** node, the relevant setting would look like: + + ```ruby + postgresql['md5_auth_cidr_addresses'] = ['198.51.100.1/32', '198.51.100.2/32','198.51.100.3/32'] + ``` + + You may also want to edit the `wal_keep_segments` and `max_wal_senders` to + match your database replication requirements. Consult the [PostgreSQL - + Replication documentation][pg-docs-runtime-replication] + for more information. + +1. Save the file and reconfigure GitLab for the database listen changes and + the replication slot changes to be applied: + + ```sh + gitlab-ctl reconfigure + ``` + + Restart PostgreSQL for its changes to take effect: + + ```sh + gitlab-ctl restart postgresql + ``` + +1. Re-enable migrations now that PostgreSQL is restarted and listening on the + private address. + + Edit `/etc/gitlab/gitlab.rb` and **change** the configuration to `true`: + + ```ruby + gitlab_rails['auto_migrate'] = true + ``` + + Save the file and reconfigure GitLab: + + ```sh + gitlab-ctl reconfigure + ``` + +1. Now that the PostgreSQL server is set up to accept remote connections, run + `netstat -plnt | grep 5432` to make sure that PostgreSQL is listening on port + `5432` to the **primary** server's private address. + +1. A certificate was automatically generated when GitLab was reconfigured. This + will be used automatically to protect your PostgreSQL traffic from + eavesdroppers, but to protect against active ("man-in-the-middle") attackers, + the **secondary** node needs a copy of the certificate. Make a copy of the PostgreSQL + `server.crt` file on the **primary** node by running this command: + + ```sh + cat ~gitlab-psql/data/server.crt + ``` + + Copy the output into a clipboard or into a local file. You + will need it when setting up the **secondary** node! The certificate is not sensitive + data. + +### Step 2. Configure the **secondary** server + +1. SSH into your GitLab **secondary** server and login as root: + + ``` + sudo -i + ``` + +1. Stop application server and Sidekiq + + ``` + gitlab-ctl stop unicorn + gitlab-ctl stop sidekiq + ``` + + NOTE: **Note**: + This step is important so we don't try to execute anything before the node is fully configured. + +1. [Check TCP connectivity][rake-maintenance] to the **primary** node's PostgreSQL server: + + ```sh + gitlab-rake gitlab:tcp_check[198.51.100.1,5432] + ``` + + NOTE: **Note**: + If this step fails, you may be using the wrong IP address, or a firewall may + be preventing access to the server. Check the IP address, paying close + attention to the difference between public and private addresses and ensure + that, if a firewall is present, the **secondary** node is permitted to connect to the + **primary** node on port 5432. + +1. Create a file `server.crt` in the **secondary** server, with the content you got on the last step of the **primary** node's setup: + + ``` + editor server.crt + ``` + +1. Set up PostgreSQL TLS verification on the **secondary** node: + + Install the `server.crt` file: + + ```sh + install -D -o gitlab-psql -g gitlab-psql -m 0400 -T server.crt ~gitlab-psql/.postgresql/root.crt + ``` + + PostgreSQL will now only recognize that exact certificate when verifying TLS + connections. The certificate can only be replicated by someone with access + to the private key, which is **only** present on the **primary** node. + +1. Test that the `gitlab-psql` user can connect to the **primary** node's database: + + ```sh + sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql --list -U gitlab_replicator -d "dbname=gitlabhq_production sslmode=verify-ca" -W -h 198.51.100.1 + ``` + + When prompted enter the password you set in the first step for the + `gitlab_replicator` user. If all worked correctly, you should see + the list of **primary** node's databases. + + A failure to connect here indicates that the TLS configuration is incorrect. + Ensure that the contents of `~gitlab-psql/data/server.crt` on the **primary** node + match the contents of `~gitlab-psql/.postgresql/root.crt` on the **secondary** node. + +1. Configure PostgreSQL to enable FDW support: + + This step is similar to how we configured the **primary** instance. + We need to enable this, to enable FDW support, even if using a single node. + + Edit `/etc/gitlab/gitlab.rb` and add the following, replacing the IP + addresses with addresses appropriate to your network configuration: + + ```ruby + ## + ## Geo Secondary role + ## - configure dependent flags automatically to enable Geo + ## + roles ['geo_secondary_role'] + + ## + ## Secondary address + ## - replace '198.51.100.2' with the public or VPC address of your Geo secondary node + ## + postgresql['listen_address'] = '198.51.100.2' + postgresql['md5_auth_cidr_addresses'] = ['198.51.100.2/32'] + + ## + ## Database credentials password (defined previously in primary node) + ## - replicate same values here as defined in primary node + ## + postgresql['sql_user_password'] = 'fca0b89a972d69f00eb3ec98a5838484' + gitlab_rails['db_password'] = 'mypassword' + + ## + ## Enable FDW support for the Geo Tracking Database (improves performance) + ## + geo_secondary['db_fdw'] = true + ``` + + For external PostgreSQL instances, see [additional instructions](external_database.md). + If you bring a former **primary** node back online to serve as a **secondary** node, then you also need to remove `roles ['geo_primary_role']` or `geo_primary_role['enable'] = true`. + +1. Reconfigure GitLab for the changes to take effect: + + ```sh + gitlab-ctl reconfigure + ``` + +1. Restart PostgreSQL for the IP change to take effect and reconfigure again: + + ```sh + gitlab-ctl restart postgresql + gitlab-ctl reconfigure + ``` + + This last reconfigure will provision the FDW configuration and enable it. + +### Step 3. Initiate the replication process + +Below we provide a script that connects the database on the **secondary** node to +the database on the **primary** node, replicates the database, and creates the +needed files for streaming replication. + +The directories used are the defaults that are set up in Omnibus. If you have +changed any defaults or are using a source installation, configure it as you +see fit replacing the directories and paths. + +CAUTION: **Warning:** +Make sure to run this on the **secondary** server as it removes all PostgreSQL's +data before running `pg_basebackup`. + +1. SSH into your GitLab **secondary** server and login as root: + + ```sh + sudo -i + ``` + +1. Choose a database-friendly name to use for your **secondary** node to + use as the replication slot name. For example, if your domain is + `secondary.geo.example.com`, you may use `secondary_example` as the slot + name as shown in the commands below. + +1. Execute the command below to start a backup/restore and begin the replication + CAUTION: **Warning:** Each Geo **secondary** node must have its own unique replication slot name. + Using the same slot name between two secondaries will break PostgreSQL replication. + + ```sh + gitlab-ctl replicate-geo-database --slot-name=secondary_example --host=198.51.100.1 + ``` + + When prompted, enter the _plaintext_ password you set up for the `gitlab_replicator` + user in the first step. + + This command also takes a number of additional options. You can use `--help` + to list them all, but here are a couple of tips: + - If PostgreSQL is listening on a non-standard port, add `--port=` as well. + - If your database is too large to be transferred in 30 minutes, you will need + to increase the timeout, e.g., `--backup-timeout=3600` if you expect the + initial replication to take under an hour. + - Pass `--sslmode=disable` to skip PostgreSQL TLS authentication altogether + (e.g., you know the network path is secure, or you are using a site-to-site + VPN). This is **not** safe over the public Internet! + - You can read more details about each `sslmode` in the + [PostgreSQL documentation][pg-docs-ssl]; + the instructions above are carefully written to ensure protection against + both passive eavesdroppers and active "man-in-the-middle" attackers. + - Change the `--slot-name` to the name of the replication slot + to be used on the **primary** database. The script will attempt to create the + replication slot automatically if it does not exist. + - If you're repurposing an old server into a Geo **secondary** node, you'll need to + add `--force` to the command line. + - When not in a production machine you can disable backup step if you + really sure this is what you want by adding `--skip-backup` + +The replication process is now complete. + +## PGBouncer support (optional) + +[PGBouncer](http://pgbouncer.github.io/) may be used with GitLab Geo to pool +PostgreSQL connections. We recommend using PGBouncer if you use GitLab in a +high-availability configuration with a cluster of nodes supporting a Geo +**primary** node and another cluster of nodes supporting a Geo **secondary** node. For more +information, see the [Omnibus HA](https://docs.gitlab.com/ee/administration/high_availability/database.html#configure-using-omnibus-for-high-availability) +documentation. + +For a Geo **secondary** node to work properly with PGBouncer in front of the database, +it will need a separate read-only user to make [PostgreSQL FDW queries][FDW] +work: + +1. On the **primary** Geo database, enter the PostgreSQL on the console as an + admin user. If you are using an Omnibus-managed database, log onto the **primary** + node that is running the PostgreSQL database: + + ```sh + sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -h /var/opt/gitlab/postgresql gitlabhq_production + ``` + +1. Then create the read-only user: + + ```sql + -- NOTE: Use the password defined earlier + CREATE USER gitlab_geo_fdw WITH password 'mypassword'; + GRANT CONNECT ON DATABASE gitlabhq_production to gitlab_geo_fdw; + GRANT USAGE ON SCHEMA public TO gitlab_geo_fdw; + GRANT SELECT ON ALL TABLES IN SCHEMA public TO gitlab_geo_fdw; + GRANT SELECT ON ALL SEQUENCES IN SCHEMA public TO gitlab_geo_fdw; + + -- Tables created by "gitlab" should be made read-only for "gitlab_geo_fdw" + -- automatically. + ALTER DEFAULT PRIVILEGES FOR USER gitlab IN SCHEMA public GRANT SELECT ON TABLES TO gitlab_geo_fdw; + ALTER DEFAULT PRIVILEGES FOR USER gitlab IN SCHEMA public GRANT SELECT ON SEQUENCES TO gitlab_geo_fdw; + ``` + +1. On the **secondary** nodes, change `/etc/gitlab/gitlab.rb`: + + ``` + geo_postgresql['fdw_external_user'] = 'gitlab_geo_fdw' + ``` + +1. Save the file and reconfigure GitLab for the changes to be applied: + + ```sh + gitlab-ctl reconfigure + ``` + +## MySQL replication + +MySQL replication is not supported for Geo. + +## Troubleshooting + +Read the [troubleshooting document](troubleshooting.md). + +[replication-slots-article]: https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75 +[pgback]: http://www.postgresql.org/docs/9.2/static/app-pgbasebackup.html +[replication user]:https://wiki.postgresql.org/wiki/Streaming_Replication +[FDW]: https://www.postgresql.org/docs/9.6/static/postgres-fdw.html +[toc]: index.md#using-omnibus-gitlab +[rake-maintenance]: ../../raketasks/maintenance.md +[pg-docs-ssl]: https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION +[pg-docs-runtime-conn]: https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html +[pg-docs-runtime-replication]: https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html diff --git a/doc/administration/geo/replication/database_source.md b/doc/administration/geo/replication/database_source.md new file mode 100644 index 00000000000..a6b990ebc43 --- /dev/null +++ b/doc/administration/geo/replication/database_source.md @@ -0,0 +1,431 @@ +# Geo database replication (source) + +NOTE: **Note:** +This documentation applies to GitLab source installations. In GitLab 11.5, this documentation was deprecated and will be removed in a future release. +Please consider [migrating to GitLab Omnibus install](https://docs.gitlab.com/omnibus/update/convert_to_omnibus.html). For installations +using the Omnibus GitLab packages, follow the +[**database replication for Omnibus GitLab**][database] guide. + +NOTE: **Note:** +The stages of the setup process must be completed in the documented order. +Before attempting the steps in this stage, [complete all prior stages](index.md#using-gitlab-installed-from-source-deprecated). + +This document describes the minimal steps you have to take in order to +replicate your **primary** GitLab database to a **secondary** node's database. You may +have to change some values according to your database setup, how big it is, etc. + +You are encouraged to first read through all the steps before executing them +in your testing/production environment. + +## PostgreSQL replication + +The GitLab **primary** node where the write operations happen will connect to +**primary** database server, and the **secondary** ones which are read-only will +connect to **secondary** database servers (which are read-only too). + +NOTE: **Note:** +In many databases' documentation, you will see "**primary**" being referenced as "master" +and "**secondary**" as either "slave" or "standby" server (read-only). + +We recommend using [PostgreSQL replication slots][replication-slots-article] +to ensure the **primary** node retains all the data necessary for the secondaries to +recover. See below for more details. + +The following guide assumes that: + +- You are using PostgreSQL 9.6 or later which includes the + [`pg_basebackup` tool][pgback] and improved [Foreign Data Wrapper][FDW] support. +- You have a **primary** node already set up (the GitLab server you are + replicating from), running PostgreSQL 9.6 or later, and + you have a new **secondary** server set up with the same versions of the OS, + PostgreSQL, and GitLab on all nodes. +- The IP of the **primary** server for our examples is `198.51.100.1`, whereas the + **secondary** node's IP is `198.51.100.2`. Note that the **primary** and **secondary** servers + **must** be able to communicate over these addresses. These IP addresses can either + be public or private. + +CAUTION: **Warning:** +Geo works with streaming replication. Logical replication is not supported at this time. +There is an [issue where support is being discussed](https://gitlab.com/gitlab-org/gitlab-ee/issues/7420). + +### Step 1. Configure the **primary** server + +1. SSH into your GitLab **primary** server and login as root: + + ```sh + sudo -i + ``` + +1. Add this node as the Geo **primary** by running: + + ```sh + bundle exec rake geo:set_primary_node + ``` + +1. Create a [replication user] named `gitlab_replicator`: + + ```sql + --- Create a new user 'replicator' + CREATE USER gitlab_replicator; + + --- Set/change a password and grants replication privilege + ALTER USER gitlab_replicator WITH REPLICATION ENCRYPTED PASSWORD 'replicationpasswordhere'; + ``` + +1. Make sure your the `gitlab` database user has a password defined: + + ```sh + sudo -u postgres psql -d template1 -c "ALTER USER gitlab WITH ENCRYPTED PASSWORD 'mydatabasepassword';" + ``` + +1. Edit the content of `database.yml` in `production:` and add the password like the example below: + + ```yaml + # + # PRODUCTION + # + production: + adapter: postgresql + encoding: unicode + database: gitlabhq_production + pool: 10 + username: gitlab + password: mydatabasepassword + host: /var/opt/gitlab/geo-postgresql + ``` + +1. Set up TLS support for the PostgreSQL **primary** server: + + CAUTION: **Warning**: + Only skip this step if you **know** that PostgreSQL traffic + between the **primary** and **secondary** nodes will be secured through some other + means, e.g., a known-safe physical network path or a site-to-site VPN that + you have configured. + + If you are replicating your database across the open Internet, it is + **essential** that the connection is TLS-secured. Correctly configured, this + provides protection against both passive eavesdroppers and active + "man-in-the-middle" attackers. + + To generate a self-signed certificate and key, run this command: + + ```sh + openssl req -nodes -batch -x509 -newkey rsa:4096 -keyout server.key -out server.crt -days 3650 + ``` + + This will create two files - `server.key` and `server.crt` - that you can + use for authentication. + + Copy them to the correct location for your PostgreSQL installation: + + ```sh + # Copying a self-signed certificate and key + install -o postgres -g postgres -m 0400 -T server.crt ~postgres/9.x/main/data/server.crt + install -o postgres -g postgres -m 0400 -T server.key ~postgres/9.x/main/data/server.key + ``` + + Add this configuration to `postgresql.conf`, removing any existing + configuration for `ssl_cert_file` or `ssl_key_file`: + + ``` + ssl = on + ssl_cert_file='server.crt' + ssl_key_file='server.key' + ``` + +1. Edit `postgresql.conf` to configure the **primary** server for streaming replication + (for Debian/Ubuntu that would be `/etc/postgresql/9.x/main/postgresql.conf`): + + ``` + listen_address = '198.51.100.1' + wal_level = hot_standby + max_wal_senders = 5 + min_wal_size = 80MB + max_wal_size = 1GB + max_replicaton_slots = 1 # Number of Geo secondary nodes + wal_keep_segments = 10 + hot_standby = on + ``` + + NOTE: **Note**: + Be sure to set `max_replication_slots` to the number of Geo **secondary** + nodes that you may potentially have (at least 1). + + For security reasons, PostgreSQL by default only listens on the local + interface (e.g. 127.0.0.1). However, Geo needs to communicate + between the **primary** and **secondary** nodes over a common network, such as a + corporate LAN or the public Internet. For this reason, we need to + configure PostgreSQL to listen on more interfaces. + + The `listen_address` option opens PostgreSQL up to external connections + with the interface corresponding to the given IP. See [the PostgreSQL + documentation][pg-docs-runtime-conn] for more details. + + You may also want to edit the `wal_keep_segments` and `max_wal_senders` to + match your database replication requirements. Consult the + [PostgreSQL - Replication documentation][pg-docs-runtime-replication] for more information. + +1. Set the access control on the **primary** node to allow TCP connections using the + server's public IP and set the connection from the **secondary** node to require a + password. Edit `pg_hba.conf` (for Debian/Ubuntu that would be + `/etc/postgresql/9.x/main/pg_hba.conf`): + + ```sh + host all all 198.51.100.1/32 md5 + host replication gitlab_replicator 198.51.100.2/32 md5 + ``` + + Where `198.51.100.1` is the public IP address of the **primary** server, and `198.51.100.2` + the public IP address of the **secondary** one. If you want to add another + secondary, add one more row like the replication one and change the IP + address: + + ```sh + host all all 198.51.100.1/32 md5 + host replication gitlab_replicator 198.51.100.2/32 md5 + host replication gitlab_replicator 198.51.100.3/32 md5 + ``` + +1. Restart PostgreSQL for the changes to take effect. + +1. Choose a database-friendly name to use for your secondary to use as the + replication slot name. For example, if your domain is + `secondary.geo.example.com`, you may use `secondary_example` as the slot + name. + +1. Create the replication slot on the **primary** node: + + ```sh + $ sudo -u postgres psql -c "SELECT * FROM pg_create_physical_replication_slot('secondary_example');" + slot_name | xlog_position + ------------------+--------------- + secondary_example | + (1 row) + ``` + +1. Now that the PostgreSQL server is set up to accept remote connections, run + `netstat -plnt` to make sure that PostgreSQL is listening to the server's + public IP. + +### Step 2. Configure the secondary server + +Follow the first steps in ["configure the secondary server"][database-replication] and note that since you are installing from source, the username and +group listed as `gitlab-psql` in those steps should be replaced by `postgres` +instead. After completing the "Test that the `gitlab-psql` user can connect to +the **primary** node's database" step, continue here: + +1. Edit `postgresql.conf` to configure the secondary for streaming replication + (for Debian/Ubuntu that would be `/etc/postgresql/9.*/main/postgresql.conf`): + + ```sh + wal_level = hot_standby + max_wal_senders = 5 + checkpoint_segments = 10 + wal_keep_segments = 10 + hot_standby = on + ``` + +1. Restart PostgreSQL for the changes to take effect. + +#### Enable tracking database on the secondary server + +Geo secondary nodes use a tracking database to keep track of replication status +and recover automatically from some replication issues. Follow the steps below to create +the tracking database. + +1. On the secondary node, run the following command to create `database_geo.yml` with the + information of your secondary PostgreSQL instance: + + ```sh + sudo cp /home/git/gitlab/config/database_geo.yml.postgresql /home/git/gitlab/config/database_geo.yml + ``` + +1. Edit the content of `database_geo.yml` in `production:` as in the example below: + + ```yaml + # + # PRODUCTION + # + production: + adapter: postgresql + encoding: unicode + database: gitlabhq_geo_production + pool: 10 + username: gitlab_geo + # password: + host: /var/opt/gitlab/geo-postgresql + ``` + +1. Create the database `gitlabhq_geo_production` on the PostgreSQL instance of the **secondary** node. + +1. Set up the Geo tracking database: + + ```sh + bundle exec rake geo:db:migrate + ``` + +1. Configure the [PostgreSQL FDW][FDW] connection and credentials: + + Save the script below in a file, ex. `/tmp/geo_fdw.sh` and modify the connection + params to match your environment. Execute it to set up the FDW connection. + + ```sh + #!/bin/bash + + # Secondary Database connection params: + DB_HOST="/var/opt/gitlab/postgresql" # change to the public IP or VPC private IP if its an external server + DB_NAME="gitlabhq_production" + DB_USER="gitlab" + DB_PORT="5432" + + # Tracking Database connection params: + GEO_DB_HOST="/var/opt/gitlab/geo-postgresql" # change to the public IP or VPC private IP if its an external server + GEO_DB_NAME="gitlabhq_geo_production" + GEO_DB_USER="gitlab_geo" + GEO_DB_PORT="5432" + + query_exec () { + gitlab-psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "${1}" + } + + query_exec "CREATE EXTENSION postgres_fdw;" + query_exec "CREATE SERVER gitlab_secondary FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '${DB_HOST}', dbname '${DB_NAME}', port '${DB_PORT}');" + query_exec "CREATE USER MAPPING FOR ${GEO_DB_USER} SERVER gitlab_secondary OPTIONS (user '${DB_USER}');" + query_exec "CREATE SCHEMA gitlab_secondary;" + query_exec "GRANT USAGE ON FOREIGN SERVER gitlab_secondary TO ${GEO_DB_USER};" + ``` + + And edit the content of `database_geo.yml` and to add `fdw: true` to + the `production:` block. + +### Step 3. Initiate the replication process + +Below we provide a script that connects the database on the **secondary** node to +the database on the **primary** node, replicates the database, and creates the +needed files for streaming replication. + +The directories used are the defaults for Debian/Ubuntu. If you have changed +any defaults, configure it as you see fit replacing the directories and paths. + +CAUTION: **Warning:** +Make sure to run this on the **secondary** server as it removes all PostgreSQL's +data before running `pg_basebackup`. + +1. SSH into your GitLab **secondary** server and login as root: + + ```sh + sudo -i + ``` + +1. Save the snippet below in a file, let's say `/tmp/replica.sh`. Modify the + embedded paths if necessary: + + ``` + #!/bin/bash + + PORT="5432" + USER="gitlab_replicator" + echo --------------------------------------------------------------- + echo WARNING: Make sure this script is run from the secondary server + echo --------------------------------------------------------------- + echo + echo Enter the IP or FQDN of the primary PostgreSQL server + read HOST + echo Enter the password for $USER@$HOST + read -s PASSWORD + echo Enter the required sslmode + read SSLMODE + + echo Stopping PostgreSQL and all GitLab services + sudo service gitlab stop + sudo service postgresql stop + + echo Backing up postgresql.conf + sudo -u postgres mv /var/opt/gitlab/postgresql/data/postgresql.conf /var/opt/gitlab/postgresql/ + + echo Cleaning up old cluster directory + sudo -u postgres rm -rf /var/opt/gitlab/postgresql/data + + echo Starting base backup as the replicator user + echo Enter the password for $USER@$HOST + sudo -u postgres /opt/gitlab/embedded/bin/pg_basebackup -h $HOST -D /var/opt/gitlab/postgresql/data -U gitlab_replicator -v -x -P + + echo Writing recovery.conf file + sudo -u postgres bash -c "cat > /var/opt/gitlab/postgresql/data/recovery.conf <<- _EOF1_ + standby_mode = 'on' + primary_conninfo = 'host=$HOST port=$PORT user=$USER password=$PASSWORD sslmode=$SSLMODE' + _EOF1_ + " + + echo Restoring postgresql.conf + sudo -u postgres mv /var/opt/gitlab/postgresql/postgresql.conf /var/opt/gitlab/postgresql/data/ + + echo Starting PostgreSQL + sudo service postgresql start + ``` + +1. Run it with: + + ```sh + bash /tmp/replica.sh + ``` + + When prompted, enter the IP/FQDN of the **primary** node, and the password you set up + for the `gitlab_replicator` user in the first step. + + You should use `verify-ca` for the `sslmode`. You can use `disable` if you + are happy to skip PostgreSQL TLS authentication altogether (e.g., you know + the network path is secure, or you are using a site-to-site VPN). This is + **not** safe over the public Internet! + + You can read more details about each `sslmode` in the + [PostgreSQL documentation][pg-docs-ssl]; + the instructions above are carefully written to ensure protection against + both passive eavesdroppers and active "man-in-the-middle" attackers. + +The replication process is now over. + +## PGBouncer support (optional) + +1. First, enter the PostgreSQL console as an admin user. + +1. Then create the read-only user: + + ```sql + -- NOTE: Use the password defined earlier + CREATE USER gitlab_geo_fdw WITH password 'mypassword'; + GRANT CONNECT ON DATABASE gitlabhq_production to gitlab_geo_fdw; + GRANT USAGE ON SCHEMA public TO gitlab_geo_fdw; + GRANT SELECT ON ALL TABLES IN SCHEMA public TO gitlab_geo_fdw; + GRANT SELECT ON ALL SEQUENCES IN SCHEMA public TO gitlab_geo_fdw; + + -- Tables created by "gitlab" should be made read-only for "gitlab_geo_fdw" + -- automatically. + ALTER DEFAULT PRIVILEGES FOR USER gitlab IN SCHEMA public GRANT SELECT ON TABLES TO gitlab_geo_fdw; + ALTER DEFAULT PRIVILEGES FOR USER gitlab IN SCHEMA public GRANT SELECT ON SEQUENCES TO gitlab_geo_fdw; + ``` + +1. Enter the PostgreSQL console on the **secondary** tracking database and change the user mapping to this new user: + + ``` + ALTER USER MAPPING FOR gitlab_geo SERVER gitlab_secondary OPTIONS (SET user 'gitlab_geo_fdw') + ``` + +## MySQL replication + +MySQL replication is not supported for Geo. + +## Troubleshooting + +Read the [troubleshooting document](troubleshooting.md). + +[replication-slots-article]: https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75 +[pgback]: http://www.postgresql.org/docs/9.6/static/app-pgbasebackup.html +[replication user]:https://wiki.postgresql.org/wiki/Streaming_Replication +[FDW]: https://www.postgresql.org/docs/9.6/static/postgres-fdw.html +[database]: database.md +[add-geo-node]: configuration.md#step-3-add-the-secondary-gitlab-node +[database-replication]: database.md#step-2-configure-the-secondary-server +[pg-docs-ssl]: https://www.postgresql.org/docs/9.6/static/libpq-ssl.html#LIBPQ-SSL-PROTECTION +[pg-docs-runtime-conn]: https://www.postgresql.org/docs/9.6/static/runtime-config-connection.html +[pg-docs-runtime-replication]: https://www.postgresql.org/docs/9.6/static/runtime-config-replication.html diff --git a/doc/administration/geo/replication/docker_registry.md b/doc/administration/geo/replication/docker_registry.md new file mode 100644 index 00000000000..8a5bca1708f --- /dev/null +++ b/doc/administration/geo/replication/docker_registry.md @@ -0,0 +1,23 @@ +# Docker Registry for a secondary node + +You can set up a [Docker Registry] on your +**secondary** Geo node that mirrors the one on the **primary** Geo node. + +## Storage support + +CAUTION: **Warning:** +If you use [local storage][registry-storage] +for the Container Registry you **cannot** replicate it to a **secondary** node. + +Docker Registry currently supports a few types of storages. If you choose a +distributed storage (`azure`, `gcs`, `s3`, `swift`, or `oss`) for your Docker +Registry on the **primary** node, you can use the same storage for a **secondary** +Docker Registry as well. For more information, read the +[Load balancing considerations][registry-load-balancing] +when deploying the Registry, and how to set up the storage driver for GitLab's +integrated [Container Registry][registry-storage]. + +[ee]: https://about.gitlab.com/pricing/ +[Docker Registry]: https://docs.docker.com/registry/ +[registry-storage]: ../../container_registry.md#container-registry-storage-driver +[registry-load-balancing]: https://docs.docker.com/registry/deploying/#load-balancing-considerations diff --git a/doc/administration/geo/replication/external_database.md b/doc/administration/geo/replication/external_database.md new file mode 100644 index 00000000000..18e0c75f703 --- /dev/null +++ b/doc/administration/geo/replication/external_database.md @@ -0,0 +1,169 @@ +# Geo with external PostgreSQL instances + +This document is relevant if you are using a PostgreSQL instance that is *not +managed by Omnibus*. This includes cloud-managed instances like AWS RDS, or +manually installed and configured PostgreSQL instances. + +NOTE: **Note**: +We strongly recommend running Omnibus-managed instances as they are actively +developed and tested. We aim to be compatible with most external +(not managed by Omnibus) databases but we do not guarantee compatibility. + +## **Primary** node + +### Configure the external database to be replicated + +To set up an external database, you can either: + +- Set up streaming replication yourself (for example, in AWS RDS). +- Perform the Omnibus configuration manually as follows. + +In an Omnibus install, the +[geo_primary_role](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles) +configures the **primary** node's database to be replicated by making changes to +`pg_hba.conf` and `postgresql.conf`. Make the following configuration changes +manually to your external database configuration: + +``` +## +## Geo Primary Role +## - pg_hba.conf +## +host replication gitlab_replicator <trusted secondary IP>/32 md5 +``` + +``` +## +## Geo Primary Role +## - postgresql.conf +## +sql_replication_user = gitlab_replicator +wal_level = hot_standby +max_wal_senders = 10 +wal_keep_segments = 50 +max_replication_slots = 1 # number of secondary instances +hot_standby = on +``` + +## **Secondary** nodes + +With Omnibus, the +[geo_secondary_role](https://docs.gitlab.com/omnibus/roles/#gitlab-geo-roles) +has three main functions: + +1. Configure the replica database. +1. Configure the tracking database. +1. Enable the Geo Log Cursor (`geo_logcursor`) (irrelevant to this doc). + +### Configure the external replica database + +To set up an external replica database, you can either: + +- Set up streaming replication yourself (for example, in AWS RDS). +- Perform the Omnibus configuration manually as follows. + +In an Omnibus install, the `geo_secondary_role` makes configuration changes to +`postgresql.conf`. Make the following configuration changes manually to your +external replica database configuration: + +``` +## +## Geo Secondary Role +## - postgresql.conf +## +wal_level = hot_standby +max_wal_senders = 10 +wal_keep_segments = 10 +hot_standby = on +``` + +### Configure the tracking database + +**Secondary** nodes use a separate PostgreSQL installation as a tracking +database to keep track of replication status and automatically recover from +potential replication issues. + +It requires an [FDW](https://www.postgresql.org/docs/9.6/static/postgres-fdw.html) +connection with the **secondary** replica database for improved performance. + +If you have an external database ready to be used as the tracking database, +follow the instructions below to use it: + +1. SSH into a GitLab **secondary** server and login as root: + + ```bash + sudo -i + ``` + +1. Edit `/etc/gitlab/gitlab.rb` with the connection params and credentials for + the machine with the PostgreSQL instance: + + ```ruby + # note this is shared between both databases, + # make sure you define the same password in both + gitlab_rails['db_password'] = 'mypassword' + + geo_secondary['db_host'] = '<change to the tracking DB public IP>' + geo_secondary['db_port'] = 5431 # change to the correct port + geo_secondary['db_fdw'] = true # enable FDW + geo_postgresql['enable'] = false # don't use internal managed instance + ``` + +1. Reconfigure GitLab for the changes to take effect: + + ```bash + gitlab-ctl reconfigure + ``` + +1. Run the tracking database migrations: + + ```bash + gitlab-rake geo:db:migrate + ``` + +1. Configure the + [PostgreSQL FDW](https://www.postgresql.org/docs/9.6/static/postgres-fdw.html) + connection and credentials: + + Save the script below in a file, ex. `/tmp/geo_fdw.sh` and modify the connection + params to match your environment. Execute it to set up the FDW connection. + + ```bash + #!/bin/bash + + # Secondary Database connection params: + DB_HOST="<change to the public IP or VPC private IP>" + DB_NAME="gitlabhq_production" + DB_USER="gitlab" + DB_PORT="5432" + + # Tracking Database connection params: + GEO_DB_HOST="<change to the public IP or VPC private IP>" + GEO_DB_NAME="gitlabhq_geo_production" + GEO_DB_USER="gitlab_geo" + GEO_DB_PORT="5432" + + query_exec () { + gitlab-psql -h $GEO_DB_HOST -d $GEO_DB_NAME -p $GEO_DB_PORT -c "${1}" + } + + query_exec "CREATE EXTENSION postgres_fdw;" + query_exec "CREATE SERVER gitlab_secondary FOREIGN DATA WRAPPER postgres_fdw OPTIONS (host '${DB_HOST}', dbname '${DB_NAME}', port '${DB_PORT}');" + query_exec "CREATE USER MAPPING FOR ${GEO_DB_USER} SERVER gitlab_secondary OPTIONS (user '${DB_USER}');" + query_exec "CREATE SCHEMA gitlab_secondary;" + query_exec "GRANT USAGE ON FOREIGN SERVER gitlab_secondary TO ${GEO_DB_USER};" + ``` + + NOTE: **Note:** The script template above uses `gitlab-psql` as it's intended to be executed from the Geo machine, + but you can change it to `psql` and run it from any machine that has access to the database. + +1. Restart GitLab: + + ```bash + gitlab-ctl restart + ``` +1. Populate the FDW tables: + + ```bash + gitlab-rake geo:db:refresh_foreign_tables + ``` diff --git a/doc/administration/geo/replication/faq.md b/doc/administration/geo/replication/faq.md new file mode 100644 index 00000000000..996542f9f2e --- /dev/null +++ b/doc/administration/geo/replication/faq.md @@ -0,0 +1,62 @@ +# Geo Frequently Asked Questions + +## What are the minimum requirements to run Geo? + +The requirements are listed [on the index page](index.md#requirements-for-running-geo) + +## How does Geo know which projects to sync? + +On each **secondary** node, there is a read-only replicated copy of the GitLab database. +A **secondary** node also has a tracking database where it stores which projects have been synced. +Geo compares the two databases to find projects that are not yet tracked. + +At the start, this tracking database is empty, so Geo will start trying to update from every project that it can see in the GitLab database. + +For each project to sync: + +1. Geo will issue a `git fetch geo --mirror` to get the latest information from the **primary** node. +If there are no changes, the sync will be fast and end quickly. Otherwise, it will pull the latest commits. +1. The **secondary** node will update the tracking database to store the fact that it has synced projects A, B, C, etc. +1. Repeat until all projects are synced. + +When someone pushes a commit to the **primary** node, it generates an event in the GitLab database that the repository has changed. +The **secondary** node sees this event, marks the project in question as dirty, and schedules the project to be resynced. + +To ensure that problems with pipelines (for example, syncs failing too many times or jobs being lost) don't permanently stop projects syncing, Geo also periodically checks the tracking database for projects that are marked as dirty. This check happens when +the number of concurrent syncs falls below `repos_max_capacity` and there are no new projects waiting to be synced. + +Geo also has a checksum feature which runs a SHA256 sum across all the Git references to the SHA values. +If the refs don't match between the **primary** node and the **secondary** node, then the **secondary** node will mark that project as dirty and try to resync it. +So even if we have an outdated tracking database, the validation should activate and find discrepancies in the repository state and resync. + +## Can I use Geo in a disaster recovery situation? + +Yes, but there are limitations to what we replicate (see +[What data is replicated to a **secondary** node?](#what-data-is-replicated-to-a-secondary-node)). + +Read the documentation for [Disaster Recovery](../disaster_recovery/index.md). + +## What data is replicated to a **secondary** node? + +We currently replicate project repositories, LFS objects, generated +attachments / avatars and the whole database. This means user accounts, +issues, merge requests, groups, project data, etc., will be available for +query. + +## Can I git push to a **secondary** node? + +Yes! Pushing directly to a **secondary** node (for both HTTP and SSH, including git-lfs) was [introduced](https://about.gitlab.com/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3. + +## How long does it take to have a commit replicated to a **secondary** node? + +All replication operations are asynchronous and are queued to be dispatched. Therefore, it depends on a lot of +factors including the amount of traffic, how big your commit is, the +connectivity between your nodes, your hardware, etc. + +## What if the SSH server runs at a different port? + +That's totally fine. We use HTTP(s) to fetch repository changes from the **primary** node to all **secondary** nodes. + +## Is this possible to set up a Docker Registry for a **secondary** node that mirrors the one on the **primary** node? + +Yes. See [Docker Registry for a **secondary** node](docker_registry.md). diff --git a/doc/administration/geo/replication/high_availability.md b/doc/administration/geo/replication/high_availability.md new file mode 100644 index 00000000000..9ff7f5bfefa --- /dev/null +++ b/doc/administration/geo/replication/high_availability.md @@ -0,0 +1,228 @@ +# Geo High Availability + +This document describes a minimal reference architecture for running Geo +in a high availability configuration. If your HA setup differs from the one +described, it is possible to adapt these instructions to your needs. + +## Architecture overview + +![Geo HA Diagram](../../img/high_availability/geo-ha-diagram.png) + +_[diagram source - gitlab employees only][diagram-source]_ + +The topology above assumes that the **primary** and **secondary** Geo clusters +are located in two separate locations, on their own virtual network +with private IP addresses. The network is configured such that all machines within +one geographic location can communicate with each other using their private IP addresses. +The IP addresses given are examples and may be different depending on the +network topology of your deployment. + +The only external way to access the two Geo deployments is by HTTPS at +`gitlab.us.example.com` and `gitlab.eu.example.com` in the example above. + +NOTE: **Note:** +The **primary** and **secondary** Geo deployments must be able to communicate to each other over HTTPS. + +## Redis and PostgreSQL High Availability + +The **primary** and **secondary** Redis and PostgreSQL should be configured +for high availability. Because of the additional complexity involved +in setting up this configuration for PostgreSQL and Redis, +it is not covered by this Geo HA documentation. + +For more information about setting up a highly available PostgreSQL cluster and Redis cluster using the omnibus package see the high availability documentation for +[PostgreSQL](../../high_availability/database.md) and +[Redis](../../high_availability/redis.md), respectively. + +NOTE: **Note:** +It is possible to use cloud hosted services for PostgreSQL and Redis, but this is beyond the scope of this document. + +## Prerequisites: A working GitLab HA cluster + +This cluster will serve as the **primary** node. Use the +[GitLab HA documentation](../../high_availability/README.md) to set this up. + +## Configure the GitLab cluster to be the **primary** node + +The following steps enable a GitLab cluster to serve as the **primary** node. + +### Step 1: Configure the **primary** frontend servers + +1. Edit `/etc/gitlab/gitlab.rb` and add the following: + + ```ruby + ## + ## Enable the Geo primary role + ## + roles ['geo_primary_role'] + + ## + ## Disable automatic migrations + ## + gitlab_rails['auto_migrate'] = false + ``` + +After making these changes, [reconfigure GitLab][gitlab-reconfigure] so the changes take effect. + +NOTE: **Note:** PostgreSQL and Redis should have already been disabled on the +application servers, and connections from the application servers to those +services on the backend servers configured, during normal GitLab HA set up. See +high availability configuration documentation for +[PostgreSQL](../../high_availability/database.md#configuring-the-application-nodes) +and [Redis](../../high_availability/redis.md#example-configuration-for-the-gitlab-application). + +The **primary** database will require modification later, as part of +[step 2](#step-2-configure-the-main-read-only-replica-postgresql-database-on-the-secondary-node). + +## Configure a **secondary** node + +A **secondary** cluster is similar to any other GitLab HA cluster, with two +major differences: + +* The main PostgreSQL database is a read-only replica of the **primary** node's + PostgreSQL database. +* There is also a single PostgreSQL database for the **secondary** cluster, + called the "tracking database", which tracks the synchronization state of + various resources. + +Therefore, we will set up the HA components one-by-one, and include deviations +from the normal HA setup. + +### Step 1: Configure the Redis and NFS services on the **secondary** node + +Configure the following services, again using the non-Geo high availability +documentation: + +* [Configuring Redis for GitLab HA](../../high_availability/redis.md) for high + availability. +* [NFS](../../high_availability/nfs.md) which will store data that is + synchronized from the **primary** node. + +### Step 2: Configure the main read-only replica PostgreSQL database on the **secondary** node + +NOTE: **Note:** The following documentation assumes the database will be run on +only a single machine, rather than as a PostgreSQL cluster. + +Configure the [**secondary** database](database.md) as a read-only replica of +the **primary** database. + +If using an external PostgreSQL instance, refer also to +[Geo with external PostgreSQL instances](external_database.md). + +### Step 3: Configure the tracking database on the **secondary** node + +NOTE: **Note:** This documentation assumes the tracking database will be run on +only a single machine, rather than as a PostgreSQL cluster. + +Configure the tracking database. + +1. Edit `/etc/gitlab/gitlab.rb` in the tracking database machine, and add the + following: + + ```ruby + ## + ## Enable the Geo secondary tracking database + ## + geo_postgresql['enable'] = true + geo_postgresql['ha'] = true + ``` + +After making these changes [Reconfigure GitLab][gitlab-reconfigure] so the changes take effect. + +If using an external PostgreSQL instance, refer also to +[Geo with external PostgreSQL instances](external_database.md). + +### Step 4: Configure the frontend application servers on the **secondary** node + +In the architecture overview, there are two machines running the GitLab +application services. These services are enabled selectively in the +configuration. + +Configure the application servers following +[Configuring GitLab for HA](../../high_availability/gitlab.md), then make the +following modifications: + +1. Edit `/etc/gitlab/gitlab.rb` on each application server in the **secondary** + cluster, and add the following: + + ```ruby + ## + ## Enable the Geo secondary role + ## + roles ['geo_secondary_role', 'application_role'] + + ## + ## Disable automatic migrations + ## + gitlab_rails['auto_migrate'] = false + + ## + ## Configure the connection to the tracking DB. And disable application + ## servers from running tracking databases. + ## + geo_secondary['db_host'] = '10.1.4.1' + geo_secondary['db_password'] = 'plaintext Geo tracking DB password' + geo_postgresql['enable'] = false + + ## + ## Configure connection to the streaming replica database, if you haven't + ## already + ## + gitlab_rails['db_host'] = '10.1.3.1' + gitlab_rails['db_password'] = 'plaintext DB password' + + ## + ## Configure connection to Redis, if you haven't already + ## + gitlab_rails['redis_host'] = '10.1.2.1' + gitlab_rails['redis_password'] = 'Redis password' + + ## + ## If you are using custom users not managed by Omnibus, you need to specify + ## UIDs and GIDs like below, and ensure they match between servers in a + ## cluster to avoid permissions issues + ## + user['uid'] = 9000 + user['gid'] = 9000 + web_server['uid'] = 9001 + web_server['gid'] = 9001 + registry['uid'] = 9002 + registry['gid'] = 9002 + ``` +NOTE: **Note:** +If you had set up PostgreSQL cluster using the omnibus package and you had set +up `postgresql['sql_user_password'] = 'md5 digest of secret'` setting, keep in +mind that `gitlab_rails['db_password']` and `geo_secondary['db_password']` +mentioned above contains the plaintext passwords. This is used to let the Rails +servers connect to the databases. + +NOTE: **Note:** +Make sure that current node IP is listed in `postgresql['md5_auth_cidr_addresses']` setting of your remote database. + +After making these changes [Reconfigure GitLab][gitlab-reconfigure] so the changes take effect. + +On the secondary the following GitLab frontend services will be enabled: + +* geo-logcursor +* gitlab-pages +* gitlab-workhorse +* logrotate +* nginx +* registry +* remote-syslog +* sidekiq +* unicorn + +Verify these services by running `sudo gitlab-ctl status` on the frontend +application servers. + +### Step 5: Set up the LoadBalancer for the **secondary** node + +In this topology, a load balancer is required at each geographic location to +route traffic to the application servers. + +See [Load Balancer for GitLab HA](../../high_availability/load_balancer.md) for +more information. + +[diagram-source]: https://docs.google.com/drawings/d/1z0VlizKiLNXVVVaERFwgsIOuEgjcUqDTWPdQYsE7Z4c/edit +[gitlab-reconfigure]: ../../restart_gitlab.md#omnibus-gitlab-reconfigure diff --git a/doc/administration/geo/replication/img/geo_architecture.png b/doc/administration/geo/replication/img/geo_architecture.png Binary files differnew file mode 100644 index 00000000000..d318cd5d0f4 --- /dev/null +++ b/doc/administration/geo/replication/img/geo_architecture.png diff --git a/doc/administration/geo/replication/img/geo_node_dashboard.png b/doc/administration/geo/replication/img/geo_node_dashboard.png Binary files differnew file mode 100644 index 00000000000..99792d0770d --- /dev/null +++ b/doc/administration/geo/replication/img/geo_node_dashboard.png diff --git a/doc/administration/geo/replication/img/geo_node_healthcheck.png b/doc/administration/geo/replication/img/geo_node_healthcheck.png Binary files differnew file mode 100644 index 00000000000..33a31f7ab49 --- /dev/null +++ b/doc/administration/geo/replication/img/geo_node_healthcheck.png diff --git a/doc/administration/geo/replication/img/geo_overview.png b/doc/administration/geo/replication/img/geo_overview.png Binary files differnew file mode 100644 index 00000000000..01c1615212c --- /dev/null +++ b/doc/administration/geo/replication/img/geo_overview.png diff --git a/doc/administration/geo/replication/index.md b/doc/administration/geo/replication/index.md new file mode 100644 index 00000000000..2e7aa90f80e --- /dev/null +++ b/doc/administration/geo/replication/index.md @@ -0,0 +1,309 @@ +# Geo Replication **[PREMIUM ONLY]** + +Geo is the solution for widely distributed development teams. + +## Overview + +Fetching large repositories can take a long time for teams located far from a single GitLab instance. + +Geo provides local, read-only instances of your GitLab instances, reducing the time it takes to clone and fetch large repositories and speeding up development. + +> - Geo is part of [GitLab Premium](https://about.gitlab.com/pricing/#self-managed). +> - Introduced in GitLab Enterprise Edition 8.9. +> - We recommend you use: +> - At least GitLab Enterprise Edition 10.0 for basic Geo features. +> - The latest version for a better experience. +> - Make sure that all nodes run the same GitLab version. +> - Geo requires PostgreSQL 9.6 and Git 2.9, in addition to GitLab's usual [minimum requirements](../../../install/requirements.md). +> - Using Geo in combination with [High Availability](../../high_availability/README.md) is considered **Generally Available** (GA) in GitLab [GitLab Premium](https://about.gitlab.com/pricing/) 10.4. + +For a video introduction to Geo, see [Introduction to GitLab Geo - GitLab Features](https://www.youtube.com/watch?v=-HDLxSjEh6w). + +CAUTION: **Caution:** +Geo undergoes significant changes from release to release. Upgrades **are** supported and [documented](#updating-geo), but you should ensure that you're using the right version of the documentation for your installation. + +To make sure you're using the right version of the documentation, navigate to [the source version of this page on GitLab.com](https://gitlab.com/gitlab-org/gitlab-ee/blob/master/doc/administration/geo/replication/index.md) and choose the appropriate release from the **Switch branch/tag** dropdown. For example, [`v11.2.3-ee`](https://gitlab.com/gitlab-org/gitlab-ee/blob/v11.2.3-ee/doc/administration/geo/replication/index.md). + +## Use cases + +Implementing Geo provides the following benefits: + +- Reduce from minutes to seconds the time taken for your distributed developers to clone and fetch large repositories and projects. +- Enable all of your developers to contribute ideas and work in parallel, no matter where they are. +- Balance the load between your **primary** and **secondary** nodes, or offload your automated tests to a **secondary** node. + +In addition, it: + +- Can be used for cloning and fetching projects, in addition to reading any data available in the GitLab web interface (see [current limitations](#current-limitations)). +- Overcomes slow connections between distant offices, saving time by improving speed for distributed teams. +- Helps reducing the loading time for automated tasks, custom integrations, and internal workflows. +- Can quickly fail over to a **secondary** node in a [disaster recovery](../disaster_recovery/index.md) scenario. +- Allows [planned failover](../disaster_recovery/planned_failover.md) to a **secondary** node. + +Geo provides: + +- Read-only **secondary** nodes: Maintain one **primary** GitLab node while still enabling read-only **secondary** nodes for each of your distributed teams. +- Authentication system hooks: **Secondary** nodes receives all authentication data (like user accounts and logins) from the **primary** instance. +- An intuitive UI: **Secondary** nodes utilize the same web interface your team has grown accustomed to. In addition, there are visual notifications that block write operations and make it clear that a user is on a **secondary** node. + +## How it works + +Your Geo instance can be used for cloning and fetching projects, in addition to reading any data. This will make working with large repositories over large distances much faster. + +![Geo overview](img/geo_overview.png) + +When Geo is enabled, the: + +- Original instance is known as the **primary** node. +- Replicated read-only nodes are known as **secondary** nodes. + +Keep in mind that: + +- **Secondary** nodes talk to the **primary** node to: + - Get user data for logins (API). + - Replicate repositories, LFS Objects, and Attachments (HTTPS + JWT). +- Since GitLab Premium 10.0, the **primary** node no longer talks to **secondary** nodes to notify for changes (API). +- Pushing directly to a **secondary** node (for both HTTP and SSH, including git-lfs) was [introduced](https://about.gitlab.com/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3. +- There are [limitations](#current-limitations) in the current implementation. + +### Architecture + +The following diagram illustrates the underlying architecture of Geo. + +![Geo architecture](img/geo_architecture.png) + +In this diagram: + +- There is the **primary** node and the details of one **secondary** node. +- Writes to the database can only be performed on the **primary** node. A **secondary** node receives database + updates via PostgreSQL streaming replication. +- If present, the [LDAP server](#ldap) should be configured to replicate for [Disaster Recovery](../disaster_recovery/index.md) scenarios. +- A **secondary** node performs different type of synchronizations against the **primary** node, using a special + authorization protected by JWT: + - Repositories are cloned/updated via Git over HTTPS. + - Attachments, LFS objects, and other files are downloaded via HTTPS using a private API endpoint. + +From the perspective of a user performing Git operations: + +- The **primary** node behaves as a full read-write GitLab instance. +- **Secondary** nodes are read-only but proxy Git push operations to the **primary** node. This makes **secondary** nodes appear to support push operations themselves. + +To simplify the diagram, some necessary components are omitted. Note that: + +- Git over SSH requires [`gitlab-shell`](https://gitlab.com/gitlab-org/gitlab-shell) and OpenSSH. +- Git over HTTPS required [`gitlab-workhorse`](https://gitlab.com/gitlab-org/gitlab-workhorse). + +Note that a **secondary** node needs two different PostgreSQL databases: + +- A read-only database instance that streams data from the main GitLab database. +- [Another database instance](#geo-tracking-database) used internally by the **secondary** node to record what data has been replicated. + +In **secondary** nodes, there is an additional daemon: [Geo Log Cursor](#geo-log-cursor). + +## Requirements for running Geo + +The following are required to run Geo: + +- An operating system that supports OpenSSH 6.9+ (needed for + [fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md)) + The following operating systems are known to ship with a current version of OpenSSH: + - [CentOS](https://www.centos.org) 7.4+ + - [Ubuntu](https://www.ubuntu.com) 16.04+ +- PostgreSQL 9.6+ with [FDW](https://www.postgresql.org/docs/9.6/postgres-fdw.html) support and [Streaming Replication](https://wiki.postgresql.org/wiki/Streaming_Replication) +- Git 2.9+ + +### Firewall rules + +The following table lists basic ports that must be open between the **primary** and **secondary** nodes for Geo. + +| **Primary** node | **Secondary** node | Protocol | +|:-----------------|:-------------------|:-------------| +| 80 | 80 | HTTP | +| 443 | 443 | TCP or HTTPS | +| 22 | 22 | TCP | +| 5432 | | PostgreSQL | + +See the full list of ports used by GitLab in [Package defaults](https://docs.gitlab.com/omnibus/package-information/defaults.html) + +NOTE: **Note:** +[Web terminal](../../../ci/environments.md#web-terminals) support requires your load balancer to correctly handle WebSocket connections. +When using HTTP or HTTPS proxying, your load balancer must be configured to pass through the `Connection` and `Upgrade` hop-by-hop headers. See the [web terminal](../../integration/terminal.md) integration guide for more details. + +NOTE: **Note:** +When using HTTPS protocol for port 443, you will need to add an SSL certificate to the load balancers. +If you wish to terminate SSL at the GitLab application server instead, use TCP protocol. + +### LDAP + +We recommend that if you use LDAP on your **primary** node, you also set up secondary LDAP servers on each **secondary** node. Otherwise, users will not be able to perform Git operations over HTTP(s) on the **secondary** node using HTTP Basic Authentication. However, Git via SSH and personal access tokens will still work. + +NOTE: **Note:** +It is possible for all **secondary** nodes to share an LDAP server, but additional latency can be an issue. Also, consider what LDAP server will be available in a [disaster recovery](../disaster_recovery/index.md) scenario if a **secondary** node is promoted to be a **primary** node. + +Check for instructions on how to set up replication in your LDAP service. Instructions will be different depending on the software or service used. For example, OpenLDAP provides [these instructions](https://www.openldap.org/doc/admin24/replication.html). + +### Geo Tracking Database + +The tracking database instance is used as metadata to control what needs to be updated on the disk of the local instance. For example: + +- Download new assets. +- Fetch new LFS Objects. +- Fetch changes from a repository that has recently been updated. + +Because the replicated database instance is read-only, we need this additional database instance for each **secondary** node. +The tracking database requires the `postgres_fdw` extension. + +### Geo Log Cursor + +This daemon: + +- Reads a log of events replicated by the **primary** node to the **secondary** database instance. +- Updates the Geo Tracking Database instance with changes that need to be executed. + +When something is marked to be updated in the tracking database instance, asynchronous jobs running on the **secondary** node will execute the required operations and update the state. + +This new architecture allows GitLab to be resilient to connectivity issues between the nodes. It doesn't matter how long the **secondary** node is disconnected from the **primary** node as it will be able to replay all the events in the correct order and become synchronized with the **primary** node again. + +## Setup instructions + +These instructions assume you have a working instance of GitLab. They guide you through: + +1. Making your existing instance the **primary** node. +1. Adding **secondary** nodes. + +CAUTION: **Caution:** +The steps below should be followed in the order they appear. **Make sure the GitLab version is the same on all nodes.** + +### Using Omnibus GitLab + +If you installed GitLab using the Omnibus packages (highly recommended): + +1. [Install GitLab Enterprise Edition](https://about.gitlab.com/installation/) on the server that will serve as the **secondary** node. Do not create an account or log in to the new **secondary** node. +1. [Upload the GitLab License](../../../user/admin_area/license.md) on the **primary** node to unlock Geo. The license must be for [GitLab Premium](https://about.gitlab.com/pricing/) or higher. +1. [Set up the database replication](database.md) (`primary (read-write) <-> secondary (read-only)` topology). +1. [Configure fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md). This step is required and needs to be done on **both** the **primary** and **secondary** nodes. +1. [Configure GitLab](configuration.md) to set the **primary** and **secondary** nodes. +1. Optional: [Configure a secondary LDAP server](../../auth/ldap.md) for the **secondary** node. See [notes on LDAP](#ldap). +1. [Follow the "Using a Geo Server" guide](using_a_geo_server.md). + +### Using GitLab installed from source (Deprecated) + +NOTE: **Note:** +In GitLab 11.5, support for using Geo in GitLab source installations was deprecated and will be removed in a future release. Please consider [migrating to GitLab Omnibus install](https://docs.gitlab.com/omnibus/update/convert_to_omnibus.html). + +If you installed GitLab from source: + +1. [Install GitLab Enterprise Edition](../../../install/installation.md) on the server that will serve as the **secondary** node. Do not create an account or log in to the new **secondary** node. +1. [Upload the GitLab License](../../../user/admin_area/license.md) on the **primary** node to unlock Geo. The license must be for [GitLab Premium](https://about.gitlab.com/pricing/) or higher. +1. [Set up the database replication](database_source.md) (`primary (read-write) <-> secondary (read-only)` topology). +1. [Configure fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md). Do this step for **both** **primary** and **secondary** nodes. +1. [Configure GitLab](configuration_source.md) to set the **primary** and **secondary** nodes. +1. [Follow the "Using a Geo Server" guide](using_a_geo_server.md). + +## Post-installation documentation + +After installing GitLab on the **secondary** nodes and performing the initial configuration, see the following documentation for post-installation information. + +### Configuring Geo + +For information on configuring Geo, see: + +- [Geo configuration (GitLab Omnibus)](configuration.md). +- [Geo configuration (source)](configuration_source.md). Configuring Geo in GitLab source installations was **deprecated** in GitLab 11.5. + +### Updating Geo + +For information on how to update your Geo nodes to the latest GitLab version, see [Updating the Geo nodes](updating_the_geo_nodes.md). + +### Configuring Geo high availability + +For information on configuring Geo for high availability, see [Geo High Availability](high_availability.md). + +### Configuring Geo with Object Storage + +For information on configuring Geo with object storage, see [Geo with Object storage](object_storage.md). + +### Disaster Recovery + +For information on using Geo in disaster recovery situations to mitigate data-loss and restore services, see [Disaster Recovery](../disaster_recovery/index.md). + +### Replicating the Container Registry + +For more information on how to replicate the Container Registry, see [Docker Registry for a **secondary** node](docker_registry.md). + +### Security Review + +For more information on Geo security, see [Geo security review](security_review.md). + +### Tuning Geo + +For more information on tuning Geo, see [Tuning Geo](tuning.md). + +## Remove Geo node + +For more information on removing a Geo node, see [Removing **secondary** Geo nodes](remove_geo_node.md). + +## Current limitations + +CAUTION: **Caution:** +This list of limitations only reflects the latest version of GitLab. If you are using an older version, extra limitations may be in place. + +- Pushing directly to a **secondary** node redirects (for HTTP) or proxies (for SSH) the request to the **primary** node instead of [handling it directly](https://gitlab.com/gitlab-org/gitlab-ee/issues/1381), except when using Git over HTTP with credentials embedded within the URI. For example, `https://user:password@secondary.tld`. +- The **primary** node has to be online for OAuth login to happen. Existing sessions and Git are not affected. +- The installation takes multiple manual steps that together can take about an hour depending on circumstances. We are working on improving this experience. See [gitlab-org/omnibus-gitlab#2978](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2978) for details. +- Real-time updates of issues/merge requests (for example, via long polling) doesn't work on the **secondary** node. +- [Selective synchronization](configuration.md#selective-synchronization) applies only to files and repositories. Other datasets are replicated to the **secondary** node in full, making it inappropriate for use as an access control mechanism. +- Object pools for forked project deduplication work only on the **primary** node, and are duplicated on the **secondary** node. +- [External merge request diffs](../../merge_request_diffs.md) will not be replicated if they are on-disk, and viewing merge requests will fail. However, external MR diffs in object storage **are** supported. The default configuration (in-database) does work. + +### Limitations on replication + +Only the following items are replicated to the **secondary** node: + +- All database content. For example, snippets, epics, issues, merge requests, groups, and project metadata. +- Project repositories. +- Project wiki repositories. +- User uploads. For example, attachments to issues, merge requests, epics, and avatars. +- CI job artifacts and traces. + +DANGER: **DANGER** +Data not on this list is unavailable on the **secondary** node. Failing over without manually replicating data not on this list will cause the data to be **lost**. + +### Examples of data not replicated + +Take special note that these examples of GitLab features are both: + +- Commonly used. +- **Not** replicated by Geo at present. + +Examples include: + +- [Elasticsearch integration](../../../integration/elasticsearch.md). +- [Container Registry](../../container_registry.md). [Object Storage](object_storage.md) can mitigate this. +- [GitLab Pages](../../pages/index.md). +- [Mattermost integration](https://docs.gitlab.com/omnibus/gitlab-mattermost/). + +CAUTION: **Caution:** +If you wish to use them on a **secondary** node, or to execute a failover successfully, you will need to replicate their data using some other means. + +## Frequently Asked Questions + +For answers to common questions, see the [Geo FAQ](faq.md). + +## Log files + +Since GitLab 9.5, Geo stores structured log messages in a `geo.log` file. For Omnibus installations, this file is at `/var/log/gitlab/gitlab-rails/geo.log`. + +This file contains information about when Geo attempts to sync repositories and files. Each line in the file contains a separate JSON entry that can be ingested into Elasticsearch, Splunk, etc. + +For example: + +```json +{"severity":"INFO","time":"2017-08-06T05:40:16.104Z","message":"Repository update","project_id":1,"source":"repository","resync_repository":true,"resync_wiki":true,"class":"Gitlab::Geo::LogCursor::Daemon","cursor_delay_s":0.038} +``` + +This message shows that Geo detected that a repository update was needed for project `1`. + +## Troubleshooting + +For troubleshooting steps, see [Geo Troubleshooting](troubleshooting.md). diff --git a/doc/administration/geo/replication/object_storage.md b/doc/administration/geo/replication/object_storage.md new file mode 100644 index 00000000000..adc298e2682 --- /dev/null +++ b/doc/administration/geo/replication/object_storage.md @@ -0,0 +1,43 @@ +# Geo with Object storage + +Geo can be used in combination with Object Storage (AWS S3, or +other compatible object storage). + +## Configuration + +At this time it is required that if object storage is enabled on the +**primary** node, it must also be enabled on each **secondary** node. + +**Secondary** nodes can use the same storage bucket as the **primary** node, or +they can use a replicated storage bucket. At this time GitLab does not +take care of content replication in object storage. + +For LFS, follow the documentation to +[set up LFS object storage](../../../workflow/lfs/lfs_administration.md#storing-lfs-objects-in-remote-object-storage). + +For CI job artifacts, there is similar documentation to configure +[jobs artifact object storage](../../job_artifacts.md#using-object-storage) + +For user uploads, there is similar documentation to configure [upload object storage](../../uploads.md#using-object-storage-core-only) + +You should enable and configure object storage on both **primary** and **secondary** +nodes. Migrating existing data to object storage should be performed on the +**primary** node only. **Secondary** nodes will automatically notice that the migrated +files are now in object storage. + +## Replication + +When using Amazon S3, you can use +[CRR](https://docs.aws.amazon.com/AmazonS3/latest/dev/crr.html) to +have automatic replication between the bucket used by the **primary** node and +the bucket used by **secondary** nodes. + +If you are using Google Cloud Storage, consider using +[Multi-Regional Storage](https://cloud.google.com/storage/docs/storage-classes#multi-regional). +Or you can use the [Storage Transfer Service](https://cloud.google.com/storage/transfer/), +although this only supports daily synchronization. + +For manual synchronization, or scheduled by `cron`, please have a look at: + +- [`s3cmd sync`](http://s3tools.org/s3cmd-sync) +- [`gsutil rsync`](https://cloud.google.com/storage/docs/gsutil/commands/rsync) diff --git a/doc/administration/geo/replication/remove_geo_node.md b/doc/administration/geo/replication/remove_geo_node.md new file mode 100644 index 00000000000..ba5664246b2 --- /dev/null +++ b/doc/administration/geo/replication/remove_geo_node.md @@ -0,0 +1,50 @@ +# Removing secondary Geo nodes + +**Secondary** nodes can be removed from the Geo cluster using the Geo admin page of the **primary** node. To remove a **secondary** node: + +1. Navigate to **Admin Area > Geo** (`/admin/geo/nodes`). +1. Click the **Remove** button for the **secondary** node you want to remove. +1. Confirm by clicking **Remove** when the prompt appears. + +Once removed from the Geo admin page, you must stop and uninstall the **secondary** node: + +1. On the **secondary** node, stop GitLab: + + ```bash + sudo gitlab-ctl stop + ``` +1. On the **secondary** node, uninstall GitLab: + + ```bash + # Stop gitlab and remove its supervision process + sudo gitlab-ctl uninstall + + # Debian/Ubuntu + sudo dpkg --remove gitlab-ee + + # Redhat/Centos + sudo rpm --erase gitlab-ee + ``` + +Once GitLab has been uninstalled from the **secondary** node, the replication slot must be dropped from the **primary** node's database as follows: + +1. On the **primary** node, start a PostgreSQL console session: + + ```bash + sudo gitlab-psql + ``` + + NOTE: **Note:** + Using `gitlab-rails dbconsole` will not work, because managing replication slots requires superuser permissions. + +1. Find the name of the relevant replication slot. This is the slot that is specified with `--slot-name` when running the replicate command: `gitlab-ctl replicate-geo-database`. + + ```sql + SELECT * FROM pg_replication_slots; + ``` + +1. Remove the replication slot for the **secondary** node: + + ```sql + SELECT pg_drop_replication_slot('<name_of_slot>'); + ``` diff --git a/doc/administration/geo/replication/security_review.md b/doc/administration/geo/replication/security_review.md new file mode 100644 index 00000000000..f77527ae8a7 --- /dev/null +++ b/doc/administration/geo/replication/security_review.md @@ -0,0 +1,287 @@ +# Geo security review (Q&A) + +The following security review of the Geo feature set focuses on security +aspects of the feature as they apply to customers running their own GitLab +instances. The review questions are based in part on the [application security architecture](https://www.owasp.org/index.php/Application_Security_Architecture_Cheat_Sheet) +questions from [owasp.org](https://www.owasp.org). + +## Business Model + +### What geographic areas does the application service? + +- This varies by customer. Geo allows customers to deploy to multiple areas, + and they get to choose where they are. +- Region and node selection is entirely manual. + +## Data Essentials + +### What data does the application receive, produce, and process? + +- Geo streams almost all data held by a GitLab instance between sites. This + includes full database replication, most files (user-uploaded attachments, + etc) and repository + wiki data. In a typical configuration, this will + happen across the public Internet, and be TLS-encrypted. +- PostgreSQL replication is TLS-encrypted. +- See also: [only TLSv1.2 should be supported](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2948) + +### How can the data be classified into categories according to its sensitivity? + +- GitLab’s model of sensitivity is centered around public vs. internal vs. + private projects. Geo replicates them all indiscriminately. “Selective sync” + exists for files and repositories (but not database content), which would permit + only less-sensitive projects to be replicated to a **secondary** node if desired. +- See also: [developing a data classification policy](https://gitlab.com/gitlab-com/security/issues/4). + +### What data backup and retention requirements have been defined for the application? + +- Geo is designed to provide replication of a certain subset of the application + data. It is part of the solution, rather than part of the problem. + +## End-Users + +### Who are the application's end‐users? + +- **Secondary** nodes are created in regions that are distant (in terms of + Internet latency) from the main GitLab installation (the **primary** node). They are + intended to be used by anyone who would ordinarily use the **primary** node, who finds + that the **secondary** node is closer to them (in terms of Internet latency). + +### How do the end‐users interact with the application? + +- **Secondary** nodes provide all the interfaces a **primary** node does + (notably a HTTP/HTTPS web application, and HTTP/HTTPS or SSH git repository + access), but is constrained to read-only activities. The principal use case is + envisioned to be cloning git repositories from the **secondary** node in favor of the + **primary** node, but end-users may use the GitLab web interface to view projects, + issues, merge requests, snippets, etc. + +### What security expectations do the end‐users have? + +- The replication process must be secure. It would typically be unacceptable to + transmit the entire database contents or all files and repositories across the + public Internet in plaintext, for instance. +- **Secondary** nodes must have the same access controls over its content as the + **primary** node - unauthenticated users must not be able to gain access to privileged + information on the **primary** node by querying the **secondary** node. +- Attackers must not be able to impersonate the **secondary** node to the **primary** node, and + thus gain access to privileged information. + +## Administrators + +### Who has administrative capabilities in the application? + +- Nothing Geo-specific. Any user where `admin: true` is set in the database is + considered an admin with super-user privileges. +- See also: [more granular access control](https://gitlab.com/gitlab-org/gitlab-ce/issues/32730) + (not geo-specific) +- Much of Geo’s integration (database replication, for instance) must be + configured with the application, typically by system administrators. + +### What administrative capabilities does the application offer? + +- **Secondary** nodes may be added, modified, or removed by users with + administrative access. +- The replication process may be controlled (start/stop) via the Sidekiq + administrative controls. + +## Network + +### What details regarding routing, switching, firewalling, and load‐balancing have been defined? + +- Geo requires the **primary** node and **secondary** node to be able to communicate with each + other across a TCP/IP network. In particular, the **secondary** nodes must be able to + access HTTP/HTTPS and PostgreSQL services on the **primary** node. + +### What core network devices support the application? + +- Varies from customer to customer. + +### What network performance requirements exist? + +- Maximum replication speeds between **primary** node and **secondary** node is limited by the + available bandwidth between sites. No hard requirements exist - time to complete + replication (and ability to keep up with changes on the **primary** node) is a function + of the size of the data set, tolerance for latency, and available network + capacity. + +### What private and public network links support the application? + +- Customers choose their own networks. As sites are intended to be + geographically separated, it is envisioned that replication traffic will pass + over the public Internet in a typical deployment, but this is not a requirement. + +## Systems + +### What operating systems support the application? + +- Geo imposes no additional restrictions on operating system (see the + [GitLab installation](https://about.gitlab.com/installation/) page for more + details), however we recommend using the operating systems listed in the [Geo documentation](index.md#requirements-for-running-geo). + +### What details regarding required OS components and lock‐down needs have been defined? + +- The recommended installation method (Omnibus) packages most components itself. + A from-source installation method exists. Both are documented at + <https://docs.gitlab.com/ee/administration/geo/replication/index.html> +- There are significant dependencies on the system-installed OpenSSH daemon (Geo + requires users to set up custom authentication methods) and the omnibus or + system-provided PostgreSQL daemon (it must be configured to listen on TCP, + additional users and replication slots must be added, etc). +- The process for dealing with security updates (for example, if there is a + significant vulnerability in OpenSSH or other services, and the customer + wants to patch those services on the OS) is identical to the non-Geo + situation: security updates to OpenSSH would be provided to the user via the + usual distribution channels. Geo introduces no delay there. + +## Infrastructure Monitoring + +### What network and system performance monitoring requirements have been defined? + +- None specific to Geo. + +### What mechanisms exist to detect malicious code or compromised application components? + +- None specific to Geo. + +### What network and system security monitoring requirements have been defined? + +- None specific to Geo. + +## Virtualization and Externalization + +### What aspects of the application lend themselves to virtualization? + +- All. + +## What virtualization requirements have been defined for the application? + +- Nothing Geo-specific, but everything in GitLab needs to have full + functionality in such an environment. + +### What aspects of the product may or may not be hosted via the cloud computing model? + +- GitLab is “cloud native” and this applies to Geo as much as to the rest of the + product. Deployment in clouds is a common and supported scenario. + +## If applicable, what approach(es) to cloud computing will be taken (Managed Hosting versus "Pure" Cloud, a "full machine" approach such as AWS-EC2 versus a "hosted database" approach such as AWS-RDS and Azure, etc)? + +- To be decided by our customers, according to their operational needs. + +## Environment + +### What frameworks and programming languages have been used to create the application? + +- Ruby on Rails, Ruby. + +### What process, code, or infrastructure dependencies have been defined for the application? + +- Nothing specific to Geo. + +### What databases and application servers support the application? + +- PostgreSQL >= 9.6, Redis, Sidekiq, Unicorn. + +### How will database connection strings, encryption keys, and other sensitive components be stored, accessed, and protected from unauthorized detection? + +- There are some Geo-specific values. Some are shared secrets which must be + securely transmitted from the **primary** node to the **secondary** node at setup time. Our + documentation recommends transmitting them from the **primary** node to the system + administrator via SSH, and then back out to the **secondary** node in the same manner. + In particular, this includes the PostgreSQL replication credentials and a secret + key (`db_key_base`) which is used to decrypt certain columns in the database. + The `db_key_base` secret is stored unencrypted on the filesystem, in + `/etc/gitlab/gitlab-secrets.json`, along with a number of other secrets. There is + no at-rest protection for them. + +## Data Processing + +### What data entry paths does the application support? + +- Data is entered via the web application exposed by GitLab itself. Some data is + also entered using system administration commands on the GitLab servers (e.g., + `gitlab-ctl set-primary-node`). +- **Secondary** nodes also receive inputs via PostgreSQL streaming replication from the **primary** node. + +### What data output paths does the application support? + +- **Primary** nodes output via PostgreSQL streaming replication to the **secondary** node. + Otherwise, principally via the web application exposed by GitLab itself, and via + SSH `git clone` operations initiated by the end-user. + +### How does data flow across the application's internal components? + +- **Secondary** nodes and **primary** nodes interact via HTTP/HTTPS (secured with JSON web + tokens) and via PostgreSQL streaming replication. +- Within a **primary** node or **secondary** node, the SSOT is the filesystem and the database + (including Geo tracking database on **secondary** node). The various internal components + are orchestrated to make alterations to these stores. + +### What data input validation requirements have been defined? + +- **Secondary** nodes must have a faithful replication of the **primary** node’s data. + +### What data does the application store and how? + +- Git repositories and files, tracking information related to the them, and the GitLab database contents. + +### What data is or may need to be encrypted and what key management requirements have been defined? + +- Neither **primary** nodes or **secondary** nodes encrypt Git repository or filesystem data at + rest. A subset of database columns are encrypted at rest using the `db_otp_key`. +- A static secret shared across all hosts in a GitLab deployment. +- In transit, data should be encrypted, although the application does permit + communication to proceed unencrypted. The two main transits are the **secondary** node’s + replication process for PostgreSQL, and for git repositories/files. Both should + be protected using TLS, with the keys for that managed via Omnibus per existing + configuration for end-user access to GitLab. + +### What capabilities exist to detect the leakage of sensitive data? + +- Comprehensive system logs exist, tracking every connection to GitLab and PostgreSQL. + +### What encryption requirements have been defined for data in transit - including transmission over WAN, LAN, SecureFTP, or publicly accessible protocols such as http: and https:? + +- Data must have the option to be encrypted in transit, and be secure against + both passive and active attack (e.g., MITM attacks should not be possible). + +## Access + +### What user privilege levels does the application support? + +- Geo adds one type of privilege: **secondary** nodes can access a special Geo API to + download files over HTTP/HTTPS, and to clone repositories using HTTP/HTTPS. + +### What user identification and authentication requirements have been defined? + +- **Secondary** nodes identify to Geo **primary** nodes via OAuth or JWT authentication + based on the shared database (HTTP access) or a PostgreSQL replication user (for + database replication). The database replication also requires IP-based access + controls to be defined. + +### What user authorization requirements have been defined? + +- **Secondary** nodes must only be able to *read* data. They are not currently able to mutate data on the **primary** node. + +### What session management requirements have been defined? + +- Geo JWTs are defined to last for only two minutes before needing to be regenerated. +- Geo JWTs are generated for one of the following specific scopes: + - Geo API access. + - Git access. + - LFS and File ID. + - Upload and File ID. + - Job Artifact and File ID. +- Geo JWTs scopes are not enforced for Git Access yet, but will be in a future version (currently scheduled for GitLab 11.10). + +### What access requirements have been defined for URI and Service calls? + +- **Secondary** nodes make many calls to the **primary** node's API. This is how file + replication proceeds, for instance. This endpoint is only accessible with a JWT token. +- The **primary** node also makes calls to the **secondary** node to get status information. + +## Application Monitoring + +### What application auditing requirements have been defined? How are audit and debug logs accessed, stored, and secured? + +- Structured JSON log is written to the filesystem, and can also be ingested + into a Kibana installation for further analysis. diff --git a/doc/administration/geo/replication/troubleshooting.md b/doc/administration/geo/replication/troubleshooting.md new file mode 100644 index 00000000000..6fea03cc8ec --- /dev/null +++ b/doc/administration/geo/replication/troubleshooting.md @@ -0,0 +1,384 @@ +# Geo Troubleshooting + +NOTE: **Note:** +This list is an attempt to document all the moving parts that can go wrong. +We are working into getting all this steps verified automatically in a +rake task in the future. + +Setting up Geo requires careful attention to details and sometimes it's easy to +miss a step. Here is a list of questions you should ask to try to detect +what you need to fix (all commands and path locations are for Omnibus installs): + +## First check the health of the **secondary** node + +Visit the **primary** node's **Admin Area > Geo** (`/admin/geo/nodes`) in +your browser. We perform the following health checks on each **secondary** node +to help identify if something is wrong: + +- Is the node running? +- Is the node's secondary database configured for streaming replication? +- Is the node's secondary tracking database configured? +- Is the node's secondary tracking database connected? +- Is the node's secondary tracking database up-to-date? + +![Geo health check](img/geo_node_healthcheck.png) + +There is also an option to check the status of the **secondary** node by running a special rake task: + +```sh +sudo gitlab-rake geo:status +``` + +## Is Postgres replication working? + +### Are my nodes pointing to the correct database instance? + +You should make sure your **primary** Geo node points to the instance with +writing permissions. + +Any **secondary** nodes should point only to read-only instances. + +### Can Geo detect my current node correctly? + +Geo uses the defined node from the **Admin Area > Geo** screen, and tries to match +it with the value defined in the `/etc/gitlab/gitlab.rb` configuration file. +The relevant line looks like: `external_url "http://gitlab.example.com"`. + +To check if the node on the current machine is correctly detected type: + +```sh +sudo gitlab-rails runner "puts Gitlab::Geo.current_node.inspect" +``` + +and expect something like: + +``` +#<GeoNode id: 2, schema: "https", host: "gitlab.example.com", port: 443, relative_url_root: "", primary: false, ...> +``` + +By running the command above, `primary` should be `true` when executed in +the **primary** node, and `false` on any **secondary** node. + +## How do I fix the message, "ERROR: replication slots can only be used if max_replication_slots > 0"? + +This means that the `max_replication_slots` PostgreSQL variable needs to +be set on the **primary** database. In GitLab 9.4, we have made this setting +default to 1. You may need to increase this value if you have more +**secondary** nodes. Be sure to restart PostgreSQL for this to take +effect. See the [PostgreSQL replication +setup][database-pg-replication] guide for more details. + +## How do I fix the message, "FATAL: could not start WAL streaming: ERROR: replication slot "geo_secondary_my_domain_com" does not exist"? + +This occurs when PostgreSQL does not have a replication slot for the +**secondary** node by that name. You may want to rerun the [replication +process](database.md) on the **secondary** node . + +## How do I fix the message, "Command exceeded allowed execution time" when setting up replication? + +This may happen while [initiating the replication process][database-start-replication] on the **secondary** node, +and indicates that your initial dataset is too large to be replicated in the default timeout (30 minutes). + +Re-run `gitlab-ctl replicate-geo-database`, but include a larger value for +`--backup-timeout`: + +```sh +sudo gitlab-ctl replicate-geo-database --host=primary.geo.example.com --slot-name=secondary_geo_example_com --backup-timeout=21600 +``` + +This will give the initial replication up to six hours to complete, rather than +the default thirty minutes. Adjust as required for your installation. + +## How do I fix the message, "PANIC: could not write to file 'pg_xlog/xlogtemp.123': No space left on device" + +Determine if you have any unused replication slots in the **primary** database. This can cause large amounts of +log data to build up in `pg_xlog`. Removing the unused slots can reduce the amount of space used in the `pg_xlog`. + +1. Start a PostgreSQL console session: + + ```sh + sudo gitlab-psql gitlabhq_production + ``` + + > Note that using `gitlab-rails dbconsole` will not work, because managing replication slots requires superuser permissions. + +1. View your replication slots with: + + ```sql + SELECT * FROM pg_replication_slots; + ``` + +Slots where `active` is `f` are not active. + +- When this slot should be active, because you have a **secondary** node configured using that slot, + log in to that **secondary** node and check the PostgreSQL logs why the replication is not running. + +- If you are no longer using the slot (e.g. you no longer have Geo enabled), you can remove it with in the + PostgreSQL console session: + + ```sql + SELECT pg_drop_replication_slot('name_of_extra_slot'); + ``` + +## Very large repositories never successfully synchronize on the **secondary** node + +GitLab places a timeout on all repository clones, including project imports +and Geo synchronization operations. If a fresh `git clone` of a repository +on the primary takes more than a few minutes, you may be affected by this. +To increase the timeout, add the following line to `/etc/gitlab/gitlab.rb` +on the **secondary** node: + +```ruby +gitlab_rails['gitlab_shell_git_timeout'] = 10800 +``` + +Then reconfigure GitLab: + +```sh +sudo gitlab-ctl reconfigure +``` + +This will increase the timeout to three hours (10800 seconds). Choose a time +long enough to accommodate a full clone of your largest repositories. + +## How to reset Geo **secondary** node replication + +If you get a **secondary** node in a broken state and want to reset the replication state, +to start again from scratch, there are a few steps that can help you: + +1. Stop Sidekiq and the Geo LogCursor + + It's possible to make Sidekiq stop gracefully, but making it stop getting new jobs and + wait until the current jobs to finish processing. + + You need to send a **SIGTSTP** kill signal for the first phase and them a **SIGTERM** + when all jobs have finished. Otherwise just use the `gitlab-ctl stop` commands. + + ```sh + gitlab-ctl status sidekiq + # run: sidekiq: (pid 10180) <- this is the PID you will use + kill -TSTP 10180 # change to the correct PID + + gitlab-ctl stop sidekiq + gitlab-ctl stop geo-logcursor + ``` + + You can watch sidekiq logs to know when sidekiq jobs processing have finished: + + ```sh + gitlab-ctl tail sidekiq + ``` + +1. Rename repository storage folders and create new ones + + ```sh + mv /var/opt/gitlab/git-data/repositories /var/opt/gitlab/git-data/repositories.old + mkdir -p /var/opt/gitlab/git-data/repositories + chown git:git /var/opt/gitlab/git-data/repositories + ``` + + TIP: **Tip** + You may want to remove the `/var/opt/gitlab/git-data/repositories.old` in the future + as soon as you confirmed that you don't need it anymore, to save disk space. + +1. _(Optional)_ Rename other data folders and create new ones + + CAUTION: **Caution**: + You may still have files on the **secondary** node that have been removed from **primary** node but + removal have not been reflected. If you skip this step, they will never be removed + from this Geo node. + + Any uploaded content like file attachments, avatars or LFS objects are stored in a + subfolder in one of the two paths below: + + 1. /var/opt/gitlab/gitlab-rails/shared + 1. /var/opt/gitlab/gitlab-rails/uploads + + To rename all of them: + + ```sh + gitlab-ctl stop + + mv /var/opt/gitlab/gitlab-rails/shared /var/opt/gitlab/gitlab-rails/shared.old + mkdir -p /var/opt/gitlab/gitlab-rails/shared + + mv /var/opt/gitlab/gitlab-rails/uploads /var/opt/gitlab/gitlab-rails/uploads.old + mkdir -p /var/opt/gitlab/gitlab-rails/uploads + ``` + + Reconfigure in order to recreate the folders and make sure permissions and ownership + are correctly + + ```sh + gitlab-ctl reconfigure + ``` + +1. Reset the Tracking Database + + ```sh + gitlab-rake geo:db:reset + ``` + +1. Restart previously stopped services + + ```sh + gitlab-ctl start + ``` + +## How do I fix a "Foreign Data Wrapper (FDW) is not configured" error? + +When setting up Geo, you might see this warning in the `gitlab-rake +gitlab:geo:check` output: + +``` +GitLab Geo tracking database Foreign Data Wrapper schema is up-to-date? ... foreign data wrapper is not configured +``` + +There are a few key points to remember: + +1. The FDW settings are configured on the Geo **tracking** database. +1. The configured foreign server enables a login to the Geo +**secondary**, read-only database. + +By default, the Geo secondary and tracking database are running on the +same host on different ports. That is, 5432 and 5431 respectively. + +### Checking configuration + +NOTE: **Note:** +The following steps are for Omnibus installs only. Using Geo with source-based installs [is deprecated](index.md#using-gitlab-installed-from-source-deprecated). + +To check the configuration: + +1. Enter the database console: + + ```sh + gitlab-geo-psql + ``` + +1. Check whether any tables are present. If everything is working, you +should see something like this: + + ```sql + gitlabhq_geo_production=# SELECT * from information_schema.foreign_tables; + foreign_table_catalog | foreign_table_schema | foreign_table_name | foreign_server_catalog | foreign_server_n + ame + -------------------------+----------------------+-------------------------------------------------+-------------------------+----------------- + ---- + gitlabhq_geo_production | gitlab_secondary | abuse_reports | gitlabhq_geo_production | gitlab_secondary + gitlabhq_geo_production | gitlab_secondary | appearances | gitlabhq_geo_production | gitlab_secondary + gitlabhq_geo_production | gitlab_secondary | application_setting_terms | gitlabhq_geo_production | gitlab_secondary + gitlabhq_geo_production | gitlab_secondary | application_settings | gitlabhq_geo_production | gitlab_secondary + <snip> + ``` + + However, if the query returns with `0 rows`, then continue onto the next steps. + +1. Check that the foreign server mapping is correct via `\des+`. The + results should look something like this: + + ```sql + gitlabhq_geo_production=# \des+ + List of foreign servers + -[ RECORD 1 ]--------+------------------------------------------------------------ + Name | gitlab_secondary + Owner | gitlab-psql + Foreign-data wrapper | postgres_fdw + Access privileges | "gitlab-psql"=U/"gitlab-psql" + + | gitlab_geo=U/"gitlab-psql" + Type | + Version | + FDW Options | (host '0.0.0.0', port '5432', dbname 'gitlabhq_production') + Description | + ``` + + NOTE: **Note:** Pay particular attention to the host and port under + FDW options. That configuration should point to the Geo secondary + database. + + If you need to experiment with changing the host or password, the + following queries demonstrate how: + + ```sql + ALTER SERVER gitlab_secondary OPTIONS (SET host 'my-new-host'); + ALTER SERVER gitlab_secondary OPTIONS (SET port 5432); + ``` + + If you change the host and/or port, you will also have to adjust the + following settings in `/etc/gitlab/gitlab.rb` and run `gitlab-ctl + reconfigure`: + + - `gitlab_rails['db_host']` + - `gitlab_rails['db_port']` + +1. Check that the user mapping is configured properly via `\deu+`: + + ```sql + gitlabhq_geo_production=# \deu+ + List of user mappings + Server | User name | FDW Options + ------------------+------------+-------------------------------------------------------------------------------- + gitlab_secondary | gitlab_geo | ("user" 'gitlab', password 'YOUR-PASSWORD-HERE') + (1 row) + ``` + + Make sure the password is correct. You can test that logins work by running `psql`: + + ```sh + # Connect to the tracking database as the `gitlab_geo` user + sudo -u git /opt/gitlab/embedded/bin/psql -h /var/opt/gitlab/geo-postgresql -p 5431 -U gitlab_geo -W -d gitlabhq_geo_production + ``` + + If you need to correct the password, the following query shows how: + + ```sql + ALTER USER MAPPING FOR gitlab_geo SERVER gitlab_secondary OPTIONS (SET password 'my-new-password'); + ``` + + If you change the user or password, you will also have to adjust the + following settings in `/etc/gitlab/gitlab.rb` and run `gitlab-ctl + reconfigure`: + + - `gitlab_rails['db_username']` + - `gitlab_rails['db_password']` + + If you are using [PgBouncer in front of the secondary + database](database.md#pgbouncer-support-optional), be sure to update + the following settings: + + - `geo_postgresql['fdw_external_user']` + - `geo_postgresql['fdw_external_password']` + +### Manual reload of FDW schema + +If you're still unable to get FDW working, you may want to try a manual +reload of the FDW schema. To manually reload the FDW schema: + +1. On the node running the Geo tracking database, enter the PostgreSQL console via + the `gitlab_geo` user: + + ```sh + sudo -u git /opt/gitlab/embedded/bin/psql -h /var/opt/gitlab/geo-postgresql -p 5431 -U gitlab_geo -W -d gitlabhq_geo_production + ``` + + Be sure to adjust the port and hostname for your configuration. You + may be asked to enter a password. + +1. Reload the schema via: + + ```sql + DROP SCHEMA IF EXISTS gitlab_secondary CASCADE; + CREATE SCHEMA gitlab_secondary; + GRANT USAGE ON FOREIGN SERVER gitlab_secondary TO gitlab_geo; + IMPORT FOREIGN SCHEMA public FROM SERVER gitlab_secondary INTO gitlab_secondary; + ``` + +1. Test that queries work: + + ```sql + SELECT * from information_schema.foreign_tables; + SELECT * FROM gitlab_secondary.projects limit 1; + ``` + +[database-start-replication]: database.md#step-3-initiate-the-replication-process +[database-pg-replication]: database.md#postgresql-replication diff --git a/doc/administration/geo/replication/tuning.md b/doc/administration/geo/replication/tuning.md new file mode 100644 index 00000000000..246e56ba43b --- /dev/null +++ b/doc/administration/geo/replication/tuning.md @@ -0,0 +1,19 @@ +# Tuning Geo + +## Changing the sync capacity values + +In the Geo admin page (`/admin/geo/nodes`), there are several variables that +can be tuned to improve performance of Geo: + +- Repository sync capacity. +- File sync capacity. + +Increasing these values will increase the number of jobs that are scheduled. +However, this may not lead to more downloads in parallel unless the number of +available Sidekiq threads is also increased. For example, if repository sync +capacity is increased from 25 to 50, you may also want to increase the number +of Sidekiq threads from 25 to 50. See the [Sidekiq concurrency +documentation][sidekiq-concurrency] +for more details. + +[sidekiq-concurrency]: ../../operations/extra_sidekiq_processes.html#concurrency diff --git a/doc/administration/geo/replication/updating_the_geo_nodes.md b/doc/administration/geo/replication/updating_the_geo_nodes.md new file mode 100644 index 00000000000..a6af8716228 --- /dev/null +++ b/doc/administration/geo/replication/updating_the_geo_nodes.md @@ -0,0 +1,404 @@ +# Updating the Geo nodes + +Depending on which version of Geo you are updating to/from, there may be +different steps. + +## General update steps + +In order to update the Geo nodes when a new GitLab version is released, +all you need to do is update GitLab itself: + +1. Log into each node (**primary** and **secondary** nodes). +1. [Update GitLab][update]. +1. [Update tracking database on **secondary** node](#update-tracking-database-on-secondary-node) when + the tracking database is enabled. +1. [Test](#check-status-after-updating) **primary** and **secondary** nodes, and check version in each. + +## Upgrading to GitLab 10.8 + +Before 10.8, broadcast messages would not propagate without flushing the cache on the **secondary** nodes. This has been fixed in 10.8, but requires one last cache flush on each **secondary** node: + +```sh +sudo gitlab-rake cache:clear +``` + +## Upgrading to GitLab 10.6 + +In 10.4, we started to recommend that you define a password for database user (`gitlab`). + +We now require this change as we use this password to enable the Foreign Data Wrapper, as a way to optimize +the Geo Tracking Database. We are also improving security by disabling the use of **trust** +authentication method. + +1. **[primary]** Login to your **primary** node and run: + + ```sh + gitlab-ctl pg-password-md5 gitlab + # Enter password: mypassword + # Confirm password: mypassword + # fca0b89a972d69f00eb3ec98a5838484 + ``` + + Copy the generated hash and edit `/etc/gitlab/gitlab.rb`: + + ```ruby + # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab` + postgresql['sql_user_password'] = 'fca0b89a972d69f00eb3ec98a5838484' + + # Every node that runs Unicorn or Sidekiq needs to have the database + # password specified as below. If you have a high-availability setup, this + # must be present in all application nodes. + gitlab_rails['db_password'] = 'mypassword' + ``` + + Still in the configuration file, locate and remove the `trust_auth_cidr_address`: + + ```ruby + postgresql['trust_auth_cidr_addresses'] = ['127.0.0.1/32','1.2.3.4/32'] # <- Remove this + ``` + +1. **[primary]** Reconfigure and restart: + + ```sh + sudo gitlab-ctl reconfigure + sudo gitlab-ctl restart + ``` + +1. **[secondary]** Login to all **secondary** nodes and edit `/etc/gitlab/gitlab.rb`: + + ```ruby + # Fill with the hash generated by `gitlab-ctl pg-password-md5 gitlab` + postgresql['sql_user_password'] = 'fca0b89a972d69f00eb3ec98a5838484' + + # Every node that runs Unicorn or Sidekiq needs to have the database + # password specified as below. If you have a high-availability setup, this + # must be present in all application nodes. + gitlab_rails['db_password'] = 'mypassword' + + # Enable Foreign Data Wrapper + geo_secondary['db_fdw'] = true + + # Secondary address + # - replace '5.6.7.8' with the secondary public or VPC address + postgresql['md5_auth_cidr_addresses'] = ['5.6.7.8/32'] + ``` + + Still in the configuration file, locate and remove the `trust_auth_cidr_address`: + + ```ruby + postgresql['trust_auth_cidr_addresses'] = ['127.0.0.1/32','5.6.7.8/32'] # <- Remove this + ``` + +1. **[secondary]** Reconfigure and restart: + + ```sh + sudo gitlab-ctl reconfigure + sudo gitlab-ctl restart + ``` + +## Upgrading to GitLab 10.5 + +For Geo Disaster Recovery to work with minimum downtime, your **secondary** node +should use the same set of secrets as the **primary** node. However, setup instructions +prior to the 10.5 release only synchronized the `db_key_base` secret. + +To rectify this error on existing installations, you should **overwrite** the +contents of `/etc/gitlab/gitlab-secrets.json` on each **secondary** node with the +contents of `/etc/gitlab/gitlab-secrets.json` on the **primary** node, then run the +following command on each **secondary** node: + +```sh +sudo gitlab-ctl reconfigure +``` + +If you do not perform this step, you may find that two-factor authentication +[is broken following DR](../disaster_recovery/index.html#i-followed-the-disaster-recovery-instructions-and-now-two-factor-auth-is-broken). + +To prevent SSH requests to the newly promoted **primary** node from failing +due to SSH host key mismatch when updating the **primary** node domain's DNS record +you should perform the step to [Manually replicate **primary** SSH host keys](configuration.md#step-2-manually-replicate-the-primary-nodes-ssh-host-keys) in each +**secondary** node. + +## Upgrading to GitLab 10.4 + +There are no Geo-specific steps to take! + +## Upgrading to GitLab 10.3 + +### Support for SSH repository synchronization removed + +In GitLab 10.2, synchronizing secondaries over SSH was deprecated. In 10.3, +support is removed entirely. All installations will switch to the HTTP/HTTPS +cloning method instead. Before upgrading, ensure that all your Geo nodes are +configured to use this method and that it works for your installation. In +particular, ensure that [Git access over HTTP/HTTPS is enabled](configuration.md#step-6-enable-git-access-over-httphttps). + +Synchronizing repositories over the public Internet using HTTP is insecure, so +you should ensure that you have HTTPS configured before upgrading. Note that +file synchronization is **also** insecure in these cases! + +## Upgrading to GitLab 10.2 + +### Secure PostgreSQL replication + +Support for TLS-secured PostgreSQL replication has been added. If you are +currently using PostgreSQL replication across the open internet without an +external means of securing the connection (e.g., a site-to-site VPN), then you +should immediately reconfigure your **primary** and **secondary** PostgreSQL instances +according to the [updated instructions][database]. + +If you *are* securing the connections externally and wish to continue doing so, +ensure you include the new option `--sslmode=prefer` in future invocations of +`gitlab-ctl replicate-geo-database`. + +### HTTPS repository sync + +Support for replicating repositories and wikis over HTTP/HTTPS has been added. +Replicating over SSH has been deprecated, and support for this option will be +removed in a future release. + +To switch to HTTP/HTTPS replication, log into the **primary** node as an admin and visit +**Admin Area > Geo** (`/admin/geo/nodes`). For each **secondary** node listed, +press the "Edit" button, change the "Repository cloning" setting from +"SSH (deprecated)" to "HTTP/HTTPS", and press "Save changes". This should take +effect immediately. + +Any new secondaries should be created using HTTP/HTTPS replication - this is the +default setting. + +After you've verified that HTTP/HTTPS replication is working, you should remove +the now-unused SSH keys from your secondaries, as they may cause problems if the +**secondary** node if ever promoted to a **primary** node: + +1. **[secondary]** Login to **all** your **secondary** nodes and run: + + ```ruby + sudo -u git -H rm ~git/.ssh/id_rsa ~git/.ssh/id_rsa.pub + ``` + +### Hashed Storage + +CAUTION: **Warning:** +Hashed storage is in **Alpha**. It is considered experimental and not +production-ready. See [Hashed Storage] for more detail. + +If you previously enabled Hashed Storage and migrated all your existing +projects to Hashed Storage, disabling hashed storage will not migrate projects +to their previous project based storage path. As such, once enabled and +migrated we recommend leaving Hashed Storage enabled. + +## Upgrading to GitLab 10.1 + +CAUTION: **Warning:** +Hashed storage is in **Alpha**. It is considered experimental and not +production-ready. See [Hashed Storage] for more detail. + +[Hashed storage] was introduced in GitLab 10.0, and a [migration path][hashed-migration] +for existing repositories was added in GitLab 10.1. + +## Upgrading to GitLab 10.0 + +Since GitLab 10.0, we require all **Geo** systems to [use SSH key lookups via +the database][ssh-fast-lookup] to avoid having to maintain consistency of the +`authorized_keys` file for SSH access. Failing to do this will prevent users +from being able to clone via SSH. + +Note that in older versions of Geo, attachments downloaded on the **secondary** +nodes would be saved to the wrong directory. We recommend that you do the +following to clean this up. + +On the **secondary** Geo nodes, run as root: + +```sh +mv /var/opt/gitlab/gitlab-rails/working /var/opt/gitlab/gitlab-rails/working.old +mkdir /var/opt/gitlab/gitlab-rails/working +chmod 700 /var/opt/gitlab/gitlab-rails/working +chown git:git /var/opt/gitlab/gitlab-rails/working +``` + +You may delete `/var/opt/gitlab/gitlab-rails/working.old` any time. + +Once this is done, we advise restarting GitLab on the **secondary** nodes for the +new working directory to be used: + +```sh +sudo gitlab-ctl restart +``` + +## Upgrading from GitLab 9.3 or older + +If you started running Geo on GitLab 9.3 or older, we recommend that you +resync your **secondary** PostgreSQL databases to use replication slots. If you +started using Geo with GitLab 9.4 or 10.x, no further action should be +required because replication slots are used by default. However, if you +started with GitLab 9.3 and upgraded later, you should still follow the +instructions below. + +When in doubt, it does not hurt to do a resync. The easiest way to do this in +Omnibus is the following: + + 1. Make sure you have Omnibus GitLab on the **primary** server. + 1. Run `gitlab-ctl reconfigure` and `gitlab-ctl restart postgresql`. This will enable replication slots on the **primary** database. + 1. Check the steps about defining `postgresql['sql_user_password']`, `gitlab_rails['db_password']`. + 1. Make sure `postgresql['max_replication_slots']` matches the number of **secondary** Geo nodes locations. + 1. Install GitLab on the **secondary** server. + 1. Re-run the [database replication process][database-replication]. + +## Special update notes for 9.0.x + +> **IMPORTANT**: +With GitLab 9.0, the PostgreSQL version is upgraded to 9.6 and manual steps are +required in order to update the **secondary** nodes and keep the Streaming +Replication working. Downtime is required, so plan ahead. + +The following steps apply only if you upgrade from a 8.17 GitLab version to +9.0+. For previous versions, update to GitLab 8.17 first before attempting to +upgrade to 9.0+. + +--- + +Make sure to follow the steps in the exact order as they appear below and pay +extra attention in what node (either **primary** or **secondary**) you execute them! Each step +is prepended with the relevant node for better clarity: + +1. **[secondary]** Login to **all** your **secondary** nodes and stop all services: + + ```ruby + sudo gitlab-ctl stop + ``` + +1. **[secondary]** Make a backup of the `recovery.conf` file on **all** + **secondary** nodes to preserve PostgreSQL's credentials: + + ```sh + sudo cp /var/opt/gitlab/postgresql/data/recovery.conf /var/opt/gitlab/ + ``` + +1. **[primary]** Update the **primary** node to GitLab 9.0 following the + [regular update docs][update]. At the end of the update, the **primary** node + will be running with PostgreSQL 9.6. + +1. **[primary]** To prevent a de-synchronization of the repository replication, + stop all services except `postgresql` as we will use it to re-initialize the + **secondary** node's database: + + ```sh + sudo gitlab-ctl stop + sudo gitlab-ctl start postgresql + ``` + +1. **[secondary]** Run the following steps on each of the **secondary** nodes: + + 1. **[secondary]** Stop all services: + + ```sh + sudo gitlab-ctl stop + ``` + + 1. **[secondary]** Prevent running database migrations: + + ```sh + sudo touch /etc/gitlab/skip-auto-migrations + ``` + + 1. **[secondary]** Move the old database to another directory: + + ```sh + sudo mv /var/opt/gitlab/postgresql{,.bak} + ``` + + 1. **[secondary]** Update to GitLab 9.0 following the [regular update docs][update]. + At the end of the update, the node will be running with PostgreSQL 9.6. + + 1. **[secondary]** Make sure all services are up: + + ```sh + sudo gitlab-ctl start + ``` + + 1. **[secondary]** Reconfigure GitLab: + + ```sh + sudo gitlab-ctl reconfigure + ``` + + 1. **[secondary]** Run the PostgreSQL upgrade command: + + ```sh + sudo gitlab-ctl pg-upgrade + ``` + + 1. **[secondary]** See the stored credentials for the database that you will + need to re-initialize the replication: + + ```sh + sudo grep -s primary_conninfo /var/opt/gitlab/recovery.conf + ``` + + 1. **[secondary]** Create the `replica.sh` script as described in the + [database configuration document][database-source-replication]. + + 1. **[secondary]** Run the recovery script using the credentials from the + previous step: + + ```sh + sudo bash /tmp/replica.sh + ``` + + 1. **[secondary]** Reconfigure GitLab: + + ```sh + sudo gitlab-ctl reconfigure + ``` + + 1. **[secondary]** Start all services: + + ```sh + sudo gitlab-ctl start + ``` + + 1. **[secondary]** Repeat the steps for the remaining **secondary** nodes. + +1. **[primary]** After all **secondary** nodes are updated, start all services in + **primary** node: + + ```sh + sudo gitlab-ctl start + ``` + +## Check status after updating + +Now that the update process is complete, you may want to check whether +everything is working correctly: + +1. Run the Geo raketask on all nodes, everything should be green: + + ```sh + sudo gitlab-rake gitlab:geo:check + ``` + +1. Check the **primary** node's Geo dashboard for any errors. +1. Test the data replication by pushing code to the **primary** node and see if it + is received by **secondary** nodes. + +## Update tracking database on **secondary** node + +After updating a **secondary** node, you might need to run migrations on +the tracking database. The tracking database was added in GitLab 9.1, +and it is required since 10.0. + +1. Run database migrations on tracking database: + + ```sh + sudo gitlab-rake geo:db:migrate + ``` + +1. Repeat this step for each **secondary** node. + +[update]: ../../../update/README.md +[database]: database.md +[database-replication]: database.md#step-3-initiate-the-replication-process +[database-source-replication]: database_source.md#step-3-initiate-the-replication-process +[Hashed Storage]: ../../repository_storage_types.md +[hashed-migration]: ../../raketasks/storage.md +[ssh-fast-lookup]: ../../operations/fast_ssh_key_lookup.md diff --git a/doc/administration/geo/replication/using_a_geo_server.md b/doc/administration/geo/replication/using_a_geo_server.md new file mode 100644 index 00000000000..c8bfcf01fa4 --- /dev/null +++ b/doc/administration/geo/replication/using_a_geo_server.md @@ -0,0 +1,18 @@ +[//]: # (Please update EE::GitLab::GeoGitAccess::GEO_SERVER_DOCS_URL if this file is moved) + +# Using a Geo Server + +After you set up the [database replication and configure the Geo nodes][req], use your closest GitLab node as you would a normal standalone GitLab instance. + +Pushing directly to a **secondary** node (for both HTTP, SSH including git-lfs) was [introduced](https://about.gitlab.com/2018/09/22/gitlab-11-3-released/) in [GitLab Premium](https://about.gitlab.com/pricing/#self-managed) 11.3. + +Example of the output you will see when pushing to a **secondary** node: + +```bash +$ git push +> GitLab: You're pushing to a Geo secondary. +> GitLab: We'll help you by proxying this request to the primary: ssh://git@primary.geo/user/repo.git +Everything up-to-date +``` + +[req]: index.md#setup-instructions diff --git a/doc/administration/high_availability/README.md b/doc/administration/high_availability/README.md index 49fe80fb2a6..7736c4fe6f3 100644 --- a/doc/administration/high_availability/README.md +++ b/doc/administration/high_availability/README.md @@ -1,4 +1,4 @@ -# High Availability +# Scaling and High Availability GitLab supports several different types of clustering and high-availability. The solution you choose will be based on the level of scalability and @@ -13,51 +13,173 @@ of Git, developers can still commit code locally even when GitLab is not available. However, some GitLab features such as the issue tracker and Continuous Integration are not available when GitLab is down. -**Keep in mind that all Highly Available solutions come with a trade-off between +**Keep in mind that all highly-available solutions come with a trade-off between cost/complexity and uptime**. The more uptime you want, the more complex the solution. And the more complex the solution, the more work is involved in setting up and maintaining it. High availability is not free and every HA solution should balance the costs against the benefits. -## Architecture - -There are two kinds of setups: - -- active/active -- active/passive - -### Active/Active - -This architecture scales easily because all application servers handle -user requests simultaneously. The database, Redis, and GitLab application are -all deployed on separate servers. The configuration is **only** highly-available -if the database, Redis and storage are also configured as such. - -Follow the steps below to configure an active/active setup: +There are many options when choosing a highly-available GitLab architecture. We +recommend engaging with GitLab Support to choose the best architecture for your +use-case. This page contains some various options and guidelines based on +experience with GitLab.com and Enterprise Edition on-premises customers. + +For a detailed insight into how GitLab scales and configures GitLab.com, you can +watch [this 1 hour Q&A](https://www.youtube.com/watch?v=uCU8jdYzpac) +with [John Northrup](https://gitlab.com/northrup), one of our infrastructure +engineers, and live questions coming in from some of our customers. + +## GitLab Components + +The following components need to be considered for a scaled or highly-available +environment. In many cases components can be combined on the same nodes to reduce +complexity. + +- Unicorn/Workhorse - Web-requests (UI, API, Git over HTTP) +- Sidekiq - Asynchronous/Background jobs +- PostgreSQL - Database + - Consul - Database service discovery and health checks/failover + - PGBouncer - Database pool manager +- Redis - Key/Value store (User sessions, cache, queue for Sidekiq) + - Sentinel - Redis health check/failover manager +- Gitaly - Provides high-level RPC access to Git repositories + +## Scalable Architecture Examples + +When an organization reaches a certain threshold it will be necessary to scale +the GitLab instance. Still, true high availability may not be necessary. There +are options for scaling GitLab instances relatively easily without incurring the +infrastructure and maintenance costs of full high availability. + +### Basic Scaling + +This is the simplest form of scaling and will work for the majority of +cases. Backend components such as PostgreSQL, Redis and storage are offloaded +to their own nodes while the remaining GitLab components all run on 2 or more +application nodes. + +This form of scaling also works well in a cloud environment when it is more +cost-effective to deploy several small nodes rather than a single +larger one. + +- 1 PostgreSQL node +- 1 Redis node +- 2 or more GitLab application nodes (Unicorn, Workhorse, Sidekiq) +- 1 NFS/Gitaly storage server + +#### Installation Instructions + +Complete the following installation steps in order. A link at the end of each +section will bring you back to the Scalable Architecture Examples section so +you can continue with the next step. + +1. [PostgreSQL](./database.md#postgresql-in-a-scaled-environment) +1. [Redis](./redis.md#redis-in-a-scaled-environment) +1. [Gitaly](./gitaly.md) (recommended) or [NFS](./nfs.md) +1. [GitLab application nodes](./gitlab.md) + +### Full Scaling + +For very large installations it may be necessary to further split components +for maximum scalability. In a fully-scaled architecture the application node +is split into separate Sidekiq and Unicorn/Workhorse nodes. One indication that +this architecture is required is if Sidekiq queues begin to periodically increase +in size, indicating that there is contention or not enough resources. + +- 1 PostgreSQL node +- 1 Redis node +- 2 or more GitLab application nodes (Unicorn, Workhorse) +- 2 or more Sidekiq nodes +- 2 or more NFS/Gitaly storage servers + +## High Availability Architecture Examples + +When organizations require scaling *and* high availability the following +architectures can be utilized. As the introduction section at the top of this +page mentions, there is a tradeoff between cost/complexity and uptime. Be sure +this complexity is absolutely required before taking the step into full +high availability. + +For all examples below, we recommend running Consul and Redis Sentinel on +dedicated nodes. If Consul is running on PostgreSQL nodes or Sentinel on +Redis nodes there is a potential that high resource usage by PostgreSQL or +Redis could prevent communication between the other Consul and Sentinel nodes. +This may lead to the other nodes believing a failure has occurred and automated +failover is necessary. Isolating them from the services they monitor reduces +the chances of split-brain. + +The examples below do not really address high availability of NFS. Some enterprises +have access to NFS appliances that manage availability. This is the best case +scenario. In the future, GitLab may offer a more user-friendly solution to +[GitLab HA Storage](https://gitlab.com/gitlab-org/omnibus-gitlab/issues/2472). + +There are many options in between each of these examples. Work with GitLab Support +to understand the best starting point for your workload and adapt from there. + +### Horizontal + +This is the simplest form of high availability and scaling. It requires the +fewest number of individual servers (virtual or physical) but does have some +trade-offs and limits. + +This architecture will work well for many GitLab customers. Larger customers +may begin to notice certain events cause contention/high load - for example, +cloning many large repositories with binary files, high API usage, a large +number of enqueued Sidekiq jobs, etc. If this happens you should consider +moving to a hybrid or fully distributed architecture depending on what is causing +the contention. + +- 3 PostgreSQL nodes +- 2 Redis nodes +- 3 Consul/Sentinel nodes +- 2 or more GitLab application nodes (Unicorn, Workhorse, Sidekiq, PGBouncer) +- 1 NFS/Gitaly server + +![Horizontal architecture diagram](../img/high_availability/horizontal.png) + +### Hybrid + +In this architecture, certain components are split on dedicated nodes so high +resource usage of one component does not interfere with others. In larger +environments this is a good architecture to consider if you foresee or do have +contention due to certain workloads. + +- 3 PostgreSQL nodes +- 2 Redis nodes +- 3 Consul/Sentinel nodes +- 2 or more Sidekiq nodes +- 2 or more Web nodes (Unicorn, Workhorse, PGBouncer) +- 1 or more NFS/Gitaly servers + +![Hybrid architecture diagram](../img/high_availability/hybrid.png) + +### Fully Distributed + +This architecture scales to hundreds of thousands of users and projects and is +the basis of the GitLab.com architecture. While this scales well it also comes +with the added complexity of many more nodes to configure, manage and monitor. + +- 3 PostgreSQL nodes +- 4 or more Redis nodes (2 separate clusters for persistent and cache data) +- 3 Consul nodes +- 3 Sentinel nodes +- Multiple dedicated Sidekiq nodes (Split into real-time, best effort, ASAP, + CI Pipeline and Pull Mirror sets) +- 2 or more Git nodes (Git over SSH/Git over HTTP) +- 2 or more API nodes (All requests to `/api`) +- 2 or more Web nodes (All other web requests) +- 2 or more NFS/Gitaly servers + +![Fully Distributed architecture diagram](../img/high_availability/fully-distributed.png) + +The following pages outline the steps necessary to configure each component +separately: 1. [Configure the database](database.md) 1. [Configure Redis](redis.md) 1. [Configure Redis for GitLab source installations](redis_source.md) 1. [Configure NFS](nfs.md) + 1. [NFS Client and Host setup](nfs_host_client_setup.md) 1. [Configure the GitLab application servers](gitlab.md) 1. [Configure the load balancers](load_balancer.md) -![Active/Active HA Diagram](../img/high_availability/active-active-diagram.png) - -### Active/Passive - -For pure high-availability/failover with no scaling you can use an -active/passive configuration. This utilizes DRBD (Distributed Replicated -Block Device) to keep all data in sync. DRBD requires a low latency link to -remain in sync. It is not advisable to attempt to run DRBD between data centers -or in different cloud availability zones. - -> **Note:** GitLab recommends against choosing this HA method because of the - complexity of managing DRBD and crafting automatic failover. This is - *compatible* with GitLab, but not officially *supported*. If you are - an EE customer, support will help you with GitLab related problems, but if the - root cause is identified as DRBD, we will not troubleshoot further. - -Components/Servers Required: 2 servers/virtual machines (one active/one passive) - -![Active/Passive HA Diagram](../img/high_availability/active-passive-diagram.png) diff --git a/doc/administration/high_availability/alpha_database.md b/doc/administration/high_availability/alpha_database.md new file mode 100644 index 00000000000..7bf20be60e6 --- /dev/null +++ b/doc/administration/high_availability/alpha_database.md @@ -0,0 +1,6 @@ +--- +redirect_to: 'database.md' +--- + +This documentation has been moved to the main +[database documentation](database.md#configure_using_omnibus_for_high_availability). diff --git a/doc/administration/high_availability/consul.md b/doc/administration/high_availability/consul.md new file mode 100644 index 00000000000..056b7fc15d9 --- /dev/null +++ b/doc/administration/high_availability/consul.md @@ -0,0 +1,105 @@ +# Working with the bundled Consul service **[PREMIUM ONLY]** + +## Overview + +As part of its High Availability stack, GitLab Premium includes a bundled version of [Consul](http://consul.io) that can be managed through `/etc/gitlab/gitlab.rb`. + +A Consul cluster consists of multiple server agents, as well as client agents that run on other nodes which need to talk to the consul cluster. + +## Operations + +### Checking cluster membership + +To see which nodes are part of the cluster, run the following on any member in the cluster +``` +# /opt/gitlab/embedded/bin/consul members +Node Address Status Type Build Protocol DC +consul-b XX.XX.X.Y:8301 alive server 0.9.0 2 gitlab_consul +consul-c XX.XX.X.Y:8301 alive server 0.9.0 2 gitlab_consul +consul-c XX.XX.X.Y:8301 alive server 0.9.0 2 gitlab_consul +db-a XX.XX.X.Y:8301 alive client 0.9.0 2 gitlab_consul +db-b XX.XX.X.Y:8301 alive client 0.9.0 2 gitlab_consul +``` + +Ideally all nodes will have a `Status` of `alive`. + +### Restarting the server cluster + +**Note**: This section only applies to server agents. It is safe to restart client agents whenever needed. + +If it is necessary to restart the server cluster, it is important to do this in a controlled fashion in order to maintain quorum. If quorum is lost, you will need to follow the consul [outage recovery](#outage-recovery) process to recover the cluster. + +To be safe, we recommend you only restart one server agent at a time to ensure the cluster remains intact. + +For larger clusters, it is possible to restart multiple agents at a time. See the [Consul consensus document](https://www.consul.io/docs/internals/consensus.html#deployment-table) for how many failures it can tolerate. This will be the number of simulateneous restarts it can sustain. + +## Troubleshooting + +### Consul server agents unable to communicate + +By default, the server agents will attempt to [bind](https://www.consul.io/docs/agent/options.html#_bind) to '0.0.0.0', but they will advertise the first private IP address on the node for other agents to communicate with them. If the other nodes cannot communicate with a node on this address, then the cluster will have a failed status. + +You will see messages like the following in `gitlab-ctl tail consul` output if you are running into this issue: + +``` +2017-09-25_19:53:39.90821 2017/09/25 19:53:39 [WARN] raft: no known peers, aborting election +2017-09-25_19:53:41.74356 2017/09/25 19:53:41 [ERR] agent: failed to sync remote state: No cluster leader +``` + + +To fix this: + +1. Pick an address on each node that all of the other nodes can reach this node through. +1. Update your `/etc/gitlab/gitlab.rb` + + ```ruby + consul['configuration'] = { + ... + bind_addr: 'IP ADDRESS' + } + ``` +1. Run `gitlab-ctl reconfigure` + +If you still see the errors, you may have to [erase the consul database and reinitialize](#recreate-from-scratch) on the affected node. + +### Consul agents do not start - Multiple private IPs + +In the case that a node has multiple private IPs the agent be confused as to which of the private addresses to advertise, and then immediately exit on start. + +You will see messages like the following in `gitlab-ctl tail consul` output if you are running into this issue: + +``` +2017-11-09_17:41:45.52876 ==> Starting Consul agent... +2017-11-09_17:41:45.53057 ==> Error creating agent: Failed to get advertise address: Multiple private IPs found. Please configure one. +``` + +To fix this: + +1. Pick an address on the node that all of the other nodes can reach this node through. +1. Update your `/etc/gitlab/gitlab.rb` + + ```ruby + consul['configuration'] = { + ... + bind_addr: 'IP ADDRESS' + } + ``` +1. Run `gitlab-ctl reconfigure` + +### Outage recovery + +If you lost enough server agents in the cluster to break quorum, then the cluster is considered failed, and it will not function without manual intervenetion. + +#### Recreate from scratch +By default, GitLab does not store anything in the consul cluster that cannot be recreated. To erase the consul database and reinitialize + +``` +# gitlab-ctl stop consul +# rm -rf /var/opt/gitlab/consul/data +# gitlab-ctl start consul +``` + +After this, the cluster should start back up, and the server agents rejoin. Shortly after that, the client agents should rejoin as well. + +#### Recover a failed cluster +If you have taken advantage of consul to store other data, and want to restore the failed cluster, please follow the [Consul guide](https://www.consul.io/docs/guides/outage.html) to recover a failed cluster. diff --git a/doc/administration/high_availability/database.md b/doc/administration/high_availability/database.md index c1eeb40b98f..5c725f00b79 100644 --- a/doc/administration/high_availability/database.md +++ b/doc/administration/high_availability/database.md @@ -1,11 +1,6 @@ -# Configuring a Database for GitLab HA +# Configuring PostgreSQL for Scaling and High Availability -You can choose to install and manage a database server (PostgreSQL/MySQL) -yourself, or you can use GitLab Omnibus packages to help. GitLab recommends -PostgreSQL. This is the database that will be installed if you use the -Omnibus package to manage your database. - -## Configure your own database server +## Provide your own PostgreSQL instance **[CORE ONLY]** If you're hosting GitLab on a cloud provider, you can optionally use a managed service for PostgreSQL. For example, AWS offers a managed Relational @@ -20,91 +15,1147 @@ If you use a cloud-managed service, or provide your own PostgreSQL: 1. Configure the GitLab application servers with the appropriate details. This step is covered in [Configuring GitLab for HA](gitlab.md). -## Configure using Omnibus +## PostgreSQL in a Scaled Environment -1. Download/install GitLab Omnibus using **steps 1 and 2** from - [GitLab downloads](https://about.gitlab.com/downloads). Do not complete other - steps on the download page. -1. Create/edit `/etc/gitlab/gitlab.rb` and use the following configuration. - Be sure to change the `external_url` to match your eventual GitLab front-end - URL. If there is a directive listed below that you do not see in the configuration, be sure to add it. +This section is relevant for [Scaled Architecture](./README.md#scalable-architecture-examples) +environments including [Basic Scaling](./README.md#basic-scaling) and +[Full Scaling](./README.md#full-scaling). - ```ruby - external_url 'https://gitlab.example.com' +### Provide your own PostgreSQL instance **[CORE ONLY]** + +If you want to use your own deployed PostgreSQL instance(s), +see [Provide your own PostgreSQL instance](#provide-your-own-postgresql-instance-core-only) +for more details. However, you can use the GitLab Omnibus package to easily +deploy the bundled PostgreSQL. + +### Standalone PostgreSQL using GitLab Omnibus **[CORE ONLY]** + +1. SSH into the PostgreSQL server. +1. [Download/install](https://about.gitlab.com/installation) the Omnibus GitLab + package you want using **steps 1 and 2** from the GitLab downloads page. + - Do not complete any other steps on the download page. +1. Generate a password hash for PostgreSQL. This assumes you will use the default + username of `gitlab` (recommended). The command will request a password + and confirmation. Use the value that is output by this command in the next + step as the value of `POSTGRESQL_PASSWORD_HASH`. + ```sh + sudo gitlab-ctl pg-password-md5 gitlab + ``` + +1. Edit `/etc/gitlab/gitlab.rb` and add the contents below, updating placeholder + values appropriately. + + - `POSTGRESQL_PASSWORD_HASH` - The value output from the previous step + - `APPLICATION_SERVER_IP_BLOCKS` - A space delimited list of IP subnets or IP + addresses of the GitLab application servers that will connect to the + database. Example: `%w(123.123.123.123/32 123.123.123.234/32)` + + ```ruby # Disable all components except PostgreSQL roles ['postgres_role'] + repmgr['enable'] = false + consul['enable'] = false + prometheus['enable'] = false + alertmanager['enable'] = false + pgbouncer_exporter['enable'] = false + redis_exporter['enable'] = false + gitlab_monitor['enable'] = false + + postgresql['listen_address'] = '0.0.0.0' + postgresql['port'] = 5432 + + # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value + postgresql['sql_user_password'] = 'POSTGRESQL_PASSWORD_HASH' + + # Replace XXX.XXX.XXX.XXX/YY with Network Address + # ???? + postgresql['trust_auth_cidr_addresses'] = %w(APPLICATION_SERVER_IP_BLOCKS) + + # Disable automatic database migrations + gitlab_rails['auto_migrate'] = false + ``` + + NOTE: **Note:** The role `postgres_role` was introduced with GitLab 10.3 + +1. [Reconfigure GitLab] for the changes to take effect. +1. Note the PostgreSQL node's IP address or hostname, port, and + plain text password. These will be necessary when configuring the GitLab + application servers later. + +Advanced configuration options are supported and can be added if +needed. + +Continue configuration of other components by going +[back to Scaled Architectures](./README.md#scalable-architecture-examples) + +## PostgreSQL with High Availability + +This section is relevant for [High Availability Architecture](./README.md#high-availability-architecture-examples) +environments including [Horizontal](./README.md#horizontal), +[Hybrid](./README.md#hybrid), and +[Fully Distributed](./README.md#fully-distributed). + +### Provide your own PostgreSQL instance **[CORE ONLY]** + +If you want to use your own deployed PostgreSQL instance(s), +see [Provide your own PostgreSQL instance](#provide-your-own-postgresql-instance-core-only) +for more details. However, you can use the GitLab Omnibus package to easily +deploy the bundled PostgreSQL. + +### High Availability with GitLab Omnibus **[PREMIUM ONLY]** + +> Important notes: +> - This document will focus only on configuration supported with [GitLab Premium](https://about.gitlab.com/pricing/), using the Omnibus GitLab package. +> - If you are a Community Edition or Starter user, consider using a cloud hosted solution. +> - This document will not cover installations from source. +> +> - If HA setup is not what you were looking for, see the [database configuration document](http://docs.gitlab.com/omnibus/settings/database.html) +> for the Omnibus GitLab packages. + +> Please read this document fully before attempting to configure PostgreSQL HA +> for GitLab. +> +> This configuration is GA in EE 10.2. + +The recommended configuration for a PostgreSQL HA requires: + +- A minimum of three database nodes + - Each node will run the following services: + - `PostgreSQL` - The database itself + - `repmgrd` - A service to monitor, and handle failover in case of a failure + - `Consul` agent - Used for service discovery, to alert other nodes when failover occurs +- A minimum of three `Consul` server nodes +- A minimum of one `pgbouncer` service node + +You also need to take into consideration the underlying network topology, +making sure you have redundant connectivity between all Database and GitLab instances, +otherwise the networks will become a single point of failure. + +#### Architecture + +![PG HA Architecture](pg_ha_architecture.png) + +Database nodes run two services with PostgreSQL: + +- Repmgrd. Monitors the cluster and handles failover when issues with the master occur. The failover consists of: + - Selecting a new master for the cluster. + - Promoting the new node to master. + - Instructing remaining servers to follow the new master node. + + On failure, the old master node is automatically evicted from the cluster, and should be rejoined manually once recovered. +- Consul. Monitors the status of each node in the database cluster and tracks its health in a service definition on the consul cluster. + +Alongside pgbouncer, there is a consul agent that watches the status of the PostgreSQL service. If that status changes, consul runs a script which updates the configuration and reloads pgbouncer + +##### Connection flow + +Each service in the package comes with a set of [default ports](https://docs.gitlab.com/omnibus/package-information/defaults.html#ports). You may need to make specific firewall rules for the connections listed below: + +- Application servers connect to [PgBouncer default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#pgbouncer) +- PgBouncer connects to the primary database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql) +- Repmgr connects to the database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql) +- Postgres secondaries connect to the primary database servers [PostgreSQL default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#postgresql) +- Consul servers and agents connect to each others [Consul default ports](https://docs.gitlab.com/omnibus/package-information/defaults.html#consul) + +#### Required information + +Before proceeding with configuration, you will need to collect all the necessary +information. + +##### Network information + +PostgreSQL does not listen on any network interface by default. It needs to know +which IP address to listen on in order to be accessible to other services. +Similarly, PostgreSQL access is controlled based on the network source. + +This is why you will need: + +> IP address of each nodes network interface +> - This can be set to `0.0.0.0` to listen on all interfaces. It cannot +> be set to the loopack address `127.0.0.1` +> +> Network Address +> - This can be in subnet (i.e. `192.168.0.0/255.255.255.0`) or CIDR (i.e. +> `192.168.0.0/24`) form. + +##### User information + +Various services require different configuration to secure +the communication as well as information required for running the service. +Bellow you will find details on each service and the minimum required +information you need to provide. + +##### Consul information + +When using default setup, minimum configuration requires: + +- `CONSUL_USERNAME`. Defaults to `gitlab-consul` +- `CONSUL_DATABASE_PASSWORD`. Password for the database user. +- `CONSUL_PASSWORD_HASH`. This is a hash generated out of consul username/password pair. + Can be generated with: + + ```sh + sudo gitlab-ctl pg-password-md5 CONSUL_USERNAME + ``` + +- `CONSUL_SERVER_NODES`. The IP addresses or DNS records of the Consul server nodes. + +Few notes on the service itself: + +- The service runs under a system account, by default `gitlab-consul`. + - If you are using a different username, you will have to specify it. We +will refer to it with `CONSUL_USERNAME`, +- There will be a database user created with read only access to the repmgr +database +- Passwords will be stored in the following locations: + - `/etc/gitlab/gitlab.rb`: hashed + - `/var/opt/gitlab/pgbouncer/pg_auth`: hashed + - `/var/opt/gitlab/gitlab-consul/.pgpass`: plaintext + +##### PostgreSQL information + +When configuring PostgreSQL, we will set `max_wal_senders` to one more than +the number of database nodes in the cluster. +This is used to prevent replication from using up all of the +available database connections. + +> Note: +> - In this document we are assuming 3 database nodes, which makes this configuration: + +``` +postgresql['max_wal_senders'] = 4 +``` + +As previously mentioned, you'll have to prepare the network subnets that will +be allowed to authenticate with the database. +You'll also need to supply the IP addresses or DNS records of Consul +server nodes. + +We will need the following password information for the application's database user: + +- `POSTGRESQL_USERNAME`. Defaults to `gitlab` +- `POSTGRESQL_USER_PASSWORD`. The password for the database user +- `POSTGRESQL_PASSWORD_HASH`. This is a hash generated out of the username/password pair. + Can be generated with: + + ```sh + sudo gitlab-ctl pg-password-md5 POSTGRESQL_USERNAME + ``` + +##### Pgbouncer information + +When using default setup, minimum configuration requires: + +- `PGBOUNCER_USERNAME`. Defaults to `pgbouncer` +- `PGBOUNCER_PASSWORD`. This is a password for pgbouncer service. +- `PGBOUNCER_PASSWORD_HASH`. This is a hash generated out of pgbouncer username/password pair. + Can be generated with: + + ```sh + sudo gitlab-ctl pg-password-md5 PGBOUNCER_USERNAME + ``` + +- `PGBOUNCER_NODE`, is the IP address or a FQDN of the node running Pgbouncer. + +Few notes on the service itself: + +- The service runs as the same system account as the database + - In the package, this is by default `gitlab-psql` +- If you use a non-default user account for Pgbouncer service (by default `pgbouncer`), you will have to specify this username. We will refer to this requirement with `PGBOUNCER_USERNAME`. +- The service will have a regular database user account generated for it + - This defaults to `repmgr` +- Passwords will be stored in the following locations: + - `/etc/gitlab/gitlab.rb`: hashed, and in plain text + - `/var/opt/gitlab/pgbouncer/pg_auth`: hashed + +##### Repmgr information + +When using default setup, you will only have to prepare the network subnets that will +be allowed to authenticate with the service. + +Few notes on the service itself: + +- The service runs under the same system account as the database + - In the package, this is by default `gitlab-psql` +- The service will have a superuser database user account generated for it + - This defaults to `gitlab_repmgr` + +#### Installing Omnibus GitLab + +First, make sure to [download/install](https://about.gitlab.com/installation) +GitLab Omnibus **on each node**. + +Make sure you install the necessary dependencies from step 1, +add GitLab package repository from step 2. +When installing the GitLab package, do not supply `EXTERNAL_URL` value. + +#### Configuring the Consul nodes + +On each Consul node perform the following: + +1. Make sure you collect [`CONSUL_SERVER_NODES`](#consul-information) before executing the next step. + +1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: + + ```ruby + # Disable all components except Consul + roles ['consul_role'] + + # START user configuration + # Replace placeholders: + # + # Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z + # with the addresses gathered for CONSUL_SERVER_NODES + consul['configuration'] = { + server: true, + retry_join: %w(Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z) + } + + # Disable auto migrations + gitlab_rails['auto_migrate'] = false + # + # END user configuration + ``` + + > `consul_role` was introduced with GitLab 10.3 + +1. [Reconfigure GitLab] for the changes to take effect. + +##### Consul Checkpoint + +Before moving on, make sure Consul is configured correctly. Run the following +command to verify all server nodes are communicating: + +``` +/opt/gitlab/embedded/bin/consul members +``` + +The output should be similar to: + +``` +Node Address Status Type Build Protocol DC +CONSUL_NODE_ONE XXX.XXX.XXX.YYY:8301 alive server 0.9.2 2 gitlab_consul +CONSUL_NODE_TWO XXX.XXX.XXX.YYY:8301 alive server 0.9.2 2 gitlab_consul +CONSUL_NODE_THREE XXX.XXX.XXX.YYY:8301 alive server 0.9.2 2 gitlab_consul +``` + +If any of the nodes isn't `alive` or if any of the three nodes are missing, +check the [Troubleshooting section](#troubleshooting) before proceeding. + +#### Configuring the Database nodes + +1. Make sure you collect [`CONSUL_SERVER_NODES`](#consul-information), [`PGBOUNCER_PASSWORD_HASH`](#pgbouncer-information), [`POSTGRESQL_PASSWORD_HASH`](#postgresql-information), the [number of db nodes](#postgresql-information), and the [network address](#network-information) before executing the next step. + +1. On the master database node, edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: + + ```ruby + # Disable all components except PostgreSQL and Repmgr and Consul + roles ['postgres_role'] # PostgreSQL configuration - gitlab_rails['db_password'] = 'DB password' - postgresql['md5_auth_cidr_addresses'] = ['0.0.0.0/0'] postgresql['listen_address'] = '0.0.0.0' + postgresql['hot_standby'] = 'on' + postgresql['wal_level'] = 'replica' + postgresql['shared_preload_libraries'] = 'repmgr_funcs' # Disable automatic database migrations gitlab_rails['auto_migrate'] = false + + # Configure the consul agent + consul['services'] = %w(postgresql) + + # START user configuration + # Please set the real values as explained in Required Information section + # + # Replace PGBOUNCER_PASSWORD_HASH with a generated md5 value + postgresql['pgbouncer_user_password'] = 'PGBOUNCER_PASSWORD_HASH' + # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value + postgresql['sql_user_password'] = 'POSTGRESQL_PASSWORD_HASH' + # Replace X with value of number of db nodes + 1 + postgresql['max_wal_senders'] = X + + # Replace XXX.XXX.XXX.XXX/YY with Network Address + postgresql['trust_auth_cidr_addresses'] = %w(XXX.XXX.XXX.XXX/YY) + repmgr['trust_auth_cidr_addresses'] = %w(127.0.0.1/32 XXX.XXX.XXX.XXX/YY) + + # Replace placeholders: + # + # Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z + # with the addresses gathered for CONSUL_SERVER_NODES + consul['configuration'] = { + retry_join: %w(Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z) + } + # + # END user configuration ``` -1. Run `sudo gitlab-ctl reconfigure` to install and configure PostgreSQL. + > `postgres_role` was introduced with GitLab 10.3 + +1. On secondary nodes, add all the configuration specified above for primary node + to `/etc/gitlab/gitlab.rb`. In addition, append the following configuration + to inform gitlab-ctl that they are standby nodes initially and it need not + attempt to register them as primary node + ``` + # HA setting to specify if a node should attempt to be master on initialization + repmgr['master_on_initialization'] = false + ``` - > **Note**: This `reconfigure` step will result in some errors. - That's OK - don't be alarmed. +1. [Reconfigure GitLab] for te changes to take effect. + +> Please note: +> - If you want your database to listen on a specific interface, change the config: +> `postgresql['listen_address'] = '0.0.0.0'` +> - If your Pgbouncer service runs under a different user account, +> you also need to specify: `postgresql['pgbouncer_user'] = PGBOUNCER_USERNAME` in +> your configuration + +##### Database nodes post-configuration + +###### Primary node + +Select one node as a primary node. 1. Open a database prompt: + ```sh + gitlab-psql -d gitlabhq_production ``` - su - gitlab-psql - /bin/bash - psql -h /var/opt/gitlab/postgresql -d template1 - # Output: +1. Enable the `pg_trgm` extension: - psql (9.2.15) - Type "help" for help. + ```sh + CREATE EXTENSION pg_trgm; + ``` + +1. Exit the database prompt by typing `\q` and Enter. + +1. Verify the cluster is initialized with one node: - template1=# + ```sh + gitlab-ctl repmgr cluster show ``` -1. Run the following command at the database prompt and you will be asked to - enter the new password for the PostgreSQL superuser. + The output should be similar to the following: ``` - \password + Role | Name | Upstream | Connection String + ----------+----------|----------|---------------------------------------- + * master | HOSTNAME | | host=HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr + ``` - # Output: +1. Note down the hostname/ip in the connection string: `host=HOSTNAME`. We will + refer to the hostname in the next section as `MASTER_NODE_NAME`. If the value + is not an IP address, it will need to be a resolvable name (via DNS or + `/etc/hosts`) - Enter new password: - Enter it again: + +###### Secondary nodes + +1. Set up the repmgr standby: + + ```sh + gitlab-ctl repmgr standby setup MASTER_NODE_NAME ``` -1. Similarly, set the password for the `gitlab` database user. Use the same - password that you specified in the `/etc/gitlab/gitlab.rb` file for - `gitlab_rails['db_password']`. + Do note that this will remove the existing data on the node. The command + has a wait time. + The output should be similar to the following: + + ```console + # gitlab-ctl repmgr standby setup MASTER_NODE_NAME + Doing this will delete the entire contents of /var/opt/gitlab/postgresql/data + If this is not what you want, hit Ctrl-C now to exit + To skip waiting, rerun with the -w option + Sleeping for 30 seconds + Stopping the database + Removing the data + Cloning the data + Starting the database + Registering the node with the cluster + ok: run: repmgrd: (pid 19068) 0s ``` - \password gitlab - # Output: +1. Verify the node now appears in the cluster: - Enter new password: - Enter it again: + ```sh + gitlab-ctl repmgr cluster show ``` -1. Exit from editing `template1` prompt by typing `\q` and Enter. -1. Enable the `pg_trgm` extension within the `gitlabhq_production` database: - + + The output should be similar to the following: + + ``` + Role | Name | Upstream | Connection String + ----------+---------|-----------|------------------------------------------------ + * master | MASTER | | host=MASTER_NODE_NAME user=gitlab_repmgr dbname=gitlab_repmgr + standby | STANDBY | MASTER | host=STANDBY_HOSTNAME user=gitlab_repmgr dbname=gitlab_repmgr + ``` + +Repeat the above steps on all secondary nodes. + +##### Database checkpoint + +Before moving on, make sure the databases are configured correctly. Run the +following command on the **primary** node to verify that replication is working +properly: + +``` +gitlab-ctl repmgr cluster show +``` + +The output should be similar to: + +``` +Role | Name | Upstream | Connection String +----------+--------------|--------------|-------------------------------------------------------------------- +* master | MASTER | | host=MASTER port=5432 user=gitlab_repmgr dbname=gitlab_repmgr + standby | STANDBY | MASTER | host=STANDBY port=5432 user=gitlab_repmgr dbname=gitlab_repmgr +``` + +If the 'Role' column for any node says "FAILED", check the +[Troubleshooting section](#troubleshooting) before proceeding. + +Also, check that the check master command works successfully on each node: + +``` +su - gitlab-consul +gitlab-ctl repmgr-check-master || echo 'This node is a standby repmgr node' +``` + +This command relies on exit codes to tell Consul whether a particular node is a master +or secondary. The most important thing here is that this command does not produce errors. +If there are errors it's most likely due to incorrect `gitlab-consul` database user permissions. +Check the [Troubleshooting section](#troubleshooting) before proceeding. + +#### Configuring the Pgbouncer node + +1. Make sure you collect [`CONSUL_SERVER_NODES`](#consul-information), [`CONSUL_PASSWORD_HASH`](#consul-information), and [`PGBOUNCER_PASSWORD_HASH`](#pgbouncer-information) before executing the next step. + +1. Edit `/etc/gitlab/gitlab.rb` replacing values noted in the `# START user configuration` section: + + ```ruby + # Disable all components except Pgbouncer and Consul agent + roles ['pgbouncer_role'] + + # Configure Pgbouncer + pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul) + + # Configure Consul agent + consul['watchers'] = %w(postgresql) + + # START user configuration + # Please set the real values as explained in Required Information section + # Replace CONSUL_PASSWORD_HASH with with a generated md5 value + # Replace PGBOUNCER_PASSWORD_HASH with with a generated md5 value + pgbouncer['users'] = { + 'gitlab-consul': { + password: 'CONSUL_PASSWORD_HASH' + }, + 'pgbouncer': { + password: 'PGBOUNCER_PASSWORD_HASH' + } + } + # Replace placeholders: + # + # Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z + # with the addresses gathered for CONSUL_SERVER_NODES + consul['configuration'] = { + retry_join: %w(Y.Y.Y.Y consul1.gitlab.example.com Z.Z.Z.Z) + } + # + # END user configuration ``` + + > `pgbouncer_role` was introduced with GitLab 10.3 + +1. [Reconfigure GitLab] for the changes to take effect. + +1. Create a `.pgpass` file so Consule is able to + reload pgbouncer. Enter the `PGBOUNCER_PASSWORD` twice when asked: + + ```sh + gitlab-ctl write-pgpass --host 127.0.0.1 --database pgbouncer --user pgbouncer --hostuser gitlab-consul + ``` + +##### PGBouncer Checkpoint + +1. Ensure the node is talking to the current master: + + ```sh + gitlab-ctl pgb-console # You will be prompted for PGBOUNCER_PASSWORD + ``` + + If there is an error `psql: ERROR: Auth failed` after typing in the + password, ensure you previously generated the MD5 password hashes with the correct + format. The correct format is to concatenate the password and the username: + `PASSWORDUSERNAME`. For example, `Sup3rS3cr3tpgbouncer` would be the text + needed to generate an MD5 password hash for the `pgbouncer` user. + +1. Once the console prompt is available, run the following queries: + + ```sh + show databases ; show clients ; + ``` + + The output should be similar to the following: + + ``` + name | host | port | database | force_user | pool_size | reserve_pool | pool_mode | max_connections | current_connections + ---------------------+-------------+------+---------------------+------------+-----------+--------------+-----------+-----------------+--------------------- + gitlabhq_production | MASTER_HOST | 5432 | gitlabhq_production | | 20 | 0 | | 0 | 0 + pgbouncer | | 6432 | pgbouncer | pgbouncer | 2 | 0 | statement | 0 | 0 + (2 rows) + + type | user | database | state | addr | port | local_addr | local_port | connect_time | request_time | ptr | link | remote_pid | tls + ------+-----------+---------------------+---------+----------------+-------+------------+------------+---------------------+---------------------+-----------+------+------------+----- + C | pgbouncer | pgbouncer | active | 127.0.0.1 | 56846 | 127.0.0.1 | 6432 | 2017-08-21 18:09:59 | 2017-08-21 18:10:48 | 0x22b3880 | | 0 | + (2 rows) + ``` + +#### Configuring the Application nodes + +These will be the nodes running the `gitlab-rails` service. You may have other +attributes set, but the following need to be set. + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + # Disable PostgreSQL on the application node + postgresql['enable'] = false + + gitlab_rails['db_host'] = 'PGBOUNCER_NODE' + gitlab_rails['db_port'] = 6432 + gitlab_rails['db_password'] = 'POSTGRESQL_USER_PASSWORD' + gitlab_rails['auto_migrate'] = false + ``` + +1. [Reconfigure GitLab] for the changes to take effect. + +##### Application node post-configuration + +Ensure that all migrations ran: + +```sh +gitlab-rake gitlab:db:configure +``` + +> **Note**: If you encounter a `rake aborted!` error stating that PGBouncer is failing to connect to +PostgreSQL it may be that your PGBouncer node's IP address is missing from +PostgreSQL's `trust_auth_cidr_addresses` in `gitlab.rb` on your database nodes. See +[PGBouncer error `ERROR: pgbouncer cannot connect to server`](#pgbouncer-error-error-pgbouncer-cannot-connect-to-server) +in the Troubleshooting section before proceeding. + +##### Ensure GitLab is running + +At this point, your GitLab instance should be up and running. Verify you are +able to login, and create issues and merge requests. If you have troubles check +the [Troubleshooting section](#troubleshooting). + +#### Example configuration + +Here we'll show you some fully expanded example configurations. + +##### Example recommended setup + +This example uses 3 consul servers, 3 postgresql servers, and 1 application node. + +We start with all servers on the same 10.6.0.0/16 private network range, they +can connect to each freely other on those addresses. + +Here is a list and description of each machine and the assigned IP: + +* `10.6.0.11`: Consul 1 +* `10.6.0.12`: Consul 2 +* `10.6.0.13`: Consul 3 +* `10.6.0.21`: PostgreSQL master +* `10.6.0.22`: PostgreSQL secondary +* `10.6.0.23`: PostgreSQL secondary +* `10.6.0.31`: GitLab application + +All passwords are set to `toomanysecrets`, please do not use this password or derived hashes. + +The external_url for GitLab is `http://gitlab.example.com` + +Please note that after the initial configuration, if a failover occurs, the PostgresSQL master will change to one of the available secondaries until it is failed back. + +##### Example recommended setup for Consul servers + +On each server edit `/etc/gitlab/gitlab.rb`: + +```ruby +# Disable all components except Consul +roles ['consul_role'] + +consul['configuration'] = { + server: true, + retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13) +} +``` + +[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect. + +##### Example recommended setup for PostgreSQL servers + +###### Primary node + +On primary node edit `/etc/gitlab/gitlab.rb`: + +```ruby +# Disable all components except PostgreSQL and Repmgr and Consul +roles ['postgres_role'] + +# PostgreSQL configuration +postgresql['listen_address'] = '0.0.0.0' +postgresql['hot_standby'] = 'on' +postgresql['wal_level'] = 'replica' +postgresql['shared_preload_libraries'] = 'repmgr_funcs' + +# Disable automatic database migrations +gitlab_rails['auto_migrate'] = false + +# Configure the consul agent +consul['services'] = %w(postgresql) + +postgresql['pgbouncer_user_password'] = '771a8625958a529132abe6f1a4acb19c' +postgresql['sql_user_password'] = '450409b85a0223a214b5fb1484f34d0f' +postgresql['max_wal_senders'] = 4 + +postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/16) +repmgr['trust_auth_cidr_addresses'] = %w(10.6.0.0/16) + +consul['configuration'] = { + retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13) +} +``` + +[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect. + +###### Secondary nodes + +On secondary nodes, edit `/etc/gitlab/gitlab.rb` and add all the configuration +added to primary node, noted above. In addition, append the following +configuration + +``` +# HA setting to specify if a node should attempt to be master on initialization +repmgr['master_on_initialization'] = false +``` + +[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect. + +##### Example recommended setup for application server + +On the server edit `/etc/gitlab/gitlab.rb`: + +```ruby +external_url 'http://gitlab.example.com' + +gitlab_rails['db_host'] = '127.0.0.1' +gitlab_rails['db_port'] = 6432 +gitlab_rails['db_password'] = 'toomanysecrets' +gitlab_rails['auto_migrate'] = false + +postgresql['enable'] = false +pgbouncer['enable'] = true +consul['enable'] = true + +# Configure Pgbouncer +pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul) + +# Configure Consul agent +consul['watchers'] = %w(postgresql) + +pgbouncer['users'] = { + 'gitlab-consul': { + password: '5e0e3263571e3704ad655076301d6ebe' + }, + 'pgbouncer': { + password: '771a8625958a529132abe6f1a4acb19c' + } +} + +consul['configuration'] = { + retry_join: %w(10.6.0.11 10.6.0.12 10.6.0.13) +} +``` + +[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect. + +##### Example recommended setup manual steps + +After deploying the configuration follow these steps: + +1. On `10.6.0.21`, our primary database + + Enable the `pg_trgm` extension + + ```sh gitlab-psql -d gitlabhq_production - + ``` + + ``` CREATE EXTENSION pg_trgm; + ``` + +1. On `10.6.0.22`, our first standby database - # Output: + Make this node a standby of the primary - CREATE EXTENSION + ```sh + gitlab-ctl repmgr standby setup 10.6.0.21 ``` -1. Exit the database prompt by typing `\q` and Enter. -1. Exit the `gitlab-psql` user by running `exit` twice. -1. Run `sudo gitlab-ctl reconfigure` a final time. -1. Configure the GitLab application servers with the appropriate details. - This step is covered in [Configuring GitLab for HA](gitlab.md). + +1. On `10.6.0.23`, our second standby database + + Make this node a standby of the primary + + ```sh + gitlab-ctl repmgr standby setup 10.6.0.21 + ``` + +1. On `10.6.0.31`, our application server + + Set gitlab-consul's pgbouncer password to `toomanysecrets` + + ```sh + gitlab-ctl write-pgpass --host 127.0.0.1 --database pgbouncer --user pgbouncer --hostuser gitlab-consul + ``` + + Run database migrations + + ```sh + gitlab-rake gitlab:db:configure + ``` + +#### Example minimal setup + +This example uses 3 postgresql servers, and 1 application node. + +It differs from the [recommended setup](#example-recommended-setup) by moving the consul servers into the same servers we use for PostgreSQL. +The trade-off is between reducing server counts, against the increased operational complexity of needing to deal with postgres [failover](#failover-procedure) and [restore](#restore-procedure) procedures in addition to [consul outage recovery](consul.md#outage-recovery) on the same set of machines. + +In this example we start with all servers on the same 10.6.0.0/16 private network range, they can connect to each freely other on those addresses. + +Here is a list and description of each machine and the assigned IP: + +* `10.6.0.21`: PostgreSQL master +* `10.6.0.22`: PostgreSQL secondary +* `10.6.0.23`: PostgreSQL secondary +* `10.6.0.31`: GitLab application + +All passwords are set to `toomanysecrets`, please do not use this password or derived hashes. + +The external_url for GitLab is `http://gitlab.example.com` + +Please note that after the initial configuration, if a failover occurs, the PostgresSQL master will change to one of the available secondaries until it is failed back. + +##### Example minimal configuration for database servers + +##### Primary node +On primary database node edit `/etc/gitlab/gitlab.rb`: + +```ruby +# Disable all components except PostgreSQL, Repmgr, and Consul +roles ['postgres_role'] + +# PostgreSQL configuration +postgresql['listen_address'] = '0.0.0.0' +postgresql['hot_standby'] = 'on' +postgresql['wal_level'] = 'replica' +postgresql['shared_preload_libraries'] = 'repmgr_funcs' + +# Disable automatic database migrations +gitlab_rails['auto_migrate'] = false + +# Configure the consul agent +consul['services'] = %w(postgresql) + +postgresql['pgbouncer_user_password'] = '771a8625958a529132abe6f1a4acb19c' +postgresql['sql_user_password'] = '450409b85a0223a214b5fb1484f34d0f' +postgresql['max_wal_senders'] = 4 + +postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/16) +repmgr['trust_auth_cidr_addresses'] = %w(10.6.0.0/16) + +consul['configuration'] = { + server: true, + retry_join: %w(10.6.0.21 10.6.0.22 10.6.0.23) +} +``` + +[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect. + +###### Secondary nodes + +On secondary nodes, edit `/etc/gitlab/gitlab.rb` and add all the information added +to primary node, noted above. In addition, append the following configuration + +``` +# HA setting to specify if a node should attempt to be master on initialization +repmgr['master_on_initialization'] = false +``` + +##### Example minimal configuration for application server + +On the server edit `/etc/gitlab/gitlab.rb`: + +```ruby +external_url 'http://gitlab.example.com' + +gitlab_rails['db_host'] = '127.0.0.1' +gitlab_rails['db_port'] = 6432 +gitlab_rails['db_password'] = 'toomanysecrets' +gitlab_rails['auto_migrate'] = false + +postgresql['enable'] = false +pgbouncer['enable'] = true +consul['enable'] = true + +# Configure Pgbouncer +pgbouncer['admin_users'] = %w(pgbouncer gitlab-consul) + +# Configure Consul agent +consul['watchers'] = %w(postgresql) + +pgbouncer['users'] = { + 'gitlab-consul': { + password: '5e0e3263571e3704ad655076301d6ebe' + }, + 'pgbouncer': { + password: '771a8625958a529132abe6f1a4acb19c' + } +} + +consul['configuration'] = { + retry_join: %w(10.6.0.21 10.6.0.22 10.6.0.23) +} +``` + +[Reconfigure Omnibus GitLab][reconfigure GitLab] for the changes to take effect. + +##### Example minimal setup manual steps + +The manual steps for this configuration are the same as for the [example recommended setup](#example-recommended-setup-manual-steps). + +#### Failover procedure + +By default, if the master database fails, `repmgrd` should promote one of the +standby nodes to master automatically, and consul will update pgbouncer with +the new master. + +If you need to failover manually, you have two options: + +**Shutdown the current master database** + +Run: + +```sh +gitlab-ctl stop postgresql +``` + +The automated failover process will see this and failover to one of the +standby nodes. + +**Or perform a manual failover** + +1. Ensure the old master node is not still active. +1. Login to the server that should become the new master and run: + + ```sh + gitlab-ctl repmgr standby promote + ``` + +1. If there are any other standby servers in the cluster, have them follow + the new master server: + + ```sh + gitlab-ctl repmgr standby follow NEW_MASTER + ``` + +#### Restore procedure + +If a node fails, it can be removed from the cluster, or added back as a standby +after it has been restored to service. + +- If you want to remove the node from the cluster, on any other node in the + cluster, run: + + ```sh + gitlab-ctl repmgr standby unregister --node=X + ``` + + where X is the value of node in `repmgr.conf` on the old server. + + To find this, you can use: + + ```sh + awk -F = '$1 == "node" { print $2 }' /var/opt/gitlab/postgresql/repmgr.conf + ``` + + It will output something like: + + ``` + 959789412 + ``` + + Then you will use this id to unregister the node: + + ```sh + gitlab-ctl repmgr standby unregister --node=959789412 + ``` + +- To add the node as a standby server: + + ```sh + gitlab-ctl repmgr standby follow NEW_MASTER + gitlab-ctl restart repmgrd + ``` + + CAUTION: **Warning:** When the server is brought back online, and before + you switch it to a standby node, repmgr will report that there are two masters. + If there are any clients that are still attempting to write to the old master, + this will cause a split, and the old master will need to be resynced from + scratch by performing a `gitlab-ctl repmgr standby setup NEW_MASTER`. + +#### Alternate configurations + +##### Database authorization + +By default, we give any host on the database network the permission to perform +repmgr operations using PostgreSQL's `trust` method. If you do not want this +level of trust, there are alternatives. + +You can trust only the specific nodes that will be database clusters, or you +can require md5 authentication. + +##### Trust specific addresses + +If you know the IP address, or FQDN of all database and pgbouncer nodes in the +cluster, you can trust only those nodes. + +In `/etc/gitlab/gitlab.rb` on all of the database nodes, set +`repmgr['trust_auth_cidr_addresses']` to an array of strings containing all of +the addresses. + +If setting to a node's FQDN, they must have a corresponding PTR record in DNS. +If setting to a node's IP address, specify it as `XXX.XXX.XXX.XXX/32`. + +For example: + +```ruby +repmgr['trust_auth_cidr_addresses'] = %w(192.168.1.44/32 db2.example.com) +``` + + +##### MD5 Authentication + +If you are running on an untrusted network, repmgr can use md5 authentication +with a [.pgpass file](https://www.postgresql.org/docs/9.6/static/libpq-pgpass.html) +to authenticate. + +You can specify by IP address, FQDN, or by subnet, using the same format as in +the previous section: + +1. On the current master node, create a password for the `gitlab` and + `gitlab_repmgr` user: + + ```sh + gitlab-psql -d template1 + template1=# \password gitlab_repmgr + Enter password: **** + Confirm password: **** + template1=# \password gitlab + ``` + +1. On each database node: + + 1. Edit `/etc/gitlab/gitlab.rb`: + 1. Ensure `repmgr['trust_auth_cidr_addresses']` is **not** set + 1. Set `postgresql['md5_auth_cidr_addresses']` to the desired value + 1. Set `postgresql['sql_replication_user'] = 'gitlab_repmgr'` + 1. Reconfigure with `gitlab-ctl reconfigure` + 1. Restart postgresql with `gitlab-ctl restart postgresql` + + 1. Create a `.pgpass` file. Enter the `gitlab_repmgr` password twice to + when asked: + + ```sh + gitlab-ctl write-pgpass --user gitlab_repmgr --hostuser gitlab-psql --database '*' + ``` + +1. On each pgbouncer node, edit `/etc/gitlab/gitlab.rb`: + 1. Ensure `gitlab_rails['db_password']` is set to the plaintext password for + the `gitlab` database user + 1. [Reconfigure GitLab] for the changes to take effect + +## Troubleshooting + +#### Consul and PostgreSQL changes not taking effect. + +Due to the potential impacts, `gitlab-ctl reconfigure` only reloads Consul and PostgreSQL, it will not restart the services. However, not all changes can be activated by reloading. + +To restart either service, run `gitlab-ctl restart SERVICE` + +For PostgreSQL, it is usually safe to restart the master node by default. Automatic failover defaults to a 1 minute timeout. Provided the database returns before then, nothing else needs to be done. To be safe, you can stop `repmgrd` on the standby nodes first with `gitlab-ctl stop repmgrd`, then start afterwards with `gitlab-ctl start repmgrd`. + +On the consul server nodes, it is important to restart the consul service in a controlled fashion. Read our [consul documentation](consul.md#restarting-the-server-cluster) for instructions on how to restart the service. + +#### `gitlab-ctl repmgr-check-master` command produces errors + +If this command displays errors about database permissions it is likely that something failed during +install, resulting in the `gitlab-consul` database user getting incorrect permissions. Follow these +steps to fix the problem: + +1. On the master database node, connect to the database prompt - `gitlab-psql -d template1` +1. Delete the `gitlab-consul` user - `DROP USER "gitlab-consul";` +1. Exit the database prompt - `\q` +1. [Reconfigure GitLab] and the user will be re-added with the proper permissions. +1. Change to the `gitlab-consul` user - `su - gitlab-consul` +1. Try the check command again - `gitlab-ctl repmgr-check-master`. + +Now there should not be errors. If errors still occur then there is another problem. + +#### PGBouncer error `ERROR: pgbouncer cannot connect to server` + +You may get this error when running `gitlab-rake gitlab:db:configure` or you +may see the error in the PGBouncer log file. + +``` +PG::ConnectionBad: ERROR: pgbouncer cannot connect to server +``` + +The problem may be that your PGBouncer node's IP address is not included in the +`trust_auth_cidr_addresses` setting in `/etc/gitlab/gitlab.rb` on the database nodes. + +You can confirm that this is the issue by checking the PostgreSQL log on the master +database node. If you see the following error then `trust_auth_cidr_addresses` +is the problem. + +``` +2018-03-29_13:59:12.11776 FATAL: no pg_hba.conf entry for host "123.123.123.123", user "pgbouncer", database "gitlabhq_production", SSL off +``` + +To fix the problem, add the IP address to `/etc/gitlab/gitlab.rb`. + +``` +postgresql['trust_auth_cidr_addresses'] = %w(123.123.123.123/32 <other_cidrs>) +``` + +[Reconfigure GitLab] for the changes to take effect. + +#### Issues with other components + +If you're running into an issue with a component not outlined here, be sure to check the troubleshooting section of their specific documentation page. + +- [Consul](consul.md#troubleshooting) +- [PostgreSQL](http://docs.gitlab.com/omnibus/settings/database.html#troubleshooting) +- [GitLab application](gitlab.md#troubleshooting) + +## Configure using Omnibus + +**Note**: We recommend that you follow the instructions here for a full [PostgreSQL cluster](#configure-using-omnibus-for-high-availability). +If you are reading this section due to an old bookmark, you can find that old documentation [in the repository](https://gitlab.com/gitlab-org/gitlab-ce/blob/v10.1.4/doc/administration/high_availability/database.md#configure-using-omnibus). --- @@ -114,3 +1165,6 @@ Read more on high-availability configuration: 1. [Configure NFS](nfs.md) 1. [Configure the GitLab application servers](gitlab.md) 1. [Configure the load balancers](load_balancer.md) +1. [Manage the bundled Consul cluster](consul.md) + +[reconfigure GitLab]: ../restart_gitlab.md#omnibus-gitlab-reconfigure diff --git a/doc/administration/high_availability/gitaly.md b/doc/administration/high_availability/gitaly.md new file mode 100644 index 00000000000..8004eea2208 --- /dev/null +++ b/doc/administration/high_availability/gitaly.md @@ -0,0 +1,91 @@ +# Configuring Gitaly for Scaled and High Availability + +Gitaly does not yet support full high availability. However, Gitaly is quite +stable and is in use on GitLab.com. Scaled and highly available GitLab environments +should consider using Gitaly on a separate node. + +See the [Gitaly HA Epic](https://gitlab.com/groups/gitlab-org/-/epics/289) to +track plans and progress toward high availability support. + +This document is relevant for [Scaled Architecture](./README.md#scalable-architecture-examples) +environments and [High Availability Architecture](./README.md#high-availability-architecture-examples). + +## Running Gitaly on its own server + +Starting with GitLab 11.4, Gitaly is a replacement for NFS except +when the [Elastic Search indexer](https://gitlab.com/gitlab-org/gitlab-elasticsearch-indexer) +is used. + +NOTE: **Note:** While Gitaly can be used as a replacement for NFS, we do not recommend using EFS as it may impact GitLab's performance. Please review the [relevant documentation](nfs.md#avoid-using-awss-elastic-file-system-efs) for more details. + +NOTE: **Note:** Gitaly network traffic is unencrypted so we recommend a firewall to +restrict access to your Gitaly server. + +The steps below are the minimum necessary to configure a Gitaly server with +Omnibus: + +1. SSH into the Gitaly server. +1. [Download/install](https://about.gitlab.com/installation) the Omnibus GitLab + package you want using **steps 1 and 2** from the GitLab downloads page. + - Do not complete any other steps on the download page. + +1. Edit `/etc/gitlab/gitlab.rb` and add the contents: + + Gitaly must trigger some callbacks to GitLab via GitLab Shell. As a result, + the GitLab Shell secret must be the same between the other GitLab servers and + the Gitaly server. The easiest way to accomplish this is to copy `/etc/gitlab/gitlab-secrets.json` + from an existing GitLab server to the Gitaly server. Without this shared secret, + Git operations in GitLab will result in an API error. + + > **NOTE:** In most or all cases the storage paths below end in `repositories` which is + different than `path` in `git_data_dirs` of Omnibus installations. Check the + directory layout on your Gitaly server to be sure. + + ```ruby + # Enable Gitaly + gitaly['enable'] = true + + ## Disable all other services + sidekiq['enable'] = false + gitlab_workhorse['enable'] = false + unicorn['enable'] = false + postgresql['enable'] = false + nginx['enable'] = false + prometheus['enable'] = false + alertmanager['enable'] = false + pgbouncer_exporter['enable'] = false + redis_exporter['enable'] = false + gitlab_monitor['enable'] = false + gitaly['enable'] = false + + # Prevent database connections during 'gitlab-ctl reconfigure' + gitlab_rails['rake_cache_clear'] = false + gitlab_rails['auto_migrate'] = false + + # Configure the gitlab-shell API callback URL. Without this, `git push` will + # fail. This can be your 'front door' GitLab URL or an internal load + # balancer. + gitlab_rails['internal_api_url'] = 'https://gitlab.example.com' + + # Make Gitaly accept connections on all network interfaces. You must use + # firewalls to restrict access to this address/port. + gitaly['listen_addr'] = "0.0.0.0:8075" + gitaly['auth_token'] = 'abc123secret' + + gitaly['storage'] = [ + { 'name' => 'default', 'path' => '/mnt/gitlab/default/repositories' }, + { 'name' => 'storage1', 'path' => '/mnt/gitlab/storage1/repositories' }, + ] + + # To use tls for gitaly you need to add + gitaly['tls_listen_addr'] = "0.0.0.0:9999" + gitaly['certificate_path'] = "path/to/cert.pem" + gitaly['key_path'] = "path/to/key.pem" + ``` + +Again, reconfigure (Omnibus) or restart (source). + +Continue configuration of other components by going back to: + +- [Scaled Architectures](./README.md#scalable-architecture-examples) +- [High Availability Architectures](./README.md#high-availability-architecture-examples) diff --git a/doc/administration/high_availability/gitlab.md b/doc/administration/high_availability/gitlab.md index d95c3acec54..888426ece5c 100644 --- a/doc/administration/high_availability/gitlab.md +++ b/doc/administration/high_availability/gitlab.md @@ -1,8 +1,4 @@ -# Configuring GitLab for HA - -Assuming you have already configured a [database](database.md), [Redis](redis.md), and [NFS](nfs.md), you can -configure the GitLab application server(s) now. Complete the steps below -for each GitLab application server in your environment. +# Configuring GitLab Scaling and High Availability > **Note:** There is some additional configuration near the bottom for additional GitLab application servers. It's important to read and understand diff --git a/doc/administration/high_availability/nfs_host_client_setup.md b/doc/administration/high_availability/nfs_host_client_setup.md new file mode 100644 index 00000000000..fce27332f23 --- /dev/null +++ b/doc/administration/high_availability/nfs_host_client_setup.md @@ -0,0 +1,136 @@ +# Configuring NFS for GitLab HA + +Setting up NFS for a GitLab HA setup allows all applications nodes in a cluster +to share the same files and maintain data consistency. Application nodes in an HA +setup act as clients while the NFS server plays host. + +> Note: The instructions provided in this documentation allow for setting a quick +proof of concept but will leave NFS as potential single point of failure and +therefore not recommended for use in production. Explore options such as [Pacemaker +and Corosync](http://clusterlabs.org/) for highly available NFS in production. + +Below are instructions for setting up an application node(client) in an HA cluster +to read from and write to a central NFS server(host). + +NOTE: **Note:** +Using EFS may negatively impact performance. Please review the [relevant documentation](nfs.md#avoid-using-awss-elastic-file-system-efs) for additional details. + +## NFS Server Setup + +> Follow the instructions below to set up and configure your NFS server. + +### Step 1 - Install NFS Server on Host + +Installing the nfs-kernel-server package allows you to share directories with the clients running the GitLab application. + +```sh +apt-get update +apt-get install nfs-kernel-server +``` + +### Step 2 - Export Host's Home Directory to Client + +In this setup we will share the home directory on the host with the client. Edit the exports file as below to share the host's home directory with the client. If you have multiple clients running GitLab you must enter the client IP addresses in line in the `/etc/exports` file. + +```text +#/etc/exports for one client +/home <client-ip-address>(rw,sync,no_root_squash,no_subtree_check) + +#/etc/exports for three clients +/home <client-ip-address>(rw,sync,no_root_squash,no_subtree_check) <client-2-ip-address>(rw,sync,no_root_squash,no_subtree_check) <client-3-ip-address>(rw,sync,no_root_squash,no_subtree_check) +``` + +Restart the NFS server after making changes to the `exports` file for the changes +to take effect. + +```sh +systemctl restart nfs-kernel-server +``` + +NOTE: **Note:** +You may need to update your server's firewall. See the [firewall section](#nfs-in-a-firewalled-environment) at the end of this guide. + +## Client/ GitLab application node Setup + +> Follow the instructions below to connect any GitLab rails application node running +inside your HA environment to the NFS server configured above. + +### Step 1 - Install NFS Common on Client + +The nfs-common provides NFS functionality without installing server components which +we don't need running on the application nodes. + +```sh +apt-get update +apt-get install nfs-common +``` + +### Step 2 - Create Mount Points on Client + +Create a directroy on the client that we can mount the shared directory from the host. +Please note that if your mount point directory contains any files they will be hidden +once the remote shares are mounted. An empty/new directory on the client is recommended +for this purpose. + +```sh +mkdir -p /nfs/home +``` + +Confirm that the mount point works by mounting it on the client and checking that +it is mounted with the command below: + +```sh +mount <host_ip_address>:/home +df -h +``` + +### Step 3 - Set up Automatic Mounts on Boot + +Edit `/etc/fstab` on client as below to mount the remote shares automatically at boot. +Note that GitLab requires advisory file locking, which is only supported natively in +NFS version 4. NFSv3 also supports locking as long as Linux Kernel 2.6.5+ is used. +We recommend using version 4 and do not specifically test NFSv3. + +```text +#/etc/fstab +165.227.159.85:/home /nfs/home nfs4 defaults,soft,rsize=1048576,wsize=1048576,noatime,nofail,lookupcache=positive 0 2 +``` + +Reboot the client and confirm that the mount point is mounted automatically. + +### Step 4 - Set up GitLab to Use NFS mounts + +When using the default Omnibus configuration you will need to share 5 data locations +between all GitLab cluster nodes. No other locations should be shared. Changing the +default file locations in `gitlab.rb` on the client allows you to have one main mount +point and have all the required locations as subdirectories to use the NFS mount for +git-data. + +```text +git_data_dirs({"default" => "/nfs/home/var/opt/gitlab-data/git-data"}) +user['home'] = '/nfs/home/var/opt/gitlab-data/home' +gitlab_rails['uploads_directory'] = '/nfs/home/var/opt/gitlab-data/uploads' +gitlab_rails['shared_path'] = '/nfs/home/var/opt/gitlab-data/shared' +gitlab_ci['builds_directory'] = '/nfs/home/var/opt/gitlab-data/builds' +``` + +Save the changes in `gitlab.rb` and run `gitlab-ctl reconfigure`. + +## NFS in a Firewalled Environment + +If the traffic between your NFS server and NFS client(s) is subject to port filtering +by a firewall, then you will need to reconfigure that firewall to allow NFS communication. + +[This guide from TDLP](http://tldp.org/HOWTO/NFS-HOWTO/security.html#FIREWALLS) +covers the basics of using NFS in a firewalled environment. Additionally, we encourage you to +search for and review the specific documentation for your OS/distro and your firewall software. + +Example for Ubuntu: + +Check that NFS traffic from the client is allowed by the firewall on the host by running +the command: `sudo ufw status`. If it's being blocked, then you can allow traffic from a specific +client with the command below. + +```sh +sudo ufw allow from <client-ip-address> to any port nfs +``` diff --git a/doc/administration/high_availability/pg_ha_architecture.png b/doc/administration/high_availability/pg_ha_architecture.png Binary files differnew file mode 100644 index 00000000000..ef870f652ae --- /dev/null +++ b/doc/administration/high_availability/pg_ha_architecture.png diff --git a/doc/administration/high_availability/pgbouncer.md b/doc/administration/high_availability/pgbouncer.md new file mode 100644 index 00000000000..c2c57f8e16d --- /dev/null +++ b/doc/administration/high_availability/pgbouncer.md @@ -0,0 +1,131 @@ +# Working with the bundle Pgbouncer service + +## Overview + +As part of its High Availability stack, GitLab Premium includes a bundled version of [Pgbouncer](https://pgbouncer.github.io/) that can be managed through `/etc/gitlab/gitlab.rb`. + +In a High Availability setup, Pgbouncer is used to seamlessly migrate database connections between servers in a failover scenario. + +Additionally, it can be used in a non-HA setup to pool connections, speeding up response time while reducing resource usage. + +It is recommended to run pgbouncer alongside the `gitlab-rails` service, or on its own dedicated node in a cluster. + +## Operations + +### Running Pgbouncer as part of an HA GitLab installation +See our [HA documentation for PostgreSQL](database.md) for information on running pgbouncer as part of a HA setup + +### Running Pgbouncer as part of a non-HA GitLab installation + +1. Generate PGBOUNCER_USER_PASSWORD_HASH with the command `gitlab-ctl pg-password-md5 pgbouncer` + +1. Generate SQL_USER_PASSWORD_HASH with the command `gitlab-ctl pg-password-md5 gitlab`. We'll also need to enter the plaintext SQL_USER_PASSWORD later + +1. On your database node, ensure the following is set in your `/etc/gitlab/gitlab.rb` + + ```ruby + postgresql['pgbouncer_user_password'] = 'PGBOUNCER_USER_PASSWORD_HASH' + postgresql['sql_user_password'] = 'SQL_USER_PASSWORD_HASH' + postgresql['listen_address'] = 'XX.XX.XX.Y' # Where XX.XX.XX.Y is the ip address on the node postgresql should listen on + postgresql['md5_auth_cidr_addresses'] = %w(AA.AA.AA.B/32) # Where AA.AA.AA.B is the IP address of the pgbouncer node + ``` + +1. Run `gitlab-ctl reconfigure` + + **Note:** If the database was already running, it will need to be restarted after reconfigure by running `gitlab-ctl restart postgresql`. + +1. On the node you are running pgbouncer on, make sure the following is set in `/etc/gitlab/gitlab.rb` + + ```ruby + pgbouncer['enable'] = true + pgbouncer['databases'] = { + gitlabhq_production: { + host: 'DATABASE_HOST', + user: 'pgbouncer', + password: 'PGBOUNCER_USER_PASSWORD_HASH' + } + } + ``` + +1. Run `gitlab-ctl reconfigure` + +1. On the node running unicorn, make sure the following is set in `/etc/gitlab/gitlab.rb` + + ```ruby + gitlab_rails['db_host'] = 'PGBOUNCER_HOST' + gitlab_rails['db_port'] = '6432' + gitlab_rails['db_password'] = 'SQL_USER_PASSWORD' + ``` + +1. Run `gitlab-ctl reconfigure` + +1. At this point, your instance should connect to the database through pgbouncer. If you are having issues, see the [Troubleshooting](#troubleshooting) section + +### Interacting with pgbouncer + +#### Administrative console + +As part of omnibus-gitlab, we provide a command `gitlab-ctl pgb-console` to automatically connect to the pgbouncer administrative console. Please see the [pgbouncer documentation](https://pgbouncer.github.io/usage.html#admin-console) for detailed instructions on how to interact with the console. + +To start a session, run + +```shell +# gitlab-ctl pgb-console +Password for user pgbouncer: +psql (9.6.8, server 1.7.2/bouncer) +Type "help" for help. + +pgbouncer=# +``` + +The password you will be prompted for is the PGBOUNCER_USER_PASSWORD + +To get some basic information about the instance, run +```shell +pgbouncer=# show databases; show clients; show servers; + name | host | port | database | force_user | pool_size | reserve_pool | pool_mode | max_connections | current_connections +---------------------+-----------+------+---------------------+------------+-----------+--------------+-----------+-----------------+--------------------- + gitlabhq_production | 127.0.0.1 | 5432 | gitlabhq_production | | 100 | 5 | | 0 | 1 + pgbouncer | | 6432 | pgbouncer | pgbouncer | 2 | 0 | statement | 0 | 0 +(2 rows) + + type | user | database | state | addr | port | local_addr | local_port | connect_time | request_time | ptr | link +| remote_pid | tls +------+-----------+---------------------+--------+-----------+-------+------------+------------+---------------------+---------------------+-----------+------ ++------------+----- + C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44590 | 127.0.0.1 | 6432 | 2018-04-24 22:13:10 | 2018-04-24 22:17:10 | 0x12444c0 | +| 0 | + C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44592 | 127.0.0.1 | 6432 | 2018-04-24 22:13:10 | 2018-04-24 22:17:10 | 0x12447c0 | +| 0 | + C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44594 | 127.0.0.1 | 6432 | 2018-04-24 22:13:10 | 2018-04-24 22:17:10 | 0x1244940 | +| 0 | + C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44706 | 127.0.0.1 | 6432 | 2018-04-24 22:14:22 | 2018-04-24 22:16:31 | 0x1244ac0 | +| 0 | + C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44708 | 127.0.0.1 | 6432 | 2018-04-24 22:14:22 | 2018-04-24 22:15:15 | 0x1244c40 | +| 0 | + C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44794 | 127.0.0.1 | 6432 | 2018-04-24 22:15:15 | 2018-04-24 22:15:15 | 0x1244dc0 | +| 0 | + C | gitlab | gitlabhq_production | active | 127.0.0.1 | 44798 | 127.0.0.1 | 6432 | 2018-04-24 22:15:15 | 2018-04-24 22:16:31 | 0x1244f40 | +| 0 | + C | pgbouncer | pgbouncer | active | 127.0.0.1 | 44660 | 127.0.0.1 | 6432 | 2018-04-24 22:13:51 | 2018-04-24 22:17:12 | 0x1244640 | +| 0 | +(8 rows) + + type | user | database | state | addr | port | local_addr | local_port | connect_time | request_time | ptr | link | rem +ote_pid | tls +------+--------+---------------------+-------+-----------+------+------------+------------+---------------------+---------------------+-----------+------+---- +--------+----- + S | gitlab | gitlabhq_production | idle | 127.0.0.1 | 5432 | 127.0.0.1 | 35646 | 2018-04-24 22:15:15 | 2018-04-24 22:17:10 | 0x124dca0 | | + 19980 | +(1 row) +``` + +## Troubleshooting + +In case you are experiencing any issues connecting through pgbouncer, the first place to check is always the logs: + +```shell +# gitlab-ctl tail pgbouncer +``` + +Additionally, you can check the output from `show databases` in the [Administrative console](#administrative-console). In the output, you would expect to see values in the `host` field for the `gitlabhq_production` database. Additionally, `current_connections` should be greater than 1. diff --git a/doc/administration/high_availability/redis.md b/doc/administration/high_availability/redis.md index a52bc5c3b02..953edb51bc8 100644 --- a/doc/administration/high_availability/redis.md +++ b/doc/administration/high_availability/redis.md @@ -1,6 +1,103 @@ -# Configuring Redis for GitLab HA +# Configuring Redis for Scaling and High Availability -> Experimental Redis Sentinel support was [Introduced][ce-1877] in GitLab 8.11. +## Provide your own Redis instance **[CORE ONLY]** + +The following are the requirements for providing your own Redis instance: + +- Redis version 2.8 or higher. Version 3.2 or higher is recommend as this is + what ships with the GitLab Omnibus package. +- Standalone Redis or Redis high availability with Sentinel are supported. Redis + Cluster is not supported. +- Managed Redis from cloud providers such as AWS Elasticache will work. If these + services support high availability, be sure it is not the Redis Cluster type. + +Note the Redis node's IP address or hostname, port, and password (if required). +These will be necessary when configuring the GitLab application servers later. + +## Redis in a Scaled Environment + +This section is relevant for [Scaled Architecture](./README.md#scalable-architecture-examples) +environments including [Basic Scaling](./README.md#basic-scaling) and +[Full Scaling](./README.md#full-scaling). + +### Provide your own Redis instance **[CORE ONLY]** + +If you want to use your own deployed Redis instance(s), +see [Provide your own Redis instance](#provide-your-own-redis-instance-core-only) +for more details. However, you can use the GitLab Omnibus package to easily +deploy the bundled Redis. + +### Standalone Redis using GitLab Omnibus **[CORE ONLY]** + +The GitLab Omnibus package can be used to configure a standalone Redis server. +In this configuration Redis is not highly available, and represents a single +point of failure. However, in a scaled environment the objective is to allow +the environment to handle more users or to increase throughput. Redis itself +is generally stable and can handle many requests so it is an acceptable +trade off to have only a single instance. See [Scaling and High Availability](./README.md) +for an overview of GitLab scaling and high availability options. + +The steps below are the minimum necessary to configure a Redis server with +Omnibus: + +1. SSH into the Redis server. +1. [Download/install](https://about.gitlab.com/installation) the Omnibus GitLab + package you want using **steps 1 and 2** from the GitLab downloads page. + - Do not complete any other steps on the download page. + +1. Edit `/etc/gitlab/gitlab.rb` and add the contents: + + ```ruby + ## Enable Redis + redis['enable'] = true + + ## Disable all other services + sidekiq['enable'] = false + gitlab_workhorse['enable'] = false + unicorn['enable'] = false + postgresql['enable'] = false + nginx['enable'] = false + prometheus['enable'] = false + alertmanager['enable'] = false + pgbouncer_exporter['enable'] = false + gitlab_monitor['enable'] = false + gitaly['enable'] = false + + redis['bind'] = '0.0.0.0' + redis['port'] = '6379' + redis['password'] = 'SECRET_PASSWORD_HERE' + + gitlab_rails['auto_migrate'] = false + ``` + +1. [Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect. +1. Note the Redis node's IP address or hostname, port, and + Redis password. These will be necessary when configuring the GitLab + application servers later. + +Advanced configuration options are supported and can be added if +needed. + +Continue configuration of other components by going +[back to Scaled Architectures](./README.md#scalable-architecture-examples) + +## Redis with High Availability + +This section is relevant for [High Availability Architecture](./README.md#high-availability-architecture-examples) +environments including [Horizontal](./README.md#horizontal), +[Hybrid](./README.md#hybrid), and +[Fully Distributed](./README.md#fully-distributed). + +### Provide your own Redis instance **[CORE ONLY]** + +If you want to use your own deployed Redis instance(s), +see [Provide your own Redis instance](#provide-your-own-redis-instance-core-only) +for more details. However, you can use the GitLab Omnibus package to easily +deploy the bundled Redis. + +### High Availability with GitLab Omnibus **[PREMIUM ONLY]** + +> Experimental Redis Sentinel support was [introduced in GitLab 8.11][ce-1877]. Starting with 8.14, Redis Sentinel is no longer experimental. If you've used it with versions `< 8.14` before, please check the updated documentation here. @@ -52,8 +149,6 @@ failure. Make sure that you read this document once as a whole before configuring the components below. -### High Availability with Sentinel - > **Notes:** > - Starting with GitLab `8.11`, you can configure a list of Redis Sentinel > servers that will monitor a group of Redis servers to provide failover support. @@ -267,10 +362,9 @@ The prerequisites for a HA Redis setup are the following: 1. Edit `/etc/gitlab/gitlab.rb` and add the contents: ```ruby - # Enable the master role and disable all other services in the machine - # (you can still enable Sentinel). - redis_master_role['enable'] = true - + # Specify server role as 'redis_master_role' + roles ['redis_master_role'] + # IP address pointing to a local IP that the other machines can reach to. # You can also set bind to '0.0.0.0' which listen in all interfaces. # If you really need to bind to an external accessible IP, make @@ -284,6 +378,7 @@ The prerequisites for a HA Redis setup are the following: # Set up password authentication for Redis (use the same password in all nodes). redis['password'] = 'redis-password-goes-here' ``` + 1. Only the primary GitLab application server should handle migrations. To prevent database migrations from running on upgrade, add the following @@ -295,6 +390,10 @@ The prerequisites for a HA Redis setup are the following: 1. [Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect. +> Note: You can specify multiple roles like sentinel and redis as: +> roles ['redis_sentinel_role', 'redis_master_role']. Read more about high +> availability roles at https://docs.gitlab.com/omnibus/roles/ + ### Step 2. Configuring the slave Redis instances 1. SSH into the **slave** Redis server. @@ -307,11 +406,9 @@ The prerequisites for a HA Redis setup are the following: 1. Edit `/etc/gitlab/gitlab.rb` and add the contents: ```ruby - # Enable the slave role and disable all other services in the machine - # (you can still enable Sentinel). This will also set automatically - # `redis['master'] = false`. - redis_slave_role['enable'] = true - + # Specify server role as 'redis_slave_role' + roles ['redis_slave_role'] + # IP address pointing to a local IP that the other machines can reach to. # You can also set bind to '0.0.0.0' which listen in all interfaces. # If you really need to bind to an external accessible IP, make @@ -333,17 +430,19 @@ The prerequisites for a HA Redis setup are the following: #redis['master_port'] = 6379 ``` -1. To prevent database migrations from running on upgrade, run: +1. To prevent reconfigure from running automatically on upgrade, run: ``` sudo touch /etc/gitlab/skip-auto-reconfigure ``` - Only the primary GitLab application server should handle migrations. - 1. [Reconfigure Omnibus GitLab][reconfigure] for the changes to take effect. 1. Go through the steps again for all the other slave nodes. +> Note: You can specify multiple roles like sentinel and redis as: +> roles ['redis_sentinel_role', 'redis_slave_role']. Read more about high +> availability roles at https://docs.gitlab.com/omnibus/roles/ + --- These values don't have to be changed again in `/etc/gitlab/gitlab.rb` after @@ -397,13 +496,13 @@ multiple machines with the Sentinel daemon. be duplicate below): ```ruby - redis_sentinel_role['enable'] = true + roles ['redis_sentinel_role'] # Must be the same in every sentinel node redis['master_name'] = 'gitlab-redis' # The same password for Redis authentication you set up for the master node. - redis['password'] = 'redis-password-goes-here' + redis['master_password'] = 'redis-password-goes-here' # The IP of the master Redis node. redis['master_ip'] = '10.0.0.1' @@ -570,8 +669,7 @@ or a failover promotes a different **Master** node. In `/etc/gitlab/gitlab.rb`: ```ruby -redis_master_role['enable'] = true -redis_sentinel_role['enable'] = true +roles ['redis_sentinel_role', 'redis_master_role'] redis['bind'] = '10.0.0.1' redis['port'] = 6379 redis['password'] = 'redis-password-goes-here' @@ -593,8 +691,7 @@ sentinel['quorum'] = 2 In `/etc/gitlab/gitlab.rb`: ```ruby -redis_slave_role['enable'] = true -redis_sentinel_role['enable'] = true +roles ['redis_sentinel_role', 'redis_slave_role'] redis['bind'] = '10.0.0.2' redis['port'] = 6379 redis['password'] = 'redis-password-goes-here' @@ -616,8 +713,7 @@ sentinel['quorum'] = 2 In `/etc/gitlab/gitlab.rb`: ```ruby -redis_slave_role['enable'] = true -redis_sentinel_role['enable'] = true +roles ['redis_sentinel_role', 'redis_slave_role'] redis['bind'] = '10.0.0.3' redis['port'] = 6379 redis['password'] = 'redis-password-goes-here' @@ -640,7 +736,7 @@ In `/etc/gitlab/gitlab.rb`: ```ruby redis['master_name'] = 'gitlab-redis' -redis['password'] = 'redis-password-goes-here' +redis['master_password'] = 'redis-password-goes-here' gitlab_rails['redis_sentinels'] = [ {'host' => '10.0.0.1', 'port' => 26379}, {'host' => '10.0.0.2', 'port' => 26379}, @@ -761,15 +857,11 @@ Before proceeding with the troubleshooting below, check your firewall rules: ### Troubleshooting Redis replication You can check if everything is correct by connecting to each server using -`redis-cli` application, and sending the `INFO` command. +`redis-cli` application, and sending the `info replication` command as below. -If authentication was correctly defined, it should fail with: -`NOAUTH Authentication required` error. Try to authenticate with the -previous defined password with `AUTH redis-password-goes-here` and -try the `INFO` command again. - -Look for the `# Replication` section where you should see some important -information like the `role` of the server. +``` +/opt/gitlab/embedded/bin/redis-cli -a <redis-password> info replication +``` When connected to a `master` redis, you will see the number of connected `slaves`, and a list of each with connection details: @@ -839,7 +931,7 @@ To make sure your configuration is correct: 1. Run in the console: ```ruby - redis = Redis.new(Gitlab::Redis.params) + redis = Redis.new(Gitlab::Redis::SharedState.params) redis.info ``` diff --git a/doc/administration/img/db_load_balancing_postgres_stats.png b/doc/administration/img/db_load_balancing_postgres_stats.png Binary files differnew file mode 100644 index 00000000000..8b311616e7b --- /dev/null +++ b/doc/administration/img/db_load_balancing_postgres_stats.png diff --git a/doc/administration/img/high_availability/fully-distributed.png b/doc/administration/img/high_availability/fully-distributed.png Binary files differnew file mode 100644 index 00000000000..ad23207134e --- /dev/null +++ b/doc/administration/img/high_availability/fully-distributed.png diff --git a/doc/administration/img/high_availability/geo-ha-diagram.png b/doc/administration/img/high_availability/geo-ha-diagram.png Binary files differnew file mode 100644 index 00000000000..da5d612827c --- /dev/null +++ b/doc/administration/img/high_availability/geo-ha-diagram.png diff --git a/doc/administration/img/high_availability/horizontal.png b/doc/administration/img/high_availability/horizontal.png Binary files differnew file mode 100644 index 00000000000..c3bd489d96f --- /dev/null +++ b/doc/administration/img/high_availability/horizontal.png diff --git a/doc/administration/img/high_availability/hybrid.png b/doc/administration/img/high_availability/hybrid.png Binary files differnew file mode 100644 index 00000000000..7d4a56bf0ea --- /dev/null +++ b/doc/administration/img/high_availability/hybrid.png diff --git a/doc/administration/img/instance_review_button.png b/doc/administration/img/instance_review_button.png Binary files differnew file mode 100644 index 00000000000..b7604d7c7e5 --- /dev/null +++ b/doc/administration/img/instance_review_button.png diff --git a/doc/administration/incoming_email.md b/doc/administration/incoming_email.md index 658b2f55d30..4de54e03c8c 100644 --- a/doc/administration/incoming_email.md +++ b/doc/administration/incoming_email.md @@ -10,6 +10,8 @@ GitLab has several features based on receiving incoming emails: - [New merge request by email](../user/project/merge_requests/index.md#create-new-merge-requests-by-email): allow GitLab users to create a new merge request by sending an email to a user-specific email address. +- [Service Desk](../user/project/service_desk.md): provide e-mail support to + your customers through GitLab. ## Requirements diff --git a/doc/administration/index.md b/doc/administration/index.md index b723edfc78f..d208d9a736a 100644 --- a/doc/administration/index.md +++ b/doc/administration/index.md @@ -32,8 +32,14 @@ Learn how to install, configure, update, and maintain your GitLab instance. ### Installing GitLab - [Install](../install/README.md): Requirements, directory structures, and installation methods. + - [Database load balancing](database_load_balancing.md): Distribute database queries among multiple database servers. **[STARTER ONLY]** + - [Omnibus support for external MySQL DB](https://docs.gitlab.com/omnibus/settings/database.html#using-a-mysql-database-management-server-enterprise-edition-only): Omnibus package supports configuring an external MySQL database. **[STARTER ONLY]** + - [Omnibus support for log forwarding](https://docs.gitlab.com/omnibus/settings/logs.html#udp-log-shipping-gitlab-enterprise-edition-only) **[STARTER ONLY]** - [High Availability](high_availability/README.md): Configure multiple servers for scaling or high availability. - [High Availability on AWS](../university/high-availability/aws/README.md): Set up GitLab HA on Amazon AWS. +- [Geo](geo/replication/index.md): Replicate your GitLab instance to other geographic locations as a read-only fully operational version. **[PREMIUM ONLY]** +- [Disaster Recovery](geo/disaster_recovery/index.md): Quickly fail-over to a different site with minimal effort in a disaster situation. **[PREMIUM ONLY]** +- [Pivotal Tile](../install/pivotal/index.md): Deploy GitLab as a pre-configured appliance using Ops Manager (BOSH) for Pivotal Cloud Foundry. **[PREMIUM ONLY]** ### Configuring GitLab @@ -43,8 +49,8 @@ Learn how to install, configure, update, and maintain your GitLab instance. - [Usage statistics, version check, and usage ping](../user/admin_area/settings/usage_statistics.md): Enable or disable information about your instance to be sent to GitLab, Inc. - [Polling](polling.md): Configure how often the GitLab UI polls for updates. - [GitLab Pages configuration](pages/index.md): Enable and configure GitLab Pages. -- [GitLab Pages configuration for GitLab source installations](pages/source.md): Enable and configure GitLab Pages on - [source installations](../install/installation.md#installation-from-source). +- [GitLab Pages configuration for GitLab source installations](pages/source.md): Enable and configure GitLab Pages on [source installations](../install/installation.md#installation-from-source). +- [Uploads configuration](uploads.md): Configure GitLab uploads storage. - [Environment variables](environment_variables.md): Supported environment variables that can be used to override their defaults values in order to configure GitLab. - [Plugins](plugins.md): With custom plugins, GitLab administrators can introduce custom integrations without modifying GitLab's source code. - [Enforcing Terms of Service](../user/admin_area/settings/terms.md) @@ -53,6 +59,9 @@ Learn how to install, configure, update, and maintain your GitLab instance. - [Diff limits](../user/admin_area/diff_limits.md): Configure the diff rendering size limits of branch comparison pages. - [Merge request diffs storage](merge_request_diffs.md): Configure merge requests diffs external storage. - [Broadcast Messages](../user/admin_area/broadcast_messages.md): Send messages to GitLab users through the UI. +- [Elasticsearch](../integration/elasticsearch.md): Enable Elasticsearch to empower GitLab's Advanced Global Search. Useful when you deal with a huge amount of data. **[STARTER ONLY]** +- [External Classification Policy Authorization](../user/admin_area/settings/external_authorization.md) **[PREMIUM ONLY]** +- [Upload a license](../user/admin_area/license.md): Upload a license to unlock features that are in paid tiers of GitLab. **[STARTER ONLY]** - [Admin Area](../user/admin_area/index.md): for self-managed instance-wide configuration and maintenance. #### Customizing GitLab's appearance @@ -62,6 +71,7 @@ Learn how to install, configure, update, and maintain your GitLab instance. - [Branded login page](../customization/branded_login_page.md): Customize the login page with your own logo, title, and description. - [Welcome message](../customization/welcome_message.md): Add a custom welcome message to the sign-in page. - ["New Project" page](../customization/new_project_page.md): Customize the text to be displayed on the page that opens whenever your users create a new project. +- [Additional custom email text](../user/admin_area/settings/email.md#custom-additional-text): Add additional custom text to emails sent from GitLab. **[PREMIUM ONLY]** ### Maintaining GitLab @@ -95,7 +105,16 @@ Learn how to install, configure, update, and maintain your GitLab instance. - [Libravatar](../customization/libravatar.md): Use Libravatar instead of Gravatar for user avatars. - [Sign-up restrictions](../user/admin_area/settings/sign_up_restrictions.md): block email addresses of specific domains, or whitelist only specific domains. - [Access restrictions](../user/admin_area/settings/visibility_and_access_controls.md#enabled-git-access-protocols): Define which Git access protocols can be used to talk to GitLab (SSH, HTTP, HTTPS). -- [Authentication and Authorization](auth/README.md): Configure external authentication with LDAP, SAML, CAS and additional providers. See also other [authentication](../topics/authentication/index.md#gitlab-administrators) topics (for example, enforcing 2FA). +- [Authentication and Authorization](auth/README.md): Configure external authentication with LDAP, SAML, CAS and additional providers. + - [Sync LDAP](auth/ldap-ee.md) **[STARTER ONLY]** + - [Kerberos authentication](../integration/kerberos.md) **[STARTER ONLY]** + - See also other [authentication](../topics/authentication/index.md#gitlab-administrators) topics (for example, enforcing 2FA). +- [Email users](../tools/email.md): Email GitLab users from within GitLab. **[STARTER ONLY]** +- [User Cohorts](../user/admin_area/user_cohorts.md): Display the monthly cohorts of new users and their activities over time. +- [Audit logs and events](audit_events.md): View the changes made within the GitLab server for: + - Groups and projects. **[STARTER]** + - Instances. **[PREMIUM ONLY]** +- [Auditor users](auditor_users.md): Users with read-only access to all projects, groups, and other resources on the GitLab instance. **[PREMIUM ONLY]** - [Incoming email](incoming_email.md): Configure incoming emails to allow users to [reply by email], create [issues by email] and [merge requests by email], and to enable [Service Desk]. @@ -107,6 +126,7 @@ Learn how to install, configure, update, and maintain your GitLab instance. [reply by email]: reply_by_email.md [issues by email]: ../user/project/issues/create_new_issue.md#new-issue-via-email [merge requests by email]: ../user/project/merge_requests/index.md#create-new-merge-requests-by-email +[Service Desk]: ../user/project/service_desk.md ## Project settings @@ -115,13 +135,15 @@ Learn how to install, configure, update, and maintain your GitLab instance. - [Gitaly](gitaly/index.md): Configuring Gitaly, GitLab's Git repository storage service. - [Default labels](../user/admin_area/labels.html): Create labels that will be automatically added to every new project. - [Restrict the use of public or internal projects](../public_access/public_access.md#restricting-the-use-of-public-or-internal-projects): Restrict the use of visibility levels for users when they create a project or a snippet. -- [Custom project templates](https://docs.gitlab.com/ee/user/admin_area/custom_project_templates.html): Configure a set of projects to be used as custom templates when creating a new project. **[PREMIUM ONLY]** +- [Custom project templates](../user/admin_area/custom_project_templates.md): Configure a set of projects to be used as custom templates when creating a new project. **[PREMIUM ONLY]** +- [Packages](packages.md): Enable GitLab to act as a Maven repository or NPM registry. **[PREMIUM ONLY]** ### Repository settings - [Repository checks](repository_checks.md): Periodic Git repository checks. - [Repository storage paths](repository_storage_paths.md): Manage the paths used to store repositories. - [Repository storage rake tasks](raketasks/storage.md): A collection of rake tasks to list and migrate existing projects and attachments associated with it from Legacy storage to Hashed storage. +- [Limit repository size](../user/admin_area/settings/account_and_limit_settings.md): Set a hard limit for your repositories' size. **[STARTER ONLY]** ## Continuous Integration settings @@ -158,6 +180,10 @@ Learn how to install, configure, update, and maintain your GitLab instance. - [Request Profiling](monitoring/performance/request_profiling.md): Get a detailed profile on slow requests. - [Performance Bar](monitoring/performance/performance_bar.md): Get performance information for the current page. +## Analytics + +- [Pseudonymizer](pseudonymizer.md): Export data from GitLab's database to CSV files in a secure way. + ## Troubleshooting - [Debugging tips](troubleshooting/debug.md): Tips to debug problems when things go wrong diff --git a/doc/administration/instance_review.md b/doc/administration/instance_review.md new file mode 100644 index 00000000000..5781ce4150c --- /dev/null +++ b/doc/administration/instance_review.md @@ -0,0 +1,17 @@ +# Instance Review + +> [Introduced][6995] in [GitLab Core][ee] 11.3. + +If you are running a medium size instance of GitLab Core edition you are qualified for a free Instance Review. You can find the button in the User menu. + +![Instance Review button](img/instance_review_button.png) + +When you click the button you will be redirected to a form with prefilled data obtained from your instance. + +Once you submit the data to GitLab Inc. you can see the initial report. + +Additionally you will be contacted by our team for further review which should help you to improve your usage of GitLab. + +[6995]: https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/6995 +[ee]: https://about.gitlab.com/pricing/ + diff --git a/doc/administration/maven_packages.md b/doc/administration/maven_packages.md new file mode 100644 index 00000000000..d8551f64ece --- /dev/null +++ b/doc/administration/maven_packages.md @@ -0,0 +1,5 @@ +--- +redirect_to: 'packages.md' +--- + +This document was moved to [another location](packages.md). diff --git a/doc/administration/maven_repository.md b/doc/administration/maven_repository.md new file mode 100644 index 00000000000..d8551f64ece --- /dev/null +++ b/doc/administration/maven_repository.md @@ -0,0 +1,5 @@ +--- +redirect_to: 'packages.md' +--- + +This document was moved to [another location](packages.md). diff --git a/doc/administration/monitoring/performance/img/request_profiling_token.png b/doc/administration/monitoring/performance/img/request_profiling_token.png Binary files differindex 9f3dd7f08ca..ee819fcb437 100644 --- a/doc/administration/monitoring/performance/img/request_profiling_token.png +++ b/doc/administration/monitoring/performance/img/request_profiling_token.png diff --git a/doc/administration/monitoring/prometheus/gitlab_metrics.md b/doc/administration/monitoring/prometheus/gitlab_metrics.md index 3bfcc9a289e..da7094338a0 100644 --- a/doc/administration/monitoring/prometheus/gitlab_metrics.md +++ b/doc/administration/monitoring/prometheus/gitlab_metrics.md @@ -48,6 +48,47 @@ The following metrics are available: | unicorn_active_connections | Gauge | 11.0 | The number of active Unicorn connections (workers) | | unicorn_queued_connections | Gauge | 11.0 | The number of queued Unicorn connections | +## Sidekiq Metrics available + +Sidekiq jobs may also gather metrics, and these metrics can be accessed if the Sidekiq exporter is enabled (e.g. via +the `monitoring.sidekiq_exporter` configuration option in `gitlab.yml`. + +| Metric | Type | Since | Description | Labels | +|:-------------------------------------------- |:------- |:----- |:----------- |:------ | +| geo_db_replication_lag_seconds | Gauge | 10.2 | Database replication lag (seconds) | url +| geo_repositories | Gauge | 10.2 | Total number of repositories available on primary | url +| geo_repositories_synced | Gauge | 10.2 | Number of repositories synced on secondary | url +| geo_repositories_failed | Gauge | 10.2 | Number of repositories failed to sync on secondary | url +| geo_lfs_objects | Gauge | 10.2 | Total number of LFS objects available on primary | url +| geo_lfs_objects_synced | Gauge | 10.2 | Number of LFS objects synced on secondary | url +| geo_lfs_objects_failed | Gauge | 10.2 | Number of LFS objects failed to sync on secondary | url +| geo_attachments | Gauge | 10.2 | Total number of file attachments available on primary | url +| geo_attachments_synced | Gauge | 10.2 | Number of attachments synced on secondary | url +| geo_attachments_failed | Gauge | 10.2 | Number of attachments failed to sync on secondary | url +| geo_last_event_id | Gauge | 10.2 | Database ID of the latest event log entry on the primary | url +| geo_last_event_timestamp | Gauge | 10.2 | UNIX timestamp of the latest event log entry on the primary | url +| geo_cursor_last_event_id | Gauge | 10.2 | Last database ID of the event log processed by the secondary | url +| geo_cursor_last_event_timestamp | Gauge | 10.2 | Last UNIX timestamp of the event log processed by the secondary | url +| geo_status_failed_total | Counter | 10.2 | Number of times retrieving the status from the Geo Node failed | url +| geo_last_successful_status_check_timestamp | Gauge | 10.2 | Last timestamp when the status was successfully updated | url +| geo_lfs_objects_synced_missing_on_primary | Gauge | 10.7 | Number of LFS objects marked as synced due to the file missing on the primary | url +| geo_job_artifacts_synced_missing_on_primary | Gauge | 10.7 | Number of job artifacts marked as synced due to the file missing on the primary | url +| geo_attachments_synced_missing_on_primary | Gauge | 10.7 | Number of attachments marked as synced due to the file missing on the primary | url +| geo_repositories_checksummed_count | Gauge | 10.7 | Number of repositories checksummed on primary | url +| geo_repositories_checksum_failed_count | Gauge | 10.7 | Number of repositories failed to calculate the checksum on primary | url +| geo_wikis_checksummed_count | Gauge | 10.7 | Number of wikis checksummed on primary | url +| geo_wikis_checksum_failed_count | Gauge | 10.7 | Number of wikis failed to calculate the checksum on primary | url +| geo_repositories_verified_count | Gauge | 10.7 | Number of repositories verified on secondary | url +| geo_repositories_verification_failed_count | Gauge | 10.7 | Number of repositories failed to verify on secondary | url +| geo_repositories_checksum_mismatch_count | Gauge | 10.7 | Number of repositories that checksum mismatch on secondary | url +| geo_wikis_verified_count | Gauge | 10.7 | Number of wikis verified on secondary | url +| geo_wikis_verification_failed_count | Gauge | 10.7 | Number of wikis failed to verify on secondary | url +| geo_wikis_checksum_mismatch_count | Gauge | 10.7 | Number of wikis that checksum mismatch on secondary | url +| geo_repositories_checked_count | Gauge | 11.1 | Number of repositories that have been checked via `git fsck` | url +| geo_repositories_checked_failed_count | Gauge | 11.1 | Number of repositories that have a failure from `git fsck` | url +| geo_repositories_retrying_verification_count | Gauge | 11.2 | Number of repositories verification failures that Geo is actively trying to correct on secondary | url +| geo_wikis_retrying_verification_count | Gauge | 11.2 | Number of wikis verification failures that Geo is actively trying to correct on secondary | url + ### Ruby metrics Some basic Ruby runtime metrics are available: diff --git a/doc/administration/monitoring/prometheus/index.md b/doc/administration/monitoring/prometheus/index.md index 20d7ef9bb74..095f126f4b2 100644 --- a/doc/administration/monitoring/prometheus/index.md +++ b/doc/administration/monitoring/prometheus/index.md @@ -202,6 +202,12 @@ The Postgres exporter allows you to measure various PostgreSQL metrics. [➔ Read more about the Postgres exporter.](postgres_exporter.md) +### PgBouncer exporter + +The PgBouncer exporter allows you to measure various PgBouncer metrics. + +[➔ Read more about the PgBouncer exporter.](pgbouncer_exporter.md) + ### GitLab monitor exporter The GitLab monitor exporter allows you to measure various GitLab metrics, pulled from Redis and the database. diff --git a/doc/administration/monitoring/prometheus/pgbouncer_exporter.md b/doc/administration/monitoring/prometheus/pgbouncer_exporter.md new file mode 100644 index 00000000000..d76834fdbea --- /dev/null +++ b/doc/administration/monitoring/prometheus/pgbouncer_exporter.md @@ -0,0 +1,34 @@ +# PgBouncer exporter + +>**Note:** +Available since [Omnibus GitLab 11.0][2493]. For installations from source +you'll have to install and configure it yourself. + +The [PgBouncer exporter] allows you to measure various PgBouncer metrics. + +To enable the PgBouncer exporter: + +1. [Enable Prometheus](index.md#configuring-prometheus) +1. Edit `/etc/gitlab/gitlab.rb` +1. Add or find and uncomment the following line, making sure it's set to `true`: + + ```ruby + pgbouncer_exporter['enable'] = true + ``` + +1. Save the file and [reconfigure GitLab][reconfigure] for the changes to + take effect. + +Prometheus will now automatically begin collecting performance data from +the PgBouncer exporter exposed under `localhost:9188`. + +The PgBouncer exporter will also be enabled by default if the [pgbouncer_role][postgres roles] +is enabled. + +[← Back to the main Prometheus page](index.md) + +[2493]: https://gitlab.com/gitlab-org/omnibus-gitlab/merge_requests/2493 +[PgBouncer exporter]: https://github.com/stanhu/pgbouncer_exporter +[postgres roles]: https://docs.gitlab.com/omnibus/roles/#postgres-roles +[prometheus]: https://prometheus.io +[reconfigure]: ../../restart_gitlab.md#omnibus-gitlab-reconfigure diff --git a/doc/administration/npm_registry.md b/doc/administration/npm_registry.md new file mode 100644 index 00000000000..d8551f64ece --- /dev/null +++ b/doc/administration/npm_registry.md @@ -0,0 +1,5 @@ +--- +redirect_to: 'packages.md' +--- + +This document was moved to [another location](packages.md). diff --git a/doc/administration/operations/cleaning_up_redis_sessions.md b/doc/administration/operations/cleaning_up_redis_sessions.md index b45ca99fd80..20c19445404 100644 --- a/doc/administration/operations/cleaning_up_redis_sessions.md +++ b/doc/administration/operations/cleaning_up_redis_sessions.md @@ -27,7 +27,7 @@ rcli() { # This example works for Omnibus installations of GitLab 7.3 or newer. For an # installation from source you will have to change the socket path and the # path to redis-cli. - sudo /opt/gitlab/embedded/bin/redis-cli -s /var/opt/gitlab/redis/redis.socket "$@" + sudo /opt/gitlab/embedded/bin/redis-cli -s /var/opt/gitlab/redis/redis.shared_state.socket "$@" } # test the new shell function; the response should be PONG diff --git a/doc/administration/operations/extra_sidekiq_processes.md b/doc/administration/operations/extra_sidekiq_processes.md new file mode 100644 index 00000000000..ee7c474bec5 --- /dev/null +++ b/doc/administration/operations/extra_sidekiq_processes.md @@ -0,0 +1,130 @@ +# Extra Sidekiq Processes + +GitLab Enterprise Edition allows one to start an extra set of Sidekiq processes +besides the default one. These processes can be used to consume a dedicated set +of queues. This can be used to ensure certain queues always have dedicated +workers, no matter the amount of jobs that need to be processed. + +## Starting Extra Processes + +Starting extra Sidekiq processes can be done using the command +`bin/sidekiq-cluster`. This command takes arguments using the following syntax: + +```bash +sidekiq-cluster [QUEUE,QUEUE,...] [QUEUE, ...] +``` + +Each separate argument denotes a group of queues that have to be processed by a +Sidekiq process. Multiple queues can be processed by the same process by +separating them with a comma instead of a space. + +Instead of a queue, a queue namespace can also be provided, to have the process +automatically listen on all queues in that namespace without needing to +explicitly list all the queue names. For more information about queue namespaces, +see the relevant section in the +[Sidekiq style guide](../../development/sidekiq_style_guide.md#queue-namespaces). + +For example, say you want to start 2 extra processes: one to process the +"process_commit" queue, and one to process the "post_receive" queue. This can be +done as follows: + +```bash +sidekiq-cluster process_commit post_receive +``` + +If you instead want to start one process processing both queues you'd use the +following syntax: + +```bash +sidekiq-cluster process_commit,post_receive +``` + +If you want to have one Sidekiq process process the "process_commit" and +"post_receive" queues, and one process to process the "gitlab_shell" queue, +you'd use the following: + +```bash +sidekiq-cluster process_commit,post_receive gitlab_shell +``` + +## Concurrency + +Each process started using `sidekiq-cluster` starts with a number of threads +that equals the number of queues, plus one spare thread. For example, a process +that processes "process_commit" and "post_receive" will use 3 threads in total. + +## Monitoring + +The `sidekiq-cluster` command will not terminate once it has started the desired +amount of Sidekiq processes. Instead the process will continue running and +forward any signals to the child processes. This makes it easy to stop all +Sidekiq processes as you simply send a signal to the `sidekiq-cluster` process, +instead of having to send it to the individual processes. + +If the `sidekiq-cluster` process crashes or is SIGKILL'd the child processes +will terminate themselves after a few seconds. This ensures you don't end up +with zombie Sidekiq processes. + +All of this makes monitoring the processes fairly easy. Simply hook up +`sidekiq-cluster` to your supervisor of choice (e.g. runit) and you're good to +go. + +If a child process died the `sidekiq-cluster` command will signal all remaining +process to terminate, then terminate itself. This removes the need for +`sidekiq-cluster` to re-implement complex process monitoring/restarting code. +Instead you should make sure your supervisor restarts the `sidekiq-cluster` +process whenever necessary. + +## PID Files + +The `sidekiq-cluster` command can store its PID in a file. By default no PID +file is written, but this can be changed by passing the `--pidfile` option to +`sidekiq-cluster`. For example: + +```bash +sidekiq-cluster --pidfile /var/run/gitlab/sidekiq_cluster.pid process_commit +``` + +Keep in mind that the PID file will contain the PID of the `sidekiq-cluster` +command, and not the PID(s) of the started Sidekiq processes. + +## Environment + +The Rails environment can be set by passing the `--environment` flag to the +`sidekiq-cluster` command, or by setting `RAILS_ENV` to a non-empty value. The +default value is "development". + +## All Queues With Exceptions + +You're able to run all queues in `sidekiq_queues.yml` file on a single or +multiple processes with exceptions using the `--negate` flag. + +For example, say you want to run a single process for all queues, +except "process_commit" and "post_receive". You can do so by executing: + +```bash +sidekiq-cluster process_commit,post_receive --negate +``` + +For multiple processes of all queues (except "process_commit" and "post_receive"): + +```bash +sidekiq-cluster process_commit,post_receive process_commit,post_receive --negate +``` + +## Limiting Concurrency + +By default, `sidekiq-cluster` will spin up extra Sidekiq processes that use +one thread per queue up to a maximum of 50. If you wish to change the cap, use +the `-m N` option. For example, this would cap the maximum number of threads to 1: + +```bash +sidekiq-cluster process_commit,post_receive -m 1 +``` + +For each queue group, the concurrency factor will be set to min(number of +queues, N). Setting the value to 0 will disable the limit. + +Note that each thread requires a Redis connection, so adding threads may +increase Redis latency and potentially cause client timeouts. See the [Sidekiq +documentation about Redis](https://github.com/mperham/sidekiq/wiki/Using-Redis) for more details. diff --git a/doc/administration/operations/fast_ssh_key_lookup.md b/doc/administration/operations/fast_ssh_key_lookup.md index c293df3fc57..6ba5768ebfd 100644 --- a/doc/administration/operations/fast_ssh_key_lookup.md +++ b/doc/administration/operations/fast_ssh_key_lookup.md @@ -30,6 +30,19 @@ instructions will break installations using older versions of OpenSSH, such as those included with CentOS 6 as of September 2017. If you want to use this feature for CentOS 6, follow [the instructions on how to build and install a custom OpenSSH package](#compiling-a-custom-version-of-openssh-for-centos-6) before continuing. +## Fast lookup is required for Geo + +By default, GitLab manages an `authorized_keys` file, which contains all the +public SSH keys for users allowed to access GitLab. However, to maintain a +single source of truth, [Geo](../../gitlab-geo/README.md) needs to be configured to perform SSH fingerprint +lookups via database lookup. + +As part of [setting up Geo](../geo/replication/index.md#setup-instructions), +you will be required to follow the steps outlined below for both the primary and +secondary nodes, but note that the `Write to "authorized keys" file` checkbox +only needs to be unchecked on the primary node since it will be reflected +automatically on the secondary if database replication is working. + ## Setting up fast lookup via GitLab Shell GitLab Shell provides a way to authorize SSH users via a fast, indexed lookup diff --git a/doc/administration/operations/index.md b/doc/administration/operations/index.md index 32f36d68c50..df795a48169 100644 --- a/doc/administration/operations/index.md +++ b/doc/administration/operations/index.md @@ -11,6 +11,7 @@ Keep your GitLab instance up and running smoothly. by GitLab to another file system or another server. - [Sidekiq MemoryKiller](sidekiq_memory_killer.md): Configure Sidekiq MemoryKiller to restart Sidekiq. +- [Extra Sidekiq operations](extra_sidekiq_processes.md): Configure an extra set of Sidekiq processes to ensure certain queues always have dedicated workers, no matter the amount of jobs that need to be processed. **[STARTER ONLY]** - [Unicorn](unicorn.md): Understand Unicorn and unicorn-worker-killer. - Speed up SSH operations by [Authorizing SSH users via a fast, indexed lookup to the GitLab database](fast_ssh_key_lookup.md), and/or diff --git a/doc/administration/packages.md b/doc/administration/packages.md new file mode 100644 index 00000000000..4d60c0c7638 --- /dev/null +++ b/doc/administration/packages.md @@ -0,0 +1,174 @@ +# GitLab Packages administration **[PREMIUM ONLY]** + +GitLab Packages allows organizations to utilize GitLab as a private repository +for a variety of common package managers. Users are able to build and publish +packages, which can be easily consumed as a dependency in downstream projects. + +The Packages feature allows GitLab to act as a repository for the following: + +| Software repository | Description | Available in GitLab version | +| ------------------- | ----------- | --------------------------- | +| [Maven Repository](../user/project/packages/maven_repository.md) | The GitLab Maven Repository enables every project in GitLab to have its own space to store [Maven](https://maven.apache.org/) packages. | 11.3+ | +| [NPM Registry](../user/project/packages/npm_registry.md) | The GitLab NPM Registry enables every project in GitLab to have its own space to store [NPM](https://www.npmjs.com/) packages. | 11.7+ | + +Don't you see your package management system supported yet? +Please consider contributing +to GitLab. This [development documentation](../development/packages.md) will guide you through the process. + +## Enabling the Packages feature + +NOTE: **Note:** +After the Packages feature is enabled, the repositories are available for +for all new projects by default. To enable it for existing projects, users will +have to explicitly do so in the project's settings. + +To enable the Packages feature: + +**Omnibus GitLab installations** + +1. Edit `/etc/gitlab/gitlab.rb` and add the following line: + + ```ruby + gitlab_rails['packages_enabled'] = true + ``` + +1. Save the file and [reconfigure GitLab][] for the changes to take effect. + +**Installations from source** + +1. After the installation is complete, you will have to configure the `packages` + section in `config/gitlab.yml`. Set to `true` to enable it: + + ```yaml + packages: + enabled: true + ``` +1. [Restart GitLab] for the changes to take effect. + +## Changing the storage path + +By default, the packages are stored locally, but you can change the default +local location or even use object storage. + +### Changing the local storage path + +The packages for Omnibus GitLab installations are stored under +`/var/opt/gitlab/gitlab-rails/shared/packages/` and for source +installations under `shared/packages/` (relative to the git homedir). +To change the local storage path: + +**Omnibus GitLab installations** + +1. Edit `/etc/gitlab/gitlab.rb` and add the following line: + + ```ruby + gitlab_rails['packages_storage_path'] = "/mnt/packages" + ``` + +1. Save the file and [reconfigure GitLab][] for the changes to take effect. + +**Installations from source** + +1. Edit the `packages` section in `config/gitlab.yml`: + + ```yaml + packages: + enabled: true + storage_path: shared/packages + ``` +1. [Restart GitLab] for the changes to take effect. + +### Using object storage + +Instead of relying on the local storage, you can use an object storage to +upload packages: + +**Omnibus GitLab installations** + +1. Edit `/etc/gitlab/gitlab.rb` and add the following lines (uncomment where + necessary): + + ```ruby + gitlab_rails['packages_enabled'] = true + gitlab_rails['packages_storage_path'] = "/var/opt/gitlab/gitlab-rails/shared/packages" + gitlab_rails['packages_object_store_enabled'] = true + gitlab_rails['packages_object_store_remote_directory'] = "packages" # The bucket name. + gitlab_rails['packages_object_store_direct_upload'] = false # Use Object Storage directly for uploads instead of background uploads if enabled (Default: false). + gitlab_rails['packages_object_store_background_upload'] = true # Temporary option to limit automatic upload (Default: true). + gitlab_rails['packages_object_store_proxy_download'] = false # Passthrough all downloads via GitLab instead of using Redirects to Object Storage. + gitlab_rails['packages_object_store_connection'] = { + ## + ## If the provider is AWS S3, uncomment the following + ## + #'provider' => 'AWS', + #'region' => 'eu-west-1', + #'aws_access_key_id' => 'AWS_ACCESS_KEY_ID', + #'aws_secret_access_key' => 'AWS_SECRET_ACCESS_KEY', + ## + ## If the provider is other than AWS (an S3-compatible one), uncomment the following + ## + #'host' => 's3.amazonaws.com', + #'aws_signature_version' => 4 # For creation of signed URLs. Set to 2 if provider does not support v4. + #'endpoint' => 'https://s3.amazonaws.com' # Useful for S3-compliant services such as DigitalOcean Spaces. + #'path_style' => false # If true, use 'host/bucket_name/object' instead of 'bucket_name.host/object'. + } + ``` + +1. Save the file and [reconfigure GitLab][] for the changes to take effect. + +**Installations from source** + +1. Edit the `packages` section in `config/gitlab.yml` (uncomment where necessary): + + ```yaml + packages: + enabled: true + ## + ## The location where build packages are stored (default: shared/packages). + ## + #storage_path: shared/packages + object_store: + enabled: false + remote_directory: packages # The bucket name. + #direct_upload: false # Use Object Storage directly for uploads instead of background uploads if enabled (Default: false). + #background_upload: true # Temporary option to limit automatic upload (Default: true). + #proxy_download: false # Passthrough all downloads via GitLab instead of using Redirects to Object Storage. + connection: + ## + ## If the provider is AWS S3, uncomment the following + ## + #provider: AWS + #region: us-east-1 + #aws_access_key_id: AWS_ACCESS_KEY_ID + #aws_secret_access_key: AWS_SECRET_ACCESS_KEY + ## + ## If the provider is other than AWS (an S3-compatible one), uncomment the following + ## + #host: 's3.amazonaws.com' # default: s3.amazonaws.com. + #aws_signature_version: 4 # For creation of signed URLs. Set to 2 if provider does not support v4. + #endpoint: 'https://s3.amazonaws.com' # Useful for S3-compliant services such as DigitalOcean Spaces. + #path_style: false # If true, use 'host/bucket_name/object' instead of 'bucket_name.host/object'. + ``` + +1. [Restart GitLab] for the changes to take effect. + +### Migrating local packages to object storage + +After [configuring the object storage](#using-object-storage), you may use the +following task to migrate existing packages from the local storage to the remote one. +The processing will be done in a background worker and requires **no downtime**. + +For Omnibus GitLab: + +```sh +sudo gitlab-rake "gitlab:packages:migrate" +``` + +For installations from source: + +```bash +RAILS_ENV=production sudo -u git -H bundle exec rake gitlab:packages:migrate +``` + +[reconfigure gitlab]: restart_gitlab.md#omnibus-gitlab-reconfigure "How to reconfigure Omnibus GitLab" +[restart gitlab]: restart_gitlab.md#omnibus-gitlab-reconfigure "How to reconfigure Omnibus GitLab" diff --git a/doc/administration/pseudonymizer.md b/doc/administration/pseudonymizer.md new file mode 100644 index 00000000000..0ad937052bd --- /dev/null +++ b/doc/administration/pseudonymizer.md @@ -0,0 +1,103 @@ +# Pseudonymizer + +> [Introduced](https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/5532) in [GitLab Ultimate][ee] 11.1. + +As GitLab's database hosts sensitive information, using it unfiltered for analytics +implies high security requirements. To help alleviate this constraint, the Pseudonymizer +service is used to export GitLab's data in a pseudonymized way. + +CAUTION: **Warning:** +This process is not impervious. If the source data is available, it's possible for +a user to correlate data to the pseudonymized version. + +The Pseudonymizer currently uses `HMAC(SHA256)` to mutate fields that shouldn't +be textually exported. This ensures that: + +- the end-user of the data source cannot infer/revert the pseudonymized fields +- the referential integrity is maintained + +## Configuration + +To configure the pseudonymizer, you need to: + +- Provide a manifest file that describes which fields should be included or + pseudonymized ([example `manifest.yml` file](https://gitlab.com/gitlab-org/gitlab-ee/tree/master/config/pseudonymizer.yml)). + A default manifest is provided with the GitLab installation. Using a relative file path will be resolved from the Rails root. + Alternatively, you can use an absolute file path. +- Use an object storage and specify the connection parameters in the `pseudonymizer.upload.connection` configuration option. + +**For Omnibus installations:** + +1. Edit `/etc/gitlab/gitlab.rb` and add the following lines by replacing with + the values you want: + + ```ruby + gitlab_rails['pseudonymizer_manifest'] = 'config/pseudonymizer.yml' + gitlab_rails['pseudonymizer_upload_remote_directory'] = 'gitlab-elt' # bucket name + gitlab_rails['pseudonymizer_upload_connection'] = { + 'provider' => 'AWS', + 'region' => 'eu-central-1', + 'aws_access_key_id' => 'AWS_ACCESS_KEY_ID', + 'aws_secret_access_key' => 'AWS_SECRET_ACCESS_KEY' + } + ``` + + NOTE: **Note:** + If you are using AWS IAM profiles, be sure to omit the AWS access key and secret access key/value pairs. + + ```ruby + gitlab_rails['pseudonymizer_upload_connection'] = { + 'provider' => 'AWS', + 'region' => 'eu-central-1', + 'use_iam_profile' => true + } + ``` + +1. Save the file and [reconfigure GitLab](restart_gitlab.md#omnibus-gitlab-reconfigure) + for the changes to take effect. + +--- + +**For installations from source:** + +1. Edit `/home/git/gitlab/config/gitlab.yml` and add or amend the following + lines: + + ```yaml + pseudonymizer: + manifest: config/pseudonymizer.yml + upload: + remote_directory: 'gitlab-elt' # bucket name + connection: + provider: AWS + aws_access_key_id: AWS_ACCESS_KEY_ID + aws_secret_access_key: AWS_SECRET_ACCESS_KEY + region: eu-central-1 + ``` + +1. Save the file and [restart GitLab](restart_gitlab.md#installations-from-source) + for the changes to take effect. + +## Usage + +You can optionally run the pseudonymizer using the following environment variables: + +- `PSEUDONYMIZER_OUTPUT_DIR` - where to store the output CSV files (defaults to `/tmp`) +- `PSEUDONYMIZER_BATCH` - the batch size when querying the DB (defaults to `100000`) + +```bash +## Omnibus +sudo gitlab-rake gitlab:db:pseudonymizer + +## Source +sudo -u git -H bundle exec rake gitlab:db:pseudonymizer RAILS_ENV=production +``` + +This will produce some CSV files that might be very large, so make sure the +`PSEUDONYMIZER_OUTPUT_DIR` has sufficient space. As a rule of thumb, at least +10% of the database size is recommended. + +After the pseudonymizer has run, the output CSV files should be uploaded to the +configured object storage and deleted from the local disk. + +[ee]: https://about.gitlab.com/pricing/ diff --git a/doc/administration/raketasks/geo.md b/doc/administration/raketasks/geo.md new file mode 100644 index 00000000000..60bec0fd868 --- /dev/null +++ b/doc/administration/raketasks/geo.md @@ -0,0 +1,57 @@ +# Geo Rake Tasks + +## Git housekeeping + +There are few tasks you can run to schedule a git housekeeping to start at the +next repository sync in a **Secondary node**: + +### Incremental Repack + +This is equivalent of running `git repack -d` on a _bare_ repository. + +**Omnibus Installation** + +``` +sudo gitlab-rake geo:git:housekeeping:incremental_repack +``` + +**Source Installation** + +```bash +sudo -u git -H bundle exec rake geo:git:housekeeping:incremental_repack RAILS_ENV=production +``` + +### Full Repack + +This is equivalent of running `git repack -d -A --pack-kept-objects` on a +_bare_ repository which will optionally, write a reachability bitmap index +when this is enabled in GitLab. + +**Omnibus Installation** + +``` +sudo gitlab-rake geo:git:housekeeping:full_repack +``` + +**Source Installation** + +```bash +sudo -u git -H bundle exec rake geo:git:housekeeping:full_repack RAILS_ENV=production +``` + +### GC + +This is equivalent of running `git gc` on a _bare_ repository, optionally writing +a reachability bitmap index when this is enabled in GitLab. + +**Omnibus Installation** + +``` +sudo gitlab-rake geo:git:housekeeping:gc +``` + +**Source Installation** + +```bash +sudo -u git -H bundle exec rake geo:git:housekeeping:gc RAILS_ENV=production +``` diff --git a/doc/administration/raketasks/storage.md b/doc/administration/raketasks/storage.md index 7ad38abe4f5..d0e6540d067 100644 --- a/doc/administration/raketasks/storage.md +++ b/doc/administration/raketasks/storage.md @@ -42,6 +42,9 @@ If you find it necessary, you can run this migration script again to schedule mi Any error or warning will be logged in the sidekiq's log file. +NOTE: **Note:** +If Geo is enabled, each project that is successfully migrated generates an event to replicate the changes on any **secondary** nodes. + You only need the `gitlab:storage:migrate_to_hashed` rake task to migrate your repositories, but we have additional commands below that helps you inspect projects and attachments in both legacy and hashed storage. diff --git a/doc/administration/repository_storage_types.md b/doc/administration/repository_storage_types.md index 4934aaf39f7..7249dbf5897 100644 --- a/doc/administration/repository_storage_types.md +++ b/doc/administration/repository_storage_types.md @@ -29,15 +29,15 @@ Any change in the URL will need to be reflected on disk (when groups / users or projects are renamed). This can add a lot of load in big installations, especially if using any type of network based filesystem. -For GitLab Geo in particular: Geo does work with legacy storage, but in some +CAUTION: **Caution:** +For Geo in particular: Geo does work with legacy storage, but in some edge cases due to race conditions it can lead to errors when a project is renamed multiple times in short succession, or a project is deleted and recreated under the same name very quickly. We expect these race events to be rare, and we have not observed a race condition side-effect happening yet. - This pattern also exists in other objects stored in GitLab, like issue Attachments, GitLab Pages artifacts, Docker Containers for the integrated -Registry, etc. +Registry, etc. Hashed storage is a requirement for Geo. ## Hashed Storage @@ -87,7 +87,7 @@ The rollback has to be performed in the reverse order. To get into "Legacy" stat you need to rollback Attachments first, then Project. Also note that if Geo is enabled, after the migration was triggered, an event is generated -to replicate the operation on any Secondary node. That means the on disk changes will also +to replicate the operation on any **secondary** node. That means the [on disk changes](#project) will need to be performed on these nodes as well. Database changes will propagate without issues. You must make sure the migration event was already processed or otherwise it may migrate |