diff options
author | GitLab Bot <gitlab-bot@gitlab.com> | 2021-07-20 09:55:51 +0000 |
---|---|---|
committer | GitLab Bot <gitlab-bot@gitlab.com> | 2021-07-20 09:55:51 +0000 |
commit | e8d2c2579383897a1dd7f9debd359abe8ae8373d (patch) | |
tree | c42be41678c2586d49a75cabce89322082698334 /doc/administration | |
parent | fc845b37ec3a90aaa719975f607740c22ba6a113 (diff) | |
download | gitlab-ce-e8d2c2579383897a1dd7f9debd359abe8ae8373d.tar.gz |
Add latest changes from gitlab-org/gitlab@14-1-stable-eev14.1.0-rc42
Diffstat (limited to 'doc/administration')
92 files changed, 3538 insertions, 1612 deletions
diff --git a/doc/administration/audit_events.md b/doc/administration/audit_events.md index f0c4d947668..7a871caf658 100644 --- a/doc/administration/audit_events.md +++ b/doc/administration/audit_events.md @@ -120,6 +120,9 @@ From there, you can see the following actions: - Project access token was successfully created or revoked ([Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/230007) in GitLab 13.9) - Failed attempt to create or revoke a project access token ([Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/230007) in GitLab 13.9) - When default branch changes for a project ([Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/52339) in GitLab 13.9) +- Created, updated, or deleted DAST profiles, DAST scanner profiles, and DAST site profiles + ([Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/217872) in GitLab 14.1) +- Changed a project's compliance framework ([Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/329362) in GitLab 14.1) Project events can also be accessed via the [Project Audit Events API](../api/audit_events.md#project-audit-events). @@ -161,6 +164,9 @@ The following user actions are recorded: - Failed second-factor authentication attempt ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/16826) in GitLab 13.5) - A user's personal access token was successfully created or revoked ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/276921) in GitLab 13.6) - A failed attempt to create or revoke a user's personal access token ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/276921) in GitLab 13.6) +- Administrator added or removed ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/323905) in GitLab 14.1) +- Removed SSH key ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/220127) in GitLab 14.1) +- Added or removed GPG key ([introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/220127) in GitLab 14.1) Instance events can also be accessed via the [Instance Audit Events API](../api/audit_events.md#instance-audit-events). @@ -188,7 +194,7 @@ on adding these events into GitLab: Don't see the event you want in any of the epics linked above? You can use the **Audit Event Proposal** issue template to [create an issue](https://gitlab.com/gitlab-org/gitlab/-/issues/new?issuable_template=Audit%20Event%20Proposal) -to request it. +to request it, or you can [add it yourself](../development/audit_event_guide/). ### Disabled events diff --git a/doc/administration/auth/README.md b/doc/administration/auth/README.md index a072cc73c43..5ab8653dc35 100644 --- a/doc/administration/auth/README.md +++ b/doc/administration/auth/README.md @@ -1,52 +1,8 @@ --- -comments: false -type: index -stage: Manage -group: Access -info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +redirect_to: 'index.md' --- -# GitLab authentication and authorization **(FREE SELF)** +This document was moved to [another location](index.md). -GitLab integrates with the following external authentication and authorization -providers: - -- [Atlassian](atlassian.md) -- [Auth0](../../integration/auth0.md) -- [Authentiq](authentiq.md) -- [AWS Cognito](cognito.md) -- [Azure](../../integration/azure.md) -- [Bitbucket Cloud](../../integration/bitbucket.md) -- [CAS](../../integration/cas.md) -- [Crowd](crowd.md) -- [Facebook](../../integration/facebook.md) -- [GitHub](../../integration/github.md) -- [GitLab.com](../../integration/gitlab.md) -- [Google OAuth](../../integration/google.md) -- [JWT](jwt.md) -- [Kerberos](../../integration/kerberos.md) -- [LDAP](ldap/index.md): Includes Active Directory, Apple Open Directory, Open LDAP, - and 389 Server. - - [Google Secure LDAP](ldap/google_secure_ldap.md) -- [Salesforce](../../integration/salesforce.md) -- [SAML](../../integration/saml.md) -- [SAML for GitLab.com groups](../../user/group/saml_sso/index.md) **(PREMIUM SAAS)** -- [Shibboleth](../../integration/shibboleth.md) -- [Smartcard](smartcard.md) **(PREMIUM SELF)** -- [Twitter](../../integration/twitter.md) - -NOTE: -UltraAuth has removed their software which supports OmniAuth integration. We have therefore removed all references to UltraAuth integration. - -## SaaS vs Self-Managed Comparison - -The external authentication and authorization providers may support the following capabilities. -For more information, see the links shown on this page for each external provider. - -| Capability | SaaS | Self-Managed | -|-------------------------------------------------|-----------------------------------------|------------------------------------| -| **User Provisioning** | SCIM<br>JIT Provisioning | LDAP Sync | -| **User Detail Updating** (not group management) | Not Available | LDAP Sync | -| **Authentication** | SAML at top-level group (1 provider) | LDAP (multiple providers)<br>Generic OAuth2<br>SAML (only 1 permitted per unique provider)<br>Kerberos<br>JWT<br>Smartcard<br>OmniAuth Providers (only 1 permitted per unique provider) | -| **Provider-to-GitLab Role Sync** | SAML Group Sync | LDAP Group Sync | -| **User Removal** | SCIM (remove user from top-level group) | LDAP (Blocking User from Instance) | +<!-- This redirect file can be deleted after 2021-09-28. --> +<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/administration/auth/atlassian.md b/doc/administration/auth/atlassian.md index 365236748b9..b3892f8f5d9 100644 --- a/doc/administration/auth/atlassian.md +++ b/doc/administration/auth/atlassian.md @@ -11,7 +11,7 @@ To enable the Atlassian OmniAuth provider for passwordless authentication you mu ## Atlassian application registration -1. Go to <https://developer.atlassian.com/apps/> and sign-in with the Atlassian +1. Go to <https://developer.atlassian.com/console/myapps/> and sign-in with the Atlassian account that will administer the application. 1. Click **Create a new app**. diff --git a/doc/administration/auth/authentiq.md b/doc/administration/auth/authentiq.md index 2eab4555c85..835293ff500 100644 --- a/doc/administration/auth/authentiq.md +++ b/doc/administration/auth/authentiq.md @@ -9,7 +9,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w To enable the Authentiq OmniAuth provider for passwordless authentication you must register an application with Authentiq. -Authentiq will generate a Client ID and the accompanying Client Secret for you to use. +Authentiq generates a Client ID and the accompanying Client Secret for you to use. 1. Get your Client credentials (Client ID and Client Secret) at [Authentiq](https://www.authentiq.com/developers). @@ -67,15 +67,17 @@ Authentiq will generate a Client ID and the accompanying Client Secret for you t 1. [Reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure) or [restart GitLab](../restart_gitlab.md#installations-from-source) for the changes to take effect if you installed GitLab via Omnibus or from source respectively. -On the sign in page there should now be an Authentiq icon below the regular sign in form. Click the icon to begin the authentication process. +On the sign in page there should now be an Authentiq icon below the regular sign in form. Click the +icon to begin the authentication process. If the user: -- If the user has the Authentiq ID app installed in their iOS or Android device, they can: +- Has the Authentiq ID app installed in their iOS or Android device, they can: 1. Scan the QR code. 1. Decide what personal details to share. 1. Sign in to your GitLab installation. -- If not they will be prompted to download the app and then follow the procedure above. +- Does not have the app installed, they are prompted to download the app and then follow the + procedure above. -If everything goes right, the user will be returned to GitLab and will be signed in. +If everything works, the user is returned to GitLab and is signed in. <!-- ## Troubleshooting diff --git a/doc/administration/auth/index.md b/doc/administration/auth/index.md new file mode 100644 index 00000000000..a072cc73c43 --- /dev/null +++ b/doc/administration/auth/index.md @@ -0,0 +1,52 @@ +--- +comments: false +type: index +stage: Manage +group: Access +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# GitLab authentication and authorization **(FREE SELF)** + +GitLab integrates with the following external authentication and authorization +providers: + +- [Atlassian](atlassian.md) +- [Auth0](../../integration/auth0.md) +- [Authentiq](authentiq.md) +- [AWS Cognito](cognito.md) +- [Azure](../../integration/azure.md) +- [Bitbucket Cloud](../../integration/bitbucket.md) +- [CAS](../../integration/cas.md) +- [Crowd](crowd.md) +- [Facebook](../../integration/facebook.md) +- [GitHub](../../integration/github.md) +- [GitLab.com](../../integration/gitlab.md) +- [Google OAuth](../../integration/google.md) +- [JWT](jwt.md) +- [Kerberos](../../integration/kerberos.md) +- [LDAP](ldap/index.md): Includes Active Directory, Apple Open Directory, Open LDAP, + and 389 Server. + - [Google Secure LDAP](ldap/google_secure_ldap.md) +- [Salesforce](../../integration/salesforce.md) +- [SAML](../../integration/saml.md) +- [SAML for GitLab.com groups](../../user/group/saml_sso/index.md) **(PREMIUM SAAS)** +- [Shibboleth](../../integration/shibboleth.md) +- [Smartcard](smartcard.md) **(PREMIUM SELF)** +- [Twitter](../../integration/twitter.md) + +NOTE: +UltraAuth has removed their software which supports OmniAuth integration. We have therefore removed all references to UltraAuth integration. + +## SaaS vs Self-Managed Comparison + +The external authentication and authorization providers may support the following capabilities. +For more information, see the links shown on this page for each external provider. + +| Capability | SaaS | Self-Managed | +|-------------------------------------------------|-----------------------------------------|------------------------------------| +| **User Provisioning** | SCIM<br>JIT Provisioning | LDAP Sync | +| **User Detail Updating** (not group management) | Not Available | LDAP Sync | +| **Authentication** | SAML at top-level group (1 provider) | LDAP (multiple providers)<br>Generic OAuth2<br>SAML (only 1 permitted per unique provider)<br>Kerberos<br>JWT<br>Smartcard<br>OmniAuth Providers (only 1 permitted per unique provider) | +| **Provider-to-GitLab Role Sync** | SAML Group Sync | LDAP Group Sync | +| **User Removal** | SCIM (remove user from top-level group) | LDAP (Blocking User from Instance) | diff --git a/doc/administration/auth/ldap/index.md b/doc/administration/auth/ldap/index.md index bc6a854c518..a9d59ca0983 100644 --- a/doc/administration/auth/ldap/index.md +++ b/doc/administration/auth/ldap/index.md @@ -174,6 +174,7 @@ production: | `base` | Base where we can search for users. | **{check-circle}** Yes | `'ou=people,dc=gitlab,dc=example'` or `'DC=mydomain,DC=com'` | | `user_filter` | Filter LDAP users. Format: [RFC 4515](https://tools.ietf.org/search/rfc4515) Note: GitLab does not support `omniauth-ldap`'s custom filter syntax. | **{dotted-circle}** No | For examples, read [Examples of user filters](#examples-of-user-filters). | | `lowercase_usernames` | If enabled, GitLab converts the name to lower case. | **{dotted-circle}** No | boolean | +| `retry_empty_result_with_codes` | An array of LDAP query response code that will attempt to retrying the operation if the result/content is empty. | **{dotted-circle}** No | `[80]` | #### Examples of user filters diff --git a/doc/administration/auth/ldap/ldap-troubleshooting.md b/doc/administration/auth/ldap/ldap-troubleshooting.md index 1215d90134f..5e6c3443e44 100644 --- a/doc/administration/auth/ldap/ldap-troubleshooting.md +++ b/doc/administration/auth/ldap/ldap-troubleshooting.md @@ -552,7 +552,7 @@ LDAP. If the email has changed and the DN has not, GitLab finds the user with the DN and update its own record of the user's email to match the one in LDAP. -However, if the primary email _and_ the DN change in LDAP, then GitLab +However, if the primary email _and_ the DN change in LDAP, then GitLab has no way of identifying the correct LDAP record of the user and, as a result, the user is blocked. To rectify this, the user's existing profile must be updated with at least one of the new values (primary diff --git a/doc/administration/auth/oidc.md b/doc/administration/auth/oidc.md index 30ca7d94a1e..951c7df26ef 100644 --- a/doc/administration/auth/oidc.md +++ b/doc/administration/auth/oidc.md @@ -159,14 +159,14 @@ gitlab_rails['omniauth_providers'] = [ ### Microsoft Azure The OpenID Connect (OIDC) protocol for Microsoft Azure uses the [Microsoft identity platform (v2) endpoints](https://docs.microsoft.com/en-us/azure/active-directory/azuread-dev/azure-ad-endpoint-comparison). -To get started, sign in to the [Azure Portal](https://portal.azure.com). For your app, you'll need the +To get started, sign in to the [Azure Portal](https://portal.azure.com). For your app, you need the following information: - A tenant ID. You may already have one. For more information, review the [Microsoft Azure Tenant](https://docs.microsoft.com/en-us/azure/active-directory/develop/quickstart-create-new-tenant) documentation. - A client ID and a client secret. Follow the instructions in the - [Microsoft Quickstart Register an Application](https://docs.microsoft.com/en-us/azure/active-directory/develop/quickstart-register-app) documentation. -to obtain the tenant ID, client ID, and client secret for your app. + [Microsoft Quickstart Register an Application](https://docs.microsoft.com/en-us/azure/active-directory/develop/quickstart-register-app) documentation + to obtain the tenant ID, client ID, and client secret for your app. Example Omnibus configuration block: @@ -199,7 +199,7 @@ Microsoft has documented how its platform works with [the OIDC protocol](https:/ While GitLab works with [Azure Active Directory B2C](https://docs.microsoft.com/en-us/azure/active-directory-b2c/overview), it requires special configuration to work. To get started, sign in to the [Azure Portal](https://portal.azure.com). -For your app, you'll need the following information from Azure: +For your app, you need the following information from Azure: - A tenant ID. You may already have one. For more information, review the [Microsoft Azure Tenant](https://docs.microsoft.com/en-us/azure/active-directory/develop/quickstart-create-new-tenant) documentation. @@ -216,8 +216,8 @@ In addition, ensure that [ID tokens are enabled](https://docs.microsoft.com/en-u Add the following API permissions to the app: -1. `openid` -1. `offline_access` +- `openid` +- `offline_access` #### Configure custom policies @@ -231,7 +231,7 @@ standard Azure B2C user flows [do not send the OpenID `email` claim](https://git other words, they do not work with the [`allow_single_sign_on` or `auto_link_user` parameters](../../integration/omniauth.md#initial-omniauth-configuration). With a standard Azure B2C policy, GitLab cannot create a new account or -link to an existing one with an e-mail address. +link to an existing one with an email address. Carefully follow the instructions for [creating a custom policy](https://docs.microsoft.com/en-us/azure/active-directory-b2c/tutorial-create-user-flows?pivots=b2c-custom-policy). @@ -240,42 +240,42 @@ but `LocalAccounts` works for authenticating against local, Active Directory acc 1. To export the `email` claim, modify the `SignUpOrSignin.xml`. Replace the following line: - ```xml - <OutputClaim ClaimTypeReferenceId="email" /> - ``` + ```xml + <OutputClaim ClaimTypeReferenceId="email" /> + ``` - with: + with: - ```xml - <OutputClaim ClaimTypeReferenceId="signInNames.emailAddress" PartnerClaimType="email" /> - ``` + ```xml + <OutputClaim ClaimTypeReferenceId="signInNames.emailAddress" PartnerClaimType="email" /> + ``` 1. For OIDC discovery to work with B2C, the policy must be configured with an issuer compatible with the [OIDC -specification](https://openid.net/specs/openid-connect-discovery-1_0.html#rfc.section.4.3). -See the [token compatibility settings](https://docs.microsoft.com/en-us/azure/active-directory-b2c/configure-tokens?pivots=b2c-custom-policy#token-compatibility-settings). -In `TrustFrameworkBase.xml` under `JwtIssuer`, set `IssuanceClaimPattern` to `AuthorityWithTfp`: - - ```xml - <ClaimsProvider> - <DisplayName>Token Issuer</DisplayName> - <TechnicalProfiles> - <TechnicalProfile Id="JwtIssuer"> - <DisplayName>JWT Issuer</DisplayName> - <Protocol Name="None" /> - <OutputTokenFormat>JWT</OutputTokenFormat> - <Metadata> - <Item Key="IssuanceClaimPattern">AuthorityWithTfp</Item> - ... - ``` + specification](https://openid.net/specs/openid-connect-discovery-1_0.html#rfc.section.4.3). + See the [token compatibility settings](https://docs.microsoft.com/en-us/azure/active-directory-b2c/configure-tokens?pivots=b2c-custom-policy#token-compatibility-settings). + In `TrustFrameworkBase.xml` under `JwtIssuer`, set `IssuanceClaimPattern` to `AuthorityWithTfp`: + + ```xml + <ClaimsProvider> + <DisplayName>Token Issuer</DisplayName> + <TechnicalProfiles> + <TechnicalProfile Id="JwtIssuer"> + <DisplayName>JWT Issuer</DisplayName> + <Protocol Name="None" /> + <OutputTokenFormat>JWT</OutputTokenFormat> + <Metadata> + <Item Key="IssuanceClaimPattern">AuthorityWithTfp</Item> + ... + ``` 1. Now [upload the policy](https://docs.microsoft.com/en-us/azure/active-directory-b2c/tutorial-create-user-flows?pivots=b2c-custom-policy#upload-the-policies). Overwrite -the existing files if you are updating an existing policy. + the existing files if you are updating an existing policy. -1. Determine the issuer URL using the sign-in policy. The issuer URL will be in the form: +1. Determine the issuer URL using the sign-in policy. The issuer URL is in the form: - ```markdown - https://<YOUR-DOMAIN>/tfp/<YOUR-TENANT-ID>/<YOUR-SIGN-IN-POLICY-NAME>/v2.0/ - ``` + ```markdown + https://<YOUR-DOMAIN>/tfp/<YOUR-TENANT-ID>/<YOUR-SIGN-IN-POLICY-NAME>/v2.0/ + ``` The policy name is lowercased in the URL. For example, `B2C_1A_signup_signin` policy appears as `b2c_1a_signup_sigin`. @@ -283,63 +283,183 @@ the existing files if you are updating an existing policy. Note that the trailing forward slash is required. 1. Verify the operation of the OIDC discovery URL and issuer URL, append `.well-known/openid-configuration` -to the issuer URL: + to the issuer URL: + + ```markdown + https://<YOUR-DOMAIN>/tfp/<YOUR-TENANT-ID>/<YOUR-SIGN-IN-POLICY-NAME>/v2.0/.well-known/openid-configuration + ``` + + For example, if `domain` is `example.b2clogin.com` and tenant ID is + `fc40c736-476c-4da1-b489-ee48cee84386`, you can use `curl` and `jq` to extract the issuer: + + ```shell + $ curl --silent "https://example.b2clogin.com/tfp/fc40c736-476c-4da1-b489-ee48cee84386/b2c_1a_signup_signin/v2.0/.well-known/openid-configuration" | jq .issuer + "https://example.b2clogin.com/tfp/fc40c736-476c-4da1-b489-ee48cee84386/b2c_1a_signup_signin/v2.0/" + ``` + +1. Configure the issuer URL with the custom policy used for `signup_signin`. For example, this is + the Omnibus configuration with a custom policy for `b2c_1a_signup_signin`: + + ```ruby + gitlab_rails['omniauth_providers'] = [ + { + 'name' => 'openid_connect', + 'label' => 'Azure B2C OIDC', + 'args' => { + 'name' => 'openid_connect', + 'scope' => ['openid'], + 'response_mode' => 'query', + 'response_type' => 'id_token', + 'issuer' => 'https://<YOUR-DOMAIN>/tfp/<YOUR-TENANT-ID>/b2c_1a_signup_signin/v2.0/', + 'client_auth_method' => 'query', + 'discovery' => true, + 'send_scope_to_token_endpoint' => true, + 'client_options' => { + 'identifier' => '<YOUR APP CLIENT ID>', + 'secret' => '<YOUR APP CLIENT SECRET>', + 'redirect_uri' => 'https://gitlab.example.com/users/auth/openid_connect/callback' + } + } + }] + ``` + +#### Troubleshooting Azure B2C - ```markdown - https://<YOUR-DOMAIN>/tfp/<YOUR-TENANT-ID>/<YOUR-SIGN-IN-POLICY-NAME>/v2.0/.well-known/openid-configuration - ``` +- Ensure all occurrences of `yourtenant.onmicrosoft.com`, `ProxyIdentityExperienceFrameworkAppId`, and `IdentityExperienceFrameworkAppId` match your B2C tenant hostname and + the respective client IDs in the XML policy files. +- Add `https://jwt.ms` as a redirect URI to the app, and use the [custom policy tester](https://docs.microsoft.com/en-us/azure/active-directory-b2c/tutorial-create-user-flows?pivots=b2c-custom-policy#test-the-custom-policy). + Make sure the payload includes `email` that matches the user's email access. +- After you enable the custom policy, users might see "Invalid username or password" after they try to sign in. This might be a configuration + issue with the `IdentityExperienceFramework` app. See [this Microsoft comment](https://docs.microsoft.com/en-us/answers/questions/50355/unable-to-sign-on-using-custom-policy.html?childToView=122370#comment-122370) + that suggests checking that the app manifest contains these settings: + + - `"accessTokenAcceptedVersion": null` + - `"signInAudience": "AzureADMyOrg"` + + Note that this configuration corresponds with the `Supported account types` setting used when + creating the `IdentityExperienceFramework` app. + +#### Keycloak - For example, if `domain` is `example.b2clogin.com` and tenant ID is `fc40c736-476c-4da1-b489-ee48cee84386`, you can use `curl` and `jq` to -extract the issuer: +GitLab works with OpenID providers that use HTTPS. Although a Keycloak +server can be set up using HTTP, GitLab can only communicate +with a Keycloak server that uses HTTPS. - ```shell - $ curl --silent "https://example.b2clogin.com/tfp/fc40c736-476c-4da1-b489-ee48cee84386/b2c_1a_signup_signin/v2.0/.well-known/openid-configuration" | jq .issuer - "https://example.b2clogin.com/tfp/fc40c736-476c-4da1-b489-ee48cee84386/b2c_1a_signup_signin/v2.0/" - ``` +We highly recommend configuring Keycloak to use public key encryption algorithms (for example, +RSA256, RSA512, and so on) instead of symmetric key encryption algorithms (for example, HS256 or HS358) to +sign tokens. Public key encryption algorithms are: -1. Configure the issuer URL with the custom policy used for -`signup_signin`. For example, this is the Omnibus configuration with a -custom policy for `b2c_1a_signup_signin`: +- Easier to configure. +- More secure because leaking the private key has severe security consequences. + +The signature algorithm can be configured in the Keycloak administration console under +**Realm Settings > Tokens > Default Signature Algorithm**. + +Example Omnibus configuration block: ```ruby gitlab_rails['omniauth_providers'] = [ -{ - 'name' => 'openid_connect', - 'label' => 'Azure B2C OIDC', - 'args' => { + { 'name' => 'openid_connect', - 'scope' => ['openid'], - 'response_mode' => 'query', - 'response_type' => 'id_token', - 'issuer' => 'https://<YOUR-DOMAIN>/tfp/<YOUR-TENANT-ID>/b2c_1a_signup_signin/v2.0/', - 'client_auth_method' => 'query', - 'discovery' => true, - 'send_scope_to_token_endpoint' => true, - 'client_options' => { - 'identifier' => '<YOUR APP CLIENT ID>', - 'secret' => '<YOUR APP CLIENT SECRET>', - 'redirect_uri' => 'https://gitlab.example.com/users/auth/openid_connect/callback' + 'label' => 'Keycloak', + 'args' => { + 'name' => 'openid_connect', + 'scope' => ['openid', 'profile', 'email'], + 'response_type' => 'code', + 'issuer' => 'https://keycloak.example.com/auth/realms/myrealm', + 'client_auth_method' => 'query', + 'discovery' => true, + 'uid_field' => 'preferred_username', + 'client_options' => { + 'identifier' => '<YOUR CLIENT ID>', + 'secret' => '<YOUR CLIENT SECRET>', + 'redirect_uri' => 'https://gitlab.example.com/users/auth/openid_connect/callback' + } } } -}] +] ``` -#### Troubleshooting Azure B2C +##### Configure Keycloak with a symmetric key algorithm -- Ensure all occurrences of `yourtenant.onmicrosoft.com`, `ProxyIdentityExperienceFrameworkAppId`, and `IdentityExperienceFrameworkAppId` match your B2C tenant hostname and -the respective client IDs in the XML policy files. +> Introduced in GitLab 14.2. -- Add `https://jwt.ms` as a redirect URI to the app, and use the [custom policy tester](https://docs.microsoft.com/en-us/azure/active-directory-b2c/tutorial-create-user-flows?pivots=b2c-custom-policy#test-the-custom-policy). -Make sure the payload includes `email` that matches the user's e-mail access. +WARNING: +The instructions below are included for completeness, but symmetric key +encryption should only be used when absolutely necessary. -- After you enable the custom policy, users might see "Invalid username or password" after they try to sign in. This might be a configuration -issue with the `IdentityExperienceFramework` app. See [this Microsoft comment](https://docs.microsoft.com/en-us/answers/questions/50355/unable-to-sign-on-using-custom-policy.html?childToView=122370#comment-122370) -that suggests checking that the app manifest contains these settings: +To use symmetric key encryption: - - `"accessTokenAcceptedVersion": null` - - `"signInAudience": "AzureADMyOrg"` +1. Extract the secret key from the Keycloak database. Keycloak doesn't expose this value in the Web + interface. The client secret seen in the Web interface is the OAuth2 client secret, which is + different from the secret used to sign JSON Web Tokens. + + For example, if you're using PostgreSQL as the backend database for Keycloak, log in to the + database console and extract the key via this SQL query: + + ```sql + $ psql -U keycloak + psql (13.3 (Debian 13.3-1.pgdg100+1)) + Type "help" for help. + + keycloak=# SELECT c.name, value FROM component_config CC INNER JOIN component C ON(CC.component_id = C.id) WHERE C.realm_id = 'master' and provider_id = 'hmac-generated' AND CC.name = 'secret'; + -[ RECORD 1 ]--------------------------------------------------------------------------------- + name | hmac-generated + value | lo6cqjD6Ika8pk7qc3fpFx9ysrhf7E62-sqGc8drp3XW-wr93zru8PFsQokHZZuJJbaUXvmiOftCZM3C4KW3-g + -[ RECORD 2 ]--------------------------------------------------------------------------------- + name | fallback-HS384 + value | UfVqmIs--U61UYsRH-NYBH3_mlluLONpg_zN7CXEwkJcO9xdRNlzZfmfDLPtf2xSTMvqu08R2VhLr-8G-oZ47A + ``` + + In this example, there are two private keys: one for HS256 (`hmac-generated`), and another for + HS384 (`fallback-HS384`). We use the first `value` to configure GitLab. + +1. Convert `value` to standard base64. As [discussed in the post](https://keycloak.discourse.group/t/invalid-signature-with-hs256-token/3228/9), + `value` is encoded in ["Base 64 Encoding with URL and Filename Safe Alphabet" in RFC 4648](https://datatracker.ietf.org/doc/html/rfc4648#section-5). + This needs to be converted to [standard base64 as defined in RFC 2045](https://datatracker.ietf.org/doc/html/rfc2045). + The following Ruby script does this: + + ```ruby + require 'base64' + + value = "lo6cqjD6Ika8pk7qc3fpFx9ysrhf7E62-sqGc8drp3XW-wr93zru8PFsQokHZZuJJbaUXvmiOftCZM3C4KW3-g" + Base64.encode64(Base64.urlsafe_decode64(value)) + ``` + + This results in the following value: + + ```markdown + lo6cqjD6Ika8pk7qc3fpFx9ysrhf7E62+sqGc8drp3XW+wr93zru8PFsQokH\nZZuJJbaUXvmiOftCZM3C4KW3+g==\n + ``` + +1. Specify this base64-encoded secret in `jwt_secret_base64`. For example: + + ```ruby + gitlab_rails['omniauth_providers'] = [ + { + 'name' => 'openid_connect', + 'label' => 'Keycloak', + 'args' => { + 'name' => 'openid_connect', + 'scope' => ['openid', 'profile', 'email'], + 'response_type' => 'code', + 'issuer' => 'https://keycloak.example.com/auth/realms/myrealm', + 'client_auth_method' => 'query', + 'discovery' => true, + 'uid_field' => 'preferred_username', + 'jwt_secret_base64' => '<YOUR BASE64-ENCODED SECRET>', + 'client_options' => { + 'identifier' => '<YOUR CLIENT ID>', + 'secret' => '<YOUR CLIENT SECRET>', + 'redirect_uri' => 'https://gitlab.example.com/users/auth/openid_connect/callback' + } + } + } + ] + ``` - Note that this configuration corresponds with the `Supported account types` setting used when creating the `IdentityExperienceFramework` app. +If after reconfiguring, you see the error `JSON::JWS::VerificationFailed` error message, this means +the incorrect secret was specified. ## General troubleshooting diff --git a/doc/administration/auth/okta.md b/doc/administration/auth/okta.md deleted file mode 100644 index 64b42339d19..00000000000 --- a/doc/administration/auth/okta.md +++ /dev/null @@ -1,9 +0,0 @@ ---- -redirect_to: '../../integration/saml.md' -remove_date: '2021-06-15' ---- - -This document was moved to [another location](../../integration/saml.md). - -<!-- This redirect file can be deleted after 2021-06-15>. --> -<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/administration/compliance.md b/doc/administration/compliance.md index 6b80ddbcdb5..742d23105a9 100644 --- a/doc/administration/compliance.md +++ b/doc/administration/compliance.md @@ -10,13 +10,14 @@ You can configure the following GitLab features to help ensure that your GitLab instance meets common compliance standards. Click a feature name for additional documentation. -The [security features](../security/README.md) in GitLab may also help you meet +The [security features](../security/index.md) in GitLab may also help you meet relevant compliance standards. | Feature | GitLab tier | GitLab SaaS | Product level | |----------|:-----------:|:-----------:|:-------------:| |**[Restrict SSH Keys](../security/ssh_keys_restrictions.md)**<br>Control the technology and key length of SSH keys used to access GitLab. | Free+ | **{dotted-circle}** No | Instance | |**[Granular user roles and flexible permissions](../user/permissions.md)**<br>Manage access and permissions with five different user roles and settings for external users. Set permissions according to people's role, rather than either read or write access to a repository. Don't share the source code with people that only need access to the issue tracker. | Free+ | **{check-circle}** Yes | Instance, Group, Project | +|**[Generate reports on permission levels of users](../user/admin_area/index.md#user-permission-export)**<br>Administrators can generate a report listing all users' access permissions for groups and projects in the instance. | Premium+ | **{dotted-circle}** No | Instance | |**[Enforce TOS acceptance](../user/admin_area/settings/terms.md)**<br>Enforce your users accepting new terms of service by blocking GitLab traffic. | Free+ | **{dotted-circle}** No | Instance | |**[Email all users of a project, group, or entire server](../tools/email.md)**<br>An administrator can email groups of users based on project or group membership, or email everyone using the GitLab instance. This is great for scheduled maintenance or upgrades. | Premium+ | **{dotted-circle}** No | Instance | |**[Omnibus package supports log forwarding](https://docs.gitlab.com/omnibus/settings/logs.html#udp-log-forwarding)**<br>Forward your logs to a central system. | Premium+ | **{dotted-circle}** No | Instance | @@ -26,7 +27,7 @@ relevant compliance standards. |**[Audit events](audit_events.md)**<br>To maintain the integrity of your code, GitLab Enterprise Edition Premium gives administrators the ability to view any modifications made within the GitLab server in an advanced audit events system, so you can control, analyze, and track every change. | Premium+ | **{check-circle}** Yes | Instance, Group, Project | |**[Auditor users](auditor_users.md)**<br>Auditor users are users who are given read-only access to all projects, groups, and other resources on the GitLab instance. | Premium+ | **{dotted-circle}** No | Instance | |**[Credentials inventory](../user/admin_area/credentials_inventory.md)**<br>With a credentials inventory, GitLab administrators can keep track of the credentials used by all of the users in their GitLab instance. | Ultimate | **{dotted-circle}** No | Instance | -|**Separation of Duties using [Protected branches](../user/project/protected_branches.md#protected-branches-approval-by-code-owners) and [custom CI Configuration Paths](../ci/pipelines/settings.md#custom-cicd-configuration-file)**<br> GitLab Premium users can leverage the GitLab cross-project YAML configurations to define deployers of code and developers of code. View the [Separation of Duties Deploy Project](https://gitlab.com/guided-explorations/separation-of-duties-deploy/blob/master/README.md) and [Separation of Duties Project](https://gitlab.com/guided-explorations/separation-of-duties/blob/master/README.md) to see how to use this set up to define these roles. | Premium+ | **{check-circle}** Yes | Project | +|**Separation of Duties using [Protected branches](../user/project/protected_branches.md#require-code-owner-approval-on-a-protected-branch) and [custom CI Configuration Paths](../ci/pipelines/settings.md#specify-a-custom-cicd-configuration-file)**<br> GitLab Premium users can leverage the GitLab cross-project YAML configurations to define deployers of code and developers of code. View the [Separation of Duties Deploy Project](https://gitlab.com/guided-explorations/separation-of-duties-deploy/blob/master/README.md) and [Separation of Duties Project](https://gitlab.com/guided-explorations/separation-of-duties/blob/master/README.md) to see how to use this set up to define these roles. | Premium+ | **{check-circle}** Yes | Project | |**[Compliance frameworks](../user/project/settings/index.md#compliance-frameworks)**<br>Create a custom compliance framework at the group level to describe the type of compliance requirements any child project needs to follow. | Premium+ | **{check-circle}** Yes | Group | |**[Compliance pipelines](../user/project/settings/index.md#compliance-pipeline-configuration)**<br>Define a pipeline configuration to run for any projects with a given compliance framework. | Ultimate | **{check-circle}** Yes | Group | |**[Compliance dashboard](../user/compliance/compliance_dashboard/index.md)**<br>Quickly get visibility into the compliance posture of your organization. | Ultimate | **{check-circle}** Yes | Group | diff --git a/doc/administration/configure.md b/doc/administration/configure.md index 12a8f721ccf..73fbf527fe1 100644 --- a/doc/administration/configure.md +++ b/doc/administration/configure.md @@ -9,7 +9,7 @@ type: reference Customize and configure your self-managed GitLab installation. -- [Authentication](auth/README.md) +- [Authentication](auth/index.md) - [Configuration](../user/admin_area/index.md) - [Repository storage](repository_storage_paths.md) - [Geo](geo/index.md) diff --git a/doc/administration/database_load_balancing.md b/doc/administration/database_load_balancing.md index e9f989c96ea..7d17b22a4d7 100644 --- a/doc/administration/database_load_balancing.md +++ b/doc/administration/database_load_balancing.md @@ -31,7 +31,7 @@ sent to the primary (unless necessary), the primary (`db3`) hardly has any load. ## Requirements -For load balancing to work you will need at least PostgreSQL 11 or newer, +For load balancing to work, you need at least PostgreSQL 11 or newer, [**MySQL is not supported**](../install/requirements.md#database). You also need to make sure that you have at least 1 secondary in [hot standby](https://www.postgresql.org/docs/11/hot-standby.html) mode. @@ -42,7 +42,7 @@ you should put a load balancer in front of every database, and have GitLab conne to those load balancers. For example, say you have a primary (`db1.gitlab.com`) and two secondaries, -`db2.gitlab.com` and `db3.gitlab.com`. For this setup you will need to have 3 +`db2.gitlab.com` and `db3.gitlab.com`. For this setup, you need to have 3 load balancers, one for every host. For example: - `primary.gitlab.com` forwards to `db1.gitlab.com` @@ -56,7 +56,7 @@ means forwarding should now happen as follows: - `secondary1.gitlab.com` forwards to `db1.gitlab.com` - `secondary2.gitlab.com` forwards to `db3.gitlab.com` -GitLab does not take care of this for you, so you will need to do so yourself. +GitLab does not take care of this for you, so you need to do so yourself. Finally, load balancing requires that GitLab can connect to all hosts using the same credentials and port as configured in the @@ -72,7 +72,7 @@ different ports or credentials for different hosts is not supported. ## Enabling load balancing For the environment in which you want to use load balancing, you'll need to add -the following. This will balance the load between `host1.example.com` and +the following. This balances the load between `host1.example.com` and `host2.example.com`. **In Omnibus installations:** @@ -104,32 +104,19 @@ the following. This will balance the load between `host1.example.com` and 1. Save the file and [restart GitLab](restart_gitlab.md#installations-from-source) for the changes to take effect. -### Enable the load balancer for Sidekiq +### Load balancing for Sidekiq -Sidekiq mostly writes to the database, which means that most of its traffic hits the -primary database. +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/334494) in GitLab 14.1, load balancing for Sidekick is enabled by default. -Some background jobs can use database replicas to read application state. +Sidekiq jobs mostly write to the primary database, but there are read-only jobs that can benefit +from the use of Sidekiq load balancing. +These jobs can use load balancing and database replicas to read the application state. This allows to offload the primary database. -Load balancing is disabled by default in Sidekiq. When enabled, we can define -[the data consistency](../development/sidekiq_style_guide.md#job-data-consistency-strategies) +For Sidekiq, we can define +[data consistency](../development/sidekiq_style_guide.md#job-data-consistency-strategies) requirements for a specific job. -To enable it, define the `ENABLE_LOAD_BALANCING_FOR_SIDEKIQ` variable to the environment, as shown below. - -For Omnibus installations: - -```ruby -gitlab_rails['env'] = {"ENABLE_LOAD_BALANCING_FOR_SIDEKIQ" => "true"} -``` - -For installations from source: - -```shell -export ENABLE_LOAD_BALANCING_FOR_SIDEKIQ="true" -``` - ## Service Discovery > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/5883) in [GitLab Premium](https://about.gitlab.com/pricing/) 11.0. @@ -176,15 +163,15 @@ The following options can be set: | `disconnect_timeout` | The time in seconds after which an old connection is closed, after the list of hosts was updated. | 120 | | `use_tcp` | Lookup DNS resources using TCP instead of UDP | false | -If `record_type` is set to `SRV`, GitLab will continue to use a round-robin algorithm -and will ignore the `weight` and `priority` in the record. Since SRV records usually -return hostnames instead of IPs, GitLab will look for the IPs of returned hostnames +If `record_type` is set to `SRV`, then GitLab continues to use round-robin algorithm +and ignores the `weight` and `priority` in the record. Since SRV records usually +return hostnames instead of IPs, GitLab needs to look for the IPs of returned hostnames in the additional section of the SRV response. If no IP is found for a hostname, GitLab -will query the configured `nameserver` for ANY record for each such hostname looking for A or AAAA +needs to query the configured `nameserver` for ANY record for each such hostname looking for A or AAAA records, eventually dropping this hostname from rotation if it can't resolve its IP. The `interval` value specifies the _minimum_ time between checks. If the A -record has a TTL greater than this value, then service discovery will honor said +record has a TTL greater than this value, then service discovery honors said TTL. For example, if the TTL of the A record is 90 seconds, then service discovery waits at least 90 seconds before checking the A record again. diff --git a/doc/administration/external_pipeline_validation.md b/doc/administration/external_pipeline_validation.md index 9fc65fdd0b5..738cf591210 100644 --- a/doc/administration/external_pipeline_validation.md +++ b/doc/administration/external_pipeline_validation.md @@ -5,7 +5,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w type: reference, howto --- -# External pipeline validation +# External pipeline validation **(FREE SELF)** You can use an external service to validate a pipeline before it's created. diff --git a/doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md b/doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md index 4cfe781c7a4..16ae5bde062 100644 --- a/doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md +++ b/doc/administration/geo/disaster_recovery/runbooks/planned_failover_multi_node.md @@ -19,7 +19,7 @@ This runbook is in **alpha**. For complete, production-ready documentation, see | Geo site | Multi-node | | Secondaries | One | -This runbook will guide you through a planned failover of a multi-node Geo site +This runbook guides you through a planned failover of a multi-node Geo site with one secondary. The following [2000 user reference architecture](../../../../administration/reference_architectures/2k_users.md) is assumed: ```mermaid @@ -46,7 +46,7 @@ graph TD The load balancer node and optional NFS server are omitted for clarity. -This guide will result in the following: +This guide results in the following: 1. An offline primary. 1. A promoted secondary that is now the new primary. @@ -76,7 +76,7 @@ On the **secondary** node: If any objects are failing to replicate, this should be investigated before scheduling the maintenance window. After a planned failover, anything that -failed to replicate will be **lost**. +failed to replicate is **lost**. You can use the [Geo status API](../../../../api/geo_nodes.md#retrieve-project-sync-or-verification-failures-that-occurred-on-the-current-node) @@ -117,10 +117,10 @@ follow these steps to avoid unnecessary data loss: sudo iptables -A INPUT --tcp-dport 443 -j REJECT ``` - From this point, users will be unable to view their data or make changes on the - **primary** node. They will also be unable to log in to the **secondary** node. - However, existing sessions will work for the remainder of the maintenance period, and - public data will be accessible throughout. + From this point, users are unable to view their data or make changes on the + **primary** node. They are also unable to log in to the **secondary** node. + However, existing sessions need to work for the remainder of the maintenance period, and + so public data is accessible throughout. 1. Verify the **primary** node is blocked to HTTP traffic by visiting it in browser via another IP. The server should refuse connection. @@ -170,8 +170,8 @@ follow these steps to avoid unnecessary data loss: 1. [Run an integrity check](../../../raketasks/check.md) to verify the integrity of CI artifacts, LFS objects, and uploads in file storage. - At this point, your **secondary** node will contain an up-to-date copy of everything the - **primary** node has, meaning nothing will be lost when you fail over. + At this point, your **secondary** node contains an up-to-date copy of everything the + **primary** node has, meaning nothing is lost when you fail over. 1. In this final step, you need to permanently disable the **primary** node. @@ -213,7 +213,7 @@ follow these steps to avoid unnecessary data loss: - If you do not have SSH access to the **primary** node, take the machine offline and prevent it from rebooting. Since there are many ways you may prefer to accomplish - this, we will avoid a single recommendation. You may need to: + this, we avoid a single recommendation. You may need to: - Reconfigure the load balancers. - Change DNS records (for example, point the **primary** DNS record to the @@ -248,7 +248,7 @@ issue has been fixed in GitLab 13.4 and later. WARNING: If the secondary node [has been paused](../../../geo/index.md#pausing-and-resuming-replication), this performs a point-in-time recovery to the last known state. -Data that was created on the primary while the secondary was paused will be lost. +Data that was created on the primary while the secondary was paused is lost. 1. SSH in to the PostgreSQL node in the **secondary** and promote PostgreSQL separately: diff --git a/doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md b/doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md index 6caeddad51a..36c9d46d650 100644 --- a/doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md +++ b/doc/administration/geo/disaster_recovery/runbooks/planned_failover_single_node.md @@ -19,7 +19,7 @@ This runbook is in **alpha**. For complete, production-ready documentation, see | Geo site | Single-node | | Secondaries | One | -This runbook will guide you through a planned failover of a single-node Geo site +This runbook guides you through a planned failover of a single-node Geo site with one secondary. The following general architecture is assumed: ```mermaid @@ -34,7 +34,7 @@ graph TD end ``` -This guide will result in the following: +This guide results in the following: 1. An offline primary. 1. A promoted secondary that is now the new primary. @@ -61,7 +61,7 @@ time to complete. If any objects are failing to replicate, this should be investigated before scheduling the maintenance window. After a planned failover, anything that -failed to replicate will be **lost**. +failed to replicate is **lost**. You can use the [Geo status API](../../../../api/geo_nodes.md#retrieve-project-sync-or-verification-failures-that-occurred-on-the-current-node) @@ -102,10 +102,10 @@ follow these steps to avoid unnecessary data loss: sudo iptables -A INPUT --tcp-dport 443 -j REJECT ``` - From this point, users will be unable to view their data or make changes on the - **primary** node. They will also be unable to log in to the **secondary** node. - However, existing sessions will work for the remainder of the maintenance period, and - public data will be accessible throughout. + From this point, users are unable to view their data or make changes on the + **primary** node. They are also unable to log in to the **secondary** node. + However, existing sessions need to work for the remainder of the maintenance period, and + so public data is accessible throughout. 1. Verify the **primary** node is blocked to HTTP traffic by visiting it in browser via another IP. The server should refuse connection. @@ -155,8 +155,8 @@ follow these steps to avoid unnecessary data loss: 1. [Run an integrity check](../../../raketasks/check.md) to verify the integrity of CI artifacts, LFS objects, and uploads in file storage. - At this point, your **secondary** node will contain an up-to-date copy of everything the - **primary** node has, meaning nothing will be lost when you fail over. + At this point, your **secondary** node contains an up-to-date copy of everything the + **primary** node has, meaning nothing is lost when you fail over. 1. In this final step, you need to permanently disable the **primary** node. @@ -198,7 +198,7 @@ follow these steps to avoid unnecessary data loss: - If you do not have SSH access to the **primary** node, take the machine offline and prevent it from rebooting. Since there are many ways you may prefer to accomplish - this, we will avoid a single recommendation. You may need to: + this, we avoid a single recommendation. You may need to: - Reconfigure the load balancers. - Change DNS records (for example, point the **primary** DNS record to the @@ -240,7 +240,7 @@ To promote the secondary node: 1. Run the following command to list out all preflight checks and automatically check if replication and verification are complete before scheduling a planned - failover to ensure the process will go smoothly: + failover to ensure the process goes smoothly: NOTE: In GitLab 13.7 and earlier, if you have a data type with zero items to sync, diff --git a/doc/administration/geo/glossary.md b/doc/administration/geo/glossary.md index 1ec552326aa..f8769d31ec7 100644 --- a/doc/administration/geo/glossary.md +++ b/doc/administration/geo/glossary.md @@ -21,7 +21,7 @@ these definitions yet. | Term | Definition | Scope | Discouraged synonyms | |---------------------------|----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|-------------------------------------------------| -| Node | An individual server that runs GitLab either with a specific role or as a whole (e.g. a Rails application node). In a cloud context this can be a specific machine type. | GitLab | instance, server | +| Node | An individual server that runs GitLab either with a specific role or as a whole (for example a Rails application node). In a cloud context this can be a specific machine type. | GitLab | instance, server | | Site | One or a collection of nodes running a single GitLab application. A site can be single-node or multi-node. | GitLab | deployment, installation instance | | Single-node site | A specific configuration of GitLab that uses exactly one node. | GitLab | single-server, single-instance | Multi-node site | A specific configuration of GitLab that uses more than one node. | GitLab | multi-server, multi-instance, high availability | @@ -31,7 +31,7 @@ these definitions yet. | Reference architecture(s) | A [specified configuration of GitLab for a number of users](../reference_architectures/index.md), possibly including multiple nodes and multiple sites. | GitLab | | | Promoting | Changing the role of a site from secondary to primary. | Geo-specific | | | Demoting | Changing the role of a site from primary to secondary. | Geo-specific | | -| Failover | The entire process that shifts users from a primary Site to a secondary site. This includes promoting a secondary, but contains other parts as well e.g. scheduling maintenance. | Geo-specific | | +| Failover | The entire process that shifts users from a primary Site to a secondary site. This includes promoting a secondary, but contains other parts as well. For example, scheduling maintenance. | Geo-specific | | ## Examples diff --git a/doc/administration/geo/replication/configuration.md b/doc/administration/geo/replication/configuration.md index 926c4c565aa..e8ffa1ae91a 100644 --- a/doc/administration/geo/replication/configuration.md +++ b/doc/administration/geo/replication/configuration.md @@ -196,9 +196,9 @@ keys must be manually replicated to the **secondary** node. gitlab-ctl reconfigure ``` -1. On the top bar, select **Menu >** **{admin}** **Admin**. +1. On the top bar of the primary node, select **Menu >** **{admin}** **Admin**. 1. On the left sidebar, select **Geo > Nodes**. -1. Select **New node**. +1. Select **Add site**. ![Add secondary node](img/adding_a_secondary_node_v13_3.png) 1. Fill in **Name** with the `gitlab_rails['geo_node_name']` in `/etc/gitlab/gitlab.rb`. These values must always match *exactly*, character diff --git a/doc/administration/geo/replication/datatypes.md b/doc/administration/geo/replication/datatypes.md index 6989765dbad..a56d9dc813c 100644 --- a/doc/administration/geo/replication/datatypes.md +++ b/doc/administration/geo/replication/datatypes.md @@ -209,6 +209,6 @@ successfully, you must replicate their data using some other means. #### Limitation of verification for files in Object Storage -GitLab managed Object Storage replication support [is in beta](object_storage.md#enabling-gitlab-managed-object-storage-replication). +GitLab managed Object Storage replication support [is in beta](object_storage.md#enabling-gitlab-managed-object-storage-replication). Locally stored files are verified but remote stored files are not. diff --git a/doc/administration/geo/replication/docker_registry.md b/doc/administration/geo/replication/docker_registry.md index cc0719442a1..5cc4f66017b 100644 --- a/doc/administration/geo/replication/docker_registry.md +++ b/doc/administration/geo/replication/docker_registry.md @@ -53,7 +53,7 @@ We need to make Docker Registry send notification events to the registry['notifications'] = [ { 'name' => 'geo_event', - 'url' => 'https://example.com/api/v4/container_registry_event/events', + 'url' => 'https://<example.com>/api/v4/container_registry_event/events', 'timeout' => '500ms', 'threshold' => 5, 'backoff' => '1s', @@ -65,7 +65,8 @@ We need to make Docker Registry send notification events to the ``` NOTE: - Replace `<replace_with_a_secret_token>` with a case sensitive alphanumeric string + Replace `<example.com>` with the `external_url` defined in your primary site's `/etc/gitlab/gitlab.rb` file, and + replace `<replace_with_a_secret_token>` with a case sensitive alphanumeric string that starts with a letter. You can generate one with `< /dev/urandom tr -dc _A-Z-a-z-0-9 | head -c 32 | sed "s/^[0-9]*//"; echo` NOTE: @@ -109,11 +110,14 @@ For each application and Sidekiq node on the **secondary** site: 1. Copy `/var/opt/gitlab/gitlab-rails/etc/gitlab-registry.key` from the **primary** to the node. -1. Edit `/etc/gitlab/gitlab.rb`: +1. Edit `/etc/gitlab/gitlab.rb` and add: ```ruby gitlab_rails['geo_registry_replication_enabled'] = true - gitlab_rails['geo_registry_replication_primary_api_url'] = 'https://primary.example.com:5050/' # Primary registry address, it will be used by the secondary node to directly communicate to primary registry + + # Primary registry's hostname and port, it will be used by + # the secondary node to directly communicate to primary registry + gitlab_rails['geo_registry_replication_primary_api_url'] = 'https://primary.example.com:5050/' ``` 1. Reconfigure the node for the change to take effect: diff --git a/doc/administration/geo/replication/faq.md b/doc/administration/geo/replication/faq.md index ef41b2ff172..28030dccb3b 100644 --- a/doc/administration/geo/replication/faq.md +++ b/doc/administration/geo/replication/faq.md @@ -23,7 +23,7 @@ For each project to sync: 1. Geo issues a `git fetch geo --mirror` to get the latest information from the **primary** site. If there are no changes, the sync is fast. Otherwise, it has to pull the latest commits. -1. The **secondary** site updates the tracking database to store the fact that it has synced projects A, B, C, etc. +1. The **secondary** site updates the tracking database to store the fact that it has synced projects A, B, C, and so on. 1. Repeat until all projects are synced. When someone pushes a commit to the **primary** site, it generates an event in the GitLab database that the repository has changed. @@ -46,8 +46,8 @@ Read the documentation for [Disaster Recovery](../disaster_recovery/index.md). ## What data is replicated to a **secondary** site? We currently replicate project repositories, LFS objects, generated -attachments / avatars and the whole database. This means user accounts, -issues, merge requests, groups, project data, etc., will be available for +attachments and avatars, and the whole database. This means user accounts, +issues, merge requests, groups, project data, and so on, will be available for query. ## Can I `git push` to a **secondary** site? @@ -58,7 +58,7 @@ Yes! Pushing directly to a **secondary** site (for both HTTP and SSH, including All replication operations are asynchronous and are queued to be dispatched. Therefore, it depends on a lot of factors including the amount of traffic, how big your commit is, the -connectivity between your sites, your hardware, etc. +connectivity between your sites, your hardware, and so on. ## What if the SSH server runs at a different port? diff --git a/doc/administration/geo/replication/location_aware_git_url.md b/doc/administration/geo/replication/location_aware_git_url.md index 014ca59e571..a80c293149e 100644 --- a/doc/administration/geo/replication/location_aware_git_url.md +++ b/doc/administration/geo/replication/location_aware_git_url.md @@ -88,7 +88,7 @@ routing configurations. ![Created policy record](img/single_git_created_policy_record.png) -You have successfully set up a single host, e.g. `git.example.com` which +You have successfully set up a single host, for example, `git.example.com` which distributes traffic to your Geo sites by geolocation! ## Configure Git clone URLs to use the special Git URL diff --git a/doc/administration/geo/replication/remove_geo_node.md b/doc/administration/geo/replication/remove_geo_node.md deleted file mode 100644 index b72cd3cbb95..00000000000 --- a/doc/administration/geo/replication/remove_geo_node.md +++ /dev/null @@ -1,9 +0,0 @@ ---- -redirect_to: '../../geo/replication/remove_geo_site.md' -remove_date: '2021-06-01' ---- - -This document was moved to [another location](../../geo/replication/remove_geo_site.md). - -<!-- This redirect file can be deleted after 2021-06-01 --> -<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/administration/geo/replication/security_review.md b/doc/administration/geo/replication/security_review.md index ae41599311b..966902a3d74 100644 --- a/doc/administration/geo/replication/security_review.md +++ b/doc/administration/geo/replication/security_review.md @@ -60,7 +60,7 @@ from [owasp.org](https://owasp.org/). access), but is constrained to read-only activities. The principal use case is envisioned to be cloning Git repositories from the **secondary** site in favor of the **primary** site, but end-users may use the GitLab web interface to view projects, - issues, merge requests, snippets, etc. + issues, merge requests, snippets, and so on. ### What security expectations do the end‐users have? @@ -203,7 +203,7 @@ from [owasp.org](https://owasp.org/). ### What data entry paths does the application support? - Data is entered via the web application exposed by GitLab itself. Some data is - also entered using system administration commands on the GitLab servers (e.g., + also entered using system administration commands on the GitLab servers (for example `gitlab-ctl set-primary-node`). - **Secondary** sites also receive inputs via PostgreSQL streaming replication from the **primary** site. @@ -247,7 +247,7 @@ from [owasp.org](https://owasp.org/). ### What encryption requirements have been defined for data in transit - including transmission over WAN, LAN, SecureFTP, or publicly accessible protocols such as http: and https:? - Data must have the option to be encrypted in transit, and be secure against - both passive and active attack (e.g., MITM attacks should not be possible). + both passive and active attack (for example, MITM attacks should not be possible). ## Access diff --git a/doc/administration/geo/replication/troubleshooting.md b/doc/administration/geo/replication/troubleshooting.md index c00f523957c..d63e927627a 100644 --- a/doc/administration/geo/replication/troubleshooting.md +++ b/doc/administration/geo/replication/troubleshooting.md @@ -327,7 +327,7 @@ Slots where `active` is `f` are not active. - When this slot should be active, because you have a **secondary** node configured using that slot, log in to that **secondary** node and check the PostgreSQL logs why the replication is not running. -- If you are no longer using the slot (e.g. you no longer have Geo enabled), you can remove it with in the +- If you are no longer using the slot (for example, you no longer have Geo enabled), you can remove it with in the PostgreSQL console session: ```sql @@ -378,7 +378,7 @@ This happens on wrongly-formatted addresses in `postgresql['md5_auth_cidr_addres ``` To fix this, update the IP addresses in `/etc/gitlab/gitlab.rb` under `postgresql['md5_auth_cidr_addresses']` -to respect the CIDR format (i.e. `1.2.3.4/32`). +to respect the CIDR format (that is, `1.2.3.4/32`). ### Message: `LOG: invalid IP mask "md5": Name or service not known` @@ -390,7 +390,7 @@ This happens when you have added IP addresses without a subnet mask in `postgres ``` To fix this, add the subnet mask in `/etc/gitlab/gitlab.rb` under `postgresql['md5_auth_cidr_addresses']` -to respect the CIDR format (i.e. `1.2.3.4/32`). +to respect the CIDR format (that is, `1.2.3.4/32`). ### Message: `Found data in the gitlabhq_production database!` when running `gitlab-ctl replicate-geo-database` @@ -588,6 +588,75 @@ to start again from scratch, there are a few steps that can help you: gitlab-ctl start ``` +### Design repository failures on mirrored projects and project imports + +On the top bar, under **Menu >** **{admin}** **Admin > Geo > Nodes**, +if the Design repositories progress bar shows +`Synced` and `Failed` greater than 100%, and negative `Queued`, then the instance +is likely affected by +[a bug in GitLab 13.2 and 13.3](https://gitlab.com/gitlab-org/gitlab/-/issues/241668). +It was [fixed in 13.4+](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/40643). + +To determine the actual replication status of design repositories in +a [Rails console](../../operations/rails_console.md): + +```ruby +secondary = Gitlab::Geo.current_node +counts = {} +secondary.designs.select("projects.id").find_each do |p| + registry = Geo::DesignRegistry.find_by(project_id: p.id) + state = registry ? "#{registry.state}" : "registry does not exist yet" + # puts "Design ID##{p.id}: #{state}" # uncomment this for granular information + counts[state] ||= 0 + counts[state] += 1 +end +puts "\nCounts:", counts +``` + +Example output: + +```plaintext +Design ID#5: started +Design ID#6: synced +Design ID#7: failed +Design ID#8: pending +Design ID#9: synced + +Counts: +{"started"=>1, "synced"=>2, "failed"=>1, "pending"=>1} +``` + +Example output if there are actually zero design repository replication failures: + +```plaintext +Design ID#5: synced +Design ID#6: synced +Design ID#7: synced + +Counts: +{"synced"=>3} +``` + +#### If you are promoting a Geo secondary site running on a single server + +`gitlab-ctl promotion-preflight-checks` will fail due to the existence of +`failed` rows in the `geo_design_registry` table. Use the +[previous snippet](#design-repository-failures-on-mirrored-projects-and-project-imports) to +determine the actual replication status of Design repositories. + +`gitlab-ctl promote-to-primary-node` will fail since it runs preflight checks. +If the [previous snippet](#design-repository-failures-on-mirrored-projects-and-project-imports) +shows that all designs are synced, then you can use the +`--skip-preflight-checks` option or the `--force` option to move forward with +promotion. + +#### If you are promoting a Geo secondary site running on multiple servers + +`gitlab-ctl promotion-preflight-checks` will fail due to the existence of +`failed` rows in the `geo_design_registry` table. Use the +[previous snippet](#design-repository-failures-on-mirrored-projects-and-project-imports) to +determine the actual replication status of Design repositories. + ## Fixing errors during a failover or when promoting a secondary to a primary node The following are possible errors that might be encountered during failover or @@ -726,6 +795,7 @@ sudo gitlab-ctl promotion-preflight-checks sudo /opt/gitlab/embedded/bin/gitlab-pg-ctl promote sudo gitlab-ctl reconfigure sudo gitlab-rake geo:set_secondary_as_primary +``` ## Expired artifacts @@ -794,7 +864,7 @@ PostgreSQL instances: The most common problems that prevent the database from replicating correctly are: -- **Secondary** nodes cannot reach the **primary** node. Check credentials, firewall rules, etc. +- **Secondary** nodes cannot reach the **primary** node. Check credentials, firewall rules, and so on. - SSL certificate problems. Make sure you copied `/etc/gitlab/gitlab-secrets.json` from the **primary** node. - Database storage disk is full. - Database replication slot is misconfigured. diff --git a/doc/administration/geo/replication/updating_the_geo_nodes.md b/doc/administration/geo/replication/updating_the_geo_nodes.md index 0c68adf162d..03570048071 100644 --- a/doc/administration/geo/replication/updating_the_geo_nodes.md +++ b/doc/administration/geo/replication/updating_the_geo_nodes.md @@ -28,9 +28,9 @@ and all **secondary** nodes: 1. **Optional:** [Pause replication on each **secondary** node.](../index.md#pausing-and-resuming-replication) 1. Log into the **primary** node. -1. [Update GitLab on the **primary** node using Omnibus's Geo-specific steps](https://docs.gitlab.com/omnibus/update/README.html#geo-deployment). +1. [Update GitLab on the **primary** node using Omnibus](https://docs.gitlab.com/omnibus/update/#update-using-the-official-repositories). 1. Log into each **secondary** node. -1. [Update GitLab on each **secondary** node using Omnibus's Geo-specific steps](https://docs.gitlab.com/omnibus/update/README.html#geo-deployment). +1. [Update GitLab on each **secondary** node using Omnibus](https://docs.gitlab.com/omnibus/update/#update-using-the-official-repositories). 1. If you paused replication in step 1, [resume replication on each **secondary**](../index.md#pausing-and-resuming-replication) 1. [Test](#check-status-after-updating) **primary** and **secondary** nodes, and check version in each. diff --git a/doc/administration/geo/replication/usage.md b/doc/administration/geo/replication/usage.md index 1491aa3427e..7fe8eec467e 100644 --- a/doc/administration/geo/replication/usage.md +++ b/doc/administration/geo/replication/usage.md @@ -27,7 +27,7 @@ Everything up-to-date ``` NOTE: -If you're using HTTPS instead of [SSH](../../../ssh/README.md) to push to the secondary, +If you're using HTTPS instead of [SSH](../../../ssh/index.md) to push to the secondary, you can't store credentials in the URL like `user:password@URL`. Instead, you can use a [`.netrc` file](https://www.gnu.org/software/inetutils/manual/html_node/The-_002enetrc-file.html) for Unix-like operating systems or `_netrc` for Windows. In that case, the credentials diff --git a/doc/administration/geo/replication/version_specific_updates.md b/doc/administration/geo/replication/version_specific_updates.md index 301be931b29..e193fc630b9 100644 --- a/doc/administration/geo/replication/version_specific_updates.md +++ b/doc/administration/geo/replication/version_specific_updates.md @@ -11,16 +11,35 @@ Review this page for update instructions for your version. These steps accompany the [general steps](updating_the_geo_nodes.md#general-update-steps) for updating Geo nodes. +## Updating to GitLab 13.12 + +We found an issue where [secondary nodes re-download all LFS files](https://gitlab.com/gitlab-org/gitlab/-/issues/334550) upon update. This bug: + +- Only applies to Geo secondary sites that have replicated LFS objects. +- Is _not_ a data loss risk. +- Causes churn and wasted bandwidth re-downloading all LFS objects. +- May impact performance for GitLab installations with a large number of LFS files. + +If you don't have many LFS objects or can stand a bit of churn, then it is safe to let the secondary sites re-download LFS objects. +If you do have many LFS objects, or many Geo secondary sites, or limited bandwidth, or a combination of them all, then we recommend you skip GitLab 13.12.0 through 13.12.6 and update to GitLab 13.12.7 or newer. + +### If you have already updated to an affected version, and the re-sync is ongoing + +You can manually migrate the legacy sync state to the new state column by running the following command in a [Rails console](../../operations/rails_console.md). It should take under a minute: + +```ruby +Geo::LfsObjectRegistry.where(state: 0, success: true).update_all(state: 2) +``` + ## Updating to GitLab 13.11 -We found an [issue with Git clone/pull through HTTP(s)](https://gitlab.com/gitlab-org/gitlab/-/issues/330787) on Geo secondaries and on any GitLab instance if maintenance mode is enabled. This was caused by a regression in GitLab Workhorse. This is fixed in the [GitLab 13.11.4 patch release](https://about.gitlab.com/releases/2021/05/14/gitlab-13-11-4-released/). To avoid this issue, upgrade to GitLab 13.11.4 or later. +We found an [issue with Git clone/pull through HTTP(s)](https://gitlab.com/gitlab-org/gitlab/-/issues/330787) on Geo secondaries and on any GitLab instance if maintenance mode is enabled. This was caused by a regression in GitLab Workhorse. This is fixed in the [GitLab 13.11.4 patch release](https://about.gitlab.com/releases/2021/05/14/gitlab-13-11-4-released/). To avoid this issue, upgrade to GitLab 13.11.4 or later. ## Updating to GitLab 13.9 We've detected an issue [with a column rename](https://gitlab.com/gitlab-org/gitlab/-/issues/324160) -that may prevent upgrades to GitLab 13.9.0, 13.9.1, 13.9.2 and 13.9.3. -We are working on a patch, but until a fixed version is released, you can manually complete -the zero-downtime upgrade: +that will prevent upgrades to GitLab 13.9.0, 13.9.1, 13.9.2 and 13.9.3 when following the zero-downtime steps. It is necessary +to perform the following additional steps for the zero-downtime upgrade: 1. Before running the final `sudo gitlab-rake db:migrate` command on the deploy node, execute the following queries using the PostgreSQL console (or `sudo gitlab-psql`) @@ -40,9 +59,18 @@ the zero-downtime upgrade: ``` If you have already run the final `sudo gitlab-rake db:migrate` command on the deploy node and have -encountered the [column rename issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160), you can still -follow the previous steps to complete the update. +encountered the [column rename issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160), you will +see the following error: + +```shell +-- remove_column(:application_settings, :asset_proxy_whitelist) +rake aborted! +StandardError: An error has occurred, all later migrations canceled: +PG::DependentObjectsStillExist: ERROR: cannot drop column asset_proxy_whitelist of table application_settings because other objects depend on it +DETAIL: trigger trigger_0d588df444c8 on table application_settings depends on column asset_proxy_whitelist of table application_settings +``` +To work around this bug, follow the previous steps to complete the update. More details are available [in this issue](https://gitlab.com/gitlab-org/gitlab/-/issues/324160). ## Updating to GitLab 13.7 diff --git a/doc/administration/geo/setup/database.md b/doc/administration/geo/setup/database.md index f6e72092a5f..bc4128deb4a 100644 --- a/doc/administration/geo/setup/database.md +++ b/doc/administration/geo/setup/database.md @@ -31,17 +31,17 @@ A single instance database replication is easier to set up and still provides th as a clusterized alternative. It's useful for setups running on a single machine or trying to evaluate Geo for a future clusterized installation. -A single instance can be expanded to a clusterized version using Patroni, which is recommended for a +A single instance can be expanded to a clusterized version using Patroni, which is recommended for a highly available architecture. -Follow below the instructions on how to set up PostgreSQL replication as a single instance database. -Alternatively, you can look at the [Multi-node database replication](#multi-node-database-replication) +Follow below the instructions on how to set up PostgreSQL replication as a single instance database. +Alternatively, you can look at the [Multi-node database replication](#multi-node-database-replication) instructions on setting up replication with a Patroni cluster. ### PostgreSQL replication The GitLab **primary** node where the write operations happen connects to -the **primary** database server, and **secondary** nodes +the **primary** database server, and **secondary** nodes connect to their own database servers (which are also read-only). We recommend using [PostgreSQL replication slots](https://medium.com/@tk512/replication-slots-in-postgresql-b4b03d277c75) @@ -112,13 +112,13 @@ There is an [issue where support is being discussed](https://gitlab.com/gitlab-o # must be present in all application nodes. gitlab_rails['db_password'] = '<your_password_here>' ``` - + 1. Define a password for the database [replication user](https://wiki.postgresql.org/wiki/Streaming_Replication). We will use the username defined in `/etc/gitlab/gitlab.rb` under the `postgresql['sql_replication_user']` - setting. The default value is `gitlab_replicator`, but if you changed it to something else, adapt + setting. The default value is `gitlab_replicator`, but if you changed it to something else, adapt the instructions below. - + Generate a MD5 hash of the desired password: ```shell @@ -462,10 +462,10 @@ data before running `pg_basebackup`. - If PostgreSQL is listening on a non-standard port, add `--port=` as well. - If your database is too large to be transferred in 30 minutes, you need - to increase the timeout, e.g., `--backup-timeout=3600` if you expect the + to increase the timeout, for example, `--backup-timeout=3600` if you expect the initial replication to take under an hour. - Pass `--sslmode=disable` to skip PostgreSQL TLS authentication altogether - (e.g., you know the network path is secure, or you are using a site-to-site + (for example, you know the network path is secure, or you are using a site-to-site VPN). This is **not** safe over the public Internet! - You can read more details about each `sslmode` in the [PostgreSQL documentation](https://www.postgresql.org/docs/12/libpq-ssl.html#LIBPQ-SSL-PROTECTION); @@ -484,12 +484,12 @@ The replication process is now complete. ### PgBouncer support (optional) [PgBouncer](https://www.pgbouncer.org/) may be used with GitLab Geo to pool -PostgreSQL connections, which can improve performance even when using in a -single instance installation. +PostgreSQL connections, which can improve performance even when using in a +single instance installation. -We recommend using PgBouncer if you use GitLab in a highly available -configuration with a cluster of nodes supporting a Geo **primary** site and -two other clusters of nodes supporting a Geo **secondary** site. One for the +We recommend using PgBouncer if you use GitLab in a highly available +configuration with a cluster of nodes supporting a Geo **primary** site and +two other clusters of nodes supporting a Geo **secondary** site. One for the main database and the other for the tracking database. For more information, see [High Availability with Omnibus GitLab](../../postgresql/replication_and_failover.md). @@ -505,7 +505,7 @@ If you still haven't [migrated from repmgr to Patroni](#migrating-from-repmgr-to Patroni is the official replication management solution for Geo. It can be used to build a highly available cluster on the **primary** and a **secondary** Geo site. -Using Patroni on a **secondary** site is optional and you don't have to use the same amount of +Using Patroni on a **secondary** site is optional and you don't have to use the same amount of nodes on each Geo site. For instructions about how to set up Patroni on the primary site, see the @@ -515,13 +515,21 @@ For instructions about how to set up Patroni on the primary site, see the In a Geo secondary site, the main PostgreSQL database is a read-only replica of the primary site’s PostgreSQL database. -If you are currently using `repmgr` on your Geo primary site, see [these instructions](#migrating-from-repmgr-to-patroni) for migrating from `repmgr` to Patroni. +If you are currently using `repmgr` on your Geo primary site, see [these instructions](#migrating-from-repmgr-to-patroni) +for migrating from `repmgr` to Patroni. + +A production-ready and secure setup requires at least: + +- 3 Consul nodes _(primary and secondary sites)_ +- 2 Patroni nodes _(primary and secondary sites)_ +- 1 PgBouncer node _(primary and secondary sites)_ +- 1 internal load-balancer _(primary site only)_ + +The internal load balancer provides a single endpoint for connecting to the Patroni cluster's leader whenever a new leader is +elected, and it is required for enabling cascading replication from the secondary sites. -A production-ready and secure setup requires at least three Consul nodes, three -Patroni nodes, one internal load-balancing node on the primary site, and a similar -configuration for the secondary site. The internal load balancer provides a single -endpoint for connecting to the Patroni cluster's leader whenever a new leader is -elected. Be sure to use [password credentials](../../postgresql/replication_and_failover.md#database-authorization-for-patroni) and other database best practices. +Be sure to use [password credentials](../../postgresql/replication_and_failover.md#database-authorization-for-patroni) +and other database best practices. ##### Step 1. Configure Patroni permanent replication slot on the primary site @@ -542,12 +550,12 @@ Leader instance**: ```ruby roles(['patroni_role']) - + consul['services'] = %w(postgresql) consul['configuration'] = { retry_join: %w[CONSUL_PRIMARY1_IP CONSUL_PRIMARY2_IP CONSUL_PRIMARY3_IP] } - + # You need one entry for each secondary, with a unique name following PostgreSQL slot_name constraints: # # Configuration syntax is: 'unique_slotname' => { 'type' => 'physical' }, @@ -559,6 +567,8 @@ Leader instance**: patroni['use_pg_rewind'] = true patroni['postgresql']['max_wal_senders'] = 8 # Use double of the amount of patroni/reserved slots (3 patronis + 1 reserved slot for a Geo secondary). patroni['postgresql']['max_replication_slots'] = 8 # Use double of the amount of patroni/reserved slots (3 patronis + 1 reserved slot for a Geo secondary). + patroni['username'] = 'PATRONI_API_USERNAME' + patroni['password'] = 'PATRONI_API_PASSWORD' patroni['replication_password'] = 'PLAIN_TEXT_POSTGRESQL_REPLICATION_PASSWORD' # We list all secondary instances as they can all become a Standby Leader @@ -719,27 +729,41 @@ For each Patroni instance on the secondary site: patroni['standby_cluster']['host'] = 'INTERNAL_LOAD_BALANCER_PRIMARY_IP' patroni['standby_cluster']['port'] = INTERNAL_LOAD_BALANCER_PRIMARY_PORT patroni['standby_cluster']['primary_slot_name'] = 'geo_secondary' # Or the unique replication slot name you setup before + patroni['username'] = 'PATRONI_API_USERNAME' + patroni['password'] = 'PATRONI_API_PASSWORD' patroni['replication_password'] = 'PLAIN_TEXT_POSTGRESQL_REPLICATION_PASSWORD' patroni['use_pg_rewind'] = true patroni['postgresql']['max_wal_senders'] = 5 # A minimum of three for one replica, plus two for each additional replica patroni['postgresql']['max_replication_slots'] = 5 # A minimum of three for one replica, plus two for each additional replica - + postgresql['pgbouncer_user_password'] = 'PGBOUNCER_PASSWORD_HASH' postgresql['sql_replication_password'] = 'POSTGRESQL_REPLICATION_PASSWORD_HASH' postgresql['sql_user_password'] = 'POSTGRESQL_PASSWORD_HASH' postgresql['listen_address'] = '0.0.0.0' # You can use a public or VPC address here instead - + gitlab_rails['dbpassword'] = 'POSTGRESQL_PASSWORD' gitlab_rails['enable'] = true gitlab_rails['auto_migrate'] = false ``` 1. Reconfigure GitLab for the changes to take effect. - This is required to bootstrap PostgreSQL users and settings: + This is required to bootstrap PostgreSQL users and settings. - ```shell - gitlab-ctl reconfigure - ``` + - If this is a fresh installation of Patroni: + + ```shell + gitlab-ctl reconfigure + ``` + + - If you are configuring a Patroni standby cluster on a site that previously had a working Patroni cluster: + + ```shell + gitlab-ctl stop patroni + rm -rf /var/opt/gitlab/postgresql/data + /opt/gitlab/embedded/bin/patronictl -c /var/opt/gitlab/patroni/patroni.yaml remove postgresql-ha + gitlab-ctl reconfigure + gitlab-ctl start patroni + ``` ### Migrating from repmgr to Patroni @@ -769,17 +793,17 @@ by following the same instructions above. Secondary sites use a separate PostgreSQL installation as a tracking database to keep track of replication status and automatically recover from potential replication issues. Omnibus automatically configures a tracking database when `roles(['geo_secondary_role'])` is set. -If you want to run this database in a highly available configuration, follow the instructions below. -A production-ready and secure setup requires at least three Consul nodes, three -Patroni nodes on the secondary site secondary site. Be sure to use [password credentials](../../postgresql/replication_and_failover.md#database-authorization-for-patroni) and other database best practices. +If you want to run this database in a highly available configuration, don't use the `geo_secondary_role` above. +Instead, follow the instructions below. -#### Step 1. Configure a PgBouncer node on the secondary site +A production-ready and secure setup requires at least three Consul nodes, two +Patroni nodes and one PgBouncer node on the secondary site. -A production-ready and highly available configuration requires at least -three Consul nodes, three PgBouncer nodes, and one internal load-balancing node. -The internal load balancer provides a single endpoint for connecting to the -PgBouncer cluster. For more information, see [High Availability with Omnibus GitLab](../../postgresql/replication_and_failover.md). +Be sure to use [password credentials](../../postgresql/replication_and_failover.md#database-authorization-for-patroni) +and other database best practices. + +#### Step 1. Configure a PgBouncer node on the secondary site Follow the minimal configuration for the PgBouncer node for the tracking database: @@ -880,6 +904,8 @@ For each Patroni instance on the secondary site for the tracking database: ] # Patroni configuration + patroni['username'] = 'PATRONI_API_USERNAME' + patroni['password'] = 'PATRONI_API_PASSWORD' patroni['replication_password'] = 'PLAIN_TEXT_POSTGRESQL_REPLICATION_PASSWORD' patroni['postgresql']['max_wal_senders'] = 5 # A minimum of three for one replica, plus two for each additional replica diff --git a/doc/administration/geo/setup/external_database.md b/doc/administration/geo/setup/external_database.md index 9e187424afa..3ec84f1268b 100644 --- a/doc/administration/geo/setup/external_database.md +++ b/doc/administration/geo/setup/external_database.md @@ -57,7 +57,7 @@ developed and tested. We aim to be compatible with most external To set up an external database, you can either: -- Set up [streaming replication](https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION-SLOTS) yourself (for example AWS RDS, bare metal not managed by Omnibus, etc.). +- Set up [streaming replication](https://www.postgresql.org/docs/12/warm-standby.html#STREAMING-REPLICATION-SLOTS) yourself (for example AWS RDS, bare metal not managed by Omnibus, and so on). - Perform the Omnibus configuration manually as follows. #### Leverage your cloud provider's tools to replicate the primary database @@ -208,8 +208,8 @@ the tracking database on port 5432. 1. Set up PostgreSQL according to the [database requirements document](../../../install/requirements.md#database). 1. Set up a `gitlab_geo` user with a password of your choice, create the `gitlabhq_geo_production` database, and make the user an owner of the database. You can see an example of this setup in the [installation from source documentation](../../../install/installation.md#6-database). -1. If you are **not** using a cloud-managed PostgreSQL database, ensure that your secondary - node can communicate with your tracking database by manually changing the +1. If you are **not** using a cloud-managed PostgreSQL database, ensure that your secondary + node can communicate with your tracking database by manually changing the `pg_hba.conf` that is associated with your tracking database. Remember to restart PostgreSQL afterwards for the changes to take effect: diff --git a/doc/administration/geo/setup/index.md b/doc/administration/geo/setup/index.md index 1afa4360cbc..84dff69ebe7 100644 --- a/doc/administration/geo/setup/index.md +++ b/doc/administration/geo/setup/index.md @@ -9,24 +9,24 @@ type: howto These instructions assume you have a working instance of GitLab. They guide you through: -1. Making your existing instance the **primary** node. -1. Adding **secondary** nodes. +1. Making your existing instance the **primary** site. +1. Adding **secondary** site(s). WARNING: -The steps below should be followed in the order they appear. **Make sure the GitLab version is the same on all nodes.** +The steps below should be followed in the order they appear. **Make sure the GitLab version is the same on all sites.** ## Using Omnibus GitLab If you installed GitLab using the Omnibus packages (highly recommended): -1. [Install GitLab Enterprise Edition](https://about.gitlab.com/install/) on the server that will serve as the **secondary** node. Do not create an account or log in to the new **secondary** node. -1. [Upload the GitLab License](../../../user/admin_area/license.md) on the **primary** node to unlock Geo. The license must be for [GitLab Premium](https://about.gitlab.com/pricing/) or higher. +1. [Install GitLab Enterprise Edition](https://about.gitlab.com/install/) on the node(s) that will serve as the **secondary** site. Do not create an account or log in to the new **secondary** site. +1. [Upload the GitLab License](../../../user/admin_area/license.md) on the **primary** site to unlock Geo. The license must be for [GitLab Premium](https://about.gitlab.com/pricing/) or higher. 1. [Set up the database replication](database.md) (`primary (read-write) <-> secondary (read-only)` topology). -1. [Configure fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md). This step is required and needs to be done on **both** the **primary** and **secondary** nodes. -1. [Configure GitLab](../replication/configuration.md) to set the **primary** and **secondary** nodes. -1. Optional: [Configure a secondary LDAP server](../../auth/ldap/index.md) for the **secondary** node. See [notes on LDAP](../index.md#ldap). +1. [Configure fast lookup of authorized SSH keys in the database](../../operations/fast_ssh_key_lookup.md). This step is required and needs to be done on **both** the **primary** and **secondary** site(s). +1. [Configure GitLab](../replication/configuration.md) to set the **primary** and **secondary** site(s). +1. Optional: [Configure a secondary LDAP server](../../auth/ldap/index.md) for the **secondary** site(s). See [notes on LDAP](../index.md#ldap). 1. Follow the [Using a Geo Site](../replication/usage.md) guide. ## Post-installation documentation -After installing GitLab on the **secondary** nodes and performing the initial configuration, see the [following documentation for post-installation information](../index.md#post-installation-documentation). +After installing GitLab on the **secondary** site(s) and performing the initial configuration, see the [following documentation for post-installation information](../index.md#post-installation-documentation). diff --git a/doc/administration/get_started.md b/doc/administration/get_started.md new file mode 100644 index 00000000000..a9ac8b279de --- /dev/null +++ b/doc/administration/get_started.md @@ -0,0 +1,291 @@ +--- +stage: +group: +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Get started administering GitLab **(FREE)** + +Get started with GitLab administration. Configure your organization and its authentication, then secure, monitor, +and back up GitLab. + +## Authentication + +Authentication is the first step in making your installation secure. + +- [Enforce two-factor authentication (2FA) for all users](../security/two_factor_authentication.md). We highly recommended 2FA for self-managed instances. +- Ensure users do the following: + - Choose a strong, secure password. If possible, store it in a password management system. + - If it is not configured for everyone, enable [two-factor authentication (2FA)](../user/profile/account/two_factor_authentication.md) for your account. + This one-time secret code is an additional safeguard that keeps intruders out, even if they have your password. + - Add a backup email. If you lose access to your account, the GitLab Support team can help you more quickly. + - Save or print your recovery codes. If you can't access your authentication device, you can use these recovery codes to sign in to your GitLab account. + - Add [an SSH key](../ssh/index.md) to your profile. You can generate new recovery codes as needed with SSH. + - Enable [personal access tokens](../user/profile/personal_access_tokens.md). When using 2FA, you can use these tokens to access the GitLab API. + +## Projects and groups + +Organize your environment by configuring your groups and projects. + +- [Projects](../user/project/working_with_projects.md): Designate a home for your files and code or track and organize issues in a business category. +- [Groups](../user/group/index.md): Organize a collection of users or projects. Use these groups to quickly assign people and projects. +- [Roles](../user/permissions.md): Define user access and visibility for your projects and groups. + +<i class="fa fa-youtube-play youtube" aria-hidden="true"></i> +Watch an overview of [groups and projects](https://www.youtube.com/watch?v=cqb2m41At6s). + +Get started: + +- Create a [project](../user/project/working_with_projects.md#create-a-project). +- Create a [group](../user/group/index.md#create-a-group). +- [Add members](../user/group/index.md#add-users-to-a-group) to the group. +- Create a [subgroup](../user/group/subgroups/index.md#creating-a-subgroup). +- [Add members](../user/group/subgroups/index.md#membership) to the subgroup. +- Enable [external authorization control](../user/admin_area/settings/external_authorization.md#configuration). + +**More resources** + +- Learn more about [running multiple Agile teams](https://www.youtube.com/watch?v=VR2r1TJCDew). +- Sync group memberships [by using LDAP](../administration/auth/ldap/index.md#group-sync). +- Manage user access with inherited permissions. Use up to 20 levels of subgroups to organize both teams and projects. + - Learn more about [inherited permissions](../user/project/members/index.md#inherited-membership). + - View [nested category examples](../user/group/subgroups/index.md#overview). + +## Import projects + +You may need to import projects from external sources like GitHub, Bitbucket, or another instance of GitLab. Many external sources can be imported into GitLab. + +- Review the [GitLab projects documentation](../user/project/index.md#project-integrations). +- Consider [repository mirroring](../user/project/repository/repository_mirroring.md)—an [alternative to project migrations](../ci/ci_cd_for_external_repos/index.md). +- Check out our [migration index](../user/project/import/index.md) for documentation on common migration paths. +- Schedule your project exports with our [import/export API](../api/project_import_export.md#schedule-an-export). + +### Popular project imports + +- [GitHub Enterprise to self-managed GitLab](../integration/github.md#enabling-github-oauth): Enabling OAuth makes it easier for developers to find and import their projects. +- [Bitbucket Server](../user/project/import/bitbucket_server.md#limitations): There are certain data limitations. + For assistance with these data types, contact your GitLab account manager or GitLab Support about our professional migration services. + +## GitLab instance security + +Security is an important part of the onboarding process. Securing your instance protects your work and your organization. + +While this isn't an exhaustive list, following these steps gives you a solid start for securing your instance. + +- Use a long root password, stored in a vault. +- Install trusted SSL certificate and establish a process for renewal and revocation. +- [Configure SSH key restrictions](../security/ssh_keys_restrictions.md#restrict-allowed-ssh-key-technologies-and-minimum-length) per your organization's guidelines. +- [Disable new sign-ups](../user/admin_area/settings/sign_up_restrictions.md#disable-new-sign-ups). +- Require email confirmation. +- Set password length limit, configure SSO or SAML user management. +- Limit email domains if allowing sign-up. +- Require two-factor authentication (2FA). +- [Disable password authentication](../user/admin_area/settings/sign_in_restrictions.md#password-authentication-enabled) for Git over HTTPS. +- Set up [email notification for unknown sign-ins](../user/admin_area/settings/sign_in_restrictions.md#email-notification-for-unknown-sign-ins). +- Configure [user and IP rate limits](https://about.gitlab.com/blog/2020/05/20/gitlab-instance-security-best-practices/#user-and-ip-rate-limits). +- Limit [webhooks local access](https://about.gitlab.com/blog/2020/05/20/gitlab-instance-security-best-practices/#webhooks). +- Set [rate limits for protected paths](../user/admin_area/settings/protected_paths.md). + +## Monitor GitLab performance + +After you've established your basic setup, you're ready to review the GitLab monitoring services. Prometheus is our core performance monitoring tool. +Unlike other monitoring solutions (for example, Zabbix or New Relic), Prometheus is tightly integrated with GitLab and has extensive community support. + +- [Prometheus](../administration/monitoring/prometheus/index.md) captures + [these GitLab metrics](../administration/monitoring/prometheus/gitlab_metrics.md#metrics-available). +- Learn more about GitLab [bundled software metrics](../administration/monitoring/prometheus/index.md#bundled-software-metrics). +- Prometheus and its exporters are on by default. However, you need to [configure the service](../administration/monitoring/prometheus/index.md#configuring-prometheus). +- Learn more about [GitLab architecture](../development/architecture.md). +- Find out why [application performance metrics](https://about.gitlab.com/blog/2020/05/07/working-with-performance-metrics/) matter. +- Create a [self-monitoring project](../administration/monitoring/gitlab_self_monitoring_project/index.md) to track the health of your instance. +- Integrate Grafana to [build visual dashboards](https://youtu.be/f4R7s0An1qE) based on performance metrics. + +### Components of monitoring + +- [Web servers](../administration/monitoring/prometheus/gitlab_metrics.md#puma-metrics): Handles server requests and facilitates other back-end service transactions. + Monitor CPU, memory, and network IO traffic to track the health of this node. +- [Workhorse](../administration/monitoring/prometheus/gitlab_metrics.md#metrics-available): Alleviates web traffic congestion from the main server. + Monitor latency spikes to track the health of this node. +- [Sidekiq](../administration/monitoring/prometheus/gitlab_metrics.md#sidekiq-metrics): Handles background operations that allow GitLab to run smoothly. + Monitor for long, unprocessed task queues to track the health of this node. + +## Back up your GitLab data + +GitLab provides backup methods to keep your data safe and recoverable. Whether you use a self-managed or a GitLab SaaS database, it's crucial to back up your data regularly. + +- Decide on a backup strategy. +- Consider writing a cron job to make daily backups. +- Separately backup the configuration files. +- Decide what should be left out of the backup. +- Decide where to upload the backups. +- Limit backup lifetime. +- Run a test backup and restore. +- Set up a way to periodically verify the backups. + +### Back up a GitLab self-managed instance + +The routine differs, depending on whether you deployed with Omnibus or the Helm chart. + +When you backing up an Omnibus (single node) GitLab server, you can use a single Rake task. + +Learn about [backing up Omnibus or Helm variations](../raketasks/backup_restore.md#back-up-gitlab). +This process backs up your entire instance, but does not back up the configuration files. Ensure those are backed up separately. +Keep your configuration files and backup archives in a separate location to ensure the encryption keys are not kept with the encrypted data. + +#### Restore a backup + +You can restore a backup only to **the exact same version and type** (Community Edition/Enterprise Edition) of GitLab on which it was created. + +- Review the [Omnibus backup and restore documentation](https://docs.gitlab.com/omnibus/settings/backups). +- Review the [Helm Chart backup and restore documentation](https://docs.gitlab.com/charts/backup-restore). + +### Back up GitLab SaaS + +Backups of GitLab databases and filesystems are taken every 24 hours, and are kept for two weeks on a rolling schedule. All backups are encrypted. + +- GitLab SaaS creates backups to ensure your data is secure, but you can't use these methods to export or back up your data yourself. +- Issues are stored in the database. They can't be stored in Git itself. +- You can use the project export option in: + - [The UI](../user/project/settings/import_export.md#exporting-a-project-and-its-data). + - [The API](../api/project_import_export.md#schedule-an-export). +- [Group export](../user/group/settings/import_export.md) does *not* export the projects in it, but does export: + - Epics + - Milestones + - Boards + - Labels + - Additional items + +For more information about GitLab SaaS backups, see our [Backup FAQ page](https://about.gitlab.com/handbook/engineering/infrastructure/faq/#gitlabcom-backups). + +### Alternative backup strategies + +In some situations the Rake task for backups may not be the most optimal solution. Here are some +[alternatives](../raketasks/backup_restore.md) to consider if the Rake task does not work for you. + +#### Option 1: File system snapshot + +If your GitLab server contains a lot of Git repository data, you may find the GitLab backup script to be too slow. It can be especially slow when backing up to an offsite location. + +Slowness typically starts at a Git repository data size of around 200 GB. In this case, you might consider using file system snapshots as part of your backup strategy. +For example, consider a GitLab server with the following components: + +- Using Omnibus GitLab +- Hosted on AWS with an EBS drive containing an ext4 file system mounted at `/var/opt/gitlab`. + +The EC2 instance meets the requirements for an application data backup by taking an EBS snapshot. The backup includes all repositories, uploads, and PostgreSQL data. + +In general, if you're running GitLab on a virtualized server, you can create VM snapshots of the entire GitLab server. +It is common for a VM snapshot to require you to power down the server. + +#### Option 2: GitLab Geo + +Geo provides local, read-only instances of your GitLab instances. + +While GitLab Geo helps remote teams work more efficiently by using a local GitLab node, it can also be used as a disaster recovery solution. +Learn more about using [Geo as a disaster recovery solution](../administration/geo/disaster_recovery/index.md). + +Geo replicates your database, your Git repositories, and a few other assets. +Learn more about [replication limitations](../administration/geo/replication/datatypes.md#limitations-on-replicationverification). + +## Support for GitLab self-managed + +GitLab provides support for self-managed GitLab through different channels. + +- Priority support: Premium and Ultimate self-managed customers receive priority support with tiered response times. + Learn more about [upgrading to priority support](https://about.gitlab.com/support/#upgrading-to-priority-support). +- Live upgrade assistance: Get one-on-one expert guidance during a production upgrade. With your **priority support plan**, + you're eligible for a live, scheduled screen-sharing session with a member of our support team. + +To get assistance for self-managed GitLab: + +- Use the GitLab documentation for self-service support. +- Join the [GitLab Forum](https://forum.gitlab.com/) for community support. +- Gather [your subscription information](https://about.gitlab.com/support/#for-self-managed-users) before submitting a ticket. +- [Submit a support ticket](https://support.gitlab.com/hc/en-us/requests/new). + +## Support for GitLab SaaS + +If you use GitLab SaaS, you have several channels with which to get support and find answers. + +- Priority support: Gold and Silver GitLab SaaS customers receive priority support with tiered response times. + Learn more about [upgrading to priority support](https://about.gitlab.com/support/#upgrading-to-priority-support). +- GitLab SaaS 24/7 monitoring: Our full team of site reliability and production engineers is always on. + Often, by the time you notice an issue, someone's already looking into it. + +To get assistance for GitLab SaaS: + +- Access [GitLab Docs](../README.md) for self-service support. +- Join the [GitLab Forum](https://forum.gitlab.com/) for community support. +- Gather [your subscription information](https://about.gitlab.com/support/#for-self-managed-users) before submitting a ticket. +- Submit a support ticket for: + - [General assistance](https://support.gitlab.com/hc/en-us/requests/new?ticket_form_id=334447) + - [Account or sign-in issues](https://support.gitlab.com/hc/en-us/requests/new?ticket_form_id=360000803379) +- Subscribe to [the status page](https://status.gitlab.com/) for the latest on GitLab performance or service interruptions. + +## API and rate limits for self-managed GitLab + +Rate limits prevent denial-of-service or brute-force attacks. In most cases, you can reduce the load on your application +and infrastructure by limiting the rate of requests from a single IP address. + +Rate limits also improve the security of your application. + +### Configure rate limits for self-managed GitLab + +You can make changes to your default rate limits from the Admin Area. For more information about configuration, see the [Admin Area page](../security/rate_limits.md#admin-area-settings). + +- Define [issues rate limits](../user/admin_area/settings/rate_limit_on_issues_creation.md) to set a maximum number of issue creation requests per minute, per user. +- Enforce [user and IP rate limits](../user/admin_area/settings/user_and_ip_rate_limits.md) for unauthenticated web requests. +- Review the [rate limit on raw endpoints](../user/admin_area/settings/rate_limits_on_raw_endpoints.md). The default setting is 300 requests per minute for raw file access. +- Review the [import/export rate limits](../user/admin_area/settings/import_export_rate_limits.md) of the six active defaults. + +For more information about API and rate limits, see our [API page](../api/index.md). + +## API and rate limits for GitLab SaaS + +Rate limits prevent denial-of-service or brute-force attacks. IP blocks usually happen when GitLab.com receives unusual traffic +from a single IP address. The system views unusual traffic as potentially malicious based on rate limit settings. + +Rate limits also improve the security of your application. + +### Configure rate limits for GitLab SaaS + +You can make changes to your default rate limits from the Admin Area. For more information about configuration, see the [Admin Area page](../security/rate_limits.md#admin-area-settings). + +- Review the rate limit page. +- Read our [API page](../api/index.md) for more information about API and rate limiting. + +### GitLab SaaS-specific block and error responses + +- [403 forbidden error](../user/gitlab_com/index.md#gitlabcom-specific-rate-limits): If the error occurs for all GitLab SaaS requests, look for an automated process that could have triggered a block. For more assistance, contact GitLab support with your error details, including the affected IP address. +- [HAProxy API throttle](../user/gitlab_com/index.md#haproxy): GitLab SaaS responds with HTTP status code 429 to API requests that exceed 10 requests per second, per IP address. +- [Protected paths throttle](../user/gitlab_com/index.md#protected-paths-throttle): GitLab SaaS responds with HTTP status code 429 to POST requests at protected paths that exceed 10 requests per minute, per IP address. +- [Git and container registry failed authentication ban](../user/gitlab_com/index.md#git-and-container-registry-failed-authentication-ban): GitLab SaaS responds with HTTP status code 403 for one hour if it receives 30 failed authentication requests in three minutes from a single IP address. + +## GitLab training resources + +You can learn more about how to administer GitLab. + +- Get involved in the [GitLab Forum](https://forum.gitlab.com/) to trade tips with our talented community. +- Check out [our blog](https://about.gitlab.com/blog/) for ongoing updates on: + - Releases + - Applications + - Contributions + - News + - Events + +### Paid GitLab training + +- GitLab education services: Learn more about [GitLab and DevOps best practices](https://about.gitlab.com/services/education/) through our specialized training courses. See our full course catalog. +- GitLab technical certifications: Explore our [certification options](https://about.gitlab.com/handbook/customer-success/professional-services-engineering/gitlab-technical-certifications/) that focus on key GitLab and DevOps skills. + +### Free GitLab training + +- GitLab basics: Discover self-service guides on [Git and GitLab basics](../gitlab-basics/index.md). +- GitLab Learn: Learn new GitLab skills in a structured course at [GitLab Learn](https://about.gitlab.com/learn/). + +### Third-party training + +- Udemy: For a more affordable, guided training option, consider + [GitLab CI: Pipelines, CI/CD, and DevOps for Beginners](https://www.udemy.com/course/gitlab-ci-pipelines-ci-cd-and-devops-for-beginners/) on Udemy. +- LinkedIn Learning: Check out [Continuous Delivery with GitLab](https://www.linkedin.com/learning/continuous-delivery-with-gitlab) on LinkedIn Learning + for another low-cost, guided training option. diff --git a/doc/administration/git_protocol.md b/doc/administration/git_protocol.md index acc05a77bee..6e391cb459e 100644 --- a/doc/administration/git_protocol.md +++ b/doc/administration/git_protocol.md @@ -99,7 +99,7 @@ $ GIT_TRACE_PACKET=1 git -c protocol.version=2 ls-remote https://your-gitlab-ins Verify Git v2 is used by the client: ```shell -GIT_SSH_COMMAND="ssh -v" git -c protocol.version=2 ls-remote ssh://your-gitlab-instance.com:group/repo.git 2>&1 |grep GIT_PROTOCOL +GIT_SSH_COMMAND="ssh -v" git -c protocol.version=2 ls-remote ssh://git@your-gitlab-instance.com/group/repo.git 2>&1 |grep GIT_PROTOCOL ``` You should see that the `GIT_PROTOCOL` environment variable is sent: diff --git a/doc/administration/gitaly/faq.md b/doc/administration/gitaly/faq.md index 98a90925d32..a5964b7a2eb 100644 --- a/doc/administration/gitaly/faq.md +++ b/doc/administration/gitaly/faq.md @@ -7,7 +7,8 @@ type: reference # Frequently asked questions **(FREE SELF)** -The following are answers to frequently asked questions about Gitaly and Gitaly Cluster. +The following are answers to frequently asked questions about Gitaly and Gitaly Cluster. For +troubleshooting information, see [Troubleshooting Gitaly and Gitaly Cluster](troubleshooting.md). ## How does Gitaly Cluster compare to Geo? @@ -87,4 +88,4 @@ There are no special requirements. Gitaly Cluster requires PostgreSQL version 11 These tables are created per the [specific configuration section](praefect.md#postgresql). If you find you have an empty Praefect database table, see the -[relevant troubleshooting section](index.md#relation-does-not-exist-errors). +[relevant troubleshooting section](troubleshooting.md#relation-does-not-exist-errors). diff --git a/doc/administration/gitaly/index.md b/doc/administration/gitaly/index.md index eaf9e21780d..0af248e0573 100644 --- a/doc/administration/gitaly/index.md +++ b/doc/administration/gitaly/index.md @@ -19,6 +19,67 @@ Gitaly implements a client-server architecture: - [GitLab Shell](https://gitlab.com/gitlab-org/gitlab-shell). - [GitLab Workhorse](https://gitlab.com/gitlab-org/gitlab-workhorse). +Gitaly manages only Git repository access for GitLab. Other types of GitLab data aren't accessed +using Gitaly. + +GitLab accesses [repositories](../../user/project/repository/index.md) through the configured +[repository storages](../repository_storage_paths.md). Each new repository is stored on one of the +repository storages based on their +[configured weights](../repository_storage_paths.md#configure-where-new-repositories-are-stored). Each +repository storage is either: + +- A Gitaly storage with direct access to repositories using [storage paths](../repository_storage_paths.md), + where each repository is stored on a single Gitaly node. All requests are routed to this node. +- A virtual storage provided by [Gitaly Cluster](#gitaly-cluster), where each repository can be + stored on multiple Gitaly nodes for fault tolerance. In a Gitaly Cluster: + - Read requests are distributed between multiple Gitaly nodes, which can improve performance. + - Write requests are broadcast to repository replicas. + +WARNING: +Engineering support for NFS for Git repositories is deprecated. Read the +[deprecation notice](#nfs-deprecation-notice). + +## Virtual storage + +Virtual storage makes it viable to have a single repository storage in GitLab to simplify repository +management. + +Virtual storage with Gitaly Cluster can usually replace direct Gitaly storage configurations. +However, this is at the expense of additional storage space needed to store each repository on multiple +Gitaly nodes. The benefit of using Gitaly Cluster virtual storage over direct Gitaly storage is: + +- Improved fault tolerance, because each Gitaly node has a copy of every repository. +- Improved resource utilization, reducing the need for over-provisioning for shard-specific peak + loads, because read loads are distributed across Gitaly nodes. +- Manual rebalancing for performance is not required, because read loads are distributed across + Gitaly nodes. +- Simpler management, because all Gitaly nodes are identical. + +The number of repository replicas can be configured using a +[replication factor](praefect.md#replication-factor). + +It can +be uneconomical to have the same replication factor for all repositories. +[Variable replication factor](https://gitlab.com/groups/gitlab-org/-/epics/3372) is planned to +provide greater flexibility for extremely large GitLab instances. + +As with normal Gitaly storages, virtual storages can be sharded. + +## Gitaly + +The following shows GitLab set up to use direct access to Gitaly: + +![Shard example](img/shard_example_v13_3.png) + +In this example: + +- Each repository is stored on one of three Gitaly storages: `storage-1`, `storage-2`, or + `storage-3`. +- Each storage is serviced by a Gitaly node. +- The three Gitaly nodes store data on their file systems. + +### Gitaly architecture + The following illustrates the Gitaly client-server architecture: ```mermaid @@ -44,19 +105,7 @@ D -- gRPC --> Gitaly E --> F ``` -End users do not have direct access to Gitaly. Gitaly manages only Git repository access for GitLab. -Other types of GitLab data aren't accessed using Gitaly. - -<!-- vale gitlab.FutureTense = NO --> - -WARNING: -From GitLab 14.0, enhancements and bug fixes for NFS for Git repositories will no longer be -considered and customer technical support will be considered out of scope. -[Read more about Gitaly and NFS](#nfs-deprecation-notice). - -<!-- vale gitlab.FutureTense = YES --> - -## Configure Gitaly +### Configure Gitaly Gitaly comes pre-configured with Omnibus GitLab, which is a configuration [suitable for up to 1000 users](../reference_architectures/1k_users.md). For: @@ -72,10 +121,24 @@ default value. The default value depends on the GitLab version. ## Gitaly Cluster -Gitaly, the service that provides storage for Git repositories, can -be run in a clustered configuration to scale the Gitaly service and increase -fault tolerance. In this configuration, every Git repository is stored on every -Gitaly node in the cluster. +Git storage is provided through the Gitaly service in GitLab, and is essential to the operation of +GitLab. When the number of users, repositories, and activity grows, it is important to scale Gitaly +appropriately by: + +- Increasing the available CPU and memory resources available to Git before + resource exhaustion degrades Git, Gitaly, and GitLab application performance. +- Increasing available storage before storage limits are reached causing write + operations to fail. +- Removing single points of failure to improve fault tolerance. Git should be + considered mission critical if a service degradation would prevent you from + deploying changes to production. + +Gitaly can be run in a clustered configuration to: + +- Scale the Gitaly service. +- Increase fault tolerance. + +In this configuration, every Git repository can be stored on multiple Gitaly nodes in the cluster. Using a Gitaly Cluster increases fault tolerance by: @@ -87,6 +150,19 @@ NOTE: Technical support for Gitaly clusters is limited to GitLab Premium and Ultimate customers. +The following shows GitLab set up to access `storage-1`, a virtual storage provided by Gitaly +Cluster: + +![Cluster example](img/cluster_example_v13_3.png) + +In this example: + +- Repositories are stored on a virtual storage called `storage-1`. +- Three Gitaly nodes provide `storage-1` access: `gitaly-1`, `gitaly-2`, and `gitaly-3`. +- The three Gitaly nodes share data in three separate hashed storage locations. +- The [replication factor](praefect.md#replication-factor) is `3`. There are three copies maintained + of each repository. + The availability objectives for Gitaly clusters are: - **Recovery Point Objective (RPO):** Less than 1 minute. @@ -110,33 +186,18 @@ Gitaly Cluster supports: - [Strong consistency](praefect.md#strong-consistency) of the secondary replicas. - [Automatic failover](praefect.md#automatic-failover-and-primary-election-strategies) from the primary to the secondary. - Reporting of possible data loss if replication queue is non-empty. -- Marking repositories as [read-only](praefect.md#read-only-mode) if data loss is detected to prevent data inconsistencies. +- From GitLab 13.0 to GitLab 14.0, marking repositories as [read-only](praefect.md#read-only-mode) + if data loss is detected to prevent data inconsistencies. Follow the [Gitaly Cluster epic](https://gitlab.com/groups/gitlab-org/-/epics/1489) for improvements including [horizontally distributing reads](https://gitlab.com/groups/gitlab-org/-/epics/2013). -### Overview - -Git storage is provided through the Gitaly service in GitLab, and is essential -to the operation of the GitLab application. When the number of -users, repositories, and activity grows, it is important to scale Gitaly -appropriately by: - -- Increasing the available CPU and memory resources available to Git before - resource exhaustion degrades Git, Gitaly, and GitLab application performance. -- Increase available storage before storage limits are reached causing write - operations to fail. -- Improve fault tolerance by removing single points of failure. Git should be - considered mission critical if a service degradation would prevent you from - deploying changes to production. - ### Moving beyond NFS WARNING: -From GitLab 13.0, using NFS for Git repositories is deprecated. In GitLab 14.0, -support for NFS for Git repositories is scheduled to be removed. Upgrade to -Gitaly Cluster as soon as possible. +Engineering support for NFS for Git repositories is deprecated. Technical support is planned to be +unavailable from GitLab 15.0. No further enhancements are planned for this feature. [Network File System (NFS)](https://en.wikipedia.org/wiki/Network_File_System) is not well suited to Git workloads which are CPU and IOPS sensitive. @@ -159,22 +220,6 @@ Further reading: - Blog post: [The road to Gitaly v1.0 (aka, why GitLab doesn't require NFS for storing Git data anymore)](https://about.gitlab.com/blog/2018/09/12/the-road-to-gitaly-1-0/) - Blog post: [How we spent two weeks hunting an NFS bug in the Linux kernel](https://about.gitlab.com/blog/2018/11/14/how-we-spent-two-weeks-hunting-an-nfs-bug/) -### Where Gitaly Cluster fits - -GitLab accesses [repositories](../../user/project/repository/index.md) through the configured -[repository storages](../repository_storage_paths.md). Each new repository is stored on one of the -repository storages based on their configured weights. Each repository storage is either: - -- A Gitaly storage served directly by Gitaly. These map to a directory on the file system of a - Gitaly node. -- A [virtual storage](#virtual-storage-or-direct-gitaly-storage) served by Praefect. A virtual - storage is a cluster of Gitaly storages that appear as a single repository storage. - -Virtual storages are a feature of Gitaly Cluster. They support replicating the repositories to -multiple storages for fault tolerance. Virtual storages can improve performance by distributing -requests across Gitaly nodes. Their distributed nature makes it viable to have a single repository -storage in GitLab to simplify repository management. - ### Components of Gitaly Cluster Gitaly Cluster consists of multiple components: @@ -182,59 +227,10 @@ Gitaly Cluster consists of multiple components: - [Load balancer](praefect.md#load-balancer) for distributing requests and providing fault-tolerant access to Praefect nodes. - [Praefect](praefect.md#praefect) nodes for managing the cluster and routing requests to Gitaly nodes. -- [PostgreSQL database](praefect.md#postgresql) for persisting cluster metadata and [PgBouncer](praefect.md#pgbouncer), +- [PostgreSQL database](praefect.md#postgresql) for persisting cluster metadata and [PgBouncer](praefect.md#use-pgbouncer), recommended for pooling Praefect's database connections. - Gitaly nodes to provide repository storage and Git access. -![Cluster example](img/cluster_example_v13_3.png) - -In this example: - -- Repositories are stored on a virtual storage called `storage-1`. -- Three Gitaly nodes provide `storage-1` access: `gitaly-1`, `gitaly-2`, and `gitaly-3`. -- The three Gitaly nodes store data on their file systems. - -### Virtual storage or direct Gitaly storage - -Gitaly supports multiple models of scaling: - -- Clustering using Gitaly Cluster, where each repository is stored on multiple Gitaly nodes in the - cluster. Read requests are distributed between repository replicas and write requests are - broadcast to repository replicas. GitLab accesses virtual storage. -- Direct access to Gitaly storage using [repository storage paths](../repository_storage_paths.md), - where each repository is stored on the assigned Gitaly node. All requests are routed to this node. - -The following is Gitaly set up to use direct access to Gitaly instead of Gitaly Cluster: - -![Shard example](img/shard_example_v13_3.png) - -In this example: - -- Each repository is stored on one of three Gitaly storages: `storage-1`, `storage-2`, - or `storage-3`. -- Each storage is serviced by a Gitaly node. -- The three Gitaly nodes share data in three separate hashed storage locations. -- The [replication factor](praefect.md#replication-factor) is `3`. There are three copies maintained - of each repository. - -Generally, virtual storage with Gitaly Cluster can replace direct Gitaly storage configurations, at -the expense of additional storage needed to store each repository on multiple Gitaly nodes. The -benefit of using Gitaly Cluster over direct Gitaly storage is: - -- Improved fault tolerance, because each Gitaly node has a copy of every repository. -- Improved resource utilization, reducing the need for over-provisioning for shard-specific peak - loads, because read loads are distributed across replicas. -- Manual rebalancing for performance is not required, because read loads are distributed across - replicas. -- Simpler management, because all Gitaly nodes are identical. - -Under some workloads, CPU and memory requirements may require a large fleet of Gitaly nodes. It -can be uneconomical to have one to one replication factor. - -A hybrid approach can be used in these instances, where each shard is configured as a smaller -cluster. [Variable replication factor](https://gitlab.com/groups/gitlab-org/-/epics/3372) is planned -to provide greater flexibility for extremely large GitLab instances. - ### Architecture Praefect is a router and transaction manager for Gitaly, and a required @@ -360,385 +356,21 @@ The second facet presents the only real solution. For this, we developed ## NFS deprecation notice -<!-- vale gitlab.FutureTense = NO --> - -From GitLab 14.0, enhancements and bug fixes for NFS for Git repositories will no longer be -considered and customer technical support will be considered out of scope. +Engineering support for NFS for Git repositories is deprecated. Technical support is planned to be +unavailable from GitLab 15.0. No further enhancements are planned for this feature. Additional information: - [Recommended NFS mount options and known issues with Gitaly and NFS](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). - [GitLab statement of support](https://about.gitlab.com/support/statement-of-support.html#gitaly-and-nfs). -<!-- vale gitlab.FutureTense = YES --> - GitLab recommends: - Creating a [Gitaly Cluster](#gitaly-cluster) as soon as possible. - [Moving your repositories](praefect.md#migrate-to-gitaly-cluster) from NFS-based storage to Gitaly Cluster. -We welcome your feedback on this process: raise a support ticket, or [comment on the epic](https://gitlab.com/groups/gitlab-org/-/epics/4916). - -## Troubleshooting - -Refer to the information below when troubleshooting Gitaly and Gitaly Cluster. - -Before troubleshooting, see the Gitaly and Gitaly Cluster -[frequently asked questions](faq.md). - -### Troubleshoot Gitaly - -The following sections provide possible solutions to Gitaly errors. - -See also [Gitaly timeout](../../user/admin_area/settings/gitaly_timeouts.md) settings. - -#### Check versions when using standalone Gitaly servers - -When using standalone Gitaly servers, you must make sure they are the same version -as GitLab to ensure full compatibility: - -1. On the top bar, select **Menu >** **{admin}** **Admin** on your GitLab instance. -1. On the left sidebar, select **Overview > Gitaly Servers**. -1. Confirm all Gitaly servers indicate that they are up to date. - -#### Use `gitaly-debug` - -The `gitaly-debug` command provides "production debugging" tools for Gitaly and Git -performance. It is intended to help production engineers and support -engineers investigate Gitaly performance problems. - -If you're using GitLab 11.6 or newer, this tool should be installed on -your GitLab or Gitaly server already at `/opt/gitlab/embedded/bin/gitaly-debug`. -If you're investigating an older GitLab version you can compile this -tool offline and copy the executable to your server: - -```shell -git clone https://gitlab.com/gitlab-org/gitaly.git -cd cmd/gitaly-debug -GOOS=linux GOARCH=amd64 go build -o gitaly-debug -``` - -To see the help page of `gitaly-debug` for a list of supported sub-commands, run: - -```shell -gitaly-debug -h -``` - -#### Commits, pushes, and clones return a 401 - -```plaintext -remote: GitLab: 401 Unauthorized -``` - -You need to sync your `gitlab-secrets.json` file with your GitLab -application nodes. - -#### Client side gRPC logs - -Gitaly uses the [gRPC](https://grpc.io/) RPC framework. The Ruby gRPC -client has its own log file which may contain useful information when -you are seeing Gitaly errors. You can control the log level of the -gRPC client with the `GRPC_LOG_LEVEL` environment variable. The -default level is `WARN`. - -You can run a gRPC trace with: - -```shell -sudo GRPC_TRACE=all GRPC_VERBOSITY=DEBUG gitlab-rake gitlab:gitaly:check -``` - -#### Server side gRPC logs - -gRPC tracing can also be enabled in Gitaly itself with the `GODEBUG=http2debug` -environment variable. To set this in an Omnibus GitLab install: - -1. Add the following to your `gitlab.rb` file: - - ```ruby - gitaly['env'] = { - "GODEBUG=http2debug" => "2" - } - ``` - -1. [Reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure) GitLab. - -#### Correlating Git processes with RPCs - -Sometimes you need to find out which Gitaly RPC created a particular Git process. - -One method for doing this is by using `DEBUG` logging. However, this needs to be enabled -ahead of time and the logs produced are quite verbose. - -A lightweight method for doing this correlation is by inspecting the environment -of the Git process (using its `PID`) and looking at the `CORRELATION_ID` variable: - -```shell -PID=<Git process ID> -sudo cat /proc/$PID/environ | tr '\0' '\n' | grep ^CORRELATION_ID= -``` - -This method isn't reliable for `git cat-file` processes, because Gitaly -internally pools and re-uses those across RPCs. - -#### Observing `gitaly-ruby` traffic +We welcome your feedback on this process. You can: -[`gitaly-ruby`](configure_gitaly.md#gitaly-ruby) is an internal implementation detail of Gitaly, -so, there's not that much visibility into what goes on inside -`gitaly-ruby` processes. - -If you have Prometheus set up to scrape your Gitaly process, you can see -request rates and error codes for individual RPCs in `gitaly-ruby` by -querying `grpc_client_handled_total`. - -- In theory, this metric does not differentiate between `gitaly-ruby` and other RPCs. -- In practice from GitLab 11.9, all gRPC calls made by Gitaly itself are internal calls from the - main Gitaly process to one of its `gitaly-ruby` sidecars. - -Assuming your `grpc_client_handled_total` counter only observes Gitaly, -the following query shows you RPCs are (most likely) internally -implemented as calls to `gitaly-ruby`: - -```prometheus -sum(rate(grpc_client_handled_total[5m])) by (grpc_method) > 0 -``` - -#### Repository changes fail with a `401 Unauthorized` error - -If you run Gitaly on its own server and notice these conditions: - -- Users can successfully clone and fetch repositories by using both SSH and HTTPS. -- Users can't push to repositories, or receive a `401 Unauthorized` message when attempting to - make changes to them in the web UI. - -Gitaly may be failing to authenticate with the Gitaly client because it has the -[wrong secrets file](configure_gitaly.md#configure-gitaly-servers). - -Confirm the following are all true: - -- When any user performs a `git push` to any repository on this Gitaly server, it - fails with a `401 Unauthorized` error: - - ```shell - remote: GitLab: 401 Unauthorized - To <REMOTE_URL> - ! [remote rejected] branch-name -> branch-name (pre-receive hook declined) - error: failed to push some refs to '<REMOTE_URL>' - ``` - -- When any user adds or modifies a file from the repository using the GitLab - UI, it immediately fails with a red `401 Unauthorized` banner. -- Creating a new project and [initializing it with a README](../../user/project/working_with_projects.md#blank-projects) - successfully creates the project but doesn't create the README. -- When [tailing the logs](https://docs.gitlab.com/omnibus/settings/logs.html#tail-logs-in-a-console-on-the-server) - on a Gitaly client and reproducing the error, you get `401` errors - when reaching the [`/api/v4/internal/allowed`](../../development/internal_api.md) endpoint: - - ```shell - # api_json.log - { - "time": "2019-07-18T00:30:14.967Z", - "severity": "INFO", - "duration": 0.57, - "db": 0, - "view": 0.57, - "status": 401, - "method": "POST", - "path": "\/api\/v4\/internal\/allowed", - "params": [ - { - "key": "action", - "value": "git-receive-pack" - }, - { - "key": "changes", - "value": "REDACTED" - }, - { - "key": "gl_repository", - "value": "REDACTED" - }, - { - "key": "project", - "value": "\/path\/to\/project.git" - }, - { - "key": "protocol", - "value": "web" - }, - { - "key": "env", - "value": "{\"GIT_ALTERNATE_OBJECT_DIRECTORIES\":[],\"GIT_ALTERNATE_OBJECT_DIRECTORIES_RELATIVE\":[],\"GIT_OBJECT_DIRECTORY\":null,\"GIT_OBJECT_DIRECTORY_RELATIVE\":null}" - }, - { - "key": "user_id", - "value": "2" - }, - { - "key": "secret_token", - "value": "[FILTERED]" - } - ], - "host": "gitlab.example.com", - "ip": "REDACTED", - "ua": "Ruby", - "route": "\/api\/:version\/internal\/allowed", - "queue_duration": 4.24, - "gitaly_calls": 0, - "gitaly_duration": 0, - "correlation_id": "XPUZqTukaP3" - } - - # nginx_access.log - [IP] - - [18/Jul/2019:00:30:14 +0000] "POST /api/v4/internal/allowed HTTP/1.1" 401 30 "" "Ruby" - ``` - -To fix this problem, confirm that your [`gitlab-secrets.json` file](configure_gitaly.md#configure-gitaly-servers) -on the Gitaly server matches the one on Gitaly client. If it doesn't match, -update the secrets file on the Gitaly server to match the Gitaly client, then -[reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure). - -#### Command line tools cannot connect to Gitaly - -gRPC cannot reach your Gitaly server if: - -- You can't connect to a Gitaly server with command-line tools. -- Certain actions result in a `14: Connect Failed` error message. - -Verify you can reach Gitaly by using TCP: - -```shell -sudo gitlab-rake gitlab:tcp_check[GITALY_SERVER_IP,GITALY_LISTEN_PORT] -``` - -If the TCP connection: - -- Fails, check your network settings and your firewall rules. -- Succeeds, your networking and firewall rules are correct. - -If you use proxy servers in your command line environment such as Bash, these can interfere with -your gRPC traffic. - -If you use Bash or a compatible command line environment, run the following commands to determine -whether you have proxy servers configured: - -```shell -echo $http_proxy -echo $https_proxy -``` - -If either of these variables have a value, your Gitaly CLI connections may be getting routed through -a proxy which cannot connect to Gitaly. - -To remove the proxy setting, run the following commands (depending on which variables had values): - -```shell -unset http_proxy -unset https_proxy -``` - -#### Permission denied errors appearing in Gitaly or Praefect logs when accessing repositories - -You might see the following in Gitaly and Praefect logs: - -```shell -{ - ... - "error":"rpc error: code = PermissionDenied desc = permission denied", - "grpc.code":"PermissionDenied", - "grpc.meta.client_name":"gitlab-web", - "grpc.request.fullMethod":"/gitaly.ServerService/ServerInfo", - "level":"warning", - "msg":"finished unary call with code PermissionDenied", - ... -} -``` - -This is a GRPC call -[error response code](https://grpc.github.io/grpc/core/md_doc_statuscodes.html). - -If this error occurs, even though -[the Gitaly auth tokens are set up correctly](#praefect-errors-in-logs), -it's likely that the Gitaly servers are experiencing -[clock drift](https://en.wikipedia.org/wiki/Clock_drift). - -Ensure the Gitaly clients and servers are synchronized, and use an NTP time -server to keep them synchronized. - -#### Gitaly not listening on new address after reconfiguring - -When updating the `gitaly['listen_addr']` or `gitaly['prometheus_listen_addr']` values, Gitaly may -continue to listen on the old address after a `sudo gitlab-ctl reconfigure`. - -When this occurs, run `sudo gitlab-ctl restart` to resolve the issue. This should no longer be -necessary because [this issue](https://gitlab.com/gitlab-org/gitaly/-/issues/2521) is resolved. - -#### Permission denied errors appearing in Gitaly logs when accessing repositories from a standalone Gitaly node - -If this error occurs even though file permissions are correct, it's likely that the Gitaly node is -experiencing [clock drift](https://en.wikipedia.org/wiki/Clock_drift). - -Please ensure that the GitLab and Gitaly nodes are synchronized and use an NTP time -server to keep them synchronized if possible. - -### Troubleshoot Praefect (Gitaly Cluster) - -The following sections provide possible solutions to Gitaly Cluster errors. - -#### Praefect errors in logs - -If you receive an error, check `/var/log/gitlab/gitlab-rails/production.log`. - -Here are common errors and potential causes: - -- 500 response code - - **ActionView::Template::Error (7:permission denied)** - - `praefect['auth_token']` and `gitlab_rails['gitaly_token']` do not match on the GitLab server. - - **Unable to save project. Error: 7:permission denied** - - Secret token in `praefect['storage_nodes']` on GitLab server does not match the - value in `gitaly['auth_token']` on one or more Gitaly servers. -- 503 response code - - **GRPC::Unavailable (14:failed to connect to all addresses)** - - GitLab was unable to reach Praefect. - - **GRPC::Unavailable (14:all SubCons are in TransientFailure...)** - - Praefect cannot reach one or more of its child Gitaly nodes. Try running - the Praefect connection checker to diagnose. - -#### Determine primary Gitaly node - -To determine the current primary Gitaly node for a specific Praefect node: - -- Use the `Shard Primary Election` [Grafana chart](praefect.md#grafana) on the - [`Gitlab Omnibus - Praefect` dashboard](https://gitlab.com/gitlab-org/grafana-dashboards/-/blob/master/omnibus/praefect.json). - This is recommended. -- If you do not have Grafana set up, use the following command on each host of each - Praefect node: - - ```shell - curl localhost:9652/metrics | grep gitaly_praefect_primaries` - ``` - -#### Relation does not exist errors - -By default Praefect database tables are created automatically by `gitlab-ctl reconfigure` task. -However, if the `gitlab-ctl reconfigure` command isn't executed or there are errors during the -execution, the Praefect database tables are not created on initial reconfigure and can throw -errors that relations do not exist. - -For example: - -- `ERROR: relation "node_status" does not exist at character 13` -- `ERROR: relation "replication_queue_lock" does not exist at character 40` -- This error: - - ```json - {"level":"error","msg":"Error updating node: pq: relation \"node_status\" does not exist","pid":210882,"praefectName":"gitlab1x4m:0.0.0.0:2305","time":"2021-04-01T19:26:19.473Z","virtual_storage":"praefect-cluster-1"} - ``` - -To solve this, the database schema migration can be done using `sql-migrate` sub-command of -the `praefect` command: - -```shell -$ sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml sql-migrate -praefect sql-migrate: OK (applied 21 migrations) -``` +- Raise a support ticket. +- [Comment on the epic](https://gitlab.com/groups/gitlab-org/-/epics/4916). diff --git a/doc/administration/gitaly/praefect.md b/doc/administration/gitaly/praefect.md index 21e5360e27b..e483bcc944a 100644 --- a/doc/administration/gitaly/praefect.md +++ b/doc/administration/gitaly/praefect.md @@ -43,8 +43,8 @@ default value. The default value depends on the GitLab version. ## Setup Instructions -If you [installed](https://about.gitlab.com/install/) GitLab using the Omnibus -package (highly recommended), follow the steps below: +If you [installed](https://about.gitlab.com/install/) GitLab using the Omnibus GitLab package +(highly recommended), follow the steps below: 1. [Preparation](#preparation) 1. [Configuring the Praefect database](#postgresql) @@ -59,25 +59,27 @@ package (highly recommended), follow the steps below: Before beginning, you should already have a working GitLab instance. [Learn how to install GitLab](https://about.gitlab.com/install/). -Provision a PostgreSQL server (PostgreSQL 11 or newer). +Provision a PostgreSQL server. We recommend using the PostgreSQL that is shipped +with Omnibus GitLab and use it to configure the PostgreSQL database. You can use an +external PostgreSQL server (version 11 or newer) but you must set it up [manually](#manual-database-setup). -Prepare all your new nodes by [installing -GitLab](https://about.gitlab.com/install/). +Prepare all your new nodes by [installing GitLab](https://about.gitlab.com/install/). You need: +- 1 PostgreSQL node +- 1 PgBouncer node (optional) - At least 1 Praefect node (minimal storage required) - 3 Gitaly nodes (high CPU, high memory, fast storage) - 1 GitLab server -You need the IP/host address for each node. +You also need the IP/host address for each node: -1. `LOAD_BALANCER_SERVER_ADDRESS`: the IP/host address of the load balancer -1. `POSTGRESQL_SERVER_ADDRESS`: the IP/host address of the PostgreSQL server +1. `PRAEFECT_LOADBALANCER_HOST`: the IP/host address of Praefect load balancer +1. `POSTGRESQL_HOST`: the IP/host address of the PostgreSQL server +1. `PGBOUNCER_HOST`: the IP/host address of the PostgreSQL server 1. `PRAEFECT_HOST`: the IP/host address of the Praefect server 1. `GITALY_HOST_*`: the IP or host address of each Gitaly server 1. `GITLAB_HOST`: the IP/host address of the GitLab server -If you are using a cloud provider, you can look up the addresses for each server through your cloud provider's management console. - If you are using Google Cloud Platform, SoftLayer, or any other vendor that provides a virtual private cloud (VPC) you can use the private addresses for each cloud instance (corresponds to "internal address" for Google Cloud Platform) for `PRAEFECT_HOST`, `GITALY_HOST_*`, and `GITLAB_HOST`. #### Secrets @@ -98,6 +100,14 @@ with secure tokens as you complete the setup process. Praefect cluster directly; that could lead to data loss. 1. `PRAEFECT_SQL_PASSWORD`: this password is used by Praefect to connect to PostgreSQL. +1. `PRAEFECT_SQL_PASSWORD_HASH`: the hash of password of the Praefect user. + Use `gitlab-ctl pg-password-md5 praefect` to generate the hash. The command + asks for the password for `praefect` user. Enter `PRAEFECT_SQL_PASSWORD` + plaintext password. By default, Praefect uses `praefect` user, but you can + change it. +1. `PGBOUNCER_SQL_PASSWORD_HASH`: the hash of password of the PgBouncer user. + PgBouncer uses this password to connect to PostgreSQL. For more details + see [bundled PgBouncer](../postgresql/pgbouncer.md) documentation. We note in the instructions below where these secrets are required. @@ -108,127 +118,210 @@ Omnibus GitLab installations can use `gitlab-secrets.json` for `GITLAB_SHELL_SEC NOTE: Do not store the GitLab application database and the Praefect -database on the same PostgreSQL server if using -[Geo](../geo/index.md). The replication state is internal to each instance -of GitLab and should not be replicated. +database on the same PostgreSQL server if using [Geo](../geo/index.md). +The replication state is internal to each instance of GitLab and should +not be replicated. These instructions help set up a single PostgreSQL database, which creates a single point of -failure. The following options are available: +failure. Alternatively, [you can use PostgreSQL replication and failover](../postgresql/replication_and_failover.md). + +The following options are available: - For non-Geo installations, either: - Use one of the documented [PostgreSQL setups](../postgresql/index.md). - - Use your own third-party database setup, if fault tolerance is required. + - Use your own third-party database setup. This will require [manual setup](#manual-database-setup). - For Geo instances, either: - Set up a separate [PostgreSQL instance](https://www.postgresql.org/docs/11/high-availability.html). - Use a cloud-managed PostgreSQL service. AWS [Relational Database Service](https://aws.amazon.com/rds/) is recommended. -To complete this section you need: +#### Manual database setup -- 1 Praefect node -- 1 PostgreSQL server (PostgreSQL 11 or newer) - - An SQL user with permissions to create databases +To complete this section you need: -During this section, we configure the PostgreSQL server, from the Praefect -node, using `psql` which is installed by Omnibus GitLab. +- One Praefect node +- One PostgreSQL node (version 11 or newer) + - A PostgreSQL user with permissions to manage the database server -1. SSH into the **Praefect** node and login as root: +In this section, we configure the PostgreSQL database. This can be used for both external +and Omnibus-provided PostgreSQL server. - ```shell - sudo -i - ``` +To run the following instructions, you can use the Praefect node, where `psql` is installed +by Omnibus GitLab (`/opt/gitlab/embedded/bin/psql`). If you are using the Omnibus-provided +PostgreSQL you can use `gitlab-psql` on the PostgreSQL node instead: -1. Connect to the PostgreSQL server with administrative access. This is likely - the `postgres` user. The database `template1` is used because it is created - by default on all PostgreSQL servers. +1. Create a new user `praefect` to be used by Praefect: - ```shell - /opt/gitlab/embedded/bin/psql -U postgres -d template1 -h POSTGRESQL_SERVER_ADDRESS + ```sql + CREATE ROLE praefect WITH LOGIN PASSWORD 'PRAEFECT_SQL_PASSWORD'; ``` - Create a new user `praefect` to be used by Praefect. Replace - `PRAEFECT_SQL_PASSWORD` with the strong password you generated in the - preparation step. + Replace `PRAEFECT_SQL_PASSWORD` with the strong password you generated in the preparation step. + +1. Create a new database `praefect_production` that is owned by `praefect` user. ```sql - CREATE ROLE praefect WITH LOGIN CREATEDB PASSWORD 'PRAEFECT_SQL_PASSWORD'; + CREATE DATABASE praefect_production WITH OWNER praefect ENCODING UTF8; ``` -1. Reconnect to the PostgreSQL server, this time as the `praefect` user: +For using Omnibus-provided PgBouncer you need to take the following additional steps. We strongly +recommend using the PostgreSQL that is shipped with Omnibus as the backend. The following +instructions only work on Omnibus-provided PostgreSQL: - ```shell - /opt/gitlab/embedded/bin/psql -U praefect -d template1 -h POSTGRESQL_SERVER_ADDRESS +1. For Omnibus-provided PgBouncer, you need to use the hash of `praefect` user instead the of the + actual password: + + ```sql + ALTER ROLE praefect WITH PASSWORD 'md5<PRAEFECT_SQL_PASSWORD_HASH>'; ``` - Create a new database `praefect_production`. By creating the database while - connected as the `praefect` user, we are confident they have access. + Replace `<PRAEFECT_SQL_PASSWORD_HASH>` with the hash of the password you generated in the + preparation step. Note that it is prefixed with `md5` literal. + +1. The PgBouncer that is shipped with Omnibus is configured to use [`auth_query`](https://www.pgbouncer.org/config.html#generic-settings) + and uses `pg_shadow_lookup` function. You need to create this function in `praefect_production` + database: ```sql - CREATE DATABASE praefect_production WITH ENCODING=UTF8; + CREATE OR REPLACE FUNCTION public.pg_shadow_lookup(in i_username text, out username text, out password text) RETURNS record AS $$ + BEGIN + SELECT usename, passwd FROM pg_catalog.pg_shadow + WHERE usename = i_username INTO username, password; + RETURN; + END; + $$ LANGUAGE plpgsql SECURITY DEFINER; + + REVOKE ALL ON FUNCTION public.pg_shadow_lookup(text) FROM public, pgbouncer; + GRANT EXECUTE ON FUNCTION public.pg_shadow_lookup(text) TO pgbouncer; ``` The database used by Praefect is now configured. If you see Praefect database errors after configuring PostgreSQL, see -[troubleshooting steps](index.md#relation-does-not-exist-errors). +[troubleshooting steps](troubleshooting.md#relation-does-not-exist-errors). -#### PgBouncer +#### Use PgBouncer To reduce PostgreSQL resource consumption, we recommend setting up and configuring [PgBouncer](https://www.pgbouncer.org/) in front of the PostgreSQL instance. To do -this, set the corresponding IP or host address of the PgBouncer instance in -`/etc/gitlab/gitlab.rb` by changing the following settings: +this, you must point Praefect to PgBouncer by setting Praefect database parameters: -- `praefect['database_host']`, for the address. -- `praefect['database_port']`, for the port. +```ruby +praefect['database_host'] = PGBOUNCER_HOST +praefect['database_port'] = 6432 +praefect['database_user'] = 'praefect' +praefect['database_password'] = PRAEFECT_SQL_PASSWORD +praefect['database_dbname'] = 'praefect_production' +#praefect['database_sslmode'] = '...' +#praefect['database_sslcert'] = '...' +#praefect['database_sslkey'] = '...' +#praefect['database_sslrootcert'] = '...' +``` -Because PgBouncer manages resources more efficiently, Praefect still requires a -direct connection to the PostgreSQL database. It uses the -[LISTEN](https://www.postgresql.org/docs/11/sql-listen.html) -feature that is [not supported](https://www.pgbouncer.org/features.html) by -PgBouncer with `pool_mode = transaction`. -Set `praefect['database_host_no_proxy']` and `praefect['database_port_no_proxy']` -to a direct connection, and not a PgBouncer connection. +Praefect requires an additional connection to the PostgreSQL that supports the +[LISTEN](https://www.postgresql.org/docs/11/sql-listen.html) feature. With PgBouncer +this feature is only available with `session` pool mode (`pool_mode = session`). +It is not supported in `transaction` pool mode (`pool_mode = transaction`). -Save the changes to `/etc/gitlab/gitlab.rb` and -[reconfigure Praefect](../restart_gitlab.md#omnibus-gitlab-reconfigure). +For the additional connection, you must either: -This documentation doesn't provide PgBouncer installation instructions, -but you can: +- Connect Praefect directly to PostgreSQL and bypass PgBouncer. +- Configure a new PgBouncer database that uses to the same PostgreSQL database endpoint, + but with different pool mode. That is, `pool_mode = session`. -- Find instructions on the [official website](https://www.pgbouncer.org/install.html). -- Use a [Docker image](https://hub.docker.com/r/edoburu/pgbouncer/). +Praefect can be configured to use different connection parameters for direct access +to PostgreSQL. This is the connection that supports the `LISTEN` feature. -In addition to the base PgBouncer configuration options, set the following values in -your `pgbouncer.ini` file: +Here is an example of Praefect that bypasses PgBouncer and directly connects to PostgreSQL: -- The [Praefect PostgreSQL database](#postgresql) in the `[databases]` section: +```ruby +praefect['database_direct_host'] = POSTGRESQL_HOST +praefect['database_direct_port'] = 5432 + +# Use the following to override parameters of direct database connection. +# Comment out where the parameters are the same for both connections. + +praefect['database_direct_user'] = 'praefect' +praefect['database_direct_password'] = PRAEFECT_SQL_PASSWORD +praefect['database_direct_dbname'] = 'praefect_production' +#praefect['database_direct_sslmode'] = '...' +#praefect['database_direct_sslcert'] = '...' +#praefect['database_direct_sslkey'] = '...' +#praefect['database_direct_sslrootcert'] = '...' +``` - ```ini - [databases] - * = host=POSTGRESQL_SERVER_ADDRESS port=5432 auth_user=praefect - ``` +We recommend using PgBouncer with `session` pool mode instead. You can use the [bundled +PgBouncer](../postgresql/pgbouncer.md) or use an external PgBouncer and [configure it +manually](https://www.pgbouncer.org/config.html). -- [`pool_mode`](https://www.pgbouncer.org/config.html#pool_mode) - and [`ignore_startup_parameters`](https://www.pgbouncer.org/config.html#ignore_startup_parameters) - in the `[pgbouncer]` section: +The following example uses the bundled PgBouncer and sets up two separate connection pools, +one in `session` pool mode and the other in `transaction` pool mode. For this example to work, +you need to prepare PostgreSQL server with [setup instruction](#manual-database-setup): - ```ini - [pgbouncer] - pool_mode = transaction - ignore_startup_parameters = extra_float_digits - ``` +```ruby +pgbouncer['databases'] = { + # Other database configuation including gitlabhq_production + ... + + praefect_production: { + host: POSTGRESQL_HOST, + # Use `pgbouncer` user to connect to database backend. + user: 'pgbouncer', + password: PGBOUNCER_SQL_PASSWORD_HASH, + pool_mode: 'transaction' + } + praefect_production_direct: { + host: POSTGRESQL_HOST, + # Use `pgbouncer` user to connect to database backend. + user: 'pgbouncer', + password: PGBOUNCER_SQL_PASSWORD_HASH, + dbname: 'praefect_production', + pool_mode: 'session' + }, + + ... +} +``` + +Both `praefect_production` and `praefect_production_direct` use the same database endpoint +(`praefect_production`), but with different pool modes. This translates to the following +`databases` section of PgBouncer: -The `praefect` user and its password should be included in the file (default is -`userlist.txt`) used by PgBouncer if the [`auth_file`](https://www.pgbouncer.org/config.html#auth_file) -configuration option is set. +```ini +[databases] +praefect_production = host=POSTGRESQL_HOST auth_user=pgbouncer pool_mode=transaction +praefect_production_direct = host=POSTGRESQL_HOST auth_user=pgbouncer dbname=praefect_production pool_mode=session +``` + +Now you can configure Praefect to use PgBouncer for both connections: + +```ruby +praefect['database_host'] = PGBOUNCER_HOST +praefect['database_port'] = 6432 +praefect['database_user'] = 'praefect' +# `PRAEFECT_SQL_PASSWORD` is the plain-text password of +# Praefect user. Not to be confused with `PRAEFECT_SQL_PASSWORD_HASH`. +praefect['database_password'] = PRAEFECT_SQL_PASSWORD + +praefect['database_dbname'] = 'praefect_production' +praefect['database_direct_dbname'] = 'praefect_production_direct' + +# There is no need to repeat the following. Parameters of direct +# database connection will fall back to the values above. + +#praefect['database_direct_host'] = PGBOUNCER_HOST +#praefect['database_direct_port'] = 6432 +#praefect['database_direct_user'] = 'praefect' +#praefect['database_direct_password'] = PRAEFECT_SQL_PASSWORD +``` + +With this configuration, Praefect uses PgBouncer for both connection types. NOTE: -By default PgBouncer uses port `6432` to accept incoming -connections. You can change it by setting the [`listen_port`](https://www.pgbouncer.org/config.html#listen_port) -configuration option. We recommend setting it to the default port value (`5432`) used by -PostgreSQL instances. Otherwise you should change the configuration parameter -`praefect['database_port']` for each Praefect instance to the correct value. +Omnibus GitLab handles the authentication requirements (using `auth_query`), but if you are preparing +your databases manually and configuring an external PgBouncer, you must include `praefect` user and +its password in the file used by PgBouncer. For example, `userlist.txt` if the [`auth_file`](https://www.pgbouncer.org/config.html#auth_file) +configuration option is set. For more details, consult the PgBouncer documentation. ### Praefect @@ -241,17 +334,10 @@ If there are multiple Praefect nodes: To complete this section you need a [configured PostgreSQL server](#postgresql), including: -- IP/host address (`POSTGRESQL_SERVER_ADDRESS`) -- Password (`PRAEFECT_SQL_PASSWORD`) - Praefect should be run on a dedicated node. Do not run Praefect on the application server, or a Gitaly node. -1. SSH into the **Praefect** node and login as root: - - ```shell - sudo -i - ``` +On the **Praefect** node: 1. Disable all other services by editing `/etc/gitlab/gitlab.rb`: @@ -295,22 +381,8 @@ application server, or a Gitaly node. praefect['auth_token'] = 'PRAEFECT_EXTERNAL_TOKEN' ``` -1. Configure **Praefect** to connect to the PostgreSQL database by editing - `/etc/gitlab/gitlab.rb`. - - You need to replace `POSTGRESQL_SERVER_ADDRESS` with the IP/host address - of the database, and `PRAEFECT_SQL_PASSWORD` with the strong password set - above. - - ```ruby - praefect['database_host'] = 'POSTGRESQL_SERVER_ADDRESS' - praefect['database_port'] = 5432 - praefect['database_user'] = 'praefect' - praefect['database_password'] = 'PRAEFECT_SQL_PASSWORD' - praefect['database_dbname'] = 'praefect_production' - praefect['database_host_no_proxy'] = 'POSTGRESQL_SERVER_ADDRESS' - praefect['database_port_no_proxy'] = 5432 - ``` +1. Configure **Praefect** to [connect to the PostgreSQL database](#postgresql). We + highly recommend using [PgBouncer](#use-pgbouncer) as well. If you want to use a TLS client certificate, the options below can be used: @@ -507,7 +579,7 @@ To configure Praefect with TLS: ```ruby git_data_dirs({ "default" => { - "gitaly_address" => 'tls://LOAD_BALANCER_SERVER_ADDRESS:2305', + "gitaly_address" => 'tls://PRAEFECT_LOADBALANCER_HOST:2305', "gitaly_token" => 'PRAEFECT_EXTERNAL_TOKEN' } }) @@ -544,7 +616,7 @@ To configure Praefect with TLS: repositories: storages: default: - gitaly_address: tls://LOAD_BALANCER_SERVER_ADDRESS:3305 + gitaly_address: tls://PRAEFECT_LOADBALANCER_HOST:3305 path: /some/local/path ``` @@ -817,7 +889,7 @@ Particular attention should be shown to: You need to replace: - - `LOAD_BALANCER_SERVER_ADDRESS` with the IP address or hostname of the load + - `PRAEFECT_LOADBALANCER_HOST` with the IP address or hostname of the load balancer. - `PRAEFECT_EXTERNAL_TOKEN` with the real secret @@ -826,7 +898,7 @@ Particular attention should be shown to: ```ruby git_data_dirs({ "default" => { - "gitaly_address" => "tcp://LOAD_BALANCER_SERVER_ADDRESS:2305", + "gitaly_address" => "tcp://PRAEFECT_LOADBALANCER_HOST:2305", "gitaly_token" => 'PRAEFECT_EXTERNAL_TOKEN' } }) @@ -926,7 +998,7 @@ For example: git_data_dirs({ 'default' => { 'gitaly_address' => 'tcp://old-gitaly.internal:8075' }, 'cluster' => { - 'gitaly_address' => 'tcp://<load_balancer_server_address>:2305', + 'gitaly_address' => 'tcp://<PRAEFECT_LOADBALANCER_HOST>:2305', 'gitaly_token' => '<praefect_external_token>' } }) @@ -981,6 +1053,26 @@ To get started quickly: Congratulations! You've configured an observable fault-tolerant Praefect cluster. +## Network connectivity requirements + +Gitaly Cluster components need to communicate with each other over many routes. +Your firewall rules must allow the following for Gitaly Cluster to function properly: + +| From | To | Default port / TLS port | +|:-----------------------|:------------------------|:------------------------| +| GitLab | Praefect load balancer | `2305` / `3305` | +| Praefect load balancer | Praefect | `2305` / `3305` | +| Praefect | Gitaly | `8075` / `9999` | +| Gitaly | GitLab (internal API) | `80` / `443` | +| Gitaly | Praefect load balancer | `2305` / `3305` | +| Gitaly | Praefect | `2305` / `3305` | +| Gitaly | Gitaly | `8075` / `9999` | + +NOTE: +Gitaly does not directly connect to Praefect. However, requests from Gitaly to the Praefect +load balancer may still be blocked unless firewalls on the Praefect nodes allow traffic from +the Gitaly nodes. + ## Distributed reads > - Introduced in GitLab 13.1 in [beta](https://about.gitlab.com/handbook/product/gitlab-the-product/#alpha-beta-ga) with feature flag `gitaly_distributed_reads` set to disabled. @@ -1147,24 +1239,30 @@ The `per_repository` election strategy solves this problem by electing a primary repository. Combined with [configurable replication factors](#configure-replication-factor), you can horizontally scale storage capacity and distribute write load across Gitaly nodes. -Primary elections are run when: +Primary elections are run: -- Praefect starts up. -- The cluster's consensus of a Gitaly node's health changes. +- In GitLab 14.1 and later, lazily. This means that Praefect doesn't immediately elect + a new primary node if the current one is unhealthy. A new primary is elected if it is + necessary to serve a request while the current primary is unavailable. +- In GitLab 13.12 to GitLab 14.0 when: + - Praefect starts up. + - The cluster's consensus of a Gitaly node's health changes. -A Gitaly node is considered: +A valid primary node candidate is a Gitaly node that: -- Healthy if `>=50%` Praefect nodes have successfully health checked the Gitaly node in the - previous ten seconds. -- Unhealthy otherwise. +- Is healthy. A Gitaly node is considered healthy if `>=50%` Praefect nodes have + successfully health checked the Gitaly node in the previous ten seconds. +- Has a fully up to date copy of the repository. -During an election run, Praefect elects a new primary Gitaly node for each repository that has -an unhealthy primary Gitaly node. The election is made: +If there are multiple primary node candidates, Praefect: -- Randomly from healthy secondary Gitaly nodes that are the most up to date. -- Only from Gitaly nodes assigned to the host repository. +- Picks one of them randomly. +- Prioritizes promoting a Gitaly node that is assigned to host the repository. If + there are no assigned Gitaly nodes to elect as the primary, Praefect may temporarily + elect an unassigned one. The unassigned primary is demoted in favor of an assigned + one when one becomes available. -If there are no healthy secondary nodes for a repository: +If there are no valid primary candidates for a repository: - The unhealthy primary node is demoted and the repository is left without a primary node. - Operations that require a primary node fail until a primary is successfully elected. @@ -1212,7 +1310,7 @@ To migrate existing clusters: - If downtime is unacceptable: - 1. Determine which Gitaly node is [the current primary](index.md#determine-primary-gitaly-node). + 1. Determine which Gitaly node is [the current primary](troubleshooting.md#determine-primary-gitaly-node). 1. Comment out the secondary Gitaly nodes from the virtual storage's configuration in `/etc/gitlab/gitlab.rb` on all Praefect nodes. This ensures there's only one Gitaly node configured, causing both of the election @@ -1259,23 +1357,37 @@ Migrate to [repository-specific primary nodes](#repository-specific-primary-node Gitaly Cluster recovers from a failing primary Gitaly node by promoting a healthy secondary as the new primary. -To minimize data loss, Gitaly Cluster: +In GitLab 14.1 and later, Gitaly Cluster: + +- Elects a healthy secondary with a fully up to date copy of the repository as the new primary. +- Repository becomes unavailable if there are no fully up to date copies of it on healthy secondaries. + +To minimize data loss in GitLab 13.0 to 14.0, Gitaly Cluster: - Switches repositories that are outdated on the new primary to [read-only mode](#read-only-mode). -- Elects the secondary with the least unreplicated writes from the primary to be the new primary. - Because there can still be some unreplicated writes, [data loss can occur](#check-for-data-loss). +- Elects the secondary with the least unreplicated writes from the primary to be the new + primary. Because there can still be some unreplicated writes, + [data loss can occur](#check-for-data-loss). ### Read-only mode > - Introduced in GitLab 13.0 as [generally available](https://about.gitlab.com/handbook/product/gitlab-the-product/#generally-available-ga). > - Between GitLab 13.0 and GitLab 13.2, read-only mode applied to the whole virtual storage and occurred whenever failover occurred. > - [In GitLab 13.3 and later](https://gitlab.com/gitlab-org/gitaly/-/issues/2862), read-only mode applies on a per-repository basis and only occurs if a new primary is out of date. +new primary. If the failed primary contained unreplicated writes, [data loss can occur](#check-for-data-loss). +> - Removed in GitLab 14.1. Instead, repositories [become unavailable](#unavailable-repositories). + +In GitLab 13.0 to 14.0, when Gitaly Cluster switches to a new primary, repositories enter +read-only mode if they are out of date. This can happen after failing over to an outdated +secondary. Read-only mode eases data recovery efforts by preventing writes that may conflict +with the unreplicated writes on other nodes. -When Gitaly Cluster switches to a new primary, repositories enter read-only mode if they are out of -date. This can happen after failing over to an outdated secondary. Read-only mode eases data -recovery efforts by preventing writes that may conflict with the unreplicated writes on other nodes. +When Gitaly Cluster switches to a new primary In GitLab 13.0 to 14.0, repositories enter +read-only mode if they are out of date. This can happen after failing over to an outdated +secondary. Read-only mode eases data recovery efforts by preventing writes that may conflict +with the unreplicated writes on other nodes. -To enable writes again, an administrator can: +To enable writes again in GitLab 13.0 to 14.0, an administrator can: 1. [Check](#check-for-data-loss) for data loss. 1. Attempt to [recover](#data-recovery) missing data. @@ -1283,21 +1395,38 @@ To enable writes again, an administrator can: [accept data loss](#enable-writes-or-accept-data-loss) if necessary, depending on the version of GitLab. +## Unavailable repositories + +> - From GitLab 13.0 through 14.0, repositories became read-only if they were outdated on the primary but fully up to date on a healthy secondary. `dataloss` sub-command displays read-only repositories by default through these versions. +> - Since GitLab 14.1, Praefect contains more responsive failover logic which immediately fails over to one of the fully up to date secondaries rather than placing the repository in read-only mode. Since GitLab 14.1, the `dataloss` sub-command displays repositories which are unavailable due to having no fully up to date copies on healthy Gitaly nodes. + +A repository is unavailable if all of its up to date replicas are unavailable. Unavailable repositories are +not accessible through Praefect to prevent serving stale data that may break automated tooling. + ### Check for data loss -The Praefect `dataloss` sub-command identifies replicas that are likely to be outdated. This can help -identify potential data loss after a failover. The following parameters are -available: +The Praefect `dataloss` subcommand identifies: + +- Copies of repositories in GitLab 13.0 to GitLab 14.0 that at are likely to be outdated. + This can help identify potential data loss after a failover. +- Repositories in GitLab 14.1 and later that are unavailable. This helps identify potential + data loss and repositories which are no longer accessible because all of their up-to-date + replicas copies are unavailable. + +The following parameters are available: -- `-virtual-storage` that specifies which virtual storage to check. The default behavior is to - display outdated replicas of read-only repositories as they might require administrator action. -- In GitLab 13.3 and later, `-partially-replicated` that specifies whether to display a list of - [outdated replicas of writable repositories](#outdated-replicas-of-writable-repositories). +- `-virtual-storage` that specifies which virtual storage to check. Because they might require + an administrator to intervene, the default behavior is to display: + - In GitLab 13.0 to 14.0, copies of read-only repositories. + - In GitLab 14.1 and later, unavailable repositories. +- In GitLab 14.1 and later, [`-partially-unavailable`](#unavailable-replicas-of-available-repositories) + that specifies whether to include in the output repositories that are available but have + some assigned copies that are not available. NOTE: `dataloss` is still in beta and the output format is subject to change. -To check for repositories with outdated primaries, run: +To check for repositories with outdated primaries or for unavailable repositories, run: ```shell sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss [-virtual-storage <virtual-storage>] @@ -1309,13 +1438,20 @@ Every configured virtual storage is checked if none is specified: sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss ``` -Repositories which have assigned storage nodes that contain an outdated copy of the repository are listed -in the output. This information is printed for each repository: +Repositories are listed in the output that have either: + +- An outdated copy of the repository on the primary, in GitLab 13.0 to GitLab 14.0. +- No healthy and fully up-to-date copies available, in GitLab 14.1 and later. + +The following information is printed for each repository: - A repository's relative path to the storage directory identifies each repository and groups the related information. -- The repository's current status is printed in parentheses next to the disk path. If the repository's primary - is outdated, the repository is in `read-only` mode and can't accept writes. Otherwise, the mode is `writable`. +- The repository's current status is printed in parentheses next to the disk path: + - In GitLab 13.0 to 14.0, either `(read-only)` if the repository's primary node is outdated + and can't accept writes. Otherwise, `(writable)`. + - In GitLab 14.1 and later, `(unavailable)` is printed next to the disk path if the + repository is unavailable. - The primary field lists the repository's current primary. If the repository has no primary, the field shows `No Primary`. - The In-Sync Storages lists replicas which have replicated the latest successful write and all writes @@ -1325,44 +1461,51 @@ in the output. This information is printed for each repository: is listed next to replica. It's important to notice that the outdated replicas may be fully up to date or contain later changes but Praefect can't guarantee it. -Whether a replica is assigned to host the repository is listed with each replica's status. `assigned host` is printed -next to replicas which are assigned to store the repository. The text is omitted if the replica contains a copy of -the repository but is not assigned to store the repository. Such replicas aren't kept in-sync by Praefect, but may -act as replication sources to bring assigned replicas up to date. +Additional information includes: + +- Whether a node is assigned to host the repository is listed with each node's status. + `assigned host` is printed next to nodes that are assigned to store the repository. The + text is omitted if the node contains a copy of the repository but is not assigned to store + the repository. Such copies aren't kept in sync by Praefect, but may act as replication + sources to bring assigned copies up to date. +- In GitLab 14.1 and later, `unhealthy` is printed next to the copies that are located + on unhealthy Gitaly nodes. Example output: ```shell Virtual storage: default Outdated repositories: - @hashed/3f/db/3fdba35f04dc8c462986c992bcf875546257113072a909c162f7e470e581e278.git (read-only): + @hashed/3f/db/3fdba35f04dc8c462986c992bcf875546257113072a909c162f7e470e581e278.git (unavailable): Primary: gitaly-1 In-Sync Storages: - gitaly-2, assigned host + gitaly-2, assigned host, unhealthy Outdated Storages: gitaly-1 is behind by 3 changes or less, assigned host gitaly-3 is behind by 3 changes or less ``` -A confirmation is printed out when every repository is writable. For example: +A confirmation is printed out when every repository is available. For example: ```shell Virtual storage: default - All repositories are writable! + All repositories are available! ``` -#### Outdated replicas of writable repositories +#### Unavailable replicas of available repositories -> [Introduced](https://gitlab.com/gitlab-org/gitaly/-/issues/3019) in GitLab 13.3. +NOTE: +In GitLab 14.0 and earlier, the flag is `-partially-replicated` and the output shows any repositories with assigned nodes with outdated +copies. -To also list information of repositories whose primary is up to date but one or more assigned -replicas are outdated, use the `-partially-replicated` flag. +To also list information of repositories which are available but are unavailable from some of the assigned nodes, +use the `-partially-unavailable` flag. -A repository is writable if the primary has the latest changes. Secondaries might be temporarily -outdated while they are waiting to replicate the latest changes. +A repository is available if there is a healthy, up to date replica available. Some of the assigned secondary +replicas may be temporarily unavailable for access while they are waiting to replicate the latest changes. ```shell -sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss [-virtual-storage <virtual-storage>] [-partially-replicated] +sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml dataloss [-virtual-storage <virtual-storage>] [-partially-unavailable] ``` Example output: @@ -1370,7 +1513,7 @@ Example output: ```shell Virtual storage: default Outdated repositories: - @hashed/3f/db/3fdba35f04dc8c462986c992bcf875546257113072a909c162f7e470e581e278.git (writable): + @hashed/3f/db/3fdba35f04dc8c462986c992bcf875546257113072a909c162f7e470e581e278.git: Primary: gitaly-1 In-Sync Storages: gitaly-1, assigned host @@ -1379,14 +1522,14 @@ Virtual storage: default gitaly-3 is behind by 3 changes or less ``` -With the `-partially-replicated` flag set, a confirmation is printed out if every assigned replica is fully up to -date. +With the `-partially-unavailable` flag set, a confirmation is printed out if every assigned replica is fully up to +date and healthy. For example: ```shell Virtual storage: default - All repositories are up to date! + All repositories are fully available on all assigned storages! ``` ### Check repository checksums @@ -1394,30 +1537,50 @@ Virtual storage: default To check a project's repository checksums across on all Gitaly nodes, run the [replicas Rake task](../raketasks/praefect.md#replica-checksums) on the main GitLab node. +### Accept data loss + +WARNING: +`accept-dataloss` causes permanent data loss by overwriting other versions of the repository. Data +[recovery efforts](#data-recovery) must be performed before using it. + +If it is not possible to bring one of the up to date replicas back online, you may have to accept data +loss. When accepting data loss, Praefect marks the chosen replica of the repository as the latest version +and replicates it to the other assigned Gitaly nodes. This process overwrites any other version of the +repository so care must be taken. + +```shell +sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml accept-dataloss +-virtual-storage <virtual-storage> -repository <relative-path> -authoritative-storage <storage-name> +``` + ### Enable writes or accept data loss -Praefect provides the following sub-commands to re-enable writes: +WARNING: +`accept-dataloss` causes permanent data loss by overwriting other versions of the repository. +Data [recovery efforts](#data-recovery) must be performed before using it. -- In GitLab 13.2 and earlier, `enable-writes` to re-enable virtual storage for writes after data - recovery attempts. +Praefect provides the following subcommands to re-enable writes or accept data loss: - ```shell - sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml enable-writes -virtual-storage <virtual-storage> - ``` +- In GitLab 13.2 and earlier, `enable-writes` to re-enable virtual storage for writes after + data recovery attempts: -- [In GitLab 13.3](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/2415) and later, - `accept-dataloss` to accept data loss and re-enable writes for repositories after data recovery - attempts have failed. Accepting data loss causes current version of the repository on the - authoritative storage to be considered latest. Other storages are brought up to date with the - authoritative storage by scheduling replication jobs. + ```shell + sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml enable-writes -virtual-storage <virtual-storage> + ``` + +- In GitLab 13.3 and later, if it is not possible to bring one of the up to date nodes back + online, you may have to accept data loss: ```shell sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml accept-dataloss -virtual-storage <virtual-storage> -repository <relative-path> -authoritative-storage <storage-name> ``` -WARNING: -`accept-dataloss` causes permanent data loss by overwriting other versions of the repository. Data -[recovery efforts](#data-recovery) must be performed before using it. + When accepting data loss, Praefect: + + 1. Marks the chosen copy of the repository as the latest version. + 1. Replicates the copy to the other assigned Gitaly nodes. + + This process overwrites any other copy of the repository so care must be taken. ## Data recovery @@ -1463,10 +1626,7 @@ praefect['reconciliation_scheduling_interval'] = '0' # disable the feature ### Manual reconciliation WARNING: -The `reconcile` sub-command is deprecated and scheduled for removal in GitLab 14.0. Use -[automatic reconciliation](#automatic-reconciliation) instead. Manual reconciliation may -produce excess replication jobs and is limited in functionality. Manual reconciliation does -not work when [repository-specific primary nodes](#repository-specific-primary-nodes) are +The `reconcile` sub-command was removed in GitLab 14.1. Use [automatic reconciliation](#automatic-reconciliation) instead. Manual reconciliation may produce excess replication jobs and is limited in functionality. Manual reconciliation does not work when [repository-specific primary nodes](#repository-specific-primary-nodes) are enabled. The Praefect `reconcile` sub-command allows for the manual reconciliation between two Gitaly nodes. The @@ -1509,7 +1669,7 @@ After creating and configuring Gitaly Cluster: 1. Ensure all storages are accessible to the GitLab instance. In this example, these are `<original_storage_name>` and `<cluster_storage_name>`. 1. [Configure repository storage weights](../repository_storage_paths.md#configure-where-new-repositories-are-stored) - so that the Gitaly Cluster receives all new projects. This stops new projects being created + so that the Gitaly Cluster receives all new projects. This stops new projects from being created on existing Gitaly nodes while the migration is in progress. 1. Schedule repository moves for: - [Projects](#bulk-schedule-project-moves). diff --git a/doc/administration/gitaly/troubleshooting.md b/doc/administration/gitaly/troubleshooting.md new file mode 100644 index 00000000000..ab6f493cf0f --- /dev/null +++ b/doc/administration/gitaly/troubleshooting.md @@ -0,0 +1,372 @@ +--- +stage: Create +group: Gitaly +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +type: reference +--- + +# Troubleshooting Gitaly and Gitaly Cluster **(FREE SELF)** + +Refer to the information below when troubleshooting Gitaly and Gitaly Cluster. + +Before troubleshooting, see the Gitaly and Gitaly Cluster +[frequently asked questions](faq.md). + +## Troubleshoot Gitaly + +The following sections provide possible solutions to Gitaly errors. + +See also [Gitaly timeout](../../user/admin_area/settings/gitaly_timeouts.md) settings. + +### Check versions when using standalone Gitaly servers + +When using standalone Gitaly servers, you must make sure they are the same version +as GitLab to ensure full compatibility: + +1. On the top bar, select **Menu >** **{admin}** **Admin** on your GitLab instance. +1. On the left sidebar, select **Overview > Gitaly Servers**. +1. Confirm all Gitaly servers indicate that they are up to date. + +### Use `gitaly-debug` + +The `gitaly-debug` command provides "production debugging" tools for Gitaly and Git +performance. It is intended to help production engineers and support +engineers investigate Gitaly performance problems. + +If you're using GitLab 11.6 or newer, this tool should be installed on +your GitLab or Gitaly server already at `/opt/gitlab/embedded/bin/gitaly-debug`. +If you're investigating an older GitLab version you can compile this +tool offline and copy the executable to your server: + +```shell +git clone https://gitlab.com/gitlab-org/gitaly.git +cd cmd/gitaly-debug +GOOS=linux GOARCH=amd64 go build -o gitaly-debug +``` + +To see the help page of `gitaly-debug` for a list of supported sub-commands, run: + +```shell +gitaly-debug -h +``` + +### Commits, pushes, and clones return a 401 + +```plaintext +remote: GitLab: 401 Unauthorized +``` + +You need to sync your `gitlab-secrets.json` file with your GitLab +application nodes. + +### Client side gRPC logs + +Gitaly uses the [gRPC](https://grpc.io/) RPC framework. The Ruby gRPC +client has its own log file which may contain useful information when +you are seeing Gitaly errors. You can control the log level of the +gRPC client with the `GRPC_LOG_LEVEL` environment variable. The +default level is `WARN`. + +You can run a gRPC trace with: + +```shell +sudo GRPC_TRACE=all GRPC_VERBOSITY=DEBUG gitlab-rake gitlab:gitaly:check +``` + +### Server side gRPC logs + +gRPC tracing can also be enabled in Gitaly itself with the `GODEBUG=http2debug` +environment variable. To set this in an Omnibus GitLab install: + +1. Add the following to your `gitlab.rb` file: + + ```ruby + gitaly['env'] = { + "GODEBUG=http2debug" => "2" + } + ``` + +1. [Reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure) GitLab. + +### Correlating Git processes with RPCs + +Sometimes you need to find out which Gitaly RPC created a particular Git process. + +One method for doing this is by using `DEBUG` logging. However, this needs to be enabled +ahead of time and the logs produced are quite verbose. + +A lightweight method for doing this correlation is by inspecting the environment +of the Git process (using its `PID`) and looking at the `CORRELATION_ID` variable: + +```shell +PID=<Git process ID> +sudo cat /proc/$PID/environ | tr '\0' '\n' | grep ^CORRELATION_ID= +``` + +This method isn't reliable for `git cat-file` processes, because Gitaly +internally pools and re-uses those across RPCs. + +### Observing `gitaly-ruby` traffic + +[`gitaly-ruby`](configure_gitaly.md#gitaly-ruby) is an internal implementation detail of Gitaly, +so, there's not that much visibility into what goes on inside +`gitaly-ruby` processes. + +If you have Prometheus set up to scrape your Gitaly process, you can see +request rates and error codes for individual RPCs in `gitaly-ruby` by +querying `grpc_client_handled_total`. + +- In theory, this metric does not differentiate between `gitaly-ruby` and other RPCs. +- In practice from GitLab 11.9, all gRPC calls made by Gitaly itself are internal calls from the + main Gitaly process to one of its `gitaly-ruby` sidecars. + +Assuming your `grpc_client_handled_total` counter only observes Gitaly, +the following query shows you RPCs are (most likely) internally +implemented as calls to `gitaly-ruby`: + +```prometheus +sum(rate(grpc_client_handled_total[5m])) by (grpc_method) > 0 +``` + +### Repository changes fail with a `401 Unauthorized` error + +If you run Gitaly on its own server and notice these conditions: + +- Users can successfully clone and fetch repositories by using both SSH and HTTPS. +- Users can't push to repositories, or receive a `401 Unauthorized` message when attempting to + make changes to them in the web UI. + +Gitaly may be failing to authenticate with the Gitaly client because it has the +[wrong secrets file](configure_gitaly.md#configure-gitaly-servers). + +Confirm the following are all true: + +- When any user performs a `git push` to any repository on this Gitaly server, it + fails with a `401 Unauthorized` error: + + ```shell + remote: GitLab: 401 Unauthorized + To <REMOTE_URL> + ! [remote rejected] branch-name -> branch-name (pre-receive hook declined) + error: failed to push some refs to '<REMOTE_URL>' + ``` + +- When any user adds or modifies a file from the repository using the GitLab + UI, it immediately fails with a red `401 Unauthorized` banner. +- Creating a new project and [initializing it with a README](../../user/project/working_with_projects.md#blank-projects) + successfully creates the project but doesn't create the README. +- When [tailing the logs](https://docs.gitlab.com/omnibus/settings/logs.html#tail-logs-in-a-console-on-the-server) + on a Gitaly client and reproducing the error, you get `401` errors + when reaching the [`/api/v4/internal/allowed`](../../development/internal_api.md) endpoint: + + ```shell + # api_json.log + { + "time": "2019-07-18T00:30:14.967Z", + "severity": "INFO", + "duration": 0.57, + "db": 0, + "view": 0.57, + "status": 401, + "method": "POST", + "path": "\/api\/v4\/internal\/allowed", + "params": [ + { + "key": "action", + "value": "git-receive-pack" + }, + { + "key": "changes", + "value": "REDACTED" + }, + { + "key": "gl_repository", + "value": "REDACTED" + }, + { + "key": "project", + "value": "\/path\/to\/project.git" + }, + { + "key": "protocol", + "value": "web" + }, + { + "key": "env", + "value": "{\"GIT_ALTERNATE_OBJECT_DIRECTORIES\":[],\"GIT_ALTERNATE_OBJECT_DIRECTORIES_RELATIVE\":[],\"GIT_OBJECT_DIRECTORY\":null,\"GIT_OBJECT_DIRECTORY_RELATIVE\":null}" + }, + { + "key": "user_id", + "value": "2" + }, + { + "key": "secret_token", + "value": "[FILTERED]" + } + ], + "host": "gitlab.example.com", + "ip": "REDACTED", + "ua": "Ruby", + "route": "\/api\/:version\/internal\/allowed", + "queue_duration": 4.24, + "gitaly_calls": 0, + "gitaly_duration": 0, + "correlation_id": "XPUZqTukaP3" + } + + # nginx_access.log + [IP] - - [18/Jul/2019:00:30:14 +0000] "POST /api/v4/internal/allowed HTTP/1.1" 401 30 "" "Ruby" + ``` + +To fix this problem, confirm that your [`gitlab-secrets.json` file](configure_gitaly.md#configure-gitaly-servers) +on the Gitaly server matches the one on Gitaly client. If it doesn't match, +update the secrets file on the Gitaly server to match the Gitaly client, then +[reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure). + +### Command line tools cannot connect to Gitaly + +gRPC cannot reach your Gitaly server if: + +- You can't connect to a Gitaly server with command-line tools. +- Certain actions result in a `14: Connect Failed` error message. + +Verify you can reach Gitaly by using TCP: + +```shell +sudo gitlab-rake gitlab:tcp_check[GITALY_SERVER_IP,GITALY_LISTEN_PORT] +``` + +If the TCP connection: + +- Fails, check your network settings and your firewall rules. +- Succeeds, your networking and firewall rules are correct. + +If you use proxy servers in your command line environment such as Bash, these can interfere with +your gRPC traffic. + +If you use Bash or a compatible command line environment, run the following commands to determine +whether you have proxy servers configured: + +```shell +echo $http_proxy +echo $https_proxy +``` + +If either of these variables have a value, your Gitaly CLI connections may be getting routed through +a proxy which cannot connect to Gitaly. + +To remove the proxy setting, run the following commands (depending on which variables had values): + +```shell +unset http_proxy +unset https_proxy +``` + +### Permission denied errors appearing in Gitaly or Praefect logs when accessing repositories + +You might see the following in Gitaly and Praefect logs: + +```shell +{ + ... + "error":"rpc error: code = PermissionDenied desc = permission denied", + "grpc.code":"PermissionDenied", + "grpc.meta.client_name":"gitlab-web", + "grpc.request.fullMethod":"/gitaly.ServerService/ServerInfo", + "level":"warning", + "msg":"finished unary call with code PermissionDenied", + ... +} +``` + +This is a GRPC call +[error response code](https://grpc.github.io/grpc/core/md_doc_statuscodes.html). + +If this error occurs, even though +[the Gitaly auth tokens are set up correctly](#praefect-errors-in-logs), +it's likely that the Gitaly servers are experiencing +[clock drift](https://en.wikipedia.org/wiki/Clock_drift). + +Ensure the Gitaly clients and servers are synchronized, and use an NTP time +server to keep them synchronized. + +### Gitaly not listening on new address after reconfiguring + +When updating the `gitaly['listen_addr']` or `gitaly['prometheus_listen_addr']` values, Gitaly may +continue to listen on the old address after a `sudo gitlab-ctl reconfigure`. + +When this occurs, run `sudo gitlab-ctl restart` to resolve the issue. This should no longer be +necessary because [this issue](https://gitlab.com/gitlab-org/gitaly/-/issues/2521) is resolved. + +### Permission denied errors appearing in Gitaly logs when accessing repositories from a standalone Gitaly node + +If this error occurs even though file permissions are correct, it's likely that the Gitaly node is +experiencing [clock drift](https://en.wikipedia.org/wiki/Clock_drift). + +Please ensure that the GitLab and Gitaly nodes are synchronized and use an NTP time +server to keep them synchronized if possible. + +## Troubleshoot Praefect (Gitaly Cluster) + +The following sections provide possible solutions to Gitaly Cluster errors. + +### Praefect errors in logs + +If you receive an error, check `/var/log/gitlab/gitlab-rails/production.log`. + +Here are common errors and potential causes: + +- 500 response code + - **ActionView::Template::Error (7:permission denied)** + - `praefect['auth_token']` and `gitlab_rails['gitaly_token']` do not match on the GitLab server. + - **Unable to save project. Error: 7:permission denied** + - Secret token in `praefect['storage_nodes']` on GitLab server does not match the + value in `gitaly['auth_token']` on one or more Gitaly servers. +- 503 response code + - **GRPC::Unavailable (14:failed to connect to all addresses)** + - GitLab was unable to reach Praefect. + - **GRPC::Unavailable (14:all SubCons are in TransientFailure...)** + - Praefect cannot reach one or more of its child Gitaly nodes. Try running + the Praefect connection checker to diagnose. + +### Determine primary Gitaly node + +To determine the current primary Gitaly node for a specific Praefect node: + +- Use the `Shard Primary Election` [Grafana chart](praefect.md#grafana) on the + [`Gitlab Omnibus - Praefect` dashboard](https://gitlab.com/gitlab-org/grafana-dashboards/-/blob/master/omnibus/praefect.json). + This is recommended. +- If you do not have Grafana set up, use the following command on each host of each + Praefect node: + + ```shell + curl localhost:9652/metrics | grep gitaly_praefect_primaries` + ``` + +### Relation does not exist errors + +By default Praefect database tables are created automatically by `gitlab-ctl reconfigure` task. + +However, the Praefect database tables are not created on initial reconfigure and can throw +errors that relations do not exist if either: + +- The `gitlab-ctl reconfigure` command isn't executed. +- There are errors during the execution. + +For example: + +- `ERROR: relation "node_status" does not exist at character 13` +- `ERROR: relation "replication_queue_lock" does not exist at character 40` +- This error: + + ```json + {"level":"error","msg":"Error updating node: pq: relation \"node_status\" does not exist","pid":210882,"praefectName":"gitlab1x4m:0.0.0.0:2305","time":"2021-04-01T19:26:19.473Z","virtual_storage":"praefect-cluster-1"} + ``` + +To solve this, the database schema migration can be done using `sql-migrate` sub-command of +the `praefect` command: + +```shell +$ sudo /opt/gitlab/embedded/bin/praefect -config /var/opt/gitlab/praefect/config.toml sql-migrate +praefect sql-migrate: OK (applied 21 migrations) +``` diff --git a/doc/administration/housekeeping.md b/doc/administration/housekeeping.md index a89e8a2bad5..8f5bf2ee013 100644 --- a/doc/administration/housekeeping.md +++ b/doc/administration/housekeeping.md @@ -4,46 +4,68 @@ group: Distribution info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- -# Housekeeping **(FREE)** +# Housekeeping **(FREE SELF)** -GitLab supports and automates housekeeping tasks within your current repository, -such as compressing file revisions and removing unreachable objects. +GitLab supports and automates housekeeping tasks within your current repository such as: + +- Compressing Git objects. +- Removing unreachable objects. ## Configure housekeeping -GitLab automatically runs `git gc` and `git repack` on repositories -after Git pushes. +GitLab automatically runs `git gc` and `git repack` on repositories after Git pushes: + +- [`git gc`](https://git-scm.com/docs/git-gc) runs a number of housekeeping tasks such as: + - Compressing Git objects to reduce disk space and increase performance. + - Removing unreachable objects that may have been created from changes to the repository, like force-overwriting branches. +- [`git repack`](https://git-scm.com/docs/git-repack) either: + - Runs an incremental repack, according to a [configured period](#housekeeping-options). This + packs all loose objects into a new packfile and prunes the now-redundant loose objects. + - Runs a full repack, according to a [configured period](#housekeeping-options). This repacks all + packfiles and loose objects into a single new packfile, and deletes the old now-redundant loose + objects and packfiles. It also optionally creates bitmaps for the new packfile. You can change how often this happens or turn it off: 1. On the top bar, select **Menu >** **{admin}** **Admin**. 1. On the left sidebar, select **Settings > Repository**. 1. Expand **Repository maintenance**. -1. Configure the Housekeeping options. +1. In the **Housekeeping** section, configure the [housekeeping options](#housekeeping-options). 1. Select **Save changes**. -For example, in the following scenario a `git repack -d` will be executed: +### Housekeeping options + +The following housekeeping options are available: + +- **Enable automatic repository housekeeping**: Regularly run `git repack` and `git gc`. If you + keep this setting disabled for a long time, Git repository access on your GitLab server becomes + slower and your repositories use more disk space. +- **Enable Git pack file bitmap creation**: Create pack file bitmaps which accelerates `git clone` + performance. Makes housekeeping take a little longer. +- **Incremental repack period**: Number of Git pushes after which an incremental `git repack` is + run. +- **Full repack period**: Number of Git pushes after which a full `git repack` is run. +- **Git GC period**: Number of Git pushes after which `git gc` is run. + +As an example, see the following scenario: -- Project: pushes since GC counter (`pushes_since_gc`) = `10` -- Git GC period = `200` -- Full repack period = `50` +- Incremental repack period: 10. +- Full repack period: 50. +- Git GC period: 200. -When the `pushes_since_gc` value is 50 a `repack -A -d --pack-kept-objects` runs, similarly when -the `pushes_since_gc` value is 200 a `git gc` runs: +When the: -- `git gc` ([man page](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-gc.html)) runs a number of housekeeping tasks, - such as compressing file revisions (to reduce disk space and increase performance) - and removing unreachable objects which may have been created from prior invocations of - `git add`. -- `git repack` ([man page](https://mirrors.edge.kernel.org/pub/software/scm/git/docs/git-repack.html)) re-organize existing packs into a single, more efficient pack. +- `pushes_since_gc` value is 50, a `repack -A -l -d --pack-kept-objects` runs. +- `pushes_since_gc` value is 200, a `git gc` runs. Housekeeping also [removes unreferenced LFS files](../raketasks/cleanup.md#remove-unreferenced-lfs-files) -from your project on the same schedule as the `git gc` operation, freeing up storage space for your project. +from your project on the same schedule as the `git gc` operation, freeing up storage space for your +project. ## How housekeeping handles pool repositories -Housekeeping for pool repositories is handled differently from standard repositories. -It is ultimately performed by the Gitaly RPC `FetchIntoObjectPool`. +Housekeeping for pool repositories is handled differently from standard repositories. It is +ultimately performed by the Gitaly RPC `FetchIntoObjectPool`. This is the current call stack by which it is invoked: @@ -54,10 +76,10 @@ This is the current call stack by which it is invoked: 1. `ObjectPoolService#fetch` 1. `Gitaly::FetchIntoObjectPoolRequest` -To manually invoke it from a Rails console, if needed, you can call `project.pool_repository.object_pool.fetch`. -This is a potentially long-running task, though Gitaly times out in about 8 hours. +To manually invoke it from a Rails console if needed, you can call +`project.pool_repository.object_pool.fetch`. This is a potentially long-running task, though Gitaly +times out in about 8 hours. WARNING: -Do not run `git prune` or `git gc` in pool repositories! This can -cause data loss in "real" repositories that depend on the pool in -question. +Do not run `git prune` or `git gc` in pool repositories! This can cause data loss in "real" +repositories that depend on the pool in question. diff --git a/doc/administration/img/repository_storages_admin_ui_v13_1.png b/doc/administration/img/repository_storages_admin_ui_v13_1.png Binary files differdeleted file mode 100644 index a2b88d14a36..00000000000 --- a/doc/administration/img/repository_storages_admin_ui_v13_1.png +++ /dev/null diff --git a/doc/administration/incoming_email.md b/doc/administration/incoming_email.md index 56af5f56cfa..c5cabc5794a 100644 --- a/doc/administration/incoming_email.md +++ b/doc/administration/incoming_email.md @@ -6,25 +6,25 @@ info: To determine the technical writer assigned to the Stage/Group associated w # Incoming email **(FREE SELF)** -GitLab has several features based on receiving incoming emails: +GitLab has several features based on receiving incoming email messages: - [Reply by Email](reply_by_email.md): allow GitLab users to comment on issues - and merge requests by replying to notification emails. + and merge requests by replying to notification email. - [New issue by email](../user/project/issues/managing_issues.md#new-issue-via-email): allow GitLab users to create a new issue by sending an email to a user-specific email address. -- [New merge request by email](../user/project/merge_requests/creating_merge_requests.md#new-merge-request-by-email): +- [New merge request by email](../user/project/merge_requests/creating_merge_requests.md#by-sending-an-email): allow GitLab users to create a new merge request by sending an email to a user-specific email address. -- [Service Desk](../user/project/service_desk.md): provide e-mail support to +- [Service Desk](../user/project/service_desk.md): provide email support to your customers through GitLab. ## Requirements We recommend using an email address that receives **only** messages that are intended for -the GitLab instance. Any incoming emails not intended for GitLab receive a reject notice. +the GitLab instance. Any incoming email messages not intended for GitLab receive a reject notice. -Handling incoming emails requires an [IMAP](https://en.wikipedia.org/wiki/Internet_Message_Access_Protocol)-enabled +Handling incoming email messages requires an [IMAP](https://en.wikipedia.org/wiki/Internet_Message_Access_Protocol)-enabled email account. GitLab requires one of the following three strategies: - Email sub-addressing (recommended) @@ -53,7 +53,7 @@ leaving a catch-all available for other purposes beyond GitLab. ### Catch-all mailbox A [catch-all mailbox](https://en.wikipedia.org/wiki/Catch-all) for a domain -receives all emails addressed to the domain that do not match any addresses that +receives all email messages addressed to the domain that do not match any addresses that exist on the mail server. As of GitLab 11.7, catch-all mailboxes support the same features as @@ -68,7 +68,7 @@ this method only supports replies, and not the other features of [incoming email ## Set it up -If you want to use Gmail / Google Apps for incoming emails, make sure you have +If you want to use Gmail / Google Apps for incoming email, make sure you have [IMAP access enabled](https://support.google.com/mail/answer/7126229) and [allowed less secure apps to access the account](https://support.google.com/accounts/answer/6010255) or [turn-on 2-step validation](https://support.google.com/accounts/answer/185839) @@ -95,7 +95,7 @@ email address to sign up. If you also host a public-facing GitLab instance at `hooli.com` and set your incoming email domain to `hooli.com`, an attacker could abuse the "Create new issue by email" or -"[Create new merge request by email](../user/project/merge_requests/creating_merge_requests.md#new-merge-request-by-email)" +"[Create new merge request by email](../user/project/merge_requests/creating_merge_requests.md#by-sending-an-email)" features by using a project's unique address as the email when signing up for Slack. This would send a confirmation email, which would create a new issue or merge request on the project owned by the attacker, allowing them to click the diff --git a/doc/administration/index.md b/doc/administration/index.md index 69e8689c589..74c89b4d5c0 100644 --- a/doc/administration/index.md +++ b/doc/administration/index.md @@ -43,8 +43,8 @@ Learn how to install, configure, update, and maintain your GitLab instance. - [Adjust your instance's timezone](timezone.md): Customize the default time zone of GitLab. - [System hooks](../system_hooks/system_hooks.md): Notifications when users, projects and keys are changed. -- [Security](../security/README.md): Learn what you can do to further secure your GitLab instance. -- [Usage statistics, version check, and usage ping](../user/admin_area/settings/usage_statistics.md): Enable or disable information about your instance to be sent to GitLab, Inc. +- [Security](../security/index.md): Learn what you can do to further secure your GitLab instance. +- [Usage statistics, version check, and Service Ping](../user/admin_area/settings/usage_statistics.md): Enable or disable information about your instance to be sent to GitLab, Inc. - [Global user settings](user_settings.md): Configure instance-wide user permissions. - [Polling](polling.md): Configure how often the GitLab UI polls for updates. - [GitLab Pages configuration](pages/index.md): Enable and configure GitLab Pages. @@ -122,7 +122,7 @@ Learn how to install, configure, update, and maintain your GitLab instance. - [Libravatar](libravatar.md): Use Libravatar instead of Gravatar for user avatars. - [Sign-up restrictions](../user/admin_area/settings/sign_up_restrictions.md): block email addresses of specific domains, or whitelist only specific domains. - [Access restrictions](../user/admin_area/settings/visibility_and_access_controls.md#enabled-git-access-protocols): Define which Git access protocols can be used to talk to GitLab (SSH, HTTP, HTTPS). -- [Authentication and Authorization](auth/README.md): Configure external authentication with LDAP, SAML, CAS, and additional providers. +- [Authentication and Authorization](auth/index.md): Configure external authentication with LDAP, SAML, CAS, and additional providers. - [Sync LDAP](auth/ldap/index.md) - [Kerberos authentication](../integration/kerberos.md) - See also other [authentication](../topics/authentication/index.md#gitlab-administrators) topics (for example, enforcing 2FA). @@ -134,7 +134,7 @@ Learn how to install, configure, update, and maintain your GitLab instance. - [Auditor users](auditor_users.md): Users with read-only access to all projects, groups, and other resources on the GitLab instance. - [Incoming email](incoming_email.md): Configure incoming emails to allow users to [reply by email](reply_by_email.md), create [issues by email](../user/project/issues/managing_issues.md#new-issue-via-email) and - [merge requests by email](../user/project/merge_requests/creating_merge_requests.md#new-merge-request-by-email), and to enable [Service Desk](../user/project/service_desk.md). + [merge requests by email](../user/project/merge_requests/creating_merge_requests.md#by-sending-an-email), and to enable [Service Desk](../user/project/service_desk.md). - [Postfix for incoming email](reply_by_email_postfix_setup.md): Set up a basic Postfix mail server with IMAP authentication on Ubuntu for incoming emails. @@ -146,7 +146,7 @@ Learn how to install, configure, update, and maintain your GitLab instance. - [Issue closing pattern](issue_closing_pattern.md): Customize how to close an issue from commit messages. - [Gitaly](gitaly/index.md): Configuring Gitaly, the Git repository storage service for GitLab. - [Default labels](../user/admin_area/labels.md): Create labels that are automatically added to every new project. -- [Restrict the use of public or internal projects](../public_access/public_access.md#restricting-the-use-of-public-or-internal-projects): Restrict the use of visibility levels for users when they create a project or a snippet. +- [Restrict the use of public or internal projects](../public_access/public_access.md#restrict-use-of-public-or-internal-projects): Restrict the use of visibility levels for users when they create a project or a snippet. - [Custom project templates](../user/admin_area/custom_project_templates.md): Configure a set of projects to be used as custom templates when creating a new project. ## Package Registry administration @@ -241,7 +241,7 @@ who are aware of the risks. - [GitLab Rails console commands](troubleshooting/gitlab_rails_cheat_sheet.md) (for Support Engineers) - [Troubleshooting SSL](troubleshooting/ssl.md) - Related links: - - [GitLab Developer Documentation](../development/README.md) + - [GitLab Developer Documentation](../development/index.md) - [Repairing and recovering broken Git repositories](https://git.seveas.net/repairing-and-recovering-broken-git-repositories.html) - [Testing with OpenSSL](https://www.feistyduck.com/library/openssl-cookbook/online/ch-testing-with-openssl.html) - [`strace` zine](https://wizardzines.com/zines/strace/) diff --git a/doc/administration/instance_limits.md b/doc/administration/instance_limits.md index 9423045e3b5..5e0d87cd7b6 100644 --- a/doc/administration/instance_limits.md +++ b/doc/administration/instance_limits.md @@ -115,10 +115,7 @@ Limit the maximum daily member invitations allowed per group hierarchy. ### Webhook rate limit > - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/61151) in GitLab 13.12. -> - [Deployed behind a feature flag](../user/feature_flags.md), disabled by default. -> - Disabled on GitLab.com. -> - Not recommended for production use. -> - To use in GitLab self-managed instances, ask a GitLab administrator to [enable it](#enable-or-disable-rate-limiting-for-webhooks). **(FREE SELF)** +> - [Feature flag removed](https://gitlab.com/gitlab-org/gitlab/-/issues/330133) in GitLab 14.1. Limit the number of times any given webhook can be called per minute. This only applies to project and group webhooks. @@ -136,25 +133,6 @@ Set the limit to `0` to disable it. - **Default rate limit**: Disabled. -#### Enable or disable rate limiting for webhooks **(FREE SELF)** - -Rate limiting for webhooks is under development and not ready for production use. It is -deployed behind a feature flag that is **disabled by default**. -[GitLab administrators with access to the GitLab Rails console](../administration/feature_flags.md) -can enable it. - -To enable it: - -```ruby -Feature.enable(:web_hooks_rate_limit) -``` - -To disable it: - -```ruby -Feature.disable(:web_hooks_rate_limit) -``` - ## Gitaly concurrency limit Clone traffic can put a large strain on your Gitaly service. To prevent such workloads from overwhelming your Gitaly server, you can set concurrency limits in Gitaly's configuration file. @@ -169,7 +147,7 @@ Read more about [Gitaly concurrency limits](gitaly/configure_gitaly.md#limit-rpc There's a limit to the number of comments that can be submitted on an issue, merge request, or commit. When the limit is reached, system notes can still be -added so that the history of events is not lost, but the user-submitted +added so that the history of events is not lost, but the user-submitted comment fails. - **Max limit**: 5,000 comments. @@ -214,7 +192,7 @@ The number of pipelines that can be created in a single push is 4. This is to prevent the accidental creation of pipelines when `git push --all` or `git push --mirror` is used. -Read more in the [CI documentation](../ci/yaml/README.md#processing-git-pushes). +Read more in the [CI documentation](../ci/yaml/index.md#processing-git-pushes). ## Retention of activity history @@ -286,7 +264,7 @@ and to limit memory consumption. When using offset-based pagination in the REST API, there is a limit to the maximum requested offset into the set of results. This limit is only applied to endpoints that support keyset-based pagination. More information about pagination options can be -found in the [API docs section on pagination](../api/README.md#pagination). +found in the [API docs section on pagination](../api/index.md#pagination). To set this limit for a self-managed installation, run the following in the [GitLab Rails console](operations/rails_console.md#starting-a-rails-console-session): @@ -429,7 +407,7 @@ Plan.default.actual_limits.update!(ci_instance_level_variables: 30) > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/37226) in GitLab 13.3. -Job artifacts defined with [`artifacts:reports`](../ci/yaml/README.md#artifactsreports) +Job artifacts defined with [`artifacts:reports`](../ci/yaml/index.md#artifactsreports) that are uploaded by the runner are rejected if the file size exceeds the maximum file size limit. The limit is determined by comparing the project's [maximum artifact size setting](../user/admin_area/settings/continuous_integration.md#maximum-artifacts-size) @@ -443,33 +421,34 @@ setting is used: | Artifact limit name | Default value | |---------------------------------------------|---------------| -| `ci_max_artifact_size_accessibility` | 0 | -| `ci_max_artifact_size_api_fuzzing` | 0 | -| `ci_max_artifact_size_archive` | 0 | -| `ci_max_artifact_size_browser_performance` | 0 | -| `ci_max_artifact_size_cluster_applications` | 0 | -| `ci_max_artifact_size_cobertura` | 0 | -| `ci_max_artifact_size_codequality` | 0 | -| `ci_max_artifact_size_container_scanning` | 0 | -| `ci_max_artifact_size_coverage_fuzzing` | 0 | -| `ci_max_artifact_size_dast` | 0 | -| `ci_max_artifact_size_dependency_scanning` | 0 | -| `ci_max_artifact_size_dotenv` | 0 | -| `ci_max_artifact_size_junit` | 0 | -| `ci_max_artifact_size_license_management` | 0 | -| `ci_max_artifact_size_license_scanning` | 0 | -| `ci_max_artifact_size_load_performance` | 0 | -| `ci_max_artifact_size_lsif` | 100 MB ([Introduced at 20 MB](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/37226) in GitLab 13.3 and [raised to 100 MB](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/46980) in GitLab 13.6.) | -| `ci_max_artifact_size_metadata` | 0 | -| `ci_max_artifact_size_metrics_referee` | 0 | -| `ci_max_artifact_size_metrics` | 0 | -| `ci_max_artifact_size_network_referee` | 0 | -| `ci_max_artifact_size_performance` | 0 | -| `ci_max_artifact_size_requirements` | 0 | -| `ci_max_artifact_size_sast` | 0 | -| `ci_max_artifact_size_secret_detection` | 0 | -| `ci_max_artifact_size_terraform` | 5 MB ([introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/37018) in GitLab 13.3) | -| `ci_max_artifact_size_trace` | 0 | +| `ci_max_artifact_size_accessibility` | 0 | +| `ci_max_artifact_size_api_fuzzing` | 0 | +| `ci_max_artifact_size_archive` | 0 | +| `ci_max_artifact_size_browser_performance` | 0 | +| `ci_max_artifact_size_cluster_applications` | 0 | +| `ci_max_artifact_size_cluster_image_scanning` | 0 | +| `ci_max_artifact_size_cobertura` | 0 | +| `ci_max_artifact_size_codequality` | 0 | +| `ci_max_artifact_size_container_scanning` | 0 | +| `ci_max_artifact_size_coverage_fuzzing` | 0 | +| `ci_max_artifact_size_dast` | 0 | +| `ci_max_artifact_size_dependency_scanning` | 0 | +| `ci_max_artifact_size_dotenv` | 0 | +| `ci_max_artifact_size_junit` | 0 | +| `ci_max_artifact_size_license_management` | 0 | +| `ci_max_artifact_size_license_scanning` | 0 | +| `ci_max_artifact_size_load_performance` | 0 | +| `ci_max_artifact_size_lsif` | 100 MB ([Introduced at 20 MB](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/37226) in GitLab 13.3 and [raised to 100 MB](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/46980) in GitLab 13.6.) | +| `ci_max_artifact_size_metadata` | 0 | +| `ci_max_artifact_size_metrics_referee` | 0 | +| `ci_max_artifact_size_metrics` | 0 | +| `ci_max_artifact_size_network_referee` | 0 | +| `ci_max_artifact_size_performance` | 0 | +| `ci_max_artifact_size_requirements` | 0 | +| `ci_max_artifact_size_sast` | 0 | +| `ci_max_artifact_size_secret_detection` | 0 | +| `ci_max_artifact_size_terraform` | 5 MB ([introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/37018) in GitLab 13.3) | +| `ci_max_artifact_size_trace` | 0 | For example, to set the `ci_max_artifact_size_junit` limit to 10 MB on a self-managed installation, run the following in the [GitLab Rails console](operations/rails_console.md#starting-a-rails-console-session): @@ -503,6 +482,46 @@ A runner's registration fails if it exceeds the limit for the scope determined b Plan.default.actual_limits.update!(ci_registered_project_runners: 100) ``` +### Maximum file size for job logs + +> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/276192) in GitLab 14.1. +> - [Deployed behind a feature flag](../user/feature_flags.md), disabled by default. +> - Disabled on GitLab.com. +> - Not recommended for production use. +> - To use in GitLab self-managed instances, ask a GitLab administrator to [enable it](#enable-or-disable-job-log-limits). **(FREE SELF)** + +This in-development feature might not be available for your use. There can be +[risks when enabling features still in development](../user/feature_flags.md#risks-when-enabling-features-still-in-development). +Refer to this feature's version history for more details. + +The job log file size limit is 100 megabytes by default. Any job that exceeds this value is dropped. + +You can change the limit in the [GitLab Rails console](operations/rails_console.md#starting-a-rails-console-session). +Update `ci_jobs_trace_size_limit` with the new value in megabytes: + +```ruby +Plan.default.actual_limits.update!(ci_jobs_trace_size_limit: 125) +``` + +#### Enable or disable job log limits **(FREE SELF)** + +This feature is under development and not ready for production use. It is +deployed behind a feature flag that is **disabled by default**. +[GitLab administrators with access to the GitLab Rails console](feature_flags.md) +can enable it. + +To enable it: + +```ruby +Feature.enable(:ci_jobs_trace_size_limit) +``` + +To disable it: + +```ruby +Feature.disable(:ci_jobs_trace_size_limit) +``` + ## Instance monitoring and metrics ### Incident Management inbound alert limits @@ -597,7 +616,7 @@ prevent any more changes from rendering. For more information about these limits Reports that go over the 20 MB limit won't be loaded. Affected reports: - [Merge request security reports](../user/project/merge_requests/testing_and_reports_in_merge_requests.md#security-reports) -- [CI/CD parameter `artifacts:expose_as`](../ci/yaml/README.md#artifactsexpose_as) +- [CI/CD parameter `artifacts:expose_as`](../ci/yaml/index.md#artifactsexpose_as) - [Unit test reports](../ci/unit_test_reports.md) ## Advanced Search limits @@ -607,7 +626,7 @@ Reports that go over the 20 MB limit won't be loaded. Affected reports: > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/8638) in GitLab 13.3. You can set a limit on the content of repository files that are indexed in -Elasticsearch. Any files larger than this limit is neither indexed +Elasticsearch. Any files larger than this limit is neither indexed nor searchable. Setting a limit helps reduce the memory usage of the indexing processes and diff --git a/doc/administration/job_artifacts.md b/doc/administration/job_artifacts.md index 99eb1395503..3b1d253b4b6 100644 --- a/doc/administration/job_artifacts.md +++ b/doc/administration/job_artifacts.md @@ -42,10 +42,10 @@ To disable artifacts site-wide, follow the steps below. GitLab Runner can upload an archive containing the job artifacts to GitLab. By default, this is done when the job succeeds, but can also be done on failure, or always, via the -[`artifacts:when`](../ci/yaml/README.md#artifactswhen) parameter. +[`artifacts:when`](../ci/yaml/index.md#artifactswhen) parameter. Most artifacts are compressed by GitLab Runner before being sent to the coordinator. The exception to this is -[reports artifacts](../ci/yaml/README.md#artifactsreports), which are compressed after uploading. +[reports artifacts](../ci/yaml/index.md#artifactsreports), which are compressed after uploading. ### Using local storage @@ -326,7 +326,7 @@ To migrate back to local storage: ## Expiring artifacts -If [`artifacts:expire_in`](../ci/yaml/README.md#artifactsexpire_in) is used to set +If [`artifacts:expire_in`](../ci/yaml/index.md#artifactsexpire_in) is used to set an expiry for the artifacts, they are marked for deletion right after that date passes. Otherwise, they expire per the [default artifacts expiration setting](../user/admin_area/settings/continuous_integration.md). diff --git a/doc/administration/job_logs.md b/doc/administration/job_logs.md index 510da68442c..87dd365769f 100644 --- a/doc/administration/job_logs.md +++ b/doc/administration/job_logs.md @@ -108,7 +108,7 @@ See "Phase 4: uploading" in [Data flow](#data-flow) to learn about the process. If you want to avoid any local disk usage for job logs, you can do so using one of the following options: -- Enable the [beta incremental logging](#incremental-logging-architecture) feature. +- Enable the [incremental logging](#incremental-logging-architecture) feature. - Set the [job logs location](#changing-the-job-logs-local-location) to an NFS drive. @@ -140,17 +140,17 @@ For more information, see [delete references to missing artifacts](raketasks/che > - [Recommended for production use with AWS S3](https://gitlab.com/gitlab-org/gitlab/-/issues/273498) in GitLab 13.7. > - To use in GitLab self-managed instances, ask a GitLab administrator to [enable it](#enable-or-disable-incremental-logging). **(FREE SELF)** -Job logs are sent from the GitLab Runner in chunks and cached temporarily on disk +By default job logs are sent from the GitLab Runner in chunks and cached temporarily on disk in `/var/opt/gitlab/gitlab-ci/builds` by Omnibus GitLab. After the job completes, a background job archives the job log. The log is moved to `/var/opt/gitlab/gitlab-rails/shared/artifacts/` by default, or to object storage if configured. -In a [scaled-out architecture](reference_architectures/index.md) with Rails and Sidekiq running on more than one +In a [scaled-out architecture](reference_architectures/index.md) with Rails and Sidekiq running on more than one server, these two locations on the filesystem have to be shared using NFS. To eliminate both filesystem requirements: -- Enable the incremental logging feature, which uses Redis instead of disk space for temporary caching of job logs. +- [Enable the incremental logging feature](#enable-or-disable-incremental-logging), which uses Redis instead of disk space for temporary caching of job logs. - Configure [object storage](job_artifacts.md#object-storage-settings) for storing archived job logs. ### Technical details @@ -162,7 +162,7 @@ file storage. Redis is used as first-class storage, and it stores up-to 128KB of data. After the full chunk is sent, it is flushed to a persistent store, either object storage (temporary directory) or database. After a while, the data in Redis and a persistent store is archived to [object storage](#uploading-logs-to-object-storage). -The data are stored in the following Redis namespace: `Gitlab::Redis::SharedState`. +The data are stored in the following Redis namespace: `Gitlab::Redis::TraceChunks`. Here is the detailed data flow: @@ -185,7 +185,7 @@ Here is the detailed data flow: ### Enable or disable incremental logging **(FREE SELF)** -Incremental logging is under development, but ready for production use. It is +Incremental logging is under development, but [ready for production use as of GitLab 13.6](https://gitlab.com/groups/gitlab-org/-/epics/4275). It is deployed behind a feature flag that is **disabled by default**. [GitLab administrators with access to the GitLab Rails console](feature_flags.md) can enable it. diff --git a/doc/administration/lfs/index.md b/doc/administration/lfs/index.md index 862c26abac8..edf0e324a5c 100644 --- a/doc/administration/lfs/index.md +++ b/doc/administration/lfs/index.md @@ -243,7 +243,52 @@ You can see the total storage used for LFS objects on groups and projects: - In the administration area. - In the [groups](../../api/groups.md) and [projects APIs](../../api/projects.md). -## Troubleshooting: `Google::Apis::TransmissionError: execution expired` +## Troubleshooting + +### Missing LFS objects + +An error about a missing LFS object may occur in either of these situations: + +- When migrating LFS objects from disk to object storage, with error messages like: + + ```plaintext + ERROR -- : Failed to transfer LFS object + 006622269c61b41bf14a22bbe0e43be3acf86a4a446afb4250c3794ea47541a7 + with error: No such file or directory @ rb_sysopen - + /var/opt/gitlab/gitlab-rails/shared/lfs-objects/00/66/22269c61b41bf14a22bbe0e43be3acf86a4a446afb4250c3794ea47541a7 + ``` + + (Line breaks have been added for legibility.) + +- When running the + [integrity check for LFS objects](../raketasks/check.md#uploaded-files-integrity) + with the `VERBOSE=1` parameter. + +The database can have records for LFS objects which are not on disk. The database entry may +[prevent a new copy of the object from being pushed](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/49241). +To delete these references: + +1. [Start a rails console](../operations/rails_console.md). +1. Query the object that's reported as missing in the rails console, to return a file path: + + ```ruby + lfs_object = LfsObject.find_by(oid: '006622269c61b41bf14a22bbe0e43be3acf86a4a446afb4250c3794ea47541a7') + lfs_object.file.path + ``` + +1. Check on disk or object storage if it exists: + + ```shell + ls -al /var/opt/gitlab/gitlab-rails/shared/lfs-objects/00/66/22269c61b41bf14a22bbe0e43be3acf86a4a446afb4250c3794ea47541a7 + ``` + +1. If the file is not present, remove the database record via the rails console: + + ```ruby + lfs_object.destroy + ``` + +### `Google::Apis::TransmissionError: execution expired` If LFS integration is configured with Google Cloud Storage and background uploads (`background_upload: true` and `direct_upload: false`), Sidekiq workers may encounter this error. This is because the uploading timed out with very large files. @@ -276,10 +321,33 @@ end See more information in [!19581](https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/19581) +### LFS commands fail on TLS v1.3 server + +If you configure GitLab to [disable TLS v1.2](https://docs.gitlab.com/omnibus/settings/nginx.md) +and only enable TLS v1.3 connections, LFS operations require a +[Git LFS client](https://git-lfs.github.com) version 2.11.0 or later. If you use +a Git LFS client earlier than version 2.11.0, GitLab displays an error: + +```plaintext +batch response: Post https://username:***@gitlab.example.com/tool/releases.git/info/lfs/objects/batch: remote error: tls: protocol version not supported +error: failed to fetch some objects from 'https://username:[MASKED]@gitlab.example.com/tool/releases.git/info/lfs' +``` + +When using GitLab CI over a TLS v1.3 configured GitLab server, you must +[upgrade to GitLab Runner](https://docs.gitlab.com/runner/install/index.md) 13.2.0 +or later to receive an updated Git LFS client version via +the included [GitLab Runner Helper image](https://docs.gitlab.com/runner/configuration/advanced-configuration.html#helper-image). + +To check an installed Git LFS client's version, run this command: + +```shell +git lfs version +``` + ## Known limitations - Support for removing unreferenced LFS objects was added in 8.14 onward. - LFS authentications via SSH was added with GitLab 8.12. - Only compatible with the Git LFS client versions 1.1.0 and later, or 1.0.2. -- The storage statistics count each LFS object multiple times for +- The storage statistics count each LFS object for every project linking to it. diff --git a/doc/administration/logs.md b/doc/administration/logs.md index cf9c2143d8c..b1605604df5 100644 --- a/doc/administration/logs.md +++ b/doc/administration/logs.md @@ -8,8 +8,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w GitLab has an advanced log system where everything is logged, so you can analyze your instance using various system log files. In addition to -system log files, GitLab Enterprise Edition provides Audit Events. -Find more about them [in Audit Events documentation](audit_events.md). +system log files, GitLab Enterprise Edition provides [Audit Events](audit_events.md). System log files are typically plain text in a standard log file format. This guide talks about how to read and use these system log files. @@ -30,45 +29,48 @@ The logs for a given service may be managed and rotated by: - `logrotate` and `svlogd` - Or not at all -The table below includes information about what is responsible for managing and rotating logs for +The following table includes information about what's responsible for managing and rotating logs for the included services. Logs [managed by `svlogd`](https://docs.gitlab.com/omnibus/settings/logs.html#runit-logs) are written to a file called `current`. The `logrotate` service built into GitLab [manages all logs](https://docs.gitlab.com/omnibus/settings/logs.html#logrotate) except those captured by `runit`. -| Log Type | Managed by logrotate | Managed by svlogd/runit | -| ----------------------------------------------- | -------------------- | ----------------------- | -| [Alertmanager Logs](#alertmanager-logs) | N | Y | -| [Crond Logs](#crond-logs) | N | Y | -| [Gitaly](#gitaly-logs) | Y | Y | -| [GitLab Exporter For Omnibus](#gitlab-exporter) | N | Y | -| [GitLab Pages Logs](#pages-logs) | Y | Y | -| GitLab Rails | Y | N | -| [GitLab Shell Logs](#gitlab-shelllog) | Y | N | -| [Grafana Logs](#grafana-logs) | N | Y | -| [LogRotate Logs](#logrotate-logs) | N | Y | -| [Mailroom](#mail_room_jsonlog-default) | Y | Y | -| [NGINX](#nginx-logs) | Y | Y | -| [PostgreSQL Logs](#postgresql-logs) | N | Y | -| [Prometheus Logs](#prometheus-logs) | N | Y | -| [Puma](#puma-logs) | Y | Y | -| [Redis Logs](#redis-logs) | N | Y | -| [Registry Logs](#registry-logs) | N | Y | -| [Workhorse Logs](#workhorse-logs) | Y | Y | +| Log type | Managed by logrotate | Managed by svlogd/runit | +|-------------------------------------------------|------------------------|-------------------------| +| [Alertmanager Logs](#alertmanager-logs) | **{dotted-circle}** No | **{check-circle}** Yes | +| [Crond Logs](#crond-logs) | **{dotted-circle}** No | **{check-circle}** Yes | +| [Gitaly](#gitaly-logs) | **{check-circle}** Yes | **{check-circle}** Yes | +| [GitLab Exporter for Omnibus](#gitlab-exporter) | **{dotted-circle}** No | **{check-circle}** Yes | +| [GitLab Pages Logs](#pages-logs) | **{check-circle}** Yes | **{check-circle}** Yes | +| GitLab Rails | **{check-circle}** Yes | **{dotted-circle}** No | +| [GitLab Shell Logs](#gitlab-shelllog) | **{check-circle}** Yes | **{dotted-circle}** No | +| [Grafana Logs](#grafana-logs) | **{dotted-circle}** No | **{check-circle}** Yes | +| [LogRotate Logs](#logrotate-logs) | **{dotted-circle}** No | **{check-circle}** Yes | +| [Mailroom](#mail_room_jsonlog-default) | **{check-circle}** Yes | **{check-circle}** Yes | +| [NGINX](#nginx-logs) | **{check-circle}** Yes | **{check-circle}** Yes | +| [PostgreSQL Logs](#postgresql-logs) | **{dotted-circle}** No | **{check-circle}** Yes | +| [Prometheus Logs](#prometheus-logs) | **{dotted-circle}** No | **{check-circle}** Yes | +| [Puma](#puma-logs) | **{check-circle}** Yes | **{check-circle}** Yes | +| [Redis Logs](#redis-logs) | **{dotted-circle}** No | **{check-circle}** Yes | +| [Registry Logs](#registry-logs) | **{dotted-circle}** No | **{check-circle}** Yes | +| [Workhorse Logs](#workhorse-logs) | **{check-circle}** Yes | **{check-circle}** Yes | ## `production_json.log` -This file lives in `/var/log/gitlab/gitlab-rails/production_json.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/production_json.log` for -installations from source. When GitLab is running in an environment -other than production, the corresponding log file is shown here. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/production_json.log` +- Installations from source: `/home/git/gitlab/log/production_json.log` + +When GitLab is running in an environment other than production, +the corresponding log file is shown here. It contains a structured log for Rails controller requests received from -GitLab, thanks to [Lograge](https://github.com/roidrage/lograge/). Note that -requests from the API are logged to a separate file in `api_json.log`. +GitLab, thanks to [Lograge](https://github.com/roidrage/lograge/). +Requests from the API are logged to a separate file in `api_json.log`. -Each line contains a JSON line that can be ingested by services like Elasticsearch and Splunk. +Each line contains JSON that can be ingested by services like Elasticsearch and Splunk. Line breaks were added to examples for legibility: ```json @@ -103,39 +105,39 @@ This example was a GET request for a specific issue. Each line also contains performance data, with times in seconds: -1. `duration_s`: total time taken to retrieve the request -1. `queue_duration_s`: total time that the request was queued inside GitLab Workhorse -1. `view_duration_s`: total time taken inside the Rails views -1. `db_duration_s`: total time to retrieve data from PostgreSQL -1. `cpu_s`: total time spent on CPU -1. `gitaly_duration_s`: total time taken by Gitaly calls -1. `gitaly_calls`: total number of calls made to Gitaly -1. `redis_calls`: total number of calls made to Redis -1. `redis_duration_s`: total time to retrieve data from Redis -1. `redis_read_bytes`: total bytes read from Redis -1. `redis_write_bytes`: total bytes written to Redis -1. `redis_<instance>_calls`: total number of calls made to a Redis instance -1. `redis_<instance>_duration_s`: total time to retrieve data from a Redis instance -1. `redis_<instance>_read_bytes`: total bytes read from a Redis instance -1. `redis_<instance>_write_bytes`: total bytes written to a Redis instance - -User clone and fetch activity using HTTP transport appears in this log as `action: git_upload_pack`. +- `duration_s`: Total time to retrieve the request +- `queue_duration_s`: Total time the request was queued inside GitLab Workhorse +- `view_duration_s`: Total time inside the Rails views +- `db_duration_s`: Total time to retrieve data from PostgreSQL +- `cpu_s`: Total time spent on CPU +- `gitaly_duration_s`: Total time by Gitaly calls +- `gitaly_calls`: Total number of calls made to Gitaly +- `redis_calls`: Total number of calls made to Redis +- `redis_duration_s`: Total time to retrieve data from Redis +- `redis_read_bytes`: Total bytes read from Redis +- `redis_write_bytes`: Total bytes written to Redis +- `redis_<instance>_calls`: Total number of calls made to a Redis instance +- `redis_<instance>_duration_s`: Total time to retrieve data from a Redis instance +- `redis_<instance>_read_bytes`: Total bytes read from a Redis instance +- `redis_<instance>_write_bytes`: Total bytes written to a Redis instance + +User clone and fetch activity using HTTP transport appears in the log as `action: git_upload_pack`. In addition, the log contains the originating IP address, (`remote_ip`), the user's ID (`user_id`), and username (`username`). -Some endpoints such as `/search` may make requests to Elasticsearch if using +Some endpoints (such as `/search`) may make requests to Elasticsearch if using [Advanced Search](../user/search/advanced_search.md). These additionally log `elasticsearch_calls` and `elasticsearch_call_duration_s`, which correspond to: -1. `elasticsearch_calls`: total number of calls to Elasticsearch -1. `elasticsearch_duration_s`: total time taken by Elasticsearch calls -1. `elasticsearch_timed_out_count`: total number of calls to Elasticsearch that +- `elasticsearch_calls`: Total number of calls to Elasticsearch +- `elasticsearch_duration_s`: Total time taken by Elasticsearch calls +- `elasticsearch_timed_out_count`: Total number of calls to Elasticsearch that timed out and therefore returned partial results -ActionCable connection and subscription events are also logged to this file and they follow the same -format above. The `method`, `path`, and `format` fields are not applicable, and are always empty. +ActionCable connection and subscription events are also logged to this file and they follow the +previous format. The `method`, `path`, and `format` fields are not applicable, and are always empty. The ActionCable connection or channel class is used as the `controller`. ```json @@ -206,10 +208,13 @@ Starting with GitLab 12.5, if an error occurs, an ## `production.log` -This file lives in `/var/log/gitlab/gitlab-rails/production.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/production.log` for -installations from source. (When GitLab is running in an environment -other than production, the corresponding log file is shown here.) +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/production.log` +- Installations from source: `/home/git/gitlab/log/production.log` + +When GitLab is running in an environment other than production, +the corresponding log file is shown here. It contains information about all performed requests. You can see the URL and type of request, IP address, and what parts of code were @@ -244,9 +249,10 @@ The request was processed by `Projects::TreeController`. > Introduced in GitLab 10.0. -This file lives in -`/var/log/gitlab/gitlab-rails/api_json.log` for Omnibus GitLab packages, or in -`/home/git/gitlab/log/api_json.log` for installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/api_json.log` +- Installations from source: `/home/git/gitlab/log/api_json.log` It helps you see requests made directly to the API. For example: @@ -274,24 +280,25 @@ It helps you see requests made directly to the API. For example: ``` This entry shows an internal endpoint accessed to check whether an -associated SSH key can download the project in question via a `git fetch` or +associated SSH key can download the project in question by using a `git fetch` or `git clone`. In this example, we see: -1. `duration`: total time in milliseconds taken to retrieve the request -1. `queue_duration`: total time in milliseconds that the request was queued inside GitLab Workhorse -1. `method`: The HTTP method used to make the request -1. `path`: The relative path of the query -1. `params`: Key-value pairs passed in a query string or HTTP body. Sensitive parameters (such as passwords and tokens) are filtered out. -1. `ua`: The User-Agent of the requester +- `duration`: Total time in milliseconds to retrieve the request +- `queue_duration`: Total time in milliseconds the request was queued inside GitLab Workhorse +- `method`: The HTTP method used to make the request +- `path`: The relative path of the query +- `params`: Key-value pairs passed in a query string or HTTP body (sensitive parameters, such as passwords and tokens, are filtered out) +- `ua`: The User-Agent of the requester ## `application.log` -This file lives in `/var/log/gitlab/gitlab-rails/application.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/application.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/application.log` +- Installations from source: `/home/git/gitlab/log/application.log` -It helps you discover events happening in your instance such as user creation, -project removing and so on. For example: +It helps you discover events happening in your instance such as user creation +and project removal. For example: ```plaintext October 06, 2014 11:56: User "Administrator" (admin@example.com) was created @@ -305,11 +312,12 @@ October 07, 2014 11:25: Project "project133" was removed > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/22812) in GitLab 12.7. -This file lives in `/var/log/gitlab/gitlab-rails/application_json.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/application_json.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/application_json.log` +- Installations from source: `/home/git/gitlab/log/application_json.log` -It contains the JSON version of the logs in `application.log` like the example below: +It contains the JSON version of the logs in `application.log`, like this example: ```json { @@ -328,11 +336,14 @@ It contains the JSON version of the logs in `application.log` like the example b ## `integrations_json.log` -This file lives in `/var/log/gitlab/gitlab-rails/integrations_json.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/integrations_json.log` for -installations from source. +Depending on your installation method, this file is located at: -It contains information about [integrations](../user/project/integrations/overview.md) activities such as Jira, Asana, and Irker services. It uses JSON format like the example below: +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/integrations_json.log` +- Installations from source: `/home/git/gitlab/log/integrations_json.log` + +It contains information about [integration](../user/project/integrations/overview.md) +activities, such as Jira, Asana, and Irker services. It uses JSON format, +like this example: ```json { @@ -360,16 +371,16 @@ It contains information about [integrations](../user/project/integrations/overvi > Introduced in GitLab 11.6. -This file lives in -`/var/log/gitlab/gitlab-rails/kubernetes.log` for Omnibus GitLab -packages or in `/home/git/gitlab/log/kubernetes.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/kubernetes.log` +- Installations from source: `/home/git/gitlab/log/kubernetes.log` -It logs information related to the Kubernetes Integration including errors +It logs information related to the Kubernetes Integration, including errors during installing cluster applications on your managed Kubernetes clusters. -Each line contains a JSON line that can be ingested by services like Elasticsearch and Splunk. +Each line contains JSON that can be ingested by services like Elasticsearch and Splunk. Line breaks have been added to the following example for clarity: ```json @@ -399,9 +410,10 @@ Line breaks have been added to the following example for clarity: ## `git_json.log` -This file lives in `/var/log/gitlab/gitlab-rails/git_json.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/git_json.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/git_json.log` +- Installations from source: `/home/git/gitlab/log/git_json.log` After GitLab version 12.2, this file was renamed from `githost.log` to `git_json.log` and stored in JSON format. @@ -425,14 +437,15 @@ only. For example: NOTE: GitLab Free tracks a small number of different audit events. -[GitLab Premium](https://about.gitlab.com/pricing/) tracks many more. +GitLab Premium tracks many more. -This file lives in `/var/log/gitlab/gitlab-rails/audit_json.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/audit_json.log` for -installations from source. +Depending on your installation method, this file is located at: -Changes to group or project settings and memberships (`target_details`) are logged to this file. -For example: +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/audit_json.log` +- Installations from source: `/home/git/gitlab/log/audit_json.log` + +Changes to group or project settings and memberships (`target_details`) +are logged to this file. For example: ```json { @@ -454,15 +467,17 @@ For example: ## Sidekiq Logs NOTE: -In Omnibus GitLab `12.10` or earlier, the Sidekiq log lives in `/var/log/gitlab/gitlab-rails/sidekiq.log`. +In Omnibus GitLab `12.10` or earlier, the Sidekiq log is at `/var/log/gitlab/gitlab-rails/sidekiq.log`. -For Omnibus installations, some Sidekiq logs reside in `/var/log/gitlab/sidekiq/current` and as follows. +For Omnibus GitLab installations, some Sidekiq logs are in `/var/log/gitlab/sidekiq/current` +and as follows. ### `sidekiq.log` -This file lives in `/var/log/gitlab/sidekiq/current` for -Omnibus GitLab packages or in `/home/git/gitlab/log/sidekiq.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/sidekiq/current` +- Installations from source: `/home/git/gitlab/log/sidekiq.log` GitLab uses background jobs for processing tasks which can take a long time. All information about processing these jobs are written down to @@ -473,7 +488,7 @@ this file. For example: 2014-06-10T18:18:26Z 14299 TID-55uqo INFO: Booting Sidekiq 3.0.0 with redis options {:url=>"redis://localhost:6379/0", :namespace=>"sidekiq"} ``` -Instead of the format above, you can opt to generate JSON logs for +Instead of the previous format, you can opt to generate JSON logs for Sidekiq. For example: ```json @@ -506,7 +521,7 @@ For Omnibus GitLab installations, add the configuration option: sidekiq['log_format'] = 'json' ``` -For source installations, edit the `gitlab.yml` and set the Sidekiq +For installations from source, edit the `gitlab.yml` and set the Sidekiq `log_format` configuration option: ```yaml @@ -519,9 +534,10 @@ For source installations, edit the `gitlab.yml` and set the Sidekiq > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/26586) in GitLab 12.9. -This file lives in `/var/log/gitlab/gitlab-rails/sidekiq_client.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/sidekiq_client.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/sidekiq_client.log` +- Installations from source: `/home/git/gitlab/log/sidekiq_client.log` This file contains logging information about jobs before Sidekiq starts processing them, such as before being enqueued. @@ -532,11 +548,15 @@ you've configured this for Sidekiq as mentioned above. ## `gitlab-shell.log` -GitLab Shell is used by GitLab for executing Git commands and provide SSH access to Git repositories. +GitLab Shell is used by GitLab for executing Git commands and provide SSH +access to Git repositories. ### For GitLab versions 12.10 and up -For GitLab version 12.10 and later, there are 2 `gitlab-shell.log` files. Information containing `git-{upload-pack,receive-pack}` requests lives in `/var/log/gitlab/gitlab-shell/gitlab-shell.log`. Information about hooks to GitLab Shell from Gitaly lives in `/var/log/gitlab/gitaly/gitlab-shell.log`. +For GitLab version 12.10 and later, there are two `gitlab-shell.log` files. +Information containing `git-{upload-pack,receive-pack}` requests is at +`/var/log/gitlab/gitlab-shell/gitlab-shell.log`. Information about hooks to +GitLab Shell from Gitaly is at `/var/log/gitlab/gitaly/gitlab-shell.log`. Example log entries for `/var/log/gitlab/gitlab-shell/gitlab-shell.log`: @@ -589,7 +609,11 @@ Example log entries for `/var/log/gitlab/gitaly/gitlab-shell.log`: ### For GitLab versions 12.5 through 12.9 -For GitLab 12.5 to 12.9, this file lives in `/var/log/gitlab/gitaly/gitlab-shell.log` for Omnibus GitLab packages or in `/home/git/gitaly/gitlab-shell.log` for installations from source. +For GitLab 12.5 to 12.9, depending on your installation method, this +file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitaly/gitlab-shell.log` +- Installation from source: `/home/git/gitaly/gitlab-shell.log` Example log entries: @@ -608,7 +632,7 @@ Example log entries: ### For GitLab 12.5 and earlier -For GitLab 12.5 and earlier, the file lives in `/var/log/gitlab/gitlab-shell/gitlab-shell.log`. +For GitLab 12.5 and earlier, the file is at `/var/log/gitlab/gitlab-shell/gitlab-shell.log`. Example log entries: @@ -617,51 +641,64 @@ I, [2015-02-13T06:17:00.671315 #9291] INFO -- : Adding project root/example.git I, [2015-02-13T06:17:00.679433 #9291] INFO -- : Moving existing hooks directory and symlinking global hooks directory for /var/opt/gitlab/git-data/repositories/root/example.git. ``` -User clone/fetch activity using SSH transport appears in this log as `executing git command <gitaly-upload-pack...`. +User clone/fetch activity using SSH transport appears in this log as +`executing git command <gitaly-upload-pack...`. ## Gitaly Logs -This file lives in `/var/log/gitlab/gitaly/current` and is produced by [runit](http://smarden.org/runit/). `runit` is packaged with Omnibus GitLab and a brief explanation of its purpose is available [in the Omnibus GitLab documentation](https://docs.gitlab.com/omnibus/architecture/#runit). [Log files are rotated](http://smarden.org/runit/svlogd.8.html), renamed in Unix timestamp format, and `gzip`-compressed (like `@1584057562.s`). +This file is in `/var/log/gitlab/gitaly/current` and is produced by [runit](http://smarden.org/runit/). +`runit` is packaged with Omnibus GitLab and a brief explanation of its purpose +is available [in the Omnibus GitLab documentation](https://docs.gitlab.com/omnibus/architecture/#runit). +[Log files are rotated](http://smarden.org/runit/svlogd.8.html), renamed in +Unix timestamp format, and `gzip`-compressed (like `@1584057562.s`). ### `grpc.log` -This file lives in `/var/log/gitlab/gitlab-rails/grpc.log` for Omnibus GitLab packages. Native [gRPC](https://grpc.io/) logging used by Gitaly. +This file is at `/var/log/gitlab/gitlab-rails/grpc.log` for Omnibus GitLab +packages. Native [gRPC](https://grpc.io/) logging used by Gitaly. ### `gitaly_ruby_json.log` > [Introduced](https://gitlab.com/gitlab-org/gitaly/-/merge_requests/2678) in GitLab 13.6. -This file lives in `/var/log/gitlab/gitaly/gitaly_ruby_json.log` and is produced by [`gitaly-ruby`](gitaly/reference.md#gitaly-ruby). It contains an access log of gRPC calls made by Gitaly to `gitaly-ruby`. +This file is at `/var/log/gitlab/gitaly/gitaly_ruby_json.log` and is +produced by [`gitaly-ruby`](gitaly/reference.md#gitaly-ruby). It contains an +access log of gRPC calls made by Gitaly to `gitaly-ruby`. ## Puma Logs ### `puma_stdout.log` -This file lives in `/var/log/gitlab/puma/puma_stdout.log` for -Omnibus GitLab packages, and `/home/git/gitlab/log/puma_stdout.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/puma/puma_stdout.log` +- Installations from source: `/home/git/gitlab/log/puma_stdout.log` ### `puma_stderr.log` -This file lives in `/var/log/gitlab/puma/puma_stderr.log` for -Omnibus GitLab packages, or in `/home/git/gitlab/log/puma_stderr.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/puma/puma_stderr.log` +- Installations from source: `/home/git/gitlab/log/puma_stderr.log` ## `repocheck.log` -This file lives in `/var/log/gitlab/gitlab-rails/repocheck.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/repocheck.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/repocheck.log` +- Installations from source: `/home/git/gitlab/log/repocheck.log` -It logs information whenever a [repository check is run](repository_checks.md) on a project. +It logs information whenever a [repository check is run](repository_checks.md) +on a project. ## `importer.log` > Introduced in GitLab 11.3. -This file lives in `/var/log/gitlab/gitlab-rails/importer.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/importer.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/importer.log` +- Installations from source: `/home/git/gitlab/log/importer.log` It logs the progress of the import process. @@ -669,9 +706,10 @@ It logs the progress of the import process. > Introduced in GitLab 13.1. -This file lives in `/var/log/gitlab/gitlab-rails/exporter.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/exporter.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/exporter.log` +- Installations from source: `/home/git/gitlab/log/exporter.log` It logs the progress of the export process. @@ -679,10 +717,10 @@ It logs the progress of the export process. > [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/59587) in GitLab 13.7. -This file's location depends on how you installed GitLab: +Depending on your installation method, this file is located at: -- For Omnibus GitLab packages: `/var/log/gitlab/gitlab-rails/features_json.log` -- For installations from source: `/home/git/gitlab/log/features_json.log` +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/features_json.log` +- Installations from source: `/home/git/gitlab/log/features_json.log` The modification events from [Feature flags in development of GitLab](../development/feature_flags/index.md) are recorded in this file. For example: @@ -704,27 +742,29 @@ are recorded in this file. For example: > Introduced in GitLab 12.0. -This file lives in `/var/log/gitlab/gitlab-rails/auth.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/auth.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/auth.log` +- Installations from source: `/home/git/gitlab/log/auth.log` This log records: - Information whenever [Rack Attack](../security/rack_attack.md) registers an abusive request. - Requests over the [Rate Limit](../user/admin_area/settings/rate_limits_on_raw_endpoints.md) on raw endpoints. - [Protected paths](../user/admin_area/settings/protected_paths.md) abusive requests. -- In GitLab versions [12.3](https://gitlab.com/gitlab-org/gitlab/-/issues/29239) and greater, +- In GitLab versions [12.3](https://gitlab.com/gitlab-org/gitlab/-/issues/29239) and later, user ID and username, if available. ## `graphql_json.log` > [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/59587) in GitLab 12.0. -This file lives in `/var/log/gitlab/gitlab-rails/graphql_json.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/graphql_json.log` for -installations from source. +Depending on your installation method, this file is located at: -GraphQL queries are recorded in that file. For example: +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/graphql_json.log` +- Installations from source: `/home/git/gitlab/log/graphql_json.log` + +GraphQL queries are recorded in the file. For example: ```json {"query_string":"query IntrospectionQuery{__schema {queryType { name },mutationType { name }}}...(etc)","variables":{"a":1,"b":2},"complexity":181,"depth":1,"duration_s":7} @@ -734,24 +774,26 @@ GraphQL queries are recorded in that file. For example: > Introduced in GitLab 12.3. -This file lives in `/var/log/gitlab/gitlab-rails/migrations.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/migrations.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/migrations.log` +- Installations from source: `/home/git/gitlab/log/migrations.log` ## `mail_room_json.log` (default) > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/19186) in GitLab 12.6. -This file lives in `/var/log/gitlab/mailroom/current` for -Omnibus GitLab packages or in `/home/git/gitlab/log/mail_room_json.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/mailroom/current` +- Installations from source: `/home/git/gitlab/log/mail_room_json.log` This structured log file records internal activity in the `mail_room` gem. Its name and path are configurable, so the name and path may not match the above. -## Reconfigure Logs +## Reconfigure logs -Reconfigure log files live in `/var/log/gitlab/reconfigure` for Omnibus GitLab +Reconfigure log files are in `/var/log/gitlab/reconfigure` for Omnibus GitLab packages. Installations from source don't have reconfigure logs. A reconfigure log is populated whenever `gitlab-ctl reconfigure` is run manually or as part of an upgrade. @@ -763,46 +805,47 @@ was initiated, such as `1509705644.log` If Prometheus metrics and the Sidekiq Exporter are both enabled, Sidekiq starts a Web server and listen to the defined port (default: `8082`). By default, Sidekiq Exporter access logs are disabled but can -be enabled: +be enabled based on your installation method: -- For Omnibus GitLab installations, using the `sidekiq['exporter_log_enabled'] = true` - option in `/etc/gitlab/gitlab.rb`. -- For installations from source, using the `sidekiq_exporter.log_enabled` option - in `gitlab.yml`. +- Omnibus GitLab: Use the `sidekiq['exporter_log_enabled'] = true` + option in `/etc/gitlab/gitlab.rb` +- Installations from source: Use the `sidekiq_exporter.log_enabled` option + in `gitlab.yml` -When enabled, access logs are generated in -`/var/log/gitlab/gitlab-rails/sidekiq_exporter.log` for Omnibus GitLab -packages or in `/home/git/gitlab/log/sidekiq_exporter.log` for -installations from source. +When enabled, depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/sidekiq_exporter.log` +- Installations from source: `/home/git/gitlab/log/sidekiq_exporter.log` If Prometheus metrics and the Web Exporter are both enabled, Puma starts a Web server and listen to the defined port (default: `8083`), and access logs -are generated: +are generated in a location based on your installation method: -- For Omnibus GitLab packages, in `/var/log/gitlab/gitlab-rails/web_exporter.log`. -- For installations from source, in `/home/git/gitlab/log/web_exporter.log`. +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/web_exporter.log` +- Installations from source: `/home/git/gitlab/log/web_exporter.log` ## `database_load_balancing.log` **(PREMIUM SELF)** > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/15442) in GitLab 12.3. Contains details of GitLab [Database Load Balancing](database_load_balancing.md). -It's stored at: +Depending on your installation method, this file is located at: -- `/var/log/gitlab/gitlab-rails/database_load_balancing.log` for Omnibus GitLab packages. -- `/home/git/gitlab/log/database_load_balancing.log` for installations from source. +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/database_load_balancing.log` +- Installations from source: `/home/git/gitlab/log/database_load_balancing.log` ## `elasticsearch.log` **(PREMIUM SELF)** > Introduced in GitLab 12.6. This file logs information related to the Elasticsearch Integration, including -errors during indexing or searching Elasticsearch. It's stored at: +errors during indexing or searching Elasticsearch. Depending on your installation +method, this file is located at: -- `/var/log/gitlab/gitlab-rails/elasticsearch.log` for Omnibus GitLab packages. -- `/home/git/gitlab/log/elasticsearch.log` for installations from source. +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/elasticsearch.log` +- Installations from source: `/home/git/gitlab/log/elasticsearch.log` -Each line contains a JSON line that can be ingested by services like Elasticsearch and Splunk. +Each line contains JSON that can be ingested by services like Elasticsearch and Splunk. Line breaks have been added to the following example line for clarity: ```json @@ -825,12 +868,13 @@ Line breaks have been added to the following example line for clarity: This file logs the information about exceptions being tracked by `Gitlab::ErrorTracking`, which provides a standard and consistent way of -[processing rescued exceptions](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/development/logging.md#exception-handling). This file is stored in: +[processing rescued exceptions](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/development/logging.md#exception-handling). +Depending on your installation method, this file is located at: -- `/var/log/gitlab/gitlab-rails/exceptions_json.log` for Omnibus GitLab packages. -- `/home/git/gitlab/log/exceptions_json.log` for installations from source. +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/exceptions_json.log` +- Installations from source: `/home/git/gitlab/log/exceptions_json.log` -Each line contains a JSON line that can be ingested by Elasticsearch. For example: +Each line contains JSON that can be ingested by Elasticsearch. For example: ```json { @@ -853,9 +897,10 @@ Each line contains a JSON line that can be ingested by Elasticsearch. For exampl > Introduced in GitLab 13.0. -This file lives in `/var/log/gitlab/gitlab-rails/service_measurement.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/service_measurement.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/service_measurement.log` +- Installations from source: `/home/git/gitlab/log/service_measurement.log` It contains only a single structured log with measurements for each service execution. It contains measurements such as the number of SQL calls, `execution_time`, `gc_stats`, and `memory usage`. @@ -870,9 +915,12 @@ For example: > Introduced in 9.5. -Geo stores structured log messages in a `geo.log` file. For Omnibus installations, this file is at `/var/log/gitlab/gitlab-rails/geo.log`. +Geo stores structured log messages in a `geo.log` file. For Omnibus GitLab +installations, this file is at `/var/log/gitlab/gitlab-rails/geo.log`. -This file contains information about when Geo attempts to sync repositories and files. Each line in the file contains a separate JSON entry that can be ingested into. For example, Elasticsearch or Splunk. +This file contains information about when Geo attempts to sync repositories +and files. Each line in the file contains a separate JSON entry that can be +ingested into (for example, Elasticsearch or Splunk). For example: @@ -886,10 +934,10 @@ This message shows that Geo detected that a repository update was needed for pro > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/commit/7f637e2af7006dc2b1b2649d9affc0b86cfb33c4) in GitLab 11.12. -This file is stored in: +Depending on your installation method, this file is located at: -- `/var/log/gitlab/gitlab-rails/update_mirror_service_json.log` for Omnibus GitLab installations. -- `/home/git/gitlab/log/update_mirror_service_json.log` for installations from source. +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/update_mirror_service_json.log` +- Installations from source: `/home/git/gitlab/log/update_mirror_service_json.log` This file contains information about LFS errors that occurred during project mirroring. While we work to move other project mirroring errors into this log, the [general log](#productionlog) @@ -909,20 +957,20 @@ can be used. ## Registry Logs -For Omnibus installations, Container Registry logs reside in `/var/log/gitlab/registry/current`. +For Omnibus GitLab installations, Container Registry logs are in `/var/log/gitlab/registry/current`. ## NGINX Logs -For Omnibus installations, NGINX logs reside in: +For Omnibus GitLab installations, NGINX logs are in: -- `/var/log/gitlab/nginx/gitlab_access.log` contains a log of requests made to GitLab. -- `/var/log/gitlab/nginx/gitlab_error.log` contains a log of NGINX errors for GitLab. -- `/var/log/gitlab/nginx/gitlab_pages_access.log` contains a log of requests made to Pages static sites. -- `/var/log/gitlab/nginx/gitlab_pages_error.log` contains a log of NGINX errors for Pages static sites. -- `/var/log/gitlab/nginx/gitlab_registry_access.log` contains a log of requests made to the Container Registry. -- `/var/log/gitlab/nginx/gitlab_registry_error.log` contains a log of NGINX errors for the Container Registry. -- `/var/log/gitlab/nginx/gitlab_mattermost_access.log` contains a log of requests made to Mattermost. -- `/var/log/gitlab/nginx/gitlab_mattermost_error.log` contains a log of NGINX errors for Mattermost. +- `/var/log/gitlab/nginx/gitlab_access.log`: A log of requests made to GitLab +- `/var/log/gitlab/nginx/gitlab_error.log`: A log of NGINX errors for GitLab +- `/var/log/gitlab/nginx/gitlab_pages_access.log`: A log of requests made to Pages static sites +- `/var/log/gitlab/nginx/gitlab_pages_error.log`: A log of NGINX errors for Pages static sites +- `/var/log/gitlab/nginx/gitlab_registry_access.log`: A log of requests made to the Container Registry +- `/var/log/gitlab/nginx/gitlab_registry_error.log`: A log of NGINX errors for the Container Registry +- `/var/log/gitlab/nginx/gitlab_mattermost_access.log`: A log of requests made to Mattermost +- `/var/log/gitlab/nginx/gitlab_mattermost_error.log`: A log of NGINX errors for Mattermost Below is the default GitLab NGINX access log format: @@ -932,7 +980,7 @@ $remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$ ## Pages Logs -For Omnibus installations, Pages logs reside in `/var/log/gitlab/gitlab-pages/current`. +For Omnibus GitLab installations, Pages logs are in `/var/log/gitlab/gitlab-pages/current`. For example: @@ -961,66 +1009,68 @@ For example: ## Mattermost Logs -For Omnibus GitLab installations, Mattermost logs reside in `/var/log/gitlab/mattermost/mattermost.log`. +For Omnibus GitLab installations, Mattermost logs are in `/var/log/gitlab/mattermost/mattermost.log`. ## Workhorse Logs -For Omnibus GitLab installations, Workhorse logs reside in `/var/log/gitlab/gitlab-workhorse/`. +For Omnibus GitLab installations, Workhorse logs are in `/var/log/gitlab/gitlab-workhorse/`. ## PostgreSQL Logs -For Omnibus GitLab installations, PostgreSQL logs reside in `/var/log/gitlab/postgresql/`. +For Omnibus GitLab installations, PostgreSQL logs are in `/var/log/gitlab/postgresql/`. ## Prometheus Logs -For Omnibus GitLab installations, Prometheus logs reside in `/var/log/gitlab/prometheus/`. +For Omnibus GitLab installations, Prometheus logs are in `/var/log/gitlab/prometheus/`. ## Redis Logs -For Omnibus GitLab installations, Redis logs reside in `/var/log/gitlab/redis/`. +For Omnibus GitLab installations, Redis logs are in `/var/log/gitlab/redis/`. ## Alertmanager Logs -For Omnibus GitLab installations, Alertmanager logs reside in `/var/log/gitlab/alertmanager/`. +For Omnibus GitLab installations, Alertmanager logs are in `/var/log/gitlab/alertmanager/`. <!-- vale gitlab.Spelling = NO --> ## Crond Logs -For Omnibus GitLab installations, crond logs reside in `/var/log/gitlab/crond/`. +For Omnibus GitLab installations, crond logs are in `/var/log/gitlab/crond/`. <!-- vale gitlab.Spelling = YES --> ## Grafana Logs -For Omnibus GitLab installations, Grafana logs reside in `/var/log/gitlab/grafana/`. +For Omnibus GitLab installations, Grafana logs are in `/var/log/gitlab/grafana/`. ## LogRotate Logs -For Omnibus GitLab installations, `logrotate` logs reside in `/var/log/gitlab/logrotate/`. +For Omnibus GitLab installations, `logrotate` logs are in `/var/log/gitlab/logrotate/`. ## GitLab Monitor Logs -For Omnibus GitLab installations, GitLab Monitor logs reside in `/var/log/gitlab/gitlab-monitor/`. +For Omnibus GitLab installations, GitLab Monitor logs are in `/var/log/gitlab/gitlab-monitor/`. ## GitLab Exporter -For Omnibus GitLab installations, GitLab Exporter logs reside in `/var/log/gitlab/gitlab-exporter/`. +For Omnibus GitLab installations, GitLab Exporter logs are in `/var/log/gitlab/gitlab-exporter/`. ## GitLab Kubernetes Agent Server -For Omnibus GitLab installations, GitLab Kubernetes Agent Server logs reside +For Omnibus GitLab installations, GitLab Kubernetes Agent Server logs are in `/var/log/gitlab/gitlab-kas/`. ## Performance bar stats > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/48149) in GitLab 13.7. -This file lives in `/var/log/gitlab/gitlab-rails/performance_bar_json.log` for -Omnibus GitLab packages or in `/home/git/gitlab/log/performance_bar_json.log` for -installations from source. +Depending on your installation method, this file is located at: + +- Omnibus GitLab: `/var/log/gitlab/gitlab-rails/performance_bar_json.log` +- Installations from source: `/home/git/gitlab/log/performance_bar_json.log` -Performance bar statistics (currently only duration of SQL queries) are recorded in that file. For example: +Performance bar statistics (currently only duration of SQL queries) are recorded +in that file. For example: ```json {"severity":"INFO","time":"2020-12-04T09:29:44.592Z","correlation_id":"33680b1490ccd35981b03639c406a697","filename":"app/models/ci/pipeline.rb","method_path":"app/models/ci/pipeline.rb:each_with_object","request_id":"rYHomD0VJS4","duration_ms":26.889,"count":2,"type": "sql"} diff --git a/doc/administration/maintenance_mode/index.md b/doc/administration/maintenance_mode/index.md index 2f5d366f927..37415468517 100644 --- a/doc/administration/maintenance_mode/index.md +++ b/doc/administration/maintenance_mode/index.md @@ -8,7 +8,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w > [Introduced](https://gitlab.com/groups/gitlab-org/-/epics/2149) in GitLab Premium 13.9. -Maintenance Mode allows administrators to reduce write operations to a minimum while maintenance tasks are performed. The main goal is to block all external actions that change the internal state, including the PostgreSQL database, but especially files, Git repositories, Container repositories, etc. +Maintenance Mode allows administrators to reduce write operations to a minimum while maintenance tasks are performed. The main goal is to block all external actions that change the internal state, including the PostgreSQL database, but especially files, Git repositories, Container repositories, and so on. Once Maintenance Mode is enabled, in-progress actions finish relatively quickly since no new actions are coming in, and internal state changes are minimal. In that state, various maintenance tasks are easier, and services can be stopped completely or be diff --git a/doc/administration/monitoring/ip_whitelist.md b/doc/administration/monitoring/ip_whitelist.md index 522267ce362..20c97a0df8f 100644 --- a/doc/administration/monitoring/ip_whitelist.md +++ b/doc/administration/monitoring/ip_whitelist.md @@ -29,6 +29,21 @@ hosts or use IP ranges: --- +**For installations using cloud native Helm charts** + +You can set the required IPs under the `gitlab.webservice.monitoring.ipWhitelist` key. For example: + +```yaml +gitlab: + webservice: + monitoring: + # Monitoring IP whitelist + ipWhitelist: + - 0.0.0.0/0 # Default +``` + +--- + **For installations from source** 1. Edit `config/gitlab.yml`: diff --git a/doc/administration/monitoring/performance/img/performance_bar.png b/doc/administration/monitoring/performance/img/performance_bar.png Binary files differdeleted file mode 100644 index 380e2060b24..00000000000 --- a/doc/administration/monitoring/performance/img/performance_bar.png +++ /dev/null diff --git a/doc/administration/monitoring/performance/img/performance_bar_v14_0.png b/doc/administration/monitoring/performance/img/performance_bar_v14_0.png Binary files differnew file mode 100644 index 00000000000..42261ddd720 --- /dev/null +++ b/doc/administration/monitoring/performance/img/performance_bar_v14_0.png diff --git a/doc/administration/monitoring/performance/performance_bar.md b/doc/administration/monitoring/performance/performance_bar.md index 1125547f13f..5a7e8e12a38 100644 --- a/doc/administration/monitoring/performance/performance_bar.md +++ b/doc/administration/monitoring/performance/performance_bar.md @@ -12,7 +12,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w You can display the GitLab Performance Bar to see statistics for the performance of a page. When activated, it looks as follows: -![Performance Bar](img/performance_bar.png) +![Performance Bar](img/performance_bar_v14_0.png) From left to right, it displays: diff --git a/doc/administration/monitoring/prometheus/gitlab_metrics.md b/doc/administration/monitoring/prometheus/gitlab_metrics.md index 7e72f6ed7df..2aa95a2b0f1 100644 --- a/doc/administration/monitoring/prometheus/gitlab_metrics.md +++ b/doc/administration/monitoring/prometheus/gitlab_metrics.md @@ -98,6 +98,8 @@ The following metrics are available: | `gitlab_transaction_db_write_count_total` | Counter | 13.1 | Counter for total number of write SQL calls | `controller`, `action` | | `gitlab_transaction_db_cached_count_total` | Counter | 13.1 | Counter for total number of cached SQL calls | `controller`, `action` | | `gitlab_transaction_db_<role>_cached_count_total` | Counter | 13.1 | Counter for total number of cached SQL calls, grouped by database roles (primary/replica) | `controller`, `action` | +| `gitlab_transaction_db_<role>_wal_count_total` | Counter | 14.0 | Counter for total number of WAL (write ahead log location) queries, grouped by database roles (primary/replica) | `controller`, `action` | +| `gitlab_transaction_db_<role>_wal_cached_count_total` | Counter | 14.1 | Counter for total number of cached WAL (write ahead log location) queries, grouped by database roles (primary/replica)| `controller`, `action` | | `http_elasticsearch_requests_duration_seconds` **(PREMIUM)** | Histogram | 13.1 | Elasticsearch requests duration during web transactions | `controller`, `action` | | `http_elasticsearch_requests_total` **(PREMIUM)** | Counter | 13.1 | Elasticsearch requests count during web transactions | `controller`, `action` | | `pipelines_created_total` | Counter | 9.4 | Counter of pipelines created | | @@ -119,6 +121,7 @@ The following metrics are available: | `action_cable_single_client_transmissions_total` | Counter | 13.10 | The number of ActionCable messages transmitted to any client in any channel | `server_mode` | | `action_cable_subscription_confirmations_total` | Counter | 13.10 | The number of ActionCable subscriptions from clients confirmed | `server_mode` | | `action_cable_subscription_rejections_total` | Counter | 13.10 | The number of ActionCable subscriptions from clients rejected | `server_mode` | +| `action_cable_transmitted_bytes` | Histogram | 14.1 | Message size, in bytes, transmitted over action cable | `operation`, `channel` | | `gitlab_issuable_fast_count_by_state_total` | Counter | 13.5 | Total number of row count operations on issue/merge request list pages | | | `gitlab_issuable_fast_count_by_state_failures_total` | Counter | 13.5 | Number of soft-failed row count operations on issue/merge request list pages | | | `gitlab_external_http_total` | Counter | 13.8 | Total number of HTTP calls to external systems | `controller`, `action` | @@ -133,6 +136,10 @@ The following metrics are available: | `gitlab_spamcheck_request_duration_seconds` | Histogram | 13.12 | The duration for requests between Rails and the anti-spam engine | | | `service_desk_thank_you_email` | Counter | 14.0 | Total number of email responses to new service desk emails | | | `service_desk_new_note_email` | Counter | 14.0 | Total number of email notifications on new service desk comment | | +| `email_receiver_error` | Counter | 14.1 | Total number of errors when processing incoming emails | | +| `gitlab_snowplow_events_total` | Counter | 14.1 | Total number of GitLab Snowplow product intelligence events emitted | | +| `gitlab_snowplow_failed_events_total` | Counter | 14.1 | Total number of GitLab Snowplow product intelligence events emission failures | | +| `gitlab_snowplow_successful_events_total` | Counter | 14.1 | Total number of GitLab Snowplow product intelligence events emission successes | | ## Metrics controlled by a feature flag @@ -262,10 +269,11 @@ configuration option in `gitlab.yml`. These metrics are served from the The following metrics are available: -| Metric | Type | Since | Description | Labels | -|:--------------------------------- |:--------- |:------------------------------------------------------------- |:-------------------------------------- |:--------------------------------------------------------- | -| `db_load_balancing_hosts` | Gauge | [12.3](https://gitlab.com/gitlab-org/gitlab/-/issues/13630) | Current number of load balancing hosts | | -| `sidekiq_load_balancing_count` | Counter | 13.11 | Sidekiq jobs using load balancing with data consistency set to :sticky or :delayed | `queue`, `boundary`, `external_dependencies`, `feature_category`, `job_status`, `urgency`, `data_consistency`, `database_chosen` | +| Metric | Type | Since | Description | Labels | +|:-------------------------------------------------------- |:--------- |:------------------------------------------------------------- |:---------------------------------------------------------------------------------- |:---------------------------------------------------------------------------------------------------------------------------------------- | +| `db_load_balancing_hosts` | Gauge | [12.3](https://gitlab.com/gitlab-org/gitlab/-/issues/13630) | Current number of load balancing hosts | | +| `sidekiq_load_balancing_count` | Counter | 13.11 | Sidekiq jobs using load balancing with data consistency set to :sticky or :delayed | `queue`, `boundary`, `external_dependencies`, `feature_category`, `job_status`, `urgency`, `data_consistency`, `load_balancing_strategy` | +| `gitlab_transaction_caught_up_replica_pick_count_total` | Counter | 14.1 | Number of search attempts for caught up replica | `result` | ## Database partitioning metrics **(PREMIUM SELF)** @@ -336,7 +344,7 @@ These client metrics are meant to complement Redis server metrics. These metrics are broken down per [Redis instance](https://docs.gitlab.com/omnibus/settings/redis.html#running-with-multiple-redis-instances). These metrics all have a `storage` label which indicates the Redis -instance (`cache`, `shared_state` etc.). +instance (`cache`, `shared_state`, and so on). | Metric | Type | Since | Description | |:--------------------------------- |:------- |:----- |:----------- | diff --git a/doc/administration/nfs.md b/doc/administration/nfs.md index e53f2af3440..c4ff19ec3ea 100644 --- a/doc/administration/nfs.md +++ b/doc/administration/nfs.md @@ -10,7 +10,7 @@ type: reference NFS can be used as an alternative for object storage but this isn't typically recommended for performance reasons. -For data objects such as LFS, Uploads, Artifacts, etc., an [Object Storage service](object_storage.md) +For data objects such as LFS, Uploads, Artifacts, and so on, an [Object Storage service](object_storage.md) is recommended over NFS where possible, due to better performance. File system performance can impact overall GitLab performance, especially for @@ -20,11 +20,13 @@ file system performance, see ## Gitaly and NFS deprecation -WARNING: -From GitLab 14.0, enhancements and bug fixes for NFS for Git repositories are no longer -considered and customer technical support is considered out of scope. -[Read more about Gitaly and NFS](gitaly/index.md#nfs-deprecation-notice) and -[the correct mount options to use](#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). +Engineering support for NFS for Git repositories is deprecated. Technical support is planned to be +unavailable from GitLab 15.0. No further enhancements are planned for this feature. + +Read: + +- The [Gitaly and NFS deprecation notice](gitaly/index.md#nfs-deprecation-notice). +- About the [correct mount options to use](#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). ## Known kernel version incompatibilities @@ -100,7 +102,7 @@ and GIDs (which is off by default in some cases) for simplified permission management between systems: - [NetApp instructions](https://library.netapp.com/ecmdocs/ECMP1401220/html/GUID-24367A9F-E17B-4725-ADC1-02D86F56F78E.html) -- For non-NetApp devices, disable NFSv4 `idmapping` by performing opposite of [enable NFSv4 idmapper](https://wiki.archlinux.org/index.php/NFS#Enabling_NFSv4_idmapping) +- For non-NetApp devices, disable NFSv4 `idmapping` by performing opposite of [enable NFSv4 idmapper](https://wiki.archlinux.org/title/NFS#Enabling_NFSv4_idmapping) ### Disable NFS server delegation @@ -368,9 +370,8 @@ sudo ufw allow from <client_ip_address> to any port nfs ### Upgrade to Gitaly Cluster or disable caching if experiencing data loss WARNING: -From GitLab 13.0, using NFS for Git repositories is deprecated. -As of GitLab 14.0, NFS-related issues with Gitaly are no longer addressed. Read -more about [Gitaly and NFS deprecation](gitaly/index.md#nfs-deprecation-notice). +Engineering support for NFS for Git repositories is deprecated. Read the +[Gitaly and NFS deprecation notice](gitaly/index.md#nfs-deprecation-notice). Customers and users have reported data loss on high-traffic repositories when using NFS for Git repositories. For example, we have seen: diff --git a/doc/administration/object_storage.md b/doc/administration/object_storage.md index f1025bd1846..525b41359cf 100644 --- a/doc/administration/object_storage.md +++ b/doc/administration/object_storage.md @@ -537,7 +537,7 @@ the original form is omitted. To move to the consolidated form, remove the original configuration (for example, `artifacts_object_store_enabled`, or `uploads_object_store_connection`) -## Storage-specific configuration +### Storage-specific configuration For configuring object storage in GitLab 13.1 and earlier, or for storage types not supported by consolidated configuration form, refer to the following guides: @@ -580,7 +580,7 @@ There are plans to [enable the use of a single bucket](https://gitlab.com/gitlab in the future. Helm-based installs require separate buckets to -[handle backup restorations](https://docs.gitlab.com/charts/advanced/external-object-storage/#lfs-artifacts-uploads-packages-external-diffs-pseudonymizer) +[handle backup restorations](https://docs.gitlab.com/charts/advanced/external-object-storage/#lfs-artifacts-uploads-packages-external-diffs-pseudonymizer). ### S3 API compatibility issues @@ -591,12 +591,6 @@ with the Fog library that GitLab uses. Symptoms include an error in `production. 411 Length Required ``` -### Incremental logging is required for CI to use object storage - -If you configure GitLab to use object storage for CI logs and artifacts, -you can avoid [local disk usage for job logs](job_logs.md#data-flow) by enabling -[beta incremental logging](job_logs.md#incremental-logging-architecture). - ### Proxy Download Clients can download files in object storage by receiving a pre-signed, time-limited URL, @@ -724,21 +718,6 @@ must be fulfilled: [ETag mismatch errors](#etag-mismatch) occur if server side encryption headers are used without enabling the Workhorse S3 client. -##### Disabling the feature - -The Workhorse S3 client is enabled by default when the -[`use_iam_profile` configuration option](#iam-permissions) is set to `true` or consolidated -object storage settings are configured. - -The feature can be disabled using the `:use_workhorse_s3_client` feature flag. To disable the -feature, ask a GitLab administrator with -[Rails console access](feature_flags.md#how-to-enable-and-disable-features-behind-flags) to run the -following command: - -```ruby -Feature.disable(:use_workhorse_s3_client) -``` - #### IAM Permissions To set up an instance profile: diff --git a/doc/administration/operations/extra_sidekiq_processes.md b/doc/administration/operations/extra_sidekiq_processes.md index b910a789d29..1f195bcc378 100644 --- a/doc/administration/operations/extra_sidekiq_processes.md +++ b/doc/administration/operations/extra_sidekiq_processes.md @@ -74,8 +74,9 @@ To start multiple processes: just handles the `mailers` queue. When `sidekiq-cluster` is only running on a single node, make sure that at least - one process is running on all queues using `*`. This means a process is - This includes queues that have dedicated processes. + one process is running on all queues using `*`. This ensures a process + automatically picks up jobs in queues created in the future, + including queues that have dedicated processes. If `sidekiq-cluster` is running on more than one node, you can also use [`--negate`](#negate-settings) and list all the queues that are already being @@ -95,13 +96,16 @@ To view the Sidekiq processes in GitLab: ## Negate settings To have the additional Sidekiq processes work on every queue **except** the ones -you list: +you list. In this example, we exclude all import-related jobs from a Sidekiq node: 1. After you follow the steps for [starting extra processes](#start-multiple-processes), edit `/etc/gitlab/gitlab.rb` and add: ```ruby sidekiq['negate'] = true + sidekiq['queue_groups'] = [ + "feature_category=importers" + ] ``` 1. Save the file and reconfigure GitLab for the changes to take effect: @@ -171,7 +175,7 @@ When disabling `sidekiq_cluster`, you must copy your configuration for `sidekiq_cluster` is overridden by the options for `sidekiq` when setting `sidekiq['cluster'] = true`. -When using this feature, the service called `sidekiq` is now +When using this feature, the service called `sidekiq` is now running `sidekiq-cluster`. The [concurrency](#manage-concurrency) and other options configured @@ -180,32 +184,21 @@ for Sidekiq are respected. By default, logs for `sidekiq-cluster` go to `/var/log/gitlab/sidekiq` like regular Sidekiq logs. -## Ignore all GitHub import queues +## Ignore all import queues -When [importing from GitHub](../../user/project/import/github.md), Sidekiq might -use all of its resources to perform those operations. To set up a separate -`sidekiq-cluster` process to ignore all GitHub import-related queues: +When [importing from GitHub](../../user/project/import/github.md) or +other sources, Sidekiq might use all of its resources to perform those +operations. To set up two separate `sidekiq-cluster` processes, where +one only processes imports and the other processes all other queues: 1. Edit `/etc/gitlab/gitlab.rb` and add: ```ruby sidekiq['enable'] = true - sidekiq['negate'] = true + sidekiq['queue_selector'] = true sidekiq['queue_groups'] = [ - "github_import_advance_stage", - "github_importer:github_import_import_diff_note", - "github_importer:github_import_import_issue", - "github_importer:github_import_import_note", - "github_importer:github_import_import_lfs_object", - "github_importer:github_import_import_pull_request", - "github_importer:github_import_refresh_import_jid", - "github_importer:github_import_stage_finish_import", - "github_importer:github_import_stage_import_base_data", - "github_importer:github_import_stage_import_issues_and_diff_notes", - "github_importer:github_import_stage_import_notes", - "github_importer:github_import_stage_import_lfs_objects", - "github_importer:github_import_stage_import_pull_requests", - "github_importer:github_import_stage_import_repository" + "feature_category=importers", + "feature_category!=importers" ] ``` diff --git a/doc/administration/operations/extra_sidekiq_routing.md b/doc/administration/operations/extra_sidekiq_routing.md index 93cf8bd4f43..80540b7ba46 100644 --- a/doc/administration/operations/extra_sidekiq_routing.md +++ b/doc/administration/operations/extra_sidekiq_routing.md @@ -41,7 +41,7 @@ In `/etc/gitlab/gitlab.rb`: ```ruby sidekiq['routing_rules'] = [ # Route all non-CPU-bound workers that are high urgency to `high-urgency` queue - ['resource_boundary!=cpu&urgency=high', 'high-urgency'], + ['resource_boundary!=cpu&urgency=high', 'high-urgency'], # Route all database, gitaly and global search workers that are throttled to `throttled` queue ['feature_category=database,gitaly,global_search&urgency=throttled', 'throttled'], # Route all workers having contact with outside work to a `network-intenstive` queue @@ -99,7 +99,7 @@ based on a subset of worker attributes: - `urgency` - how important it is that this queue's jobs run quickly. Can be `high`, `low`, or `throttled`. For example, the `authorized_projects` queue is used to refresh user permissions, and - is high urgency. + is `high` urgency. - `worker_name` - the worker name. The other attributes are typically more useful as they are more general, but this is available in case a particular worker needs to be selected. diff --git a/doc/administration/operations/puma.md b/doc/administration/operations/puma.md index fffff78b9d6..e8477eaf686 100644 --- a/doc/administration/operations/puma.md +++ b/doc/administration/operations/puma.md @@ -4,35 +4,102 @@ group: Distribution info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- -# Switching to Puma **(FREE SELF)** +# Puma **(FREE SELF)** -As of GitLab 12.9, [Puma](https://github.com/puma/puma) has replaced [Unicorn](https://yhbt.net/unicorn/) -as the default web server. From GitLab 14.0, the following run Puma: +Puma is a simple, fast, multi-threaded, and highly concurrent HTTP 1.1 server for +Ruby applications. It's the default GitLab web server since GitLab 13.0 +and has replaced Unicorn. From GitLab 14.0, Unicorn is no longer supported. -- All-in-one package-based installations. -- Helm chart-based installations. +NOTE: +Starting with GitLab 13.0, Puma is the default web server and Unicorn has been disabled. +In GitLab 14.0, Unicorn was removed from the Linux package and only Puma is available. -## Why switch to Puma? +## Configure Puma -Puma has a multi-thread architecture which uses less memory than a multi-process -application server like Unicorn. On GitLab.com, we saw a 40% reduction in memory -consumption. +To configure Puma: -Most Rails applications requests normally include a proportion of I/O wait time. -During I/O wait time MRI Ruby will release the GVL (Global VM Lock) to other threads. -Multi-threaded Puma can therefore still serve more requests than a single process. +1. Determine suitable Puma worker and thread [settings](../../install/requirements.md#puma-settings). +1. If you're switching from Unicorn, [convert any custom settings to Puma](#convert-unicorn-settings-to-puma). +1. For multi-node deployments, configure the load balancer to use the + [readiness check](../load_balancer.md#readiness-check). +1. Reconfigure GitLab so the above changes take effect: + + ```shell + sudo gitlab-ctl reconfigure + ``` + +For Helm-based deployments, see the +[`webservice` chart documentation](https://docs.gitlab.com/charts/charts/gitlab/webservice/index.html). + +For more details about the Puma configuration, see the +[Puma documentation](https://github.com/puma/puma#configuration). + +## Puma Worker Killer + +By default: + +- The [Puma Worker Killer](https://github.com/schneems/puma_worker_killer) restarts a worker if it + exceeds a [memory limit](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/cluster/puma_worker_killer_initializer.rb). +- Rolling restarts of Puma workers are performed every 12 hours. + +To change the memory limit setting: + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + puma['per_worker_max_memory_mb'] = 1024 + ``` + +1. Reconfigure GitLab for the changes to take effect: + + ```shell + sudo gitlab-ctl reconfigure + ``` + +## Worker timeout + +A [timeout of 60 seconds](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/initializers/rack_timeout.rb) +is used when Puma is enabled. + +NOTE: +Unlike Unicorn, the `puma['worker_timeout']` setting does not set the maximum request duration. -## Configuring Puma to replace Unicorn +To change the worker timeout: -Beginning with GitLab 13.0, Puma is the default application server. We removed support for -Unicorn in GitLab 14.0. +1. Edit `/etc/gitlab/gitlab.rb`: -When switching to Puma, Unicorn server configuration -will _not_ carry over automatically, due to differences between the two application servers. For Omnibus-based -deployments, see [Configuring Puma Settings](https://docs.gitlab.com/omnibus/settings/puma.html#configuring-puma-settings). -For Helm based deployments, see the [`webservice` chart documentation](https://docs.gitlab.com/charts/charts/gitlab/webservice/index.html). + ```ruby + gitlab_rails['env'] = { + 'GITLAB_RAILS_RACK_TIMEOUT' => 600 + } + ``` -Additionally we strongly recommend that multi-node deployments [configure their load balancers to use the readiness check](../load_balancer.md#readiness-check) due to a difference between Unicorn and Puma in how they handle connections during a restart of the service. +1. Reconfigure GitLab for the changes to take effect: + + ```shell + sudo gitlab-ctl reconfigure + ``` + +## Memory-constrained environments + +In a memory-constrained environment with less than 4GB of RAM available, consider disabling Puma +[Clustered mode](https://github.com/puma/puma#clustered-mode). + +Configuring Puma by setting the amount of `workers` to `0` could reduce memory usage by hundreds of MB. +For details on Puma worker and thread settings, see the [Puma requirements](../../install/requirements.md#puma-settings). + +Unlike in a Clustered mode, which is set up by default, only a single Puma process would serve the application. + +The downside of running Puma with such configuration is the reduced throughput, which could be +considered as a fair tradeoff in a memory-constraint environment. + +When running Puma in Single mode, some features are not supported: + +- Phased restart do not work: [issue](https://gitlab.com/gitlab-org/gitlab/-/issues/300665) +- [Phased restart](https://gitlab.com/gitlab-org/gitlab/-/issues/300665) +- [Puma Worker Killer](https://gitlab.com/gitlab-org/gitlab/-/issues/300664) + +To learn more, visit [epic 5303](https://gitlab.com/groups/gitlab-org/-/epics/5303). ## Performance caveat when using Puma with Rugged @@ -66,3 +133,46 @@ optimal configuration: Rugged, single-threaded Puma works the same as Unicorn. - To force Rugged to be used with multi-threaded Puma, you can use [feature flags](../../development/gitaly.md#legacy-rugged-code). + +## Convert Unicorn settings to Puma + +NOTE: +Starting with GitLab 13.0, Puma is the default web server and Unicorn has been +disabled by default. In GitLab 14.0, Unicorn was removed from the Linux package +and only Puma is available. + +Puma has a multi-thread architecture which uses less memory than a multi-process +application server like Unicorn. On GitLab.com, we saw a 40% reduction in memory +consumption. Most Rails applications requests normally include a proportion of I/O wait time. + +During I/O wait time MRI Ruby releases the GVL (Global VM Lock) to other threads. +Multi-threaded Puma can therefore still serve more requests than a single process. + +When switching to Puma, any Unicorn server configuration will _not_ carry over +automatically, due to differences between the two application servers. + +The table below summarizes which Unicorn configuration keys correspond to those +in Puma when using the Linux package, and which ones have no corresponding counterpart. + +| Unicorn | Puma | +| ------------------------------------ | ---------------------------------- | +| `unicorn['enable']` | `puma['enable']` | +| `unicorn['worker_timeout']` | `puma['worker_timeout']` | +| `unicorn['worker_processes']` | `puma['worker_processes']` | +| n/a | `puma['ha']` | +| n/a | `puma['min_threads']` | +| n/a | `puma['max_threads']` | +| `unicorn['listen']` | `puma['listen']` | +| `unicorn['port']` | `puma['port']` | +| `unicorn['socket']` | `puma['socket']` | +| `unicorn['pidfile']` | `puma['pidfile']` | +| `unicorn['tcp_nopush']` | n/a | +| `unicorn['backlog_socket']` | n/a | +| `unicorn['somaxconn']` | `puma['somaxconn']` | +| n/a | `puma['state_path']` | +| `unicorn['log_directory']` | `puma['log_directory']` | +| `unicorn['worker_memory_limit_min']` | n/a | +| `unicorn['worker_memory_limit_max']` | `puma['per_worker_max_memory_mb']` | +| `unicorn['exporter_enabled']` | `puma['exporter_enabled']` | +| `unicorn['exporter_address']` | `puma['exporter_address']` | +| `unicorn['exporter_port']` | `puma['exporter_port']` | diff --git a/doc/administration/operations/sidekiq_memory_killer.md b/doc/administration/operations/sidekiq_memory_killer.md index d3019e2c580..598baa4fcc7 100644 --- a/doc/administration/operations/sidekiq_memory_killer.md +++ b/doc/administration/operations/sidekiq_memory_killer.md @@ -25,7 +25,7 @@ minute of delay for incoming background jobs. Some background jobs rely on long-running external processes. To ensure these are cleanly terminated when Sidekiq is restarted, each Sidekiq process should be -run as a process group leader (e.g., using `chpst -P`). If using Omnibus or the +run as a process group leader (for example, using `chpst -P`). If using Omnibus or the `bin/background_jobs` script with `runit` installed, this is handled for you. ## Configuring the MemoryKiller @@ -80,4 +80,4 @@ The MemoryKiller is controlled using environment variables. If the process hard shutdown/restart is not performed by Sidekiq, the Sidekiq process is forcefully terminated after `Sidekiq.options[:timeout] + 2` seconds. An external supervision mechanism - (e.g. runit) must restart Sidekiq afterwards. + (for example, runit) must restart Sidekiq afterwards. diff --git a/doc/administration/operations/ssh_certificates.md b/doc/administration/operations/ssh_certificates.md index 508d284b0bd..374eebeb773 100644 --- a/doc/administration/operations/ssh_certificates.md +++ b/doc/administration/operations/ssh_certificates.md @@ -41,11 +41,11 @@ uploading user SSH keys to GitLab entirely. How to fully set up SSH certificates is outside the scope of this document. See [OpenSSH's `PROTOCOL.certkeys`](https://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/ssh/PROTOCOL.certkeys?annotate=HEAD) -for how it works, and e.g. [RedHat's documentation about +for how it works, for example [RedHat's documentation about it](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/sec-using_openssh_certificate_authentication). We assume that you already have SSH certificates set up, and have -added the `TrustedUserCAKeys` of your CA to your `sshd_config`, e.g.: +added the `TrustedUserCAKeys` of your CA to your `sshd_config`, for example: ```plaintext TrustedUserCAKeys /etc/security/mycompany_user_ca.pub @@ -58,7 +58,7 @@ used for GitLab consider putting this in the `Match User git` section (described below). The SSH certificates being issued by that CA **MUST** have a "key ID" -corresponding to that user's username on GitLab, e.g. (some output +corresponding to that user's username on GitLab, for example (some output omitted for brevity): ```shell @@ -77,7 +77,7 @@ $ ssh-add -L | grep cert | ssh-keygen -L -f - [...] ``` -Technically that's not strictly true, e.g. it could be +Technically that's not strictly true, for example, it could be `prod-aearnfjord` if it's a SSH certificate you'd normally log in to servers as the `prod-aearnfjord` user, but then you must specify your own `AuthorizedPrincipalsCommand` to do that mapping instead of using @@ -107,13 +107,13 @@ command="/opt/gitlab/embedded/service/gitlab-shell/bin/gitlab-shell username-{KE ``` Where `{KEY_ID}` is the `%i` argument passed to the script -(e.g. `aeanfjord`), and `{PRINCIPAL}` is the principal passed to it -(e.g. `sshUsers`). +(for example, `aeanfjord`), and `{PRINCIPAL}` is the principal passed to it +(for example, `sshUsers`). You need to customize the `sshUsers` part of that. It should be some principal that's guaranteed to be part of the key for all users who can log in to GitLab, or you must provide a list of principals, -one of which is present for the user, e.g.: +one of which is present for the user, for example: ```plaintext [...] @@ -131,7 +131,7 @@ principal is some "group" that's allowed to log into that server. However with GitLab it's only used to appease OpenSSH's requirement for it, we effectively only care about the "key ID" being correct. Once that's extracted GitLab enforces its own ACLs for -that user (e.g. what projects the user can access). +that user (for example, what projects the user can access). So it's OK to e.g. be overly generous in what you accept, since if the user e.g. has no access to GitLab at all it just errors out with a diff --git a/doc/administration/packages/container_registry.md b/doc/administration/packages/container_registry.md index a6829b90f18..74483b65c4d 100644 --- a/doc/administration/packages/container_registry.md +++ b/doc/administration/packages/container_registry.md @@ -1025,15 +1025,15 @@ You may want to add the `-m` flag to [remove untagged manifests and unreferenced Before diving in to the following sections, here's some basic troubleshooting: 1. Check to make sure that the system clock on your Docker client and GitLab server have - been synchronized (e.g. via NTP). + been synchronized (for example, via NTP). 1. If you are using an S3-backed Registry, double check that the IAM permissions and the S3 credentials (including region) are correct. See [the sample IAM policy](https://docs.docker.com/registry/storage-drivers/s3/) for more details. -1. Check the Registry logs (e.g. `/var/log/gitlab/registry/current`) and the GitLab production logs - for errors (e.g. `/var/log/gitlab/gitlab-rails/production.log`). You may be able to find clues +1. Check the Registry logs (for example `/var/log/gitlab/registry/current`) and the GitLab production logs + for errors (for example `/var/log/gitlab/gitlab-rails/production.log`). You may be able to find clues there. ### Using self-signed certificates with Container Registry @@ -1359,7 +1359,7 @@ For Omnibus installations: [image upgrade](#images-upgrade)) steps. You should [stop](https://docs.gitlab.com/omnibus/maintenance/#starting-and-stopping) the registry service before replacing its binary and start it right after. No registry configuration changes are required. - + #### Source installations For source installations, locate your `registry` binary and temporarily replace it with the one @@ -1461,7 +1461,7 @@ no errors are generated by the curl commands. #### Running the Docker daemon with a proxy For Docker to connect through a proxy, you must start the Docker daemon with the -proper environment variables. The easiest way is to shutdown Docker (e.g. `sudo initctl stop docker`) +proper environment variables. The easiest way is to shutdown Docker (for example `sudo initctl stop docker`) and then run Docker by hand. As root, run: ```shell diff --git a/doc/administration/packages/index.md b/doc/administration/packages/index.md index 6440fb16fc6..2c2e3fc0442 100644 --- a/doc/administration/packages/index.md +++ b/doc/administration/packages/index.md @@ -26,6 +26,7 @@ The Package Registry supports the following formats: <tr><td><a href="https://docs.gitlab.com/ee/user/packages/nuget_repository/index.html">NuGet</a></td><td>12.8+</td></tr> <tr><td><a href="https://docs.gitlab.com/ee/user/packages/pypi_repository/index.html">PyPI</a></td><td>12.10+</td></tr> <tr><td><a href="https://docs.gitlab.com/ee/user/packages/generic_packages/index.html">Generic packages</a></td><td>13.5+</td></tr> +<tr><td><a href="https://docs.gitlab.com/ee/user/packages/helm_repository/index.html">Helm Charts</a></td><td>14.1+</td></tr> </table> </div> </div> @@ -237,3 +238,17 @@ For installations from source: ```shell RAILS_ENV=production sudo -u git -H bundle exec rake gitlab:packages:migrate ``` + +You can optionally track progress and verify that all packages migrated successfully. + +From the [PostgreSQL console](https://docs.gitlab.com/omnibus/settings/database.html#connecting-to-the-bundled-postgresql-database) +(`sudo gitlab-psql -d gitlabhq_production` for Omnibus GitLab), verify that `objectstg` below (where +`file_store=2`) has the count of all packages: + +```shell +gitlabhq_production=# SELECT count(*) AS total, sum(case when file_store = '1' then 1 else 0 end) AS filesystem, sum(case when file_store = '2' then 1 else 0 end) AS objectstg FROM packages_package_files; + +total | filesystem | objectstg +------+------------+----------- + 34 | 0 | 34 +``` diff --git a/doc/administration/pages/index.md b/doc/administration/pages/index.md index b9637f1b6f5..ea1e99524b8 100644 --- a/doc/administration/pages/index.md +++ b/doc/administration/pages/index.md @@ -64,7 +64,7 @@ Before proceeding with the Pages configuration, you must: 1. Configure a **wildcard DNS record**. 1. (Optional) Have a **wildcard certificate** for that domain if you decide to serve Pages under HTTPS. -1. (Optional but recommended) Enable [Shared runners](../../ci/runners/README.md) +1. (Optional but recommended) Enable [Shared runners](../../ci/runners/index.md) so that your users don't have to bring their own. 1. (Only for custom domains) Have a **secondary IP**. @@ -215,6 +215,24 @@ NOTE: `inplace_chroot` option might not work with the other features, such as [Pages Access Control](#access-control). The [GitLab Pages README](https://gitlab.com/gitlab-org/gitlab-pages#caveats) has more information about caveats and workarounds. +### Jailing mechanism disabled by default for API-based configuration + +Starting from GitLab 14.1 the [jailing/chroot mechanism is disabled by default](https://gitlab.com/gitlab-org/gitlab-pages/-/issues/589). +If you are using API-based configuration and the new [Zip storage architecture](#zip-storage) +there is nothing you need to do. + +If you run into any problems, [open a new issue](https://gitlab.com/gitlab-org/gitlab-pages/-/issues/new) +and enable the jail again by setting the environment variable: + +1. Edit `/etc/gitlab/gitlab.rb`. +1. Set the `DAEMON_ENABLE_JAIL` environment variable to `true` for GitLab Pages: + + ```ruby + gitlab_pages['env']['DAEMON_ENABLE_JAIL'] = "true" + ``` + +1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure). + ### Global settings Below is a table of all configuration settings known to Pages in Omnibus GitLab, @@ -268,8 +286,8 @@ control over how the Pages daemon runs and serves content in your environment. | `sentry_enabled` | Enable reporting and logging with Sentry, true/false. | | `sentry_environment` | The environment for Sentry crash reporting. | | `status_uri` | The URL path for a status page, for example, `/@status`. | -| `tls_max_version` | Specifies the maximum SSL/TLS version ("ssl3", "tls1.0", "tls1.1" or "tls1.2"). | -| `tls_min_version` | Specifies the minimum SSL/TLS version ("ssl3", "tls1.0", "tls1.1" or "tls1.2"). | +| `tls_max_version` | Specifies the maximum TLS version ("tls1.2" or "tls1.3"). | +| `tls_min_version` | Specifies the minimum TLS version ("tls1.2" or "tls1.3"). | | `use_http2` | Enable HTTP2 support. | | **`gitlab_pages['env'][]`** | | | `http_proxy` | Configure GitLab Pages to use an HTTP Proxy to mediate traffic between Pages and GitLab. Sets an environment variable `http_proxy` when starting Pages daemon. | @@ -373,9 +391,13 @@ When adding a custom domain, users are required to prove they own it by adding a GitLab-controlled verification code to the DNS records for that domain. If your user base is private or otherwise trusted, you can disable the -verification requirement. Go to **Admin Area > Settings > Preferences** and -uncheck **Require users to prove ownership of custom domains** in the **Pages** section. -This setting is enabled by default. +verification requirement: + +1. On the top bar, select **Menu >** **{admin}** **Admin**. +1. On the left sidebar, select **Settings > Preferences**. +1. Expand **Pages**. +1. Clear the **Require users to prove ownership of custom domains** checkbox. + This setting is enabled by default. ### Let's Encrypt integration @@ -388,9 +410,11 @@ sites served under a custom domain. To enable it, you must: 1. Choose an email address on which you want to receive notifications about expiring domains. -1. Go to your instance's **Admin Area > Settings > Preferences** and expand **Pages** settings. +1. On the top bar, select **Menu >** **{admin}** **Admin**. +1. On the left sidebar, select **Settings > Preferences**. +1. Expand **Pages**. 1. Enter the email address for receiving notifications and accept Let's Encrypt's Terms of Service as shown below. -1. Click **Save changes**. +1. Select **Save changes**. ![Let's Encrypt settings](img/lets_encrypt_integration_v12_1.png) @@ -442,11 +466,12 @@ The scope to use for authentication must match the GitLab Pages OAuth applicatio pre-existing applications must modify the GitLab Pages OAuth application. Follow these steps to do this: -1. Go to your instance's **Admin Area > Settings > Applications** and expand **GitLab Pages** - settings. +1. On the top bar, select **Menu >** **{admin}** **Admin**. +1. On the left sidebar, select **Settings > Applications**. +1. Expand **GitLab Pages**. 1. Clear the `api` scope's checkbox and select the desired scope's checkbox (for example, `read_api`). -1. Click **Save changes**. +1. Select **Save changes**. #### Disabling public access to all Pages websites @@ -460,9 +485,11 @@ This can be useful to preserve information published with Pages websites to the of your instance only. To do that: -1. Go to your instance's **Admin Area > Settings > Preferences** and expand **Pages** settings. -1. Check the **Disable public access to Pages sites** checkbox. -1. Click **Save changes**. +1. On the top bar, select **Menu >** **{admin}** **Admin**. +1. On the left sidebar, select **Settings > Preferences**. +1. Expand **Pages**. +1. Select the **Disable public access to Pages sites** checkbox. +1. Select **Save changes**. WARNING: For self-managed installations, all public websites remain private until they are @@ -635,30 +662,27 @@ Follow the steps below to configure the proxy listener of GitLab Pages. 1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure). -## Set maximum pages size - -You can configure the maximum size of the unpacked archive per project in -**Admin Area > Settings > Preferences > Pages**, in **Maximum size of pages (MB)**. -The default is 100MB. - -### Override maximum pages size per project or group **(PREMIUM SELF)** +## Override maximum pages size per project or group **(PREMIUM SELF)** > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/16610) in GitLab 12.7. NOTE: -Only GitLab admin users are able to view and override the **Maximum size of Pages** setting. +Only GitLab administrators are able to view and override the **Maximum size of Pages** setting. To override the global maximum pages size for a specific project: -1. Go to your project's **Settings > Pages** page. -1. Edit the **Maximum size of pages**. -1. Click **Save changes**. +1. On the top bar, select **Menu > Projects** and find your project. +1. On the left sidebar, select **Settings > Pages**. +1. Enter a value under **Maximum size of pages** in MB. +1. Select **Save changes**. To override the global maximum pages size for a specific group: -1. Go to your group's **Settings > General** page and expand **Pages**. -1. Edit the **Maximum size of pages**. -1. Click **Save changes**. +1. On the top bar, select **Menu > Groups** and find your group. +1. On the left sidebar, select **Settings > General**. +1. Expand **Pages**. +1. Enter a value under **Maximum size of pages** in MB. +1. Select **Save changes**. ## Running GitLab Pages on a separate server @@ -690,23 +714,14 @@ database encryption. Proceed with caution. gitlab_pages['access_control'] = true ``` +1. Configure [the object storage and migrate pages data to it](#using-object-storage). + 1. [Reconfigure the **GitLab server**](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. The `gitlab-secrets.json` file is now updated with the new configuration. 1. Set up a new server. This becomes the **Pages server**. -1. Create an [NFS share](../nfs.md) - on the **Pages server** and configure this share to - allow access from your main **GitLab server**. - Note that the example there is more general and - shares several sub-directories from `/home` to several `/nfs/home` mount points. - For our Pages-specific example here, we instead share only the - default GitLab Pages folder `/var/opt/gitlab/gitlab-rails/shared/pages` - from the **Pages server** and we mount it to `/mnt/pages` - on the **GitLab server**. - Therefore, omit "Step 4" there. - 1. On the **Pages server**, install Omnibus GitLab and modify `/etc/gitlab/gitlab.rb` to include: @@ -725,7 +740,7 @@ database encryption. Proceed with caution. ``` 1. Copy the `/etc/gitlab/gitlab-secrets.json` file from the **GitLab server** - to the **Pages server**, for example via the NFS share. + to the **Pages server**. ```shell # On the GitLab server @@ -743,7 +758,6 @@ database encryption. Proceed with caution. pages_external_url "http://<pages_server_URL>" gitlab_pages['enable'] = false pages_nginx['enable'] = false - gitlab_rails['pages_path'] = "/mnt/pages" ``` 1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. @@ -797,7 +811,7 @@ To explicitly enable API source: 1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. -Or if you want to use legacy confiration source you can: +Or if you want to use legacy configuration source you can: 1. Add the following to your `/etc/gitlab/gitlab.rb` file: @@ -929,7 +943,7 @@ In installations from source: In GitLab 14.0 the underlying storage format of GitLab Pages is changing from files stored directly in disk to a single ZIP archive per project. -These ZIP archives can be stored either locally on disk storage or on the [object storage](#using-object-storage) if it is configured. +These ZIP archives can be stored either locally on disk storage or on [object storage](#using-object-storage) if it is configured. [Starting from GitLab 13.5](https://gitlab.com/gitlab-org/gitlab/-/issues/245308) ZIP archives are stored every time pages site is updated. @@ -991,9 +1005,8 @@ to using that. > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/325285) in GitLab 13.11. -Existing Pages deployments objects (which store [ZIP archives](#zip-storage)) can similarly be -migrated to [object storage](#using-object-storage), if -you've been having them stored locally. +Existing Pages deployment objects (which store [ZIP archives](#zip-storage)) can similarly be +migrated to [object storage](#using-object-storage). Migrate your existing Pages deployments from local storage to object storage: @@ -1003,7 +1016,7 @@ sudo gitlab-rake gitlab:pages:deployments:migrate_to_object_storage ### Rolling Pages deployments back to local storage -After the migration to object storage is performed, you can choose to revert your Pages deployments back to local storage: +After the migration to object storage is performed, you can choose to move your Pages deployments back to local storage: ```shell sudo gitlab-rake gitlab:pages:deployments:migrate_to_local @@ -1013,7 +1026,7 @@ sudo gitlab-rake gitlab:pages:deployments:migrate_to_local > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/301159) in GitLab 13.11. -If you use [object storage](#using-object-storage), disable local storage: +If you use [object storage](#using-object-storage), you can disable local storage: 1. Edit `/etc/gitlab/gitlab.rb`: @@ -1027,22 +1040,22 @@ Starting from GitLab 13.12, this setting also disables the [legacy storage](#mig ## Migrate GitLab Pages to 14.0 -In GitLab 14.0 a number of breaking changes are introduced which may require some user intervention. +In GitLab 14.0 a number of breaking changes were introduced which may require some user intervention. The steps below describe the best way to migrate without causing any downtime for your GitLab instance. -If you run GitLab on a single server, then most likely you will not notice any problem after -upgrading to GitLab 14.0, but it may be safer to follow the steps anyway. -If you run GitLab on a single server, then most likely the upgrade process to 14.0 will go smoothly for you. Regardless, we recommend everyone follow the migration steps to ensure a successful upgrade. +If you run GitLab on a single server, then most likely the upgrade process to 14.0 will go smoothly for you +and you will not notice any problem after upgrading. +Regardless, we recommend everyone follow the migration steps to ensure a successful upgrade. If at any point you run into issues, consult the [troubleshooting section](#troubleshooting). -To migrate GitLab Pages to GitLab 14.0: +If your current GitLab version is lower than 13.12, then you first need to update to 13.12. +Updating directly to 14.0 is [not supported](../../update/index.md#upgrade-paths) +and may cause downtime for some web-sites hosted on GitLab Pages. Once you update to 13.12, +migrate GitLab Pages to prepare them for GitLab 14.0: -1. If your current GitLab version is lower than 13.12, then you first need to upgrade to 13.12. -Upgrading directly to 14.0 may cause downtime for some web-sites hosted on GitLab Pages -until you finish the following steps. 1. Set [`domain_config_source` to `gitlab`](#domain-source-configuration-before-140), which is the default starting from GitLab 14.0. Skip this step if you're already running GitLab 14.0 or above. -1. If you want to store your pages content in the [object storage](#using-object-storage), make sure to configure it. +1. If you want to store your pages content in [object storage](#using-object-storage), make sure to configure it. If you want to store the pages content locally or continue using an NFS server, skip this step. 1. [Migrate legacy storage to ZIP storage.](#migrate-legacy-storage-to-zip-storage) 1. Upgrade GitLab to 14.0. @@ -1126,18 +1139,44 @@ open /opt/gitlab/embedded/ssl/certs/cacert.pem: no such file or directory x509: certificate signed by unknown authority ``` -The reason for those errors is that the files `resolv.conf` and `ca-bundle.pem` are missing inside the `chroot`. -The fix is to copy the host's `/etc/resolv.conf` and the GitLab certificate bundle inside the `chroot`: +The reason for those errors is that the files `resolv.conf`, `/etc/hosts/`, `/etc/nsswitch.conf` and `ca-bundle.pem` are missing inside the `chroot`. +The fix is to copy these files inside the `chroot`: ```shell sudo mkdir -p /var/opt/gitlab/gitlab-rails/shared/pages/etc/ssl sudo mkdir -p /var/opt/gitlab/gitlab-rails/shared/pages/opt/gitlab/embedded/ssl/certs/ -sudo cp /etc/resolv.conf /var/opt/gitlab/gitlab-rails/shared/pages/etc +sudo cp /etc/resolv.conf /var/opt/gitlab/gitlab-rails/shared/pages/etc/ +sudo cp /etc/hosts /var/opt/gitlab/gitlab-rails/shared/pages/etc/ +sudo cp /etc/nsswitch.conf /var/opt/gitlab/gitlab-rails/shared/pages/etc/ sudo cp /opt/gitlab/embedded/ssl/certs/cacert.pem /var/opt/gitlab/gitlab-rails/shared/pages/opt/gitlab/embedded/ssl/certs/ sudo cp /opt/gitlab/embedded/ssl/certs/cacert.pem /var/opt/gitlab/gitlab-rails/shared/pages/etc/ssl/ca-bundle.pem ``` +### `unsupported protocol scheme \"\""` + +If you see the following error: + +```plaintext +{"error":"failed to connect to internal Pages API: Get \"/api/v4/internal/pages/status\": unsupported protocol scheme \"\"","level":"warning","msg":"attempted to connect to the API","time":"2021-06-23T20:03:30Z"} +``` + +It means you didn't set the HTTP(S) protocol scheme in the Pages server settings. +To fix it: + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + gitlab_pages['gitlab_server'] = "https://<your_pages_domain_name>" + gitlab_pages['internal_gitlab_server'] = "https://<your_pages_domain_name>" + ``` + +1. Reconfigure GitLab: + + ```shell + sudo gitlab-ctl reconfigure + ``` + ### 502 error when connecting to GitLab Pages proxy when server does not listen over IPv6 In some cases, NGINX might default to using IPv6 to connect to the GitLab Pages @@ -1256,9 +1295,14 @@ Upgrading to an [officially supported operating system](https://about.gitlab.com ### The requested scope is invalid, malformed, or unknown -This problem comes from the permissions of the GitLab Pages OAuth application. To fix it, go to -**Admin > Applications > GitLab Pages** and edit the application. Under **Scopes**, ensure that the -`api` scope is selected and save your changes. +This problem comes from the permissions of the GitLab Pages OAuth application. To fix it: + +1. On the top bar, select **Menu >** **{admin}** **Admin**. +1. On the left sidebar, select **Applications > GitLab Pages**. +1. Edit the application. +1. Under **Scopes**, ensure that the `api` scope is selected. +1. Save your changes. + When running a [separate Pages server](#running-gitlab-pages-on-a-separate-server), this setting needs to be configured on the main GitLab server. @@ -1316,6 +1360,24 @@ To enable disk access: 1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure). +### `httprange: new resource 403` + +If you see an error similar to: + +```plaintext +{"error":"httprange: new resource 403: \"403 Forbidden\"","host":"root.pages.example.com","level":"error","msg":"vfs.Root","path":"/pages1/","time":"2021-06-10T08:45:19Z"} +``` + +And you run pages on the separate server syncing files via NFS, it may mean that +the shared pages directory is mounted on a different path on the main GitLab server and the +GitLab Pages server. + +In that case, it's highly recommended you to configure +[object storage and migrate any existing pages data to it](#using-object-storage). + +Alternatively, you can mount the GitLab Pages shared directory to the same path on +both servers. + ### GitLab Pages doesn't work after upgrading to GitLab 14.0 or above GitLab 14.0 introduces a number of changes to GitLab Pages which may require manual intervention. @@ -1323,6 +1385,12 @@ GitLab 14.0 introduces a number of changes to GitLab Pages which may require man 1. Firstly [follow the migration guide](#migrate-gitlab-pages-to-140). 1. If it doesn't work, see [GitLab Pages logs](#how-to-see-gitlab-pages-logs), and if you see any errors there then search them on this page. +The most common problem is when using [`inplace_chroot`](#dial-tcp-lookup-gitlabexamplecom-and-x509-certificate-signed-by-unknown-authority). + +NOTE: +Starting from 14.1, the chroot/jailing mechanism is +[disabled by default for API-based configuration](#jailing-mechanism-disabled-by-default-for-api-based-configuration). + WARNING: As the last resort you can temporarily enable legacy storage and configuration mechanisms. Support for them [will be removed in GitLab 14.3](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/6166), so GitLab Pages will stop working if don't resolve the underlying issue. diff --git a/doc/administration/pages/source.md b/doc/administration/pages/source.md index f1c3b515f68..4aaf430db97 100644 --- a/doc/administration/pages/source.md +++ b/doc/administration/pages/source.md @@ -61,7 +61,7 @@ Before proceeding with the Pages configuration, make sure that: Pages artifacts. 1. (Optional) You have a **wildcard certificate** for the Pages domain if you decide to serve Pages (`*.example.io`) under HTTPS. -1. (Optional but recommended) You have configured and enabled the [shared runners](../../ci/runners/README.md) +1. (Optional but recommended) You have configured and enabled the [shared runners](../../ci/runners/index.md) so that your users don't have to bring their own. ### DNS configuration @@ -443,9 +443,14 @@ are stored. ## Set maximum Pages size -The maximum size of the unpacked archive per project can be configured in -**Admin Area > Settings > Preferences > Pages**, in **Maximum size of pages (MB)**. -The default is 100MB. +The default for the maximum size of unpacked archives per project is 100 MB. + +To change this value: + +1. On the top bar, select **Menu >** **{admin}** **Admin**. +1. On the left sidebar, select **Settings > Preferences**. +1. Expand **Pages**. +1. Update the value for **Maximum size of pages (MB)**. ## Backup diff --git a/doc/administration/polling.md b/doc/administration/polling.md index d3f558eeaaa..ec5d6cd45d8 100644 --- a/doc/administration/polling.md +++ b/doc/administration/polling.md @@ -7,7 +7,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w # Polling configuration **(FREE SELF)** The GitLab UI polls for updates for different resources (issue notes, issue -titles, pipeline statuses, etc.) on a schedule appropriate to the resource. +titles, pipeline statuses, and so on) on a schedule appropriate to the resource. To configure the polling interval multiplier: diff --git a/doc/administration/postgresql/pgbouncer.md b/doc/administration/postgresql/pgbouncer.md index e481fcb71f4..4f9056b9b50 100644 --- a/doc/administration/postgresql/pgbouncer.md +++ b/doc/administration/postgresql/pgbouncer.md @@ -52,6 +52,20 @@ This content has been moved to a [new location](replication_and_failover.md#conf } ``` + You can pass additional configuration parameters per database, for example: + + ```ruby + pgbouncer['databases'] = { + gitlabhq_production: { + ... + pool_mode: 'transaction' + } + } + ``` + + Use these parameters with caution. For the complete list of parameters refer to the + [PgBouncer documentation](https://www.pgbouncer.org/config.html#section-databases). + 1. Run `gitlab-ctl reconfigure` 1. On the node running Puma, make sure the following is set in `/etc/gitlab/gitlab.rb` diff --git a/doc/administration/postgresql/replication_and_failover.md b/doc/administration/postgresql/replication_and_failover.md index b6d2e36851d..d1dd233f08b 100644 --- a/doc/administration/postgresql/replication_and_failover.md +++ b/doc/administration/postgresql/replication_and_failover.md @@ -97,8 +97,8 @@ This is why you will need: - IP address of each nodes network interface. This can be set to `0.0.0.0` to listen on all interfaces. It cannot be set to the loopback address `127.0.0.1`. -- Network Address. This can be in subnet (i.e. `192.168.0.0/255.255.255.0`) - or CIDR (i.e. `192.168.0.0/24`) form. +- Network Address. This can be in subnet (that is, `192.168.0.0/255.255.255.0`) + or CIDR (that is, `192.168.0.0/24`) form. #### Consul information @@ -157,6 +157,13 @@ We will need the following password information for the application's database u sudo gitlab-ctl pg-password-md5 POSTGRESQL_USERNAME ``` +#### Patroni information + +We will need the following password information for the Patroni API: + +- `PATRONI_API_USERNAME`. A username for basic auth to the API +- `PATRONI_API_PASSWORD`. A password for basic auth to the API + #### PgBouncer information When using default setup, minimum configuration requires: @@ -236,6 +243,11 @@ postgresql['sql_replication_password'] = 'POSTGRESQL_REPLICATION_PASSWORD_HASH' # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value postgresql['sql_user_password'] = 'POSTGRESQL_PASSWORD_HASH' +# Replace PATRONI_API_USERNAME with a username for Patroni Rest API calls (use the same username in all nodes) +patroni['username'] = 'PATRONI_API_USERNAME' +# Replace PATRONI_API_PASSWORD with a password for Patroni Rest API calls (use the same password in all nodes) +patroni['password'] = 'PATRONI_API_PASSWORD' + # Sets `max_replication_slots` to double the number of database nodes. # Patroni uses one extra slot per node when initiating the replication. patroni['postgresql']['max_replication_slots'] = X @@ -246,7 +258,7 @@ patroni['postgresql']['max_replication_slots'] = X patroni['postgresql']['max_wal_senders'] = X+1 # Replace XXX.XXX.XXX.XXX/YY with Network Address -postgresql['trust_auth_cidr_addresses'] = %w(XXX.XXX.XXX.XXX/YY) +postgresql['trust_auth_cidr_addresses'] = %w(XXX.XXX.XXX.XXX/YY 127.0.0.1/32) # Replace placeholders: # @@ -259,8 +271,8 @@ consul['configuration'] = { # END user configuration ``` -You do not need an additional or different configuration for replica nodes. As a matter of fact, you don't have to have -a predetermined primary node. Therefore all database nodes use the same configuration. +All database nodes use the same configuration. The leader node is not determined in configuration, +and there is no additional or different configuration for either leader or replica nodes. Once the configuration of a node is done, you must [reconfigure Omnibus GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) on each node for the changes to take effect. @@ -555,10 +567,12 @@ gitlab_rails['auto_migrate'] = false postgresql['pgbouncer_user_password'] = '771a8625958a529132abe6f1a4acb19c' postgresql['sql_user_password'] = '450409b85a0223a214b5fb1484f34d0f' +patroni['username'] = 'PATRONI_API_USERNAME' +patroni['password'] = 'PATRONI_API_PASSWORD' patroni['postgresql']['max_replication_slots'] = 6 patroni['postgresql']['max_wal_senders'] = 7 -postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/16) +postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/16 127.0.0.1/32) # Configure the Consul agent consul['services'] = %w(postgresql) @@ -642,12 +656,15 @@ postgresql['sql_user_password'] = '450409b85a0223a214b5fb1484f34d0f' # Patroni uses one extra slot per node when initiating the replication. patroni['postgresql']['max_replication_slots'] = 6 +patroni['username'] = 'PATRONI_API_USERNAME' +patroni['password'] = 'PATRONI_API_PASSWORD' + # Set `max_wal_senders` to one more than the number of replication slots in the cluster. # This is used to prevent replication from using up all of the # available database connections. patroni['postgresql']['max_wal_senders'] = 7 -postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/16) +postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/16 127.0.0.1/32) consul['configuration'] = { server: true, @@ -721,6 +738,97 @@ functional or does not have a leader, Patroni and by extension PostgreSQL will n API which can be accessed via its [default port](https://docs.gitlab.com/omnibus/package-information/defaults.html#patroni) on each node. +### Check replication status + +Run `gitlab-ctl patroni members` to query Patroni for a summary of the cluster status: + +```plaintext ++ Cluster: postgresql-ha (6970678148837286213) ------+---------+---------+----+-----------+ +| Member | Host | Role | State | TL | Lag in MB | ++-------------------------------------+--------------+---------+---------+----+-----------+ +| gitlab-database-1.example.com | 172.18.0.111 | Replica | running | 5 | 0 | +| gitlab-database-2.example.com | 172.18.0.112 | Replica | running | 5 | 100 | +| gitlab-database-3.example.com | 172.18.0.113 | Leader | running | 5 | | ++-------------------------------------+--------------+---------+---------+----+-----------+ +``` + +To verify the status of replication: + +```shell +echo 'select * from pg_stat_wal_receiver\x\g\x \n select * from pg_stat_replication\x\g\x' | gitlab-psql +``` + +The same command can be run on all three database servers, and will return any information +about replication available depending on the role the server is performing. + +The leader should return one record per replica: + +```sql +-[ RECORD 1 ]----+------------------------------ +pid | 371 +usesysid | 16384 +usename | gitlab_replicator +application_name | gitlab-database-1.example.com +client_addr | 172.18.0.111 +client_hostname | +client_port | 42900 +backend_start | 2021-06-14 08:01:59.580341+00 +backend_xmin | +state | streaming +sent_lsn | 0/EA13220 +write_lsn | 0/EA13220 +flush_lsn | 0/EA13220 +replay_lsn | 0/EA13220 +write_lag | +flush_lag | +replay_lag | +sync_priority | 0 +sync_state | async +reply_time | 2021-06-18 19:17:14.915419+00 +``` + +Investigate further if: + +- There are missing or extra records. +- `reply_time` is not current. + +The `lsn` fields relate to which write-ahead-log segments have been replicated. +Run the following on the leader to find out the current LSN: + +```shell +echo 'SELECT pg_current_wal_lsn();' | gitlab-psql +``` + +If a replica is not in sync, `gitlab-ctl patroni members` indicates the volume +of missing data, and the `lag` fields indicate the elapsed time. + +Read more about the data returned by the leader +[in the PostgreSQL documentation](https://www.postgresql.org/docs/12/monitoring-stats.html#PG-STAT-REPLICATION-VIEW), +including other values for the `state` field. + +The replicas should return: + +```sql +-[ RECORD 1 ]---------+------------------------------------------------------------------------------------------------- +pid | 391 +status | streaming +receive_start_lsn | 0/D000000 +receive_start_tli | 5 +received_lsn | 0/EA13220 +received_tli | 5 +last_msg_send_time | 2021-06-18 19:16:54.807375+00 +last_msg_receipt_time | 2021-06-18 19:16:54.807512+00 +latest_end_lsn | 0/EA13220 +latest_end_time | 2021-06-18 19:07:23.844879+00 +slot_name | gitlab-database-1.example.com +sender_host | 172.18.0.113 +sender_port | 5432 +conninfo | user=gitlab_replicator host=172.18.0.113 port=5432 application_name=gitlab-database-1.example.com +``` + +Read more about the data returned by the replica +[in the PostgreSQL documentation](https://www.postgresql.org/docs/12/monitoring-stats.html#PG-STAT-WAL-RECEIVER-VIEW). + ### Selecting the appropriate Patroni replication method [Review the Patroni documentation carefully](https://patroni.readthedocs.io/en/latest/SETTINGS.html#postgresql) @@ -1017,6 +1125,134 @@ postgresql['trust_auth_cidr_addresses'] = %w(123.123.123.123/32 <other_cidrs>) [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. +### Reinitialize a replica + +If replication is not occurring, it may be necessary to reinitialize a replica. + +1. On any server in the cluster, determine the Cluster and Member names, + and check the replication lag by running `gitlab-ctl patroni members`. Here is an example: + + ```plaintext + + Cluster: postgresql-ha (6970678148837286213) ------+---------+---------+----+-----------+ + | Member | Host | Role | State | TL | Lag in MB | + +-------------------------------------+--------------+---------+---------+----+-----------+ + | gitlab-database-1.example.com | 172.18.0.111 | Replica | running | 5 | 0 | + | gitlab-database-2.example.com | 172.18.0.112 | Replica | running | 5 | 100 | + | gitlab-database-3.example.com | 172.18.0.113 | Leader | running | 5 | | + +-------------------------------------+--------------+---------+---------+----+-----------+ + ``` + +1. Reinitialize the affected replica server: + + ```plaintext + gitlab-ctl patroni reinitialize-replica postgresql-ha gitlab-database-2.example.com + ``` + +### Reset the Patroni state in Consul + +WARNING: +This is a destructive process and may lead the cluster into a bad state. Make sure that you have a healthy backup before running this process. + +As a last resort, if your Patroni cluster is in an unknown/bad state and no node can start, you can +reset the Patroni state in Consul completely, resulting in a reinitialized Patroni cluster when +the first Patroni node starts. + +To reset the Patroni state in Consul: + +1. Take note of the Patroni node that was the leader, or that the application thinks is the current leader, if the current state shows more than one, or none. One way to do this is to look on the PgBouncer nodes in `/var/opt/gitlab/consul/databases.ini`, which contains the hostname of the current leader. +1. Stop Patroni on all nodes: + + ```shell + sudo gitlab-ctl stop patroni + ``` + +1. Reset the state in Consul: + + ```shell + /opt/gitlab/embedded/bin/consul kv delete -recurse /service/postgresql-ha/ + ``` + +1. Start one Patroni node, which will initialize the Patroni cluster and be elected as a leader. + It's highly recommended to start the previous leader (noted in the first step), + in order to not lose existing writes that may have not been replicated because + of the broken cluster state: + + ```shell + sudo gitlab-ctl start patroni + ``` + +1. Start all other Patroni nodes that will join the Patroni cluster as replicas: + + ```shell + sudo gitlab-ctl start patroni + ``` + +If you are still seeing issues, the next step is restoring the last healthy backup. + +### Errors in the Patroni log about a `pg_hba.conf` entry for `127.0.0.1` + +The following log entry in the Patroni log indicates the replication is not working +and a configuration change is needed: + +```plaintext +FATAL: no pg_hba.conf entry for replication connection from host "127.0.0.1", user "gitlab_replicator" +``` + +To fix the problem, ensure the loopback interface is included in the CIDR addresses list: + +1. Edit `/etc/gitlab/gitlab.rb`: + + ```ruby + postgresql['trust_auth_cidr_addresses'] = %w(<other_cidrs> 127.0.0.1/32) + ``` + +1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect. +1. Check that [all the replicas are synchronized](#check-replication-status) + +### Errors in Patroni logs: the requested start point is ahead of the WAL flush position + +This error indicates that the database is not replicating: + +```plaintext +FATAL: could not receive data from WAL stream: ERROR: requested starting point 0/5000000 is ahead of the WAL flush position of this server 0/4000388 +``` + +This example error is from a replica that was initially misconfigured, and had never replicated. + +Fix it [by reinitializing the replica](#reinitialize-a-replica). + +### Patroni fails to start with `MemoryError` + +Patroni may fail to start, logging an error and stack trace: + +```plaintext +MemoryError +Traceback (most recent call last): + File "/opt/gitlab/embedded/bin/patroni", line 8, in <module> + sys.exit(main()) +[..] + File "/opt/gitlab/embedded/lib/python3.7/ctypes/__init__.py", line 273, in _reset_cache + CFUNCTYPE(c_int)(lambda: None) +``` + +If the stack trace ends with `CFUNCTYPE(c_int)(lambda: None)`, this code triggers `MemoryError` +if the Linux server has been hardened for security. + +The code causes Python to write temporary executable files, and if it cannot find a filesystem +in which to do this, for example if `noexec` is set on the `/tmp` filesystem, it fails with +`MemoryError` ([read more in the issue](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/6184)). + +Workarounds: + +- Remove `noexec` from the mount options for filesystems like `/tmp` and `/var/tmp`. +- If set to enforcing, SELinux may also prevent these operations. Verify the issue is fixed by setting + SELinux to permissive. + +Omnibus GitLab has shipped with Patroni since 13.1 along with a build of Python 3.7. +Workarounds should stop being required when GitLab 14.x starts shipping with +[a later version of Python](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/6164) as +the code which causes this was removed from Python 3.8. + ### Issues with other components If you're running into an issue with a component not outlined here, be sure to check the troubleshooting section of their specific documentation page: diff --git a/doc/administration/raketasks/check.md b/doc/administration/raketasks/check.md index f7c91aa6b47..bcc2f838565 100644 --- a/doc/administration/raketasks/check.md +++ b/doc/administration/raketasks/check.md @@ -278,11 +278,11 @@ To delete these references to missing local artifacts (`job.log` files): puts "#{artifact.id} #{artifact.file.path} is missing." ### Allow verification before destroy # artifact.destroy! ### Uncomment to actually destroy end - puts "Count of identified/destroyed invalid references: #{artifacts_deleted}" + puts "Count of identified/destroyed invalid references: #{artifacts_deleted}" ``` ### Delete references to missing LFS objects If `gitlab-rake gitlab:lfs:check VERBOSE=1` detects LFS objects that exist in the database -but not on disk, [follow the procedure in the LFS documentation](../../topics/git/lfs/index.md#missing-lfs-objects) +but not on disk, [follow the procedure in the LFS documentation](../lfs/index.md#missing-lfs-objects) to remove the database entries. diff --git a/doc/administration/raketasks/maintenance.md b/doc/administration/raketasks/maintenance.md index fa95f38f37c..5ddab999efe 100644 --- a/doc/administration/raketasks/maintenance.md +++ b/doc/administration/raketasks/maintenance.md @@ -67,7 +67,7 @@ GitLab Shell path: /opt/gitlab/embedded/service/gitlab-shell ## Show GitLab license information **(PREMIUM SELF)** > - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/20501) in GitLab 12.6. -> - [Moved](../../subscriptions/bronze_starter.md) to GitLab Premium in 13.9. +> - Moved to GitLab Premium in 13.9. This command shows information about your [GitLab license](../../user/admin_area/license.md) and how many seats are used. It is only available on GitLab Enterprise @@ -330,7 +330,7 @@ migrations are completed (have an `up` status). ## Rebuild database indexes WARNING: -This is an experimental feature that isn't enabled by default. +This is an experimental feature that isn't enabled by default. It requires PostgreSQL 12 or later. Database indexes can be rebuilt regularly to reclaim space and maintain healthy levels of index bloat over time. @@ -348,7 +348,6 @@ sudo gitlab-rake gitlab:db:reindex['public.a_specific_index'] The following index types are not supported: -1. Unique and primary key indexes 1. Indexes used for constraint exclusion 1. Partitioned indexes 1. Expression indexes diff --git a/doc/administration/redis/replication_and_failover.md b/doc/administration/redis/replication_and_failover.md index 9fde91903e8..37d586b1d32 100644 --- a/doc/administration/redis/replication_and_failover.md +++ b/doc/administration/redis/replication_and_failover.md @@ -646,6 +646,7 @@ persistence classes. | `queues` | Store Sidekiq background jobs. | | `shared_state` | Store session-related and other persistent data. | | `actioncable` | Pub/Sub queue backend for ActionCable. | +| `trace_chunks` | Store [CI trace chunks](../job_logs.md#enable-or-disable-incremental-logging) data. | To make this work with Sentinel: @@ -657,6 +658,7 @@ To make this work with Sentinel: gitlab_rails['redis_queues_instance'] = REDIS_QUEUES_URL gitlab_rails['redis_shared_state_instance'] = REDIS_SHARED_STATE_URL gitlab_rails['redis_actioncable_instance'] = REDIS_ACTIONCABLE_URL + gitlab_rails['redis_trace_chunks_instance'] = REDIS_TRACE_CHUNKS_URL # Configure the Sentinels gitlab_rails['redis_cache_sentinels'] = [ @@ -675,6 +677,10 @@ To make this work with Sentinel: { host: ACTIONCABLE_SENTINEL_HOST, port: 26379 }, { host: ACTIONCABLE_SENTINEL_HOST2, port: 26379 } ] + gitlab_rails['redis_trace_chunks_sentinels'] = [ + { host: TRACE_CHUNKS_SENTINEL_HOST, port: 26379 }, + { host: TRACE_CHUNKS_SENTINEL_HOST2, port: 26379 } + ] ``` Note that: diff --git a/doc/administration/redis/replication_and_failover_external.md b/doc/administration/redis/replication_and_failover_external.md index 141da2f79ec..65ec8eb50e5 100644 --- a/doc/administration/redis/replication_and_failover_external.md +++ b/doc/administration/redis/replication_and_failover_external.md @@ -73,7 +73,7 @@ requirements: instead of a socket. To configure Redis to use TCP connections you need to define both `bind` and `port` in the Redis configuration file. You can bind to all interfaces (`0.0.0.0`) or specify the IP of the desired interface - (e.g., one from an internal network). + (for example, one from an internal network). - Since Redis 3.2, you must define a password to receive external connections (`requirepass`). - If you are using Redis with Sentinel, you also need to define the same diff --git a/doc/administration/reference_architectures/10k_users.md b/doc/administration/reference_architectures/10k_users.md index 4627b27a45e..1fc3483fbd4 100644 --- a/doc/administration/reference_architectures/10k_users.md +++ b/doc/administration/reference_architectures/10k_users.md @@ -94,7 +94,6 @@ cloud "**Object Storage**" as object_storage #white elb -[#6a9be7]-> gitlab elb -[#6a9be7]--> monitor -gitlab -[#32CD32]> sidekiq gitlab -[#32CD32]--> ilb gitlab -[#32CD32]-> object_storage gitlab -[#32CD32]---> redis @@ -598,8 +597,12 @@ in the second step, do not supply the `EXTERNAL_URL` value. # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value postgresql['sql_user_password'] = '<postgresql_password_hash>' + # Set up basic authentication for the Patroni API (use the same username/password in all nodes). + patroni['username'] = '<patroni_api_username>' + patroni['password'] = '<patroni_api_password>' + # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) + postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32) # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -802,7 +805,7 @@ Managed Redis from cloud providers (such as AWS ElastiCache) will work. If these services support high availability, be sure it _isn't_ of the Redis Cluster type. Redis version 5.0 or higher is required, which is included with Omnibus GitLab packages starting with GitLab 13.0. Older Redis versions don't support an -optional count argument to SPOP, which is required for [Merge Trains](../../ci/merge_request_pipelines/pipelines_for_merged_results/merge_trains/index.md). +optional count argument to SPOP, which is required for [Merge Trains](../../ci/pipelines/merge_trains.md). Note the Redis node's IP address or hostname, port, and password (if required). These will be necessary later when configuring the [GitLab application servers](#configure-gitlab-rails). @@ -1403,7 +1406,7 @@ in the second step, do not supply the `EXTERNAL_URL` value. postgresql['sql_user_password'] = "<praefect_postgresql_password_hash>" # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) + postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32) # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -1605,7 +1608,7 @@ To configure the Praefect nodes, on each one: 1. Praefect requires to run some database migrations, much like the main GitLab application. For this you should select **one Praefect node only to run the migrations**, AKA the _Deploy Node_. This node must be configured first before the others as follows: - + 1. In the `/etc/gitlab/gitlab.rb` file, change the `praefect['auto_migrate']` setting value from `false` to `true` 1. To ensure database migrations are only run during reconfigure and not automatically on upgrade, run: @@ -1613,7 +1616,7 @@ To configure the Praefect nodes, on each one: ```shell sudo touch /etc/gitlab/skip-auto-reconfigure ``` - + 1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect and to run the Praefect database migrations. @@ -1681,7 +1684,7 @@ On each node: # balancer. gitlab_rails['internal_api_url'] = 'https://gitlab.example.com' - # Gitaly + # Gitaly gitaly['enable'] = true # Make Gitaly accept connections on all network interfaces. You must use @@ -2344,10 +2347,13 @@ to use GitLab Pages, this currently [requires NFS](troubleshooting.md#gitlab-pag See how to [configure NFS](../nfs.md). WARNING: -From GitLab 14.0, enhancements and bug fixes for NFS for Git repositories will no longer be -considered and customer technical support will be considered out of scope. -[Read more about Gitaly and NFS](../gitaly/index.md#nfs-deprecation-notice) and -[the correct mount options to use](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). +Engineering support for NFS for Git repositories is deprecated. Technical support is planned to be +unavailable from GitLab 15.0. No further enhancements are planned for this feature. + +Read: + +- The [Gitaly and NFS deprecation notice](../gitaly/index.md#nfs-deprecation-notice). +- About the [correct mount options to use](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). <div align="right"> <a type="button" class="btn btn-default" href="#setup-components"> @@ -2365,9 +2371,9 @@ the following other supporting services are supported: NGINX, Task Runner, Migra Prometheus and Grafana. Hybrid installations leverage the benefits of both cloud native and traditional -Kubernetes, you can reap certain cloud native workload management benefits while -the others are deployed in compute VMs with Omnibus as described above in this -page. +compute deployments. With this, _stateless_ components can benefit from cloud native +workload management benefits while _stateful_ components are deployed in compute VMs +with Omnibus to benefit from increased permanence. NOTE: This is an **advanced** setup. Running services in Kubernetes is well known @@ -2389,7 +2395,7 @@ future with further specific cloud provider details. |-------------------------------------------------------|----------|-------------------------|------------------|-----------------------------| | Webservice | 4 | 32 vCPU, 28.8 GB memory | `n1-highcpu-32` | 127.5 vCPU, 118 GB memory | | Sidekiq | 4 | 4 vCPU, 15 GB memory | `n1-standard-4` | 15.5 vCPU, 50 GB memory | -| Supporting services such as NGINX, Prometheus, etc. | 2 | 4 vCPU, 15 GB memory | `n1-standard-4` | 7.75 vCPU, 25 GB memory | +| Supporting services such as NGINX or Prometheus | 2 | 4 vCPU, 15 GB memory | `n1-standard-4` | 7.75 vCPU, 25 GB memory | <!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix --> <!-- markdownlint-disable MD029 --> @@ -2478,7 +2484,6 @@ elb -[#6a9be7]-> gitlab elb -[#6a9be7]-> monitor elb -[hidden]-> support -gitlab -[#32CD32]> sidekiq gitlab -[#32CD32]--> ilb gitlab -[#32CD32]-> object_storage gitlab -[#32CD32]---> redis @@ -2532,7 +2537,7 @@ For further information on resource usage, see the [Webservice resources](https: Sidekiq pods should generally have 1 vCPU and 2 GB of memory. [The provided starting point](#cluster-topology) allows the deployment of up to -16 Sidekiq pods. Expand available resources using the 1 vCPU to 2GB memory +14 Sidekiq pods. Expand available resources using the 1 vCPU to 2GB memory ratio for each additional pod. For further information on resource usage, see the [Sidekiq resources](https://docs.gitlab.com/charts/charts/gitlab/sidekiq/#resources). diff --git a/doc/administration/reference_architectures/25k_users.md b/doc/administration/reference_architectures/25k_users.md index 1f72c45c2b7..e45a8f6963c 100644 --- a/doc/administration/reference_architectures/25k_users.md +++ b/doc/administration/reference_architectures/25k_users.md @@ -19,7 +19,7 @@ full list of reference architectures, see |------------------------------------------|-------------|-------------------------|------------------|--------------|-----------| | External load balancing node(3) | 1 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` | `c5.xlarge` | `F4s v2` | | Consul(1) | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` | `F2s v2` | -| PostgreSQL(1) | 3 | 16 vCPU, 60 GB memory | `n1-standard-1` | `m5.4xlarge` | `D16s v3` | +| PostgreSQL(1) | 3 | 16 vCPU, 60 GB memory | `n1-standard-16` | `m5.4xlarge` | `D16s v3` | | PgBouncer(1) | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | `c5.large` | `F2s v2` | | Internal load balancing node(3) | 1 | 4 vCPU, 3.6GB memory | `n1-highcpu-4` | `c5.large` | `F2s v2` | | Redis - Cache(2) | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | `D4s v3` | @@ -94,7 +94,6 @@ cloud "**Object Storage**" as object_storage #white elb -[#6a9be7]-> gitlab elb -[#6a9be7]--> monitor -gitlab -[#32CD32]> sidekiq gitlab -[#32CD32]--> ilb gitlab -[#32CD32]-> object_storage gitlab -[#32CD32]---> redis @@ -600,8 +599,12 @@ in the second step, do not supply the `EXTERNAL_URL` value. # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value postgresql['sql_user_password'] = '<postgresql_password_hash>' + # Set up basic authentication for the Patroni API (use the same username/password in all nodes). + patroni['username'] = '<patroni_api_username>' + patroni['password'] = '<patroni_api_password>' + # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) + postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32) # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -804,7 +807,7 @@ Managed Redis from cloud providers (such as AWS ElastiCache) will work. If these services support high availability, be sure it _isn't_ of the Redis Cluster type. Redis version 5.0 or higher is required, which is included with Omnibus GitLab packages starting with GitLab 13.0. Older Redis versions don't support an -optional count argument to SPOP, which is required for [Merge Trains](../../ci/merge_request_pipelines/pipelines_for_merged_results/merge_trains/index.md). +optional count argument to SPOP, which is required for [Merge Trains](../../ci/pipelines/merge_trains.md). Note the Redis node's IP address or hostname, port, and password (if required). These will be necessary later when configuring the [GitLab application servers](#configure-gitlab-rails). @@ -863,7 +866,7 @@ a node and change its status from primary to replica (and vice versa). redis_exporter['flags'] = { 'redis.addr' => 'redis://10.6.0.51:6379', 'redis.password' => 'redis-password-goes-here', - } + } # Prevent database migrations from running on upgrade automatically gitlab_rails['auto_migrate'] = false @@ -1421,7 +1424,7 @@ in the second step, do not supply the `EXTERNAL_URL` value. postgresql['sql_user_password'] = "<praefect_postgresql_password_hash>" # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) + postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32) # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -1623,7 +1626,7 @@ the file of the same name on this server. If this is the first Omnibus node you 1. Praefect requires to run some database migrations, much like the main GitLab application. For this you should select **one Praefect node only to run the migrations**, AKA the _Deploy Node_. This node must be configured first before the others as follows: - + 1. In the `/etc/gitlab/gitlab.rb` file, change the `praefect['auto_migrate']` setting value from `false` to `true` 1. To ensure database migrations are only run during reconfigure and not automatically on upgrade, run: @@ -1631,7 +1634,7 @@ the file of the same name on this server. If this is the first Omnibus node you ```shell sudo touch /etc/gitlab/skip-auto-reconfigure ``` - + 1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect and to run the Praefect database migrations. @@ -1699,7 +1702,7 @@ On each node: # balancer. gitlab_rails['internal_api_url'] = 'https://gitlab.example.com' - # Gitaly + # Gitaly gitaly['enable'] = true # Make Gitaly accept connections on all network interfaces. You must use @@ -2362,10 +2365,194 @@ to use GitLab Pages, this currently [requires NFS](troubleshooting.md#gitlab-pag See how to [configure NFS](../nfs.md). WARNING: -From GitLab 14.0, enhancements and bug fixes for NFS for Git repositories will no longer be -considered and customer technical support will be considered out of scope. -[Read more about Gitaly and NFS](../gitaly/index.md#nfs-deprecation-notice) and -[the correct mount options to use](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). +Engineering support for NFS for Git repositories is deprecated. Technical support is planned to be +unavailable from GitLab 15.0. No further enhancements are planned for this feature. + +Read: + +- The [Gitaly and NFS deprecation notice](../gitaly/index.md#nfs-deprecation-notice). +- About the [correct mount options to use](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). + +## Cloud Native Hybrid reference architecture with Helm Charts (alternative) + +As an alternative approach, you can also run select components of GitLab as Cloud Native +in Kubernetes via our official [Helm Charts](https://docs.gitlab.com/charts/). +In this setup, we support running the equivalent of GitLab Rails and Sidekiq nodes +in a Kubernetes cluster, named Webservice and Sidekiq respectively. In addition, +the following other supporting services are supported: NGINX, Task Runner, Migrations, +Prometheus and Grafana. + +Hybrid installations leverage the benefits of both cloud native and traditional +compute deployments. With this, _stateless_ components can benefit from cloud native +workload management benefits while _stateful_ components are deployed in compute VMs +with Omnibus to benefit from increased permanence. + +NOTE: +This is an **advanced** setup. Running services in Kubernetes is well known +to be complex. **This setup is only recommended** if you have strong working +knowledge and experience in Kubernetes. The rest of this +section will assume this. + +### Cluster topology + +The following tables and diagram details the hybrid environment using the same formats +as the normal environment above. + +First starting with the components that run in Kubernetes. The recommendations at this +time use Google Cloud’s Kubernetes Engine (GKE) and associated machine types, but the memory +and CPU requirements should translate to most other providers. We hope to update this in the +future with further specific cloud provider details. + +| Service | Nodes(1) | Configuration | GCP | Allocatable CPUs and Memory | +|-------------------------------------------------------|----------|-------------------------|------------------|-----------------------------| +| Webservice | 7 | 32 vCPU, 28.8 GB memory | `n1-highcpu-32` | 223 vCPU, 206.5 GB memory | +| Sidekiq | 4 | 4 vCPU, 15 GB memory | `n1-standard-4` | 15.5 vCPU, 50 GB memory | +| Supporting services such as NGINX, Prometheus, etc. | 2 | 4 vCPU, 15 GB memory | `n1-standard-4` | 7.75 vCPU, 25 GB memory | + +<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix --> +<!-- markdownlint-disable MD029 --> +1. Nodes configuration is shown as it is forced to ensure pod vcpu / memory ratios and avoid scaling during **performance testing**. + In production deployments there is no need to assign pods to nodes. A minimum of three nodes in three different availability zones is strongly recommended to align with resilient cloud architecture practices. +<!-- markdownlint-enable MD029 --> + +Next are the backend components that run on static compute VMs via Omnibus (or External PaaS +services where applicable): + +| Service | Nodes | Configuration | GCP | +|--------------------------------------------|-------|-------------------------|------------------| +| Consul(1) | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | +| PostgreSQL(1) | 3 | 16 vCPU, 60 GB memory | `n1-standard-16` | +| PgBouncer(1) | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | +| Internal load balancing node(3) | 1 | 4 vCPU, 3.6GB memory | `n1-highcpu-4` | +| Redis - Cache(2) | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | +| Redis - Queues / Shared State(2) | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | +| Redis Sentinel - Cache(2) | 3 | 1 vCPU, 3.75 GB memory | `n1-standard-1` | +| Redis Sentinel - Queues / Shared State(2) | 3 | 1 vCPU, 3.75 GB memory | `n1-standard-1` | +| Gitaly | 3 | 32 vCPU, 120 GB memory | `n1-standard-32` | +| Praefect | 3 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` | +| Praefect PostgreSQL(1) | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | +| Object storage(4) | n/a | n/a | n/a | + +<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix --> +<!-- markdownlint-disable MD029 --> +1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and AWS RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery. +2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work. +3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work. +4. Should be run on reputable third party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work. +<!-- markdownlint-enable MD029 --> + +NOTE: +For all PaaS solutions that involve configuring instances, it is strongly recommended to implement a minimum of three nodes in three different availability zones to align with resilient cloud architecture practices. + +```plantuml +@startuml 25k + +card "Kubernetes via Helm Charts" as kubernetes { + card "**External Load Balancer**" as elb #6a9be7 + + together { + collections "**Webservice** x7" as gitlab #32CD32 + collections "**Sidekiq** x4" as sidekiq #ff8dd1 + } + + card "**Prometheus + Grafana**" as monitor #7FFFD4 + card "**Supporting Services**" as support +} + +card "**Internal Load Balancer**" as ilb #9370DB +collections "**Consul** x3" as consul #e76a9b + +card "Gitaly Cluster" as gitaly_cluster { + collections "**Praefect** x3" as praefect #FF8C00 + collections "**Gitaly** x3" as gitaly #FF8C00 + card "**Praefect PostgreSQL***\n//Non fault-tolerant//" as praefect_postgres #FF8C00 + + praefect -[#FF8C00]-> gitaly + praefect -[#FF8C00]> praefect_postgres +} + +card "Database" as database { + collections "**PGBouncer** x3" as pgbouncer #4EA7FF + card "**PostgreSQL** (Primary)" as postgres_primary #4EA7FF + collections "**PostgreSQL** (Secondary) x2" as postgres_secondary #4EA7FF + + pgbouncer -[#4EA7FF]-> postgres_primary + postgres_primary .[#4EA7FF]> postgres_secondary +} + +card "redis" as redis { + collections "**Redis Persistent** x3" as redis_persistent #FF6347 + collections "**Redis Cache** x3" as redis_cache #FF6347 + collections "**Redis Persistent Sentinel** x3" as redis_persistent_sentinel #FF6347 + collections "**Redis Cache Sentinel** x3"as redis_cache_sentinel #FF6347 + + redis_persistent <.[#FF6347]- redis_persistent_sentinel + redis_cache <.[#FF6347]- redis_cache_sentinel +} + +cloud "**Object Storage**" as object_storage #white + +elb -[#6a9be7]-> gitlab +elb -[#6a9be7]-> monitor +elb -[hidden]-> support + +gitlab -[#32CD32]--> ilb +gitlab -[#32CD32]-> object_storage +gitlab -[#32CD32]---> redis +gitlab -[hidden]--> consul + +sidekiq -[#ff8dd1]--> ilb +sidekiq -[#ff8dd1]-> object_storage +sidekiq -[#ff8dd1]---> redis +sidekiq -[hidden]--> consul + +ilb -[#9370DB]-> gitaly_cluster +ilb -[#9370DB]-> database + +consul .[#e76a9b]-> database +consul .[#e76a9b]-> gitaly_cluster +consul .[#e76a9b,norank]--> redis + +monitor .[#7FFFD4]> consul +monitor .[#7FFFD4]-> database +monitor .[#7FFFD4]-> gitaly_cluster +monitor .[#7FFFD4,norank]--> redis +monitor .[#7FFFD4]> ilb +monitor .[#7FFFD4,norank]u--> elb + +@enduml +``` + +### Resource usage settings + +The following formulas help when calculating how many pods may be deployed within resource constraints. +The [25k reference architecture example values file](https://gitlab.com/gitlab-org/charts/gitlab/-/blob/master/examples/ref/25k.yaml) +documents how to apply the calculated configuration to the Helm Chart. + +#### Webservice + +Webservice pods typically need about 1 vCPU and 1.25 GB of memory _per worker_. +Each Webservice pod will consume roughly 4 vCPUs and 5 GB of memory using +the [recommended topology](#cluster-topology) because four worker processes +are created by default and each pod has other small processes running. + +For 25k users we recommend a total Puma worker count of around 140. +With the [provided recommendations](#cluster-topology) this allows the deployment of up to 35 +Webservice pods with 4 workers per pod and 5 pods per node. Expand available resources using +the ratio of 1 vCPU to 1.25 GB of memory _per each worker process_ for each additional +Webservice pod. + +For further information on resource usage, see the [Webservice resources](https://docs.gitlab.com/charts/charts/gitlab/webservice/#resources). + +#### Sidekiq + +Sidekiq pods should generally have 1 vCPU and 2 GB of memory. + +[The provided starting point](#cluster-topology) allows the deployment of up to +14 Sidekiq pods. Expand available resources using the 1 vCPU to 2GB memory +ratio for each additional pod. + +For further information on resource usage, see the [Sidekiq resources](https://docs.gitlab.com/charts/charts/gitlab/sidekiq/#resources). <div align="right"> <a type="button" class="btn btn-default" href="#setup-components"> diff --git a/doc/administration/reference_architectures/2k_users.md b/doc/administration/reference_architectures/2k_users.md index 7db3a343e0b..ff3db877553 100644 --- a/doc/administration/reference_architectures/2k_users.md +++ b/doc/administration/reference_architectures/2k_users.md @@ -324,7 +324,7 @@ to be used with GitLab. Redis version 5.0 or higher is required, as this is what ships with Omnibus GitLab packages starting with GitLab 13.0. Older Redis versions do not support an optional count argument to SPOP which is now required for -[Merge Trains](../../ci/merge_request_pipelines/pipelines_for_merged_results/merge_trains/index.md). +[Merge Trains](../../ci/pipelines/merge_trains.md). In addition, GitLab makes use of certain commands like `UNLINK` and `USAGE` which were introduced only in Redis 4. @@ -965,10 +965,13 @@ possible. However, if you intend to use GitLab Pages, See how to [configure NFS](../nfs.md). WARNING: -From GitLab 14.0, enhancements and bug fixes for NFS for Git repositories will no longer be -considered and customer technical support will be considered out of scope. -[Read more about Gitaly and NFS](../gitaly/index.md#nfs-deprecation-notice) and -[the correct mount options to use](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). +Engineering support for NFS for Git repositories is deprecated. Technical support is planned to be +unavailable from GitLab 15.0. No further enhancements are planned for this feature. + +Read: + +- The [Gitaly and NFS deprecation notice](../gitaly/index.md#nfs-deprecation-notice). +- About the [correct mount options to use](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). <div align="right"> <a type="button" class="btn btn-default" href="#setup-components"> diff --git a/doc/administration/reference_architectures/3k_users.md b/doc/administration/reference_architectures/3k_users.md index bca5e4c3dab..ef58e69ee27 100644 --- a/doc/administration/reference_architectures/3k_users.md +++ b/doc/administration/reference_architectures/3k_users.md @@ -101,7 +101,6 @@ cloud "**Object Storage**" as object_storage #white elb -[#6a9be7]-> gitlab elb -[#6a9be7]--> monitor -gitlab -[#32CD32]> sidekiq gitlab -[#32CD32]--> ilb gitlab -[#32CD32]-> object_storage gitlab -[#32CD32]---> redis @@ -440,7 +439,7 @@ services support high availability, be sure it is **not** the Redis Cluster type Redis version 5.0 or higher is required, as this is what ships with Omnibus GitLab packages starting with GitLab 13.0. Older Redis versions do not support an optional count argument to SPOP which is now required for -[Merge Trains](../../ci/merge_request_pipelines/pipelines_for_merged_results/merge_trains/index.md). +[Merge Trains](../../ci/pipelines/merge_trains.md). Note the Redis node's IP address or hostname, port, and password (if required). These will be necessary when configuring the @@ -829,7 +828,7 @@ in the second step, do not supply the `EXTERNAL_URL` value. username of `gitlab_replicator` (recommended). The command will request a password and a confirmation. Use the value that is output by this command in the next step as the value of `<postgresql_replication_password_hash>`: - + ```shell sudo gitlab-ctl pg-password-md5 gitlab_replicator ``` @@ -848,7 +847,7 @@ in the second step, do not supply the `EXTERNAL_URL` value. ```ruby # Disable all components except Patroni and Consul roles(['patroni_role']) - + # PostgreSQL configuration postgresql['listen_address'] = '0.0.0.0' @@ -866,7 +865,7 @@ in the second step, do not supply the `EXTERNAL_URL` value. # Prevent database migrations from running on upgrade automatically gitlab_rails['auto_migrate'] = false - + # Configure the Consul agent consul['services'] = %w(postgresql) ## Enable service discovery for Prometheus @@ -882,8 +881,12 @@ in the second step, do not supply the `EXTERNAL_URL` value. # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value postgresql['sql_user_password'] = '<postgresql_password_hash>' + # Set up basic authentication for the Patroni API (use the same username/password in all nodes). + patroni['username'] = '<patroni_api_username>' + patroni['password'] = '<patroni_api_password>' + # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) + postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32) # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -1127,7 +1130,7 @@ in the second step, do not supply the `EXTERNAL_URL` value. postgresql['sql_user_password'] = "<praefect_postgresql_password_hash>" # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) + postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32) # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -1328,7 +1331,7 @@ the file of the same name on this server. If this is the first Omnibus node you 1. Praefect requires to run some database migrations, much like the main GitLab application. For this you should select **one Praefect node only to run the migrations**, AKA the _Deploy Node_. This node must be configured first before the others as follows: - + 1. In the `/etc/gitlab/gitlab.rb` file, change the `praefect['auto_migrate']` setting value from `false` to `true` 1. To ensure database migrations are only run during reconfigure and not automatically on upgrade, run: @@ -1336,7 +1339,7 @@ the file of the same name on this server. If this is the first Omnibus node you ```shell sudo touch /etc/gitlab/skip-auto-reconfigure ``` - + 1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect and to run the Praefect database migrations. @@ -2062,10 +2065,13 @@ to use GitLab Pages, this currently [requires NFS](troubleshooting.md#gitlab-pag See how to [configure NFS](../nfs.md). WARNING: -From GitLab 14.0, enhancements and bug fixes for NFS for Git repositories will no longer be -considered and customer technical support will be considered out of scope. -[Read more about Gitaly and NFS](../gitaly/index.md#nfs-deprecation-notice) and -[the correct mount options to use](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). +Engineering support for NFS for Git repositories is deprecated. Technical support is planned to be +unavailable from GitLab 15.0. No further enhancements are planned for this feature. + +Read: + +- The [Gitaly and NFS deprecation notice](../gitaly/index.md#nfs-deprecation-notice). +- About the [correct mount options to use](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). ## Supported modifications for lower user counts (HA) diff --git a/doc/administration/reference_architectures/50k_users.md b/doc/administration/reference_architectures/50k_users.md index b3324cb75fb..766f94f6c53 100644 --- a/doc/administration/reference_architectures/50k_users.md +++ b/doc/administration/reference_architectures/50k_users.md @@ -94,7 +94,6 @@ cloud "**Object Storage**" as object_storage #white elb -[#6a9be7]-> gitlab elb -[#6a9be7]--> monitor -gitlab -[#32CD32]> sidekiq gitlab -[#32CD32]--> ilb gitlab -[#32CD32]-> object_storage gitlab -[#32CD32]---> redis @@ -608,8 +607,12 @@ in the second step, do not supply the `EXTERNAL_URL` value. # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value postgresql['sql_user_password'] = '<postgresql_password_hash>' + # Set up basic authentication for the Patroni API (use the same username/password in all nodes). + patroni['username'] = '<patroni_api_username>' + patroni['password'] = '<patroni_api_password>' + # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) + postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32) # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -812,7 +815,7 @@ Managed Redis from cloud providers (such as AWS ElastiCache) will work. If these services support high availability, be sure it _isn't_ of the Redis Cluster type. Redis version 5.0 or higher is required, which is included with Omnibus GitLab packages starting with GitLab 13.0. Older Redis versions don't support an -optional count argument to SPOP, which is required for [Merge Trains](../../ci/merge_request_pipelines/pipelines_for_merged_results/merge_trains/index.md). +optional count argument to SPOP, which is required for [Merge Trains](../../ci/pipelines/merge_trains.md). Note the Redis node's IP address or hostname, port, and password (if required). These will be necessary later when configuring the [GitLab application servers](#configure-gitlab-rails). @@ -872,7 +875,7 @@ a node and change its status from primary to replica (and vice versa). 'redis.addr' => 'redis://10.6.0.51:6379', 'redis.password' => 'redis-password-goes-here', } - + # Prevent database migrations from running on upgrade automatically gitlab_rails['auto_migrate'] = false ``` @@ -1425,7 +1428,7 @@ in the second step, do not supply the `EXTERNAL_URL` value. postgresql['sql_user_password'] = "<praefect_postgresql_password_hash>" # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) + postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32) # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -1627,7 +1630,7 @@ the file of the same name on this server. If this is the first Omnibus node you 1. Praefect requires to run some database migrations, much like the main GitLab application. For this you should select **one Praefect node only to run the migrations**, AKA the _Deploy Node_. This node must be configured first before the others as follows: - + 1. In the `/etc/gitlab/gitlab.rb` file, change the `praefect['auto_migrate']` setting value from `false` to `true` 1. To ensure database migrations are only run during reconfigure and not automatically on upgrade, run: @@ -1635,7 +1638,7 @@ the file of the same name on this server. If this is the first Omnibus node you ```shell sudo touch /etc/gitlab/skip-auto-reconfigure ``` - + 1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect and to run the Praefect database migrations. @@ -1703,7 +1706,7 @@ On each node: # balancer. gitlab_rails['internal_api_url'] = 'https://gitlab.example.com' - # Gitaly + # Gitaly gitaly['enable'] = true # Make Gitaly accept connections on all network interfaces. You must use @@ -1929,7 +1932,7 @@ To configure the Sidekiq nodes, on each one: ## Set number of Sidekiq threads per queue process to the recommend number of 10 sidekiq['max_concurrency'] = 10 - # Monitoring + # Monitoring consul['enable'] = true consul['monitoring_service_discovery'] = true @@ -2373,10 +2376,194 @@ to use GitLab Pages, this currently [requires NFS](troubleshooting.md#gitlab-pag See how to [configure NFS](../nfs.md). WARNING: -From GitLab 14.0, enhancements and bug fixes for NFS for Git repositories will no longer be -considered and customer technical support will be considered out of scope. -[Read more about Gitaly and NFS](../gitaly/index.md#nfs-deprecation-notice) and -[the correct mount options to use](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). +Engineering support for NFS for Git repositories is deprecated. Technical support is planned to be +unavailable from GitLab 15.0. No further enhancements are planned for this feature. + +Read: + +- The [Gitaly and NFS deprecation notice](../gitaly/index.md#nfs-deprecation-notice). +- About the [correct mount options to use](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). + +## Cloud Native Hybrid reference architecture with Helm Charts (alternative) + +As an alternative approach, you can also run select components of GitLab as Cloud Native +in Kubernetes via our official [Helm Charts](https://docs.gitlab.com/charts/). +In this setup, we support running the equivalent of GitLab Rails and Sidekiq nodes +in a Kubernetes cluster, named Webservice and Sidekiq respectively. In addition, +the following other supporting services are supported: NGINX, Task Runner, Migrations, +Prometheus and Grafana. + +Hybrid installations leverage the benefits of both cloud native and traditional +compute deployments. With this, _stateless_ components can benefit from cloud native +workload management benefits while _stateful_ components are deployed in compute VMs +with Omnibus to benefit from increased permanence. + +NOTE: +This is an **advanced** setup. Running services in Kubernetes is well known +to be complex. **This setup is only recommended** if you have strong working +knowledge and experience in Kubernetes. The rest of this +section will assume this. + +### Cluster topology + +The following tables and diagram details the hybrid environment using the same formats +as the normal environment above. + +First starting with the components that run in Kubernetes. The recommendations at this +time use Google Cloud’s Kubernetes Engine (GKE) and associated machine types, but the memory +and CPU requirements should translate to most other providers. We hope to update this in the +future with further specific cloud provider details. + +| Service | Nodes(1) | Configuration | GCP | Allocatable CPUs and Memory | +|-------------------------------------------------------|----------|-------------------------|------------------|-----------------------------| +| Webservice | 16 | 32 vCPU, 28.8 GB memory | `n1-highcpu-32` | 510 vCPU, 472 GB memory | +| Sidekiq | 4 | 4 vCPU, 15 GB memory | `n1-standard-4` | 15.5 vCPU, 50 GB memory | +| Supporting services such as NGINX, Prometheus, etc. | 2 | 4 vCPU, 15 GB memory | `n1-standard-4` | 7.75 vCPU, 25 GB memory | + +<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix --> +<!-- markdownlint-disable MD029 --> +1. Nodes configuration is shown as it is forced to ensure pod vcpu / memory ratios and avoid scaling during **performance testing**. + In production deployments there is no need to assign pods to nodes. A minimum of three nodes in three different availability zones is strongly recommended to align with resilient cloud architecture practices. +<!-- markdownlint-enable MD029 --> + +Next are the backend components that run on static compute VMs via Omnibus (or External PaaS +services where applicable): + +| Service | Nodes | Configuration | GCP | +|--------------------------------------------|-------|-------------------------|------------------| +| Consul(1) | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | +| PostgreSQL(1) | 3 | 32 vCPU, 120 GB memory | `n1-standard-32` | +| PgBouncer(1) | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | +| Internal load balancing node(3) | 1 | 8 vCPU, 7.2 GB memory | `n1-highcpu-8` | +| Redis - Cache(2) | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | +| Redis - Queues / Shared State(2) | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | +| Redis Sentinel - Cache(2) | 3 | 1 vCPU, 3.75 GB memory | `n1-standard-1` | +| Redis Sentinel - Queues / Shared State(2) | 3 | 1 vCPU, 3.75 GB memory | `n1-standard-1` | +| Gitaly | 3 | 64 vCPU, 240 GB memory | `n1-standard-64` | +| Praefect | 3 | 4 vCPU, 3.6 GB memory | `n1-highcpu-4` | +| Praefect PostgreSQL(1) | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | +| Object storage(4) | n/a | n/a | n/a | + +<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix --> +<!-- markdownlint-disable MD029 --> +1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and AWS RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery. +2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work. +3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work. +4. Should be run on reputable third party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work. +<!-- markdownlint-enable MD029 --> + +NOTE: +For all PaaS solutions that involve configuring instances, it is strongly recommended to implement a minimum of three nodes in three different availability zones to align with resilient cloud architecture practices. + +```plantuml +@startuml 50k + +card "Kubernetes via Helm Charts" as kubernetes { + card "**External Load Balancer**" as elb #6a9be7 + + together { + collections "**Webservice** x16" as gitlab #32CD32 + collections "**Sidekiq** x4" as sidekiq #ff8dd1 + } + + card "**Prometheus + Grafana**" as monitor #7FFFD4 + card "**Supporting Services**" as support +} + +card "**Internal Load Balancer**" as ilb #9370DB +collections "**Consul** x3" as consul #e76a9b + +card "Gitaly Cluster" as gitaly_cluster { + collections "**Praefect** x3" as praefect #FF8C00 + collections "**Gitaly** x3" as gitaly #FF8C00 + card "**Praefect PostgreSQL***\n//Non fault-tolerant//" as praefect_postgres #FF8C00 + + praefect -[#FF8C00]-> gitaly + praefect -[#FF8C00]> praefect_postgres +} + +card "Database" as database { + collections "**PGBouncer** x3" as pgbouncer #4EA7FF + card "**PostgreSQL** (Primary)" as postgres_primary #4EA7FF + collections "**PostgreSQL** (Secondary) x2" as postgres_secondary #4EA7FF + + pgbouncer -[#4EA7FF]-> postgres_primary + postgres_primary .[#4EA7FF]> postgres_secondary +} + +card "redis" as redis { + collections "**Redis Persistent** x3" as redis_persistent #FF6347 + collections "**Redis Cache** x3" as redis_cache #FF6347 + collections "**Redis Persistent Sentinel** x3" as redis_persistent_sentinel #FF6347 + collections "**Redis Cache Sentinel** x3"as redis_cache_sentinel #FF6347 + + redis_persistent <.[#FF6347]- redis_persistent_sentinel + redis_cache <.[#FF6347]- redis_cache_sentinel +} + +cloud "**Object Storage**" as object_storage #white + +elb -[#6a9be7]-> gitlab +elb -[#6a9be7]-> monitor +elb -[hidden]-> support + +gitlab -[#32CD32]--> ilb +gitlab -[#32CD32]-> object_storage +gitlab -[#32CD32]---> redis +gitlab -[hidden]--> consul + +sidekiq -[#ff8dd1]--> ilb +sidekiq -[#ff8dd1]-> object_storage +sidekiq -[#ff8dd1]---> redis +sidekiq -[hidden]--> consul + +ilb -[#9370DB]-> gitaly_cluster +ilb -[#9370DB]-> database + +consul .[#e76a9b]-> database +consul .[#e76a9b]-> gitaly_cluster +consul .[#e76a9b,norank]--> redis + +monitor .[#7FFFD4]> consul +monitor .[#7FFFD4]-> database +monitor .[#7FFFD4]-> gitaly_cluster +monitor .[#7FFFD4,norank]--> redis +monitor .[#7FFFD4]> ilb +monitor .[#7FFFD4,norank]u--> elb + +@enduml +``` + +### Resource usage settings + +The following formulas help when calculating how many pods may be deployed within resource constraints. +The [50k reference architecture example values file](https://gitlab.com/gitlab-org/charts/gitlab/-/blob/master/examples/ref/50k.yaml) +documents how to apply the calculated configuration to the Helm Chart. + +#### Webservice + +Webservice pods typically need about 1 vCPU and 1.25 GB of memory _per worker_. +Each Webservice pod will consume roughly 4 vCPUs and 5 GB of memory using +the [recommended topology](#cluster-topology) because four worker processes +are created by default and each pod has other small processes running. + +For 50k users we recommend a total Puma worker count of around 320. +With the [provided recommendations](#cluster-topology) this allows the deployment of up to 80 +Webservice pods with 4 workers per pod and 5 pods per node. Expand available resources using +the ratio of 1 vCPU to 1.25 GB of memory _per each worker process_ for each additional +Webservice pod. + +For further information on resource usage, see the [Webservice resources](https://docs.gitlab.com/charts/charts/gitlab/webservice/#resources). + +#### Sidekiq + +Sidekiq pods should generally have 1 vCPU and 2 GB of memory. + +[The provided starting point](#cluster-topology) allows the deployment of up to +14 Sidekiq pods. Expand available resources using the 1 vCPU to 2GB memory +ratio for each additional pod. + +For further information on resource usage, see the [Sidekiq resources](https://docs.gitlab.com/charts/charts/gitlab/sidekiq/#resources). <div align="right"> <a type="button" class="btn btn-default" href="#setup-components"> diff --git a/doc/administration/reference_architectures/5k_users.md b/doc/administration/reference_architectures/5k_users.md index 9952df196c9..e57c4545b13 100644 --- a/doc/administration/reference_architectures/5k_users.md +++ b/doc/administration/reference_architectures/5k_users.md @@ -60,10 +60,7 @@ together { collections "**Sidekiq** x4" as sidekiq #ff8dd1 } -together { - card "**Prometheus + Grafana**" as monitor #7FFFD4 - collections "**Consul** x3" as consul #e76a9b -} +card "**Prometheus + Grafana**" as monitor #7FFFD4 card "Gitaly Cluster" as gitaly_cluster { collections "**Praefect** x3" as praefect #FF8C00 @@ -83,14 +80,15 @@ card "Database" as database { postgres_primary .[#4EA7FF]> postgres_secondary } -card "redis" as redis { - collections "**Redis Persistent** x3" as redis_persistent #FF6347 - collections "**Redis Cache** x3" as redis_cache #FF6347 - collections "**Redis Persistent Sentinel** x3" as redis_persistent_sentinel #FF6347 - collections "**Redis Cache Sentinel** x3"as redis_cache_sentinel #FF6347 +node "**Consul + Sentinel** x3" as consul_sentinel { + component Consul as consul #e76a9b + component Sentinel as sentinel #e6e727 +} - redis_persistent <.[#FF6347]- redis_persistent_sentinel - redis_cache <.[#FF6347]- redis_cache_sentinel +card "Redis" as redis { + collections "**Redis** x3" as redis_nodes #FF6347 + + redis_nodes <.[#FF6347]- sentinel } cloud "**Object Storage**" as object_storage #white @@ -98,7 +96,6 @@ cloud "**Object Storage**" as object_storage #white elb -[#6a9be7]-> gitlab elb -[#6a9be7]--> monitor -gitlab -[#32CD32]> sidekiq gitlab -[#32CD32]--> ilb gitlab -[#32CD32]-> object_storage gitlab -[#32CD32]---> redis @@ -432,7 +429,7 @@ services support high availability, be sure it is **not** the Redis Cluster type Redis version 5.0 or higher is required, as this is what ships with Omnibus GitLab packages starting with GitLab 13.0. Older Redis versions do not support an optional count argument to SPOP which is now required for -[Merge Trains](../../ci/merge_request_pipelines/pipelines_for_merged_results/merge_trains/index.md). +[Merge Trains](../../ci/pipelines/merge_trains.md). Note the Redis node's IP address or hostname, port, and password (if required). These will be necessary when configuring the @@ -846,7 +843,7 @@ in the second step, do not supply the `EXTERNAL_URL` value. # Sets `max_replication_slots` to double the number of database nodes. # Patroni uses one extra slot per node when initiating the replication. patroni['postgresql']['max_replication_slots'] = 8 - + # Set `max_wal_senders` to one more than the number of replication slots in the cluster. # This is used to prevent replication from using up all of the # available database connections. @@ -873,8 +870,12 @@ in the second step, do not supply the `EXTERNAL_URL` value. # Replace POSTGRESQL_PASSWORD_HASH with a generated md5 value postgresql['sql_user_password'] = '<postgresql_password_hash>' + # Set up basic authentication for the Patroni API (use the same username/password in all nodes). + patroni['username'] = '<patroni_api_username>' + patroni['password'] = '<patroni_api_password>' + # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) + postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32) # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -1118,7 +1119,7 @@ in the second step, do not supply the `EXTERNAL_URL` value. postgresql['sql_user_password'] = "<praefect_postgresql_password_hash>" # Replace XXX.XXX.XXX.XXX/YY with Network Address - postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24) + postgresql['trust_auth_cidr_addresses'] = %w(10.6.0.0/24 127.0.0.1/32) # Set the network addresses that the exporters will listen on for monitoring node_exporter['listen_address'] = '0.0.0.0:9100' @@ -1320,7 +1321,7 @@ the file of the same name on this server. If this is the first Omnibus node you 1. Praefect requires to run some database migrations, much like the main GitLab application. For this you should select **one Praefect node only to run the migrations**, AKA the _Deploy Node_. This node must be configured first before the others as follows: - + 1. In the `/etc/gitlab/gitlab.rb` file, change the `praefect['auto_migrate']` setting value from `false` to `true` 1. To ensure database migrations are only run during reconfigure and not automatically on upgrade, run: @@ -1328,7 +1329,7 @@ the file of the same name on this server. If this is the first Omnibus node you ```shell sudo touch /etc/gitlab/skip-auto-reconfigure ``` - + 1. [Reconfigure GitLab](../restart_gitlab.md#omnibus-gitlab-reconfigure) for the changes to take effect and to run the Praefect database migrations. @@ -2056,10 +2057,191 @@ to use GitLab Pages, this currently [requires NFS](troubleshooting.md#gitlab-pag See how to [configure NFS](../nfs.md). WARNING: -From GitLab 14.0, enhancements and bug fixes for NFS for Git repositories will no longer be -considered and customer technical support will be considered out of scope. -[Read more about Gitaly and NFS](../gitaly/index.md#nfs-deprecation-notice) and -[the correct mount options to use](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). +Engineering support for NFS for Git repositories is deprecated. Technical support is planned to be +unavailable from GitLab 15.0. No further enhancements are planned for this feature. + +Read: + +- The [Gitaly and NFS deprecation notice](../gitaly/index.md#nfs-deprecation-notice). +- About the [correct mount options to use](../nfs.md#upgrade-to-gitaly-cluster-or-disable-caching-if-experiencing-data-loss). + +## Cloud Native Hybrid reference architecture with Helm Charts (alternative) + +As an alternative approach, you can also run select components of GitLab as Cloud Native +in Kubernetes via our official [Helm Charts](https://docs.gitlab.com/charts/). +In this setup, we support running the equivalent of GitLab Rails and Sidekiq nodes +in a Kubernetes cluster, named Webservice and Sidekiq respectively. In addition, +the following other supporting services are supported: NGINX, Task Runner, Migrations, +Prometheus and Grafana. + +Hybrid installations leverage the benefits of both cloud native and traditional +compute deployments. With this, _stateless_ components can benefit from cloud native +workload management benefits while _stateful_ components are deployed in compute VMs +with Omnibus to benefit from increased permanence. + +NOTE: +This is an **advanced** setup. Running services in Kubernetes is well known +to be complex. **This setup is only recommended** if you have strong working +knowledge and experience in Kubernetes. The rest of this +section will assume this. + +### Cluster topology + +The following tables and diagram details the hybrid environment using the same formats +as the normal environment above. + +First starting with the components that run in Kubernetes. The recommendations at this +time use Google Cloud’s Kubernetes Engine (GKE) and associated machine types, but the memory +and CPU requirements should translate to most other providers. We hope to update this in the +future with further specific cloud provider details. + +| Service | Nodes(1) | Configuration | GCP | Allocatable CPUs and Memory | +|-------------------------------------------------------|----------|-------------------------|------------------|-----------------------------| +| Webservice | 5 | 16 vCPU, 14.4 GB memory | `n1-highcpu-16` | 79.5 vCPU, 62 GB memory | +| Sidekiq | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | 11.8 vCPU, 38.9 GB memory | +| Supporting services such as NGINX, Prometheus, etc. | 2 | 2 vCPU, 7.5 GB memory | `n1-standard-2` | 3.9 vCPU, 11.8 GB memory | + +<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix --> +<!-- markdownlint-disable MD029 --> +1. Nodes configuration is shown as it is forced to ensure pod vcpu / memory ratios and avoid scaling during **performance testing**. + In production deployments there is no need to assign pods to nodes. A minimum of three nodes in three different availability zones is strongly recommended to align with resilient cloud architecture practices. +<!-- markdownlint-enable MD029 --> + +Next are the backend components that run on static compute VMs via Omnibus (or External PaaS +services where applicable): + +| Service | Nodes | Configuration | GCP | +|--------------------------------------------|-------|-------------------------|------------------| +| Redis(2) | 3 | 2 vCPU, 7.5 GB memory | `n1-standard-2` | +| Consul(1) + Sentinel(2) | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | +| PostgreSQL(1) | 3 | 4 vCPU, 15 GB memory | `n1-standard-4` | +| PgBouncer(1) | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | +| Internal load balancing node(3) | 1 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | +| Gitaly | 3 | 8 vCPU, 30 GB memory | `n1-standard-8` | +| Praefect | 3 | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | +| Praefect PostgreSQL(1) | 1+ | 2 vCPU, 1.8 GB memory | `n1-highcpu-2` | +| Object storage(4) | n/a | n/a | n/a | + +<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix --> +<!-- markdownlint-disable MD029 --> +1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. Google Cloud SQL and AWS RDS are known to work, however Azure Database for PostgreSQL is [not recommended](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61) due to performance issues. Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery. +2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work. +3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work. +4. Should be run on reputable third party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work. +<!-- markdownlint-enable MD029 --> + +NOTE: +For all PaaS solutions that involve configuring instances, it is strongly recommended to implement a minimum of three nodes in three different availability zones to align with resilient cloud architecture practices. + +```plantuml +@startuml 5k + +card "Kubernetes via Helm Charts" as kubernetes { + card "**External Load Balancer**" as elb #6a9be7 + + together { + collections "**Webservice** x5" as gitlab #32CD32 + collections "**Sidekiq** x3" as sidekiq #ff8dd1 + } + + card "**Prometheus + Grafana**" as monitor #7FFFD4 + card "**Supporting Services**" as support +} + +card "**Internal Load Balancer**" as ilb #9370DB + +node "**Consul + Sentinel** x3" as consul_sentinel { + component Consul as consul #e76a9b + component Sentinel as sentinel #e6e727 +} + +card "Gitaly Cluster" as gitaly_cluster { + collections "**Praefect** x3" as praefect #FF8C00 + collections "**Gitaly** x3" as gitaly #FF8C00 + card "**Praefect PostgreSQL***\n//Non fault-tolerant//" as praefect_postgres #FF8C00 + + praefect -[#FF8C00]-> gitaly + praefect -[#FF8C00]> praefect_postgres +} + +card "Database" as database { + collections "**PGBouncer** x3" as pgbouncer #4EA7FF + card "**PostgreSQL** (Primary)" as postgres_primary #4EA7FF + collections "**PostgreSQL** (Secondary) x2" as postgres_secondary #4EA7FF + + pgbouncer -[#4EA7FF]-> postgres_primary + postgres_primary .[#4EA7FF]> postgres_secondary +} + +card "Redis" as redis { + collections "**Redis** x3" as redis_nodes #FF6347 + + redis_nodes <.[#FF6347]- sentinel +} + +cloud "**Object Storage**" as object_storage #white + +elb -[#6a9be7]-> gitlab +elb -[#6a9be7]-> monitor +elb -[hidden]-> support + +gitlab -[#32CD32]--> ilb +gitlab -[#32CD32]-> object_storage +gitlab -[#32CD32]---> redis +gitlab -[hidden]--> consul + +sidekiq -[#ff8dd1]--> ilb +sidekiq -[#ff8dd1]-> object_storage +sidekiq -[#ff8dd1]---> redis +sidekiq -[hidden]--> consul + +ilb -[#9370DB]-> gitaly_cluster +ilb -[#9370DB]-> database + +consul .[#e76a9b]-> database +consul .[#e76a9b]-> gitaly_cluster +consul .[#e76a9b,norank]--> redis + +monitor .[#7FFFD4]> consul +monitor .[#7FFFD4]-> database +monitor .[#7FFFD4]-> gitaly_cluster +monitor .[#7FFFD4,norank]--> redis +monitor .[#7FFFD4]> ilb +monitor .[#7FFFD4,norank]u--> elb + +@enduml +``` + +### Resource usage settings + +The following formulas help when calculating how many pods may be deployed within resource constraints. +The [5k reference architecture example values file](https://gitlab.com/gitlab-org/charts/gitlab/-/blob/master/examples/ref/5k.yaml) +documents how to apply the calculated configuration to the Helm Chart. + +#### Webservice + +Webservice pods typically need about 1 vCPU and 1.25 GB of memory _per worker_. +Each Webservice pod will consume roughly 4 vCPUs and 5 GB of memory using +the [recommended topology](#cluster-topology) because four worker processes +are created by default and each pod has other small processes running. + +For 5k users we recommend a total Puma worker count of around 40. +With the [provided recommendations](#cluster-topology) this allows the deployment of up to 10 +Webservice pods with 4 workers per pod and 2 pods per node. Expand available resources using +the ratio of 1 vCPU to 1.25 GB of memory _per each worker process_ for each additional +Webservice pod. + +For further information on resource usage, see the [Webservice resources](https://docs.gitlab.com/charts/charts/gitlab/webservice/#resources). + +#### Sidekiq + +Sidekiq pods should generally have 1 vCPU and 2 GB of memory. + +[The provided starting point](#cluster-topology) allows the deployment of up to +8 Sidekiq pods. Expand available resources using the 1 vCPU to 2GB memory +ratio for each additional pod. + +For further information on resource usage, see the [Sidekiq resources](https://docs.gitlab.com/charts/charts/gitlab/sidekiq/#resources). <div align="right"> <a type="button" class="btn btn-default" href="#setup-components"> diff --git a/doc/administration/reference_architectures/index.md b/doc/administration/reference_architectures/index.md index 49024365e30..22871f6ea8d 100644 --- a/doc/administration/reference_architectures/index.md +++ b/doc/administration/reference_architectures/index.md @@ -69,6 +69,13 @@ The following reference architectures are available: - [Up to 25,000 users](25k_users.md) - [Up to 50,000 users](50k_users.md) +The following Cloud Native Hybrid reference architectures, where select recommended components can be run in Kubernetes, are available: + +- [Up to 5,000 users](5k_users.md#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative) +- [Up to 10,000 users](10k_users.md#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative) +- [Up to 25,000 users](25k_users.md#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative) +- [Up to 50,000 users](50k_users.md#cloud-native-hybrid-reference-architecture-with-helm-charts-alternative) + A GitLab [Premium or Ultimate](https://about.gitlab.com/pricing/#self-managed) license is required to get assistance from Support with troubleshooting the [2,000 users](2k_users.md) and higher reference architectures. @@ -163,7 +170,7 @@ a layer of complexity that will add challenges to finding out where potential issues might lie. The reference architectures use the official GitLab Linux packages (Omnibus -GitLab) to install and configure the various components (with one notable exception being the suggested select Cloud Native installation method described below). The components are +GitLab) or [Helm Charts](https://docs.gitlab.com/charts/) to install and configure the various components. The components are installed on separate machines (virtualized or bare metal), with machine hardware requirements listed in the "Configuration" column and equivalent VM standard sizes listed in GCP/AWS/Azure columns of each [available reference architecture](#available-reference-architectures). @@ -175,21 +182,10 @@ Other technologies, like [Docker swarm](https://docs.docker.com/engine/swarm/) are not officially supported, but can be implemented at your own risk. In that case, GitLab Support will not be able to help you. -### Configuring select components with Cloud Native Helm - -We also provide [Helm charts](https://docs.gitlab.com/charts/) as a Cloud Native installation -method for GitLab. For the reference architectures, select components can be set up in this -way as an alternative if so desired. +## Supported modifications for lower user count HA reference architectures -For these kind of setups we support using the charts in an [advanced configuration](https://docs.gitlab.com/charts/#advanced-configuration) -where stateful backend components, such as the database or Gitaly, are run externally - either -via Omnibus or reputable third party services. Note that we don't currently support running the -stateful components via Helm _at large scales_. +The reference architectures for user counts [3,000](3k_users.md) and up support High Availability (HA). -When designing these environments you should refer to the respective [Reference Architecture](#available-reference-architectures) -above for guidance on sizing. Components run via Helm would be similarly scaled to their Omnibus -specs, only translated into Kubernetes resources. +In the specific case you have the requirement to achieve HA but have a lower user count, select modifications to the [3,000 user](3k_users.md) architecture are supported. -For example, if you were to set up a 50k installation with the Rails nodes being run in Helm, -then the same amount of resources as given for Omnibus should be given to the Kubernetes -cluster with the Rails nodes broken down into a number of smaller Pods across that cluster. +For more details, [refer to this section in the architecture's documentation](3k_users.md#supported-modifications-for-lower-user-counts-ha). diff --git a/doc/administration/reference_architectures/troubleshooting.md b/doc/administration/reference_architectures/troubleshooting.md index 4b07cff7de2..61d9dfea2a2 100644 --- a/doc/administration/reference_architectures/troubleshooting.md +++ b/doc/administration/reference_architectures/troubleshooting.md @@ -207,7 +207,7 @@ To make sure your configuration is correct: ## Troubleshooting Gitaly For troubleshooting information, see Gitaly and Gitaly Cluster -[troubleshooting information](../gitaly/index.md). +[troubleshooting information](../gitaly/troubleshooting.md). ## Troubleshooting the GitLab Rails application diff --git a/doc/administration/reply_by_email.md b/doc/administration/reply_by_email.md index ebb9e086cb7..c249f48b768 100644 --- a/doc/administration/reply_by_email.md +++ b/doc/administration/reply_by_email.md @@ -4,9 +4,7 @@ group: Certify info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- -# Reply by email - -> [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/1173) in GitLab 8.0. +# Reply by email **(FREE SELF)** GitLab can be set up to allow users to comment on issues and merge requests by replying to notification emails. @@ -34,10 +32,10 @@ addition, this "reply key" is also added to the `References` header. When you reply to the notification email, your email client: -- sends the email to the `Reply-To` address it got from the notification email -- sets the `In-Reply-To` header to the value of the `Message-ID` header from the +- Sends the email to the `Reply-To` address it got from the notification email +- Sets the `In-Reply-To` header to the value of the `Message-ID` header from the notification email -- sets the `References` header to the value of the `Message-ID` plus the value of +- Sets the `References` header to the value of the `Message-ID` plus the value of the notification email's `References` header. ### GitLab receives your reply to the notification email @@ -45,8 +43,8 @@ When you reply to the notification email, your email client: When GitLab receives your reply, it looks for the "reply key" in the following headers, in this order: -1. the `To` header -1. the `References` header +1. `To` header +1. `References` header If it finds a reply key, it leaves your reply as a comment on the entity the notification was about (issue, merge request, commit...). diff --git a/doc/administration/repository_checks.md b/doc/administration/repository_checks.md index 869b1e7068f..ab203bb7993 100644 --- a/doc/administration/repository_checks.md +++ b/doc/administration/repository_checks.md @@ -5,57 +5,64 @@ info: "To determine the technical writer assigned to the Stage/Group associated type: reference --- -# Repository checks **(FREE)** +# Repository checks **(FREE SELF)** -> [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/3232) in GitLab 8.7. +You can use [`git fsck`](https://git-scm.com/docs/git-fsck) to verify the integrity of all data +committed to a repository. GitLab administrators can trigger this check for a project using the +GitLab UI: -Git has a built-in mechanism, [`git fsck`](https://git-scm.com/docs/git-fsck), to verify the -integrity of all data committed to a repository. GitLab administrators -can trigger such a check for a project via the project page under the -Admin Area. The checks run asynchronously so it may take a few minutes -before the check result is visible on the project Admin Area. If the -checks failed you can see their output on in the -[`repocheck.log` file.](logs.md#repochecklog) +1. On the top bar, select **Menu >** **{admin}** **Admin**. +1. On the left sidebar, select **Overview > Projects**. +1. Select the project to check. +1. In the **Repository check** section, select **Trigger repository check**. + +The checks run asynchronously so it may take a few minutes before the check result is visible on the +project page in the Admin Area. If the checks fail, see [what to do](#what-to-do-if-a-check-failed). This setting is off by default, because it can cause many false alarms. -## Periodic checks +## Enable periodic checks + +Instead of checking repositories manually, GitLab can be configured to run the checks periodically: + +1. On the top bar, select **Menu >** **{admin}** **Admin**. +1. On the left sidebar, select **Settings > Repository** (`/admin/application_settings/repository`). +1. Expand the **Repository maintenance** section. +1. Enable **Enable repository checks**. -When enabled, GitLab periodically runs a repository check on all project -repositories and wiki repositories in order to detect data corruption. -A project is checked no more than once per month. If any projects -fail their repository checks all GitLab administrators receive an email -notification of the situation. This notification is sent out once a week, -by default, midnight at the start of Sunday. Repositories with known check -failures can be found at `/admin/projects?last_repository_check_failed=1`. +When enabled, GitLab periodically runs a repository check on all project repositories and wiki +repositories to detect possible data corruption. A project is checked no more than once per month. -## Disabling periodic checks +If any projects fail their repository checks, all GitLab administrators receive an email +notification of the situation. By default, this notification is sent out once a week at midnight at +the start of Sunday. -You can disable the periodic checks on the **Settings** page of the Admin Area. +Repositories with known check failures can be found at +`/admin/projects?last_repository_check_failed=1`. ## What to do if a check failed -If the repository check fails for some repository you should look up the error -in the [`repocheck.log` file](logs.md#repochecklog) on disk: +If a repository check fails, locate the error in the [`repocheck.log` file](logs.md#repochecklog) on +disk at: -- `/var/log/gitlab/gitlab-rails` for Omnibus GitLab installations -- `/home/git/gitlab/log` for installations from source +- `/var/log/gitlab/gitlab-rails` for Omnibus GitLab installations. +- `/home/git/gitlab/log` for installations from source. -If the periodic repository check causes false alarms, you can clear all repository check states by: +If periodic repository checks cause false alarms, you can clear all repository check states: 1. On the top bar, select **Menu >** **{admin}** **Admin**. 1. On the left sidebar, select **Settings > Repository** (`/admin/application_settings/repository`). 1. Expand the **Repository maintenance** section. 1. Select **Clear all repository checks**. -## Run a check manually +## Run a check using the command line -[`git fsck`](https://git-scm.com/docs/git-fsck) is a read-only check that you can run -manually against the repository on the [Gitaly server](gitaly/index.md). +You can run [`git fsck`](https://git-scm.com/docs/git-fsck) using the command line on repositories +on [Gitaly servers](gitaly/index.md). To locate the repositories: -- For Omnibus GitLab installations, repositories are stored by default in - `/var/opt/gitlab/git-data/repositories`. -- [Identify the subdirectory that contains the repository](repository_storage_types.md#from-project-name-to-hashed-path) +1. Go to the storage location for repositories. For Omnibus GitLab installations, repositories are + stored by default in the `/var/opt/gitlab/git-data/repositories` directory. +1. [Identify the subdirectory that contains the repository](repository_storage_types.md#from-project-name-to-hashed-path) that you need to check. To run a check (for example): @@ -65,5 +72,5 @@ sudo /opt/gitlab/embedded/bin/git -C /var/opt/gitlab/git-data/repositories/@hash ``` You can also run [Rake tasks](raketasks/check.md#repository-integrity) for checking Git -repositories, which can be used to run `git fsck` against all repositories and generate -repository checksums, as a way to compare repositories on different servers. +repositories, which can be used to run `git fsck` against all repositories and generate repository +checksums, as a way to compare repositories on different servers. diff --git a/doc/administration/repository_storage_paths.md b/doc/administration/repository_storage_paths.md index a1391f3e0ed..68f351e737a 100644 --- a/doc/administration/repository_storage_paths.md +++ b/doc/administration/repository_storage_paths.md @@ -147,13 +147,13 @@ can choose where new repositories are stored: 1. Select **Save changes**. Each repository storage path can be assigned a weight from 0-100. When a new project is created, -these weights are used to determine the storage location the repository is created on. The higher -the weight of a given repository storage path relative to other repository storages paths, the more -often it is chosen. That is, `(storage weight) / (sum of all weights) * 100 = chance %`. +these weights are used to determine the storage location the repository is created on. -![Choose repository storage path in Admin Area](img/repository_storages_admin_ui_v13_1.png) +The higher the weight of a given repository storage path relative to other repository storages +paths, the more often it is chosen. That is, +`(storage weight) / (sum of all weights) * 100 = chance %`. ## Move repositories -To move a repository to a different repository storage (for example, from `default` to `storage2`), use the +To move a repository to a different repository storage (for example, from `default` to `storage2`), use the same process as [migrating to Gitaly Cluster](gitaly/praefect.md#migrate-to-gitaly-cluster). diff --git a/doc/administration/restart_gitlab.md b/doc/administration/restart_gitlab.md index 5f7f08f4ecf..b8f09b00773 100644 --- a/doc/administration/restart_gitlab.md +++ b/doc/administration/restart_gitlab.md @@ -85,7 +85,7 @@ sudo gitlab-ctl reconfigure Reconfiguring GitLab should occur in the event that something in its configuration (`/etc/gitlab/gitlab.rb`) has changed. -When you run this command, [Chef](https://www.chef.io/products/chef-infra/), the underlying configuration management +When you run this command, [Chef](https://www.chef.io/products/chef-infra), the underlying configuration management application that powers Omnibus GitLab, makes sure that all things like directories, permissions, and services are in place and in the same shape that they were initially shipped. diff --git a/doc/administration/server_hooks.md b/doc/administration/server_hooks.md index f67bf676a61..2a431d17774 100644 --- a/doc/administration/server_hooks.md +++ b/doc/administration/server_hooks.md @@ -34,7 +34,7 @@ Note the following about server hooks: administrators are able to complete these tasks. If you don't have file system access, see possible alternatives such as: - [Webhooks](../user/project/integrations/webhooks.md). - - [GitLab CI/CD](../ci/README.md). + - [GitLab CI/CD](../ci/index.md). - [Push Rules](../push_rules/push_rules.md), for a user-configurable Git hook interface. - Server hooks aren't replicated to [Geo](geo/index.md) secondary nodes. @@ -142,7 +142,7 @@ The following set of environment variables are available to server hooks. |:---------------------|:----------------------------------------------------------------------------| | `GL_ID` | GitLab identifier of user that initiated the push. For example, `user-2234` | | `GL_PROJECT_PATH` | (GitLab 13.2 and later) GitLab project path | -| `GL_PROTOCOL` | (GitLab 13.2 and later) Protocol used with push | +| `GL_PROTOCOL` | (GitLab 13.2 and later) Protocol used for this change. One of: `http` (Git Push using HTTP), `ssh` (Git Push using SSH), or `web` (all other actions). | | `GL_REPOSITORY` | `project-<id>` where `id` is the ID of the project | | `GL_USERNAME` | GitLab username of the user that initiated the push | diff --git a/doc/administration/troubleshooting/debug.md b/doc/administration/troubleshooting/debug.md index 6861cdcde4e..031f44b1f9f 100644 --- a/doc/administration/troubleshooting/debug.md +++ b/doc/administration/troubleshooting/debug.md @@ -111,7 +111,7 @@ an SMTP server, but you're not seeing mail delivered. Here's how to check the se ``` In the example above, the SMTP server is configured for the local machine. If this is intended, you may need to check your local mail - logs (e.g. `/var/log/mail.log`) for more details. + logs (for example, `/var/log/mail.log`) for more details. 1. Send a test message via the console. @@ -119,7 +119,7 @@ an SMTP server, but you're not seeing mail delivered. Here's how to check the se irb(main):003:0> Notify.test_email('youremail@email.com', 'Hello World', 'This is a test message').deliver_now ``` - If you do not receive an e-mail and/or see an error message, then check + If you do not receive an email and/or see an error message, then check your mail server settings. ## Advanced Issues @@ -224,7 +224,7 @@ gitlab_rails['env'] = { } ``` -For source installations, set the environment variable. +For source installations, set the environment variable. Refer to [Puma Worker timeout](https://docs.gitlab.com/omnibus/settings/puma.html#worker-timeout). [Reconfigure](../restart_gitlab.md#omnibus-gitlab-reconfigure) GitLab for the changes to take effect. @@ -237,7 +237,7 @@ are concerned about affecting others during a production system, you can run a separate Rails process to debug the issue: 1. Log in to your GitLab account. -1. Copy the URL that is causing problems (e.g. `https://gitlab.com/ABC`). +1. Copy the URL that is causing problems (for example, `https://gitlab.com/ABC`). 1. Create a Personal Access Token for your user (User Settings -> Access Tokens). 1. Bring up the [GitLab Rails console.](../operations/rails_console.md#starting-a-rails-console-session) 1. At the Rails console, run: @@ -258,12 +258,12 @@ separate Rails process to debug the issue: ### GitLab: API is not accessible This often occurs when GitLab Shell attempts to request authorization via the -[internal API](../../development/internal_api.md) (e.g., `http://localhost:8080/api/v4/internal/allowed`), and +[internal API](../../development/internal_api.md) (for example, `http://localhost:8080/api/v4/internal/allowed`), and something in the check fails. There are many reasons why this may happen: -1. Timeout connecting to a database (e.g., PostgreSQL or Redis) +1. Timeout connecting to a database (for example, PostgreSQL or Redis) 1. Error in Git hooks or push rules -1. Error accessing the repository (e.g., stale NFS handles) +1. Error accessing the repository (for example, stale NFS handles) To diagnose this problem, try to reproduce the problem and then see if there is a Unicorn worker that is spinning via `top`. Try to use the `gdb` @@ -285,5 +285,5 @@ The output in `/tmp/puma.txt` may help diagnose the root cause. ## More information -- [Debugging Stuck Ruby Processes](https://blog.newrelic.com/engineering/debugging-stuck-ruby-processes-what-to-do-before-you-kill-9/) +- [Debugging Stuck Ruby Processes](https://newrelic.com/blog/engineering/debugging-stuck-ruby-processes-what-to-do-before-you-kill-9/) - [Cheat sheet of using GDB and Ruby processes](gdb-stuck-ruby.txt) diff --git a/doc/administration/troubleshooting/defcon.md b/doc/administration/troubleshooting/defcon.md index 7cae6ea1c8f..1b263f70b46 100644 --- a/doc/administration/troubleshooting/defcon.md +++ b/doc/administration/troubleshooting/defcon.md @@ -10,7 +10,7 @@ type: reference This document describes a feature that allows you to disable some important but computationally expensive parts of the application to relieve stress on the database during an ongoing downtime. -## `ci_queueing_disaster_recovery` +## `ci_queueing_disaster_recovery_disable_fair_scheduling` This feature flag, if temporarily enabled, disables fair scheduling on shared runners. This can help to reduce system resource usage on the `jobs/request` endpoint @@ -20,6 +20,16 @@ Side effects: - In case of a large backlog of jobs, the jobs are processed in the order they were put in the system, instead of balancing the jobs across many projects. + +## `ci_queueing_disaster_recovery_disable_quota` + +This feature flag, if temporarily enabled, disables enforcing CI minutes quota +on shared runners. This can help to reduce system resource usage on the +`jobs/request` endpoint by significantly reducing the computations being +performed. + +Side effects: + - Projects which are out of quota will be run. This affects only jobs created during the last hour, as prior jobs are canceled by a periodic background worker (`StuckCiJobsWorker`). diff --git a/doc/administration/troubleshooting/elasticsearch.md b/doc/administration/troubleshooting/elasticsearch.md index d04ce23188f..79295856da8 100644 --- a/doc/administration/troubleshooting/elasticsearch.md +++ b/doc/administration/troubleshooting/elasticsearch.md @@ -53,7 +53,7 @@ graph TD; B5 --> |No| B7 B7 --> B8 B{Is GitLab using<br>Elasticsearch for<br>searching?} - B1[Check Admin Area > Integrations<br>to ensure the settings are correct] + B1[From the Admin Area, select<br>Integrations from the left<br>sidebar to ensure the settings<br>are correct.] B2[Perform a search via<br>the rails console] B3[If all settings are correct<br>and it still doesn't show Elasticsearch<br>doing the searches, escalate<br>to GitLab support.] B4[Perform<br>the same search via the<br>Elasticsearch API] @@ -196,7 +196,9 @@ Troubleshooting search result issues is rather straight forward on Elasticsearch The first step is to confirm GitLab is using Elasticsearch for the search function. To do this: -1. Confirm the integration is enabled in **Admin Area > Settings > General**. +1. On the top bar, select **Menu >** **{admin}** **Admin**. +1. On the left sidebar, select **Settings > General**, and then confirm the + integration is enabled. 1. Confirm searches use Elasticsearch by accessing the rails console (`sudo gitlab-rails console`) and running the following commands: diff --git a/doc/administration/troubleshooting/gitlab_rails_cheat_sheet.md b/doc/administration/troubleshooting/gitlab_rails_cheat_sheet.md index 92070a86a0d..08755dd3285 100644 --- a/doc/administration/troubleshooting/gitlab_rails_cheat_sheet.md +++ b/doc/administration/troubleshooting/gitlab_rails_cheat_sheet.md @@ -275,7 +275,24 @@ integration active: p = Project.find_by_sql("SELECT p.id FROM projects p LEFT JOIN services s ON p.id = s.project_id WHERE s.type = 'JiraService' AND s.active = true") p.each do |project| - project.jira_service.update_attribute(:password, '<your-new-password>') + project.jira_integration.update_attribute(:password, '<your-new-password>') +end +``` + +### Bulk update push rules for _all_ projects + +For example, enable **Check whether the commit author is a GitLab user** and **Do not allow users to remove Git tags with `git push`** checkboxes, and create a filter for allowing commits from a specific e-mail domain only: + +``` ruby +Project.find_each do |p| + pr = p.push_rule || PushRule.new(project: p) + # Check whether the commit author is a GitLab user + pr.member_check = true + # Do not allow users to remove Git tags with `git push` + pr.deny_delete_tag = true + # Commit author's email + pr.author_email_regex = '@domain\.com$' + pr.save! end ``` @@ -286,9 +303,9 @@ To change all Jira project to use the instance-level integration settings: 1. In a Rails console: ```ruby - jira_service_instance_id = JiraService.find_by(instance: true).id - JiraService.where(active: true, instance: false, template: false, inherit_from_id: nil).find_each do |service| - service.update_attribute(:inherit_from_id, jira_service_instance_id) + jira_integration_instance_id = Integrations::Jira.find_by(instance: true).id + Integrations::Jira.where(active: true, instance: false, template: false, inherit_from_id: nil).find_each do |integration| + integration.update_attribute(:inherit_from_id, jira_integration_instance_id) end ``` @@ -331,10 +348,10 @@ end puts "#{artifact_storage} bytes" ``` -### Identify deploy keys associated with blocked and non-member users +### Identify deploy keys associated with blocked and non-member users -When the user who created a deploy key is blocked or removed from the project, the key -can no longer be used to push to protected branches in a private project (see [issue #329742](https://gitlab.com/gitlab-org/gitlab/-/issues/329742)). +When the user who created a deploy key is blocked or removed from the project, the key +can no longer be used to push to protected branches in a private project (see [issue #329742](https://gitlab.com/gitlab-org/gitlab/-/issues/329742)). The following script identifies unusable deploy keys: ```ruby @@ -350,7 +367,7 @@ DeployKeysProject.with_write_access.find_each do |deploy_key_mapping| # can_push_for_ref? tests if deploy_key can push to default branch, which is likely to be protected can_push = access_checker.can_do_action?(:push_code) can_push_to_default = access_checker.can_push_for_ref?(project.repository.root_ref) - + next if access_checker.allowed? && can_push && can_push_to_default if user.nil? || user.id == ghost_user_id @@ -557,7 +574,7 @@ User.billable.count ::HistoricalData.max_historical_user_count ``` -Using cURL and jq (up to a max 100, see the [pagination docs](../../api/README.md#pagination)): +Using cURL and jq (up to a max 100, see the [pagination docs](../../api/index.md#pagination)): ```shell curl --silent --header "Private-Token: ********************" \ @@ -814,12 +831,12 @@ build.dependencies.each do |d| { puts "status: #{d.status}, finished at: #{d.fin completed: #{d.complete?}, artifacts_expired: #{d.artifacts_expired?}, erased: #{d.erased?}" } ``` -### Try CI service +### Try CI integration ```ruby p = Project.find_by_full_path('<project_path>') m = project.merge_requests.find_by(iid: ) -m.project.try(:ci_service) +m.project.try(:ci_integration) ``` ### Validate the `.gitlab-ci.yml` @@ -1125,6 +1142,33 @@ registry = Geo::PackageFileRegistry.find(registry_id) registry.replicator.send(:download) ``` +#### Verify package files on the secondary manually + +This will iterate over all package files on the secondary, looking at the +`verification_checksum` stored in the database (which came from the primary) +and then calculate this value on the secondary to check if they match. This +won't change anything in the UI: + +```ruby +# Run on secondary +status = {} + +Packages::PackageFile.find_each do |package_file| + primary_checksum = package_file.verification_checksum + secondary_checksum = Packages::PackageFile.hexdigest(package_file.file.path) + verification_status = (primary_checksum == secondary_checksum) + + status[verification_status.to_s] ||= [] + status[verification_status.to_s] << package_file.id +end + +# Count how many of each value we get +status.keys.each {|key| puts "#{key} count: #{status[key].count}"} + +# See the output in its entirety +status +``` + ### Repository types newer than project/wiki repositories - `SnippetRepository` @@ -1155,31 +1199,31 @@ registry = Geo::SnippetRepositoryRegistry.find(registry_id) registry.replicator.send(:sync_repository) ``` -### Generate usage ping +## Generate Service Ping -#### Generate or get the cached usage ping +### Generate or get the cached Service Ping ```ruby Gitlab::UsageData.to_json ``` -#### Generate a fresh new usage ping +### Generate a fresh new Service Ping -This will also refresh the cached usage ping displayed in the admin area +This will also refresh the cached Service Ping displayed in the admin area ```ruby Gitlab::UsageData.to_json(force_refresh: true) ``` -#### Generate and print +### Generate and print -Generates usage ping data in JSON format. +Generates Service Ping data in JSON format. ```shell rake gitlab:usage_data:generate ``` -#### Generate and send usage ping +### Generate and send Service Ping Prints the metrics saved in `conversational_development_index_metrics`. @@ -1219,7 +1263,7 @@ Open the rails console (`gitlab rails c`) and run the following command to see a ApplicationSetting.last.attributes ``` -Among other attributes, in the output you will notice that all the settings available in the [Elasticsearch Integration page](../../integration/elasticsearch.md), like: `elasticsearch_indexing`, `elasticsearch_url`, `elasticsearch_replicas`, `elasticsearch_pause_indexing`, etc. +Among other attributes, in the output you will notice that all the settings available in the [Elasticsearch Integration page](../../integration/elasticsearch.md), like: `elasticsearch_indexing`, `elasticsearch_url`, `elasticsearch_replicas`, `elasticsearch_pause_indexing`, and so on. #### Setting attributes diff --git a/doc/administration/troubleshooting/img/AzureAD-basic_SAML.png b/doc/administration/troubleshooting/img/AzureAD-basic_SAML.png Binary files differindex e86ad7572e8..7a0d83ab2dd 100644 --- a/doc/administration/troubleshooting/img/AzureAD-basic_SAML.png +++ b/doc/administration/troubleshooting/img/AzureAD-basic_SAML.png diff --git a/doc/administration/troubleshooting/img/AzureAD-claims.png b/doc/administration/troubleshooting/img/AzureAD-claims.png Binary files differindex aab92288704..576040be337 100644 --- a/doc/administration/troubleshooting/img/AzureAD-claims.png +++ b/doc/administration/troubleshooting/img/AzureAD-claims.png diff --git a/doc/administration/troubleshooting/img/azure_configure_group_claim.png b/doc/administration/troubleshooting/img/azure_configure_group_claim.png Binary files differindex 31df5fff625..9d8c5348273 100644 --- a/doc/administration/troubleshooting/img/azure_configure_group_claim.png +++ b/doc/administration/troubleshooting/img/azure_configure_group_claim.png diff --git a/doc/administration/troubleshooting/postgresql.md b/doc/administration/troubleshooting/postgresql.md index 341c6bfbc65..994c194c6db 100644 --- a/doc/administration/troubleshooting/postgresql.md +++ b/doc/administration/troubleshooting/postgresql.md @@ -55,7 +55,7 @@ This section is for links to information elsewhere in the GitLab documentation. - Including [troubleshooting](../postgresql/replication_and_failover.md#troubleshooting) `gitlab-ctl patroni check-leader` and PgBouncer errors. -- [Developer database documentation](../../development/README.md#database-guides), +- [Developer database documentation](../../development/index.md#database-guides), some of which is absolutely not for production use. Including: - Understanding EXPLAIN plans. diff --git a/doc/administration/troubleshooting/tracing_correlation_id.md b/doc/administration/troubleshooting/tracing_correlation_id.md index 7b9ce5c6d7b..1bb10e72290 100644 --- a/doc/administration/troubleshooting/tracing_correlation_id.md +++ b/doc/administration/troubleshooting/tracing_correlation_id.md @@ -27,7 +27,7 @@ activity with the site that you're visiting. See the links below for network mon documentation for some popular browsers. - [Network Monitor - Firefox Developer Tools](https://developer.mozilla.org/en-US/docs/Tools/Network_Monitor) -- [Inspect Network Activity In Chrome DevTools](https://developers.google.com/web/tools/chrome-devtools/network/) +- [Inspect Network Activity In Chrome DevTools](https://developer.chrome.com/docs/devtools/network) - [Safari Web Development Tools](https://developer.apple.com/safari/tools/) - [Microsoft Edge Network panel](https://docs.microsoft.com/en-us/microsoft-edge/devtools-guide-chromium/network/) diff --git a/doc/administration/whats-new.md b/doc/administration/whats-new.md index ae19e0f0341..d669d05e9f0 100644 --- a/doc/administration/whats-new.md +++ b/doc/administration/whats-new.md @@ -13,7 +13,7 @@ GitLab versions in the **What's new** feature. To access it: 1. Select **What's new** from the menu. The **What's new** describes new features available in multiple -[GitLab tiers](https://about.gitlab.com/pricing). While all users can see the +[GitLab tiers](https://about.gitlab.com/pricing/). While all users can see the feature list, the feature list is tailored to your subscription type: - Features only available to self-managed installations are not shown on GitLab.com. |