summaryrefslogtreecommitdiff
path: root/doc/integration
diff options
context:
space:
mode:
Diffstat (limited to 'doc/integration')
-rw-r--r--doc/integration/README.md2
-rw-r--r--doc/integration/elasticsearch.md77
-rw-r--r--doc/integration/github.md2
-rw-r--r--doc/integration/jira_development_panel.md6
-rw-r--r--doc/integration/openid_connect_provider.md6
-rw-r--r--doc/integration/recaptcha.md6
-rw-r--r--doc/integration/saml.md161
7 files changed, 160 insertions, 100 deletions
diff --git a/doc/integration/README.md b/doc/integration/README.md
index f06cb5e4bb6..fdaff74ca58 100644
--- a/doc/integration/README.md
+++ b/doc/integration/README.md
@@ -56,6 +56,8 @@ GitLab can be integrated with the following enhancements:
- Configure [PlantUML](../administration/integration/plantuml.md) to use diagrams in AsciiDoc documents.
- Attach merge requests to [Trello](trello_power_up.md) cards.
- Enable integrated code intelligence powered by [Sourcegraph](sourcegraph.md).
+- Add [Elasticsearch](elasticsearch.md) for [Advanced Global Search](../user/search/advanced_global_search.md),
+ [Advanced System Search](../user/search/advanced_search_syntax.md), and faster searching.
## Integrations
diff --git a/doc/integration/elasticsearch.md b/doc/integration/elasticsearch.md
index 07ef5dbb99d..8ca45547f11 100644
--- a/doc/integration/elasticsearch.md
+++ b/doc/integration/elasticsearch.md
@@ -125,10 +125,7 @@ for each Elasticsearch node, per the [official guidelines](https://www.elastic.c
Keep in mind, this is the **minimum requirements** as per Elasticsearch. For
production instances, they recommend considerably more resources.
-Storage requirements also vary based on the installation side, but as a rule of
-thumb, you should allocate the total size of your production database, **plus**
-two-thirds of the total size of your Git repositories. Efforts to reduce this
-total are being tracked in [epic &153](https://gitlab.com/groups/gitlab-org/-/epics/153).
+The necessary storage space largely depends on the size of the repositories you want to store in GitLab but as a rule of thumb you should have at least 50% of free space as all your repositories combined take up.
## Enabling Elasticsearch
@@ -153,8 +150,8 @@ The following Elasticsearch settings are available:
| `AWS Access Key` | The AWS access key. |
| `AWS Secret Access Key` | The AWS secret access key. |
| `Maximum field length` | See [the explanation in instance limits.](../administration/instance_limits.md#maximum-field-length). |
-| `Maximum bulk request size (MiB)` | The Maximum Bulk Request size is used by GitLab's Golang-based indexer processes and indicates how much data it ought to collect (and store in memory) in a given indexing process before submitting the payload to Elasticsearch’s Bulk API. This setting works in tandem with the Bulk request concurrency setting (see below) and needs to accomodate the resource constraints of both the Elasticsearch host(s) and the host(s) running GitLab's Golang-based indexer either from the `gitlab-rake` command or the Sidekiq tasks. |
-| `Bulk request concurrency` | The Bulk request concurrency indicates how many of GitLab's Golang-based indexer processes (or threads) can run in parallel to collect data to subsequently submit to Elasticsearch’s Bulk API. This increases indexing performance, but fills the Elasticsearch bulk requests queue faster. This setting works in tandem with the Maximum bulk request size setting (see above) and needs to accomodate the resource constraints of both the Elasticsearch host(s) and the host(s) running GitLab's Golang-based indexer either from the `gitlab-rake` command or the Sidekiq tasks. |
+| `Maximum bulk request size (MiB)` | The Maximum Bulk Request size is used by GitLab's Golang-based indexer processes and indicates how much data it ought to collect (and store in memory) in a given indexing process before submitting the payload to Elasticsearch’s Bulk API. This setting should be used with the Bulk request concurrency setting (see below) and needs to accommodate the resource constraints of both the Elasticsearch host(s) and the host(s) running GitLab's Golang-based indexer either from the `gitlab-rake` command or the Sidekiq tasks. |
+| `Bulk request concurrency` | The Bulk request concurrency indicates how many of GitLab's Golang-based indexer processes (or threads) can run in parallel to collect data to subsequently submit to Elasticsearch’s Bulk API. This increases indexing performance, but fills the Elasticsearch bulk requests queue faster. This setting should be used together with the Maximum bulk request size setting (see above) and needs to accommodate the resource constraints of both the Elasticsearch host(s) and the host(s) running GitLab's Golang-based indexer either from the `gitlab-rake` command or the Sidekiq tasks. |
### Limiting namespaces and projects
@@ -170,10 +167,10 @@ You can filter the selection dropdown by writing part of the namespace or projec
![limit namespace filter](img/limit_namespace_filter.png)
-NOTE: **Note**:
+NOTE: **Note:**
If no namespaces or projects are selected, no Elasticsearch indexing will take place.
-CAUTION: **Warning**:
+CAUTION: **Warning:**
If you have already indexed your instance, you will have to regenerate the index in order to delete all existing data
for filtering to work correctly. To do this run the Rake tasks `gitlab:elastic:recreate_index` and
`gitlab:elastic:clear_index_status`. Afterwards, removing a namespace or a project from the list will delete the data
@@ -187,7 +184,7 @@ To disable the Elasticsearch integration:
1. Expand the **Elasticsearch** section and uncheck **Elasticsearch indexing**
and **Search with Elasticsearch enabled**.
1. Click **Save changes** for the changes to take effect.
-1. (Optional) Delete the existing index by running one of these commands:
+1. (Optional) Delete the existing index:
```shell
# Omnibus installations
@@ -209,7 +206,7 @@ To backfill existing data, you can use one of the methods below to index it in b
To index via the Admin Area:
1. [Configure your Elasticsearch host and port](#enabling-elasticsearch).
-1. Create empty indexes using one of the following commands:
+1. Create empty indexes:
```shell
# Omnibus installations
@@ -222,7 +219,7 @@ To index via the Admin Area:
1. [Enable **Elasticsearch indexing**](#enabling-elasticsearch).
1. Click **Index all projects** in **Admin Area > Settings > Integrations > Elasticsearch**.
1. Click **Check progress** in the confirmation message to see the status of the background jobs.
-1. Personal snippets need to be indexed manually by running one of these commands:
+1. Personal snippets need to be indexed manually:
```shell
# Omnibus installations
@@ -240,13 +237,13 @@ Indexing can be performed using Rake tasks.
#### Indexing small instances
-CAUTION: **Warning**:
+CAUTION: **Warning:**
This will delete your existing indexes.
If the database size is less than 500 MiB, and the size of all hosted repos is less than 5 GiB:
-1. [Enable **Elasticsearch indexing** and configure your host and port](#enabling-elasticsearch).
-1. Index your data using one of the following commands:
+1. [Configure your Elasticsearch host and port](#enabling-elasticsearch).
+1. Index your data:
```shell
# Omnibus installations
@@ -260,13 +257,13 @@ If the database size is less than 500 MiB, and the size of all hosted repos is l
#### Indexing large instances
-CAUTION: **Warning**:
+CAUTION: **Warning:**
Performing asynchronous indexing will generate a lot of Sidekiq jobs.
Make sure to prepare for this task by having a [Scalable and Highly Available Setup](README.md)
or creating [extra Sidekiq processes](../administration/operations/extra_sidekiq_processes.md)
1. [Configure your Elasticsearch host and port](#enabling-elasticsearch).
-1. Create empty indexes using one of the following commands:
+1. Create empty indexes:
```shell
# Omnibus installations
@@ -276,6 +273,16 @@ or creating [extra Sidekiq processes](../administration/operations/extra_sidekiq
bundle exec rake gitlab:elastic:create_empty_index RAILS_ENV=production
```
+1. If this is a re-index of your GitLab instance, clear the index status:
+
+ ```shell
+ # Omnibus installations
+ sudo gitlab-rake gitlab:elastic:clear_index_status
+
+ # Installations from source
+ bundle exec rake gitlab:elastic:clear_index_status RAILS_ENV=production
+ ```
+
1. [Enable **Elasticsearch indexing**](#enabling-elasticsearch).
1. Indexing large Git repositories can take a while. To speed up the process, you
can temporarily disable auto-refreshing and replicating. In our experience, you can expect a 20%
@@ -350,8 +357,7 @@ or creating [extra Sidekiq processes](../administration/operations/extra_sidekiq
indexer to "forget" all progress, so it will retry the indexing process from the
start.
-1. Personal snippets are not associated with a project and need to be indexed separately
- by running one of these commands:
+1. Personal snippets are not associated with a project and need to be indexed separately:
```shell
# Omnibus installations
@@ -415,7 +421,7 @@ The following are some available Rake tasks:
| Task | Description |
|:--------------------------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| [`sudo gitlab-rake gitlab:elastic:index`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Wrapper task for `gitlab:elastic:create_empty_index`, `gitlab:elastic:clear_index_status`, `gitlab:elastic:index_projects`, and `gitlab:elastic:index_snippets`. |
+| [`sudo gitlab-rake gitlab:elastic:index`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Enables Elasticsearch Indexing and run `gitlab:elastic:create_empty_index`, `gitlab:elastic:clear_index_status`, `gitlab:elastic:index_projects`, and `gitlab:elastic:index_snippets`. |
| [`sudo gitlab-rake gitlab:elastic:index_projects`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Iterates over all projects and queues Sidekiq jobs to index them in the background. |
| [`sudo gitlab-rake gitlab:elastic:index_projects_status`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Determines the overall status of the indexing. It is done by counting the total number of indexed projects, dividing by a count of the total number of projects, then multiplying by 100. |
| [`sudo gitlab-rake gitlab:elastic:clear_index_status`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Deletes all instances of IndexStatus for all projects. |
@@ -424,6 +430,7 @@ The following are some available Rake tasks:
| [`sudo gitlab-rake gitlab:elastic:recreate_index[<TARGET_NAME>]`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Wrapper task for `gitlab:elastic:delete_index[<TARGET_NAME>]` and `gitlab:elastic:create_empty_index[<TARGET_NAME>]`. |
| [`sudo gitlab-rake gitlab:elastic:index_snippets`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Performs an Elasticsearch import that indexes the snippets data. |
| [`sudo gitlab-rake gitlab:elastic:projects_not_indexed`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Displays which projects are not indexed. |
+| [`sudo gitlab-rake gitlab:elastic:reindex_cluster`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/tasks/gitlab/elastic.rake) | Schedules a zero-downtime cluster reindexing task. This feature should be used with an index that was created after GitLab 13.0. |
NOTE: **Note:**
The `TARGET_NAME` parameter is optional and will use the default index/alias name from the current `RAILS_ENV` if not set.
@@ -466,6 +473,25 @@ When performing a search, the GitLab index will use the following scopes:
## Tuning
+### Guidance on choosing optimal cluster configuration
+
+For basic guidance on choosing a cluster configuration you may refer to [Elastic Cloud Calculator](https://cloud.elastic.co/pricing). You can find more information below.
+
+- Generally, you will want to use at least a 2-node cluster configuration with one replica, which will allow you to have resilience. If your storage usage is growing quickly, you may want to plan horizontal scaling (adding more nodes) beforehand.
+- It's not recommended to use HDD storage with the search cluster, because it will take a hit on performance. It's better to use SSD storage (NVMe or SATA SSD drives for example).
+- You can use the [GitLab Performance Tool](https://gitlab.com/gitlab-org/quality/performance) to benchmark search performance with different search cluster sizes and configurations.
+- `Heap size` should be set to no more than 50% of your physical RAM. Additionally, it shouldn't be set to more than the threshold for zero-based compressed oops. The exact threshold varies, but 26 GB is safe on most systems, but can also be as large as 30 GB on some systems. See [Setting the heap size](https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html#heap-size) for more details.
+- Number of CPUs (CPU cores) per node usually corresponds to the `Number of Elasticsearch shards` setting described below.
+- A good guideline is to ensure you keep the number of shards per node below 20 per GB heap it has configured. A node with a 30GB heap should therefore have a maximum of 600 shards, but the further below this limit you can keep it the better. This will generally help the cluster stay in good health.
+- Small shards result in small segments, which increases overhead. Aim to keep the average shard size between at least a few GB and a few tens of GB. Another consideration is the number of documents, you should aim for this simple formula for the number of shards: `number of expected documents / 5M +1`.
+- `refresh_interval` is a per index setting. You may want to adjust that from default `1s` to a bigger value if you don't need data in realtime. This will change how soon you will see fresh results. If that's important for you, you should leave it as close as possible to the default value.
+- You might want to raise [`indices.memory.index_buffer_size`](https://www.elastic.co/guide/en/elasticsearch/reference/current/indexing-buffer.html) to 30% or 40% if you have a lot of heavy indexing operations.
+
+### Elasticsearch integration settings guidance
+
+- The `Number of Elasticsearch shards` setting usually corresponds with the number of CPUs available in your cluster. For example, if you have a 3-node cluster with 4 cores each, this means you will benefit from having at least 3*4=12 shards in the cluster. Please note, it's only possible to change the shards number by using [Split index API](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-split-index.html) or by reindexing to a different index with a changed number of shards.
+- The `Number of Elasticsearch replicas` setting should most of the time be equal to `1` (each shard will have 1 replica). Using `0` is not recommended, because losing one node will corrupt the index.
+
### Deleted documents
Whenever a change or deletion is made to an indexed GitLab object (a merge request description is changed, a file is deleted from the master branch in a repository, a project is deleted, etc), a document in the index is deleted. However, since these are "soft" deletes, the overall number of "deleted documents", and therefore wasted space, increases. Elasticsearch does intelligent merging of segments in order to remove these deleted documents. However, depending on the amount and type of activity in your GitLab installation, it's possible to see as much as 50% wasted space in the index.
@@ -516,7 +542,7 @@ Here are some common pitfalls and how to overcome them:
If you see `"Kaminari::PaginatableArray"` you are using Elasticsearch.
- NOTE: **Note**:
+ NOTE: **Note:**
The above instructions are used to verify that GitLab is using Elasticsearch only when indexing all namespaces. This is not to be used for scenarios that only index a [subset of namespaces](#limiting-namespaces-and-projects).
- **I updated GitLab and now I can't find anything**
@@ -539,7 +565,7 @@ Here are some common pitfalls and how to overcome them:
pp s.search_objects.to_a
```
- NOTE: **Note**:
+ NOTE: **Note:**
The above instructions are not to be used for scenarios that only index a [subset of namespaces](#limiting-namespaces-and-projects).
See [Elasticsearch Index Scopes](#elasticsearch-index-scopes) for more information on searching for specific types of data.
@@ -556,12 +582,6 @@ Here are some common pitfalls and how to overcome them:
You can run `sudo gitlab-rake gitlab:elastic:projects_not_indexed` to display projects that aren't indexed.
-- **No new data is added to the Elasticsearch index when I push code**
-
- When performing the initial indexing of blobs, we lock all projects until the project finishes indexing. It could
- happen that an error during the process causes one or multiple projects to remain locked. In order to unlock them,
- run the `gitlab:elastic:clear_locked_projects` Rake task.
-
- **"Can't specify parent if no parent field has been configured"**
If you enabled Elasticsearch before GitLab 8.12 and have not rebuilt indexes you will get
@@ -609,7 +629,8 @@ Here are some common pitfalls and how to overcome them:
**For a single node Elasticsearch cluster the functional cluster health status will be yellow** (will never be green) because the primary shard is allocated but replicas can not be as there is no other node to which Elasticsearch can assign a replica. This also applies if you are using the
[Amazon Elasticsearch](https://docs.aws.amazon.com/elasticsearch-service/latest/developerguide/aes-handling-errors.html#aes-handling-errors-yellow-cluster-status) service.
- CAUTION: **Warning**: Setting the number of replicas to `0` is not something that we recommend (this is not allowed in the GitLab Elasticsearch Integration menu). If you are planning to add more Elasticsearch nodes (for a total of more than 1 Elasticsearch) the number of replicas will need to be set to an integer value larger than `0`. Failure to do so will result in lack of redundancy (losing one node will corrupt the index).
+ CAUTION: **Warning:**
+ Setting the number of replicas to `0` is not something that we recommend (this is not allowed in the GitLab Elasticsearch Integration menu). If you are planning to add more Elasticsearch nodes (for a total of more than 1 Elasticsearch) the number of replicas will need to be set to an integer value larger than `0`. Failure to do so will result in lack of redundancy (losing one node will corrupt the index).
If you have a **hard requirement to have a green status for your single node Elasticsearch cluster**, please make sure you understand the risks outlined in the previous paragraph and then simply run the following query to set the number of replicas to `0`(the cluster will no longer try to create any shard replicas):
diff --git a/doc/integration/github.md b/doc/integration/github.md
index 68971c510d5..ccce4672cbf 100644
--- a/doc/integration/github.md
+++ b/doc/integration/github.md
@@ -12,7 +12,7 @@ When you create an OAuth 2 app in GitHub, you'll need the following information:
- The authorization callback URL; in this case, `https://gitlab.example.com/users/auth`. Include the port number if your GitLab instance uses a non-default port.
NOTE: **Note:**
-To prevent an [OAuth2 covert redirect](http://tetraph.com/covert_redirect/) vulnerability, append `/users/auth` to the end of the GitHub authorization callback URL.
+To prevent an [OAuth2 covert redirect](https://oauth.net/advisories/2014-1-covert-redirect/) vulnerability, append `/users/auth` to the end of the GitHub authorization callback URL.
See [Initial OmniAuth Configuration](omniauth.md#initial-omniauth-configuration) for initial settings.
diff --git a/doc/integration/jira_development_panel.md b/doc/integration/jira_development_panel.md
index f032345b9e0..85404ff3164 100644
--- a/doc/integration/jira_development_panel.md
+++ b/doc/integration/jira_development_panel.md
@@ -16,7 +16,7 @@ A top-level GitLab group is one that does not have any parent group itself. All
as well as projects of the top-level group's subgroups nesting down, are connected. Alternatively, you can specify
a GitLab personal namespace in the Jira configuration, which will then connect the projects in that personal namespace to Jira.
-NOTE: **Note**:
+NOTE: **Note:**
Note this is different from the [existing Jira](../user/project/integrations/jira.md) project integration, where the mapping
is one GitLab project to the entire Jira instance.
@@ -55,7 +55,7 @@ There are no special requirements if you are using GitLab.com.
replacing `<your-gitlab-instance-domain>` appropriately. So for example, if you are using GitLab.com,
this would be `https://gitlab.com/login/oauth/callback`.
- NOTE: **Note**:
+ NOTE: **Note:**
If using a GitLab version earlier than 11.3, the `Redirect URI` must be
`https://<your-gitlab-instance-domain>/-/jira/login/oauth/callback`. If you want Jira
to have access to all projects, GitLab recommends an administrator creates the
@@ -90,7 +90,7 @@ There are no special requirements if you are using GitLab.com.
replacing `<your-gitlab-instance-domain>` appropriately. So for example, if you are using GitLab.com,
this would be `https://gitlab.com/`.
- NOTE: **Note**:
+ NOTE: **Note:**
If using a GitLab version earlier than 11.3 the `Host URL` value should be `https://<your-gitlab-instance-domain>/-/jira`
For the `Client ID` field, use the `Application ID` value from the previous section.
diff --git a/doc/integration/openid_connect_provider.md b/doc/integration/openid_connect_provider.md
index 3da17347e91..b66262772da 100644
--- a/doc/integration/openid_connect_provider.md
+++ b/doc/integration/openid_connect_provider.md
@@ -37,11 +37,11 @@ Currently the following user information is shared with clients:
| `auth_time` | `integer` | The timestamp for the user's last authentication
| `name` | `string` | The user's full name
| `nickname` | `string` | The user's GitLab username
-| `email` | `string` | The user's public email address
-| `email_verified` | `boolean` | Whether the user's public email address was verified
+| `email` | `string` | The user's email address<br>This is the user's *primary* email address if the application has access to the `email` claim and the user's *public* email address otherwise
+| `email_verified` | `boolean` | Whether the user's email address was verified
| `website` | `string` | URL for the user's website
| `profile` | `string` | URL for the user's GitLab profile
| `picture` | `string` | URL for the user's GitLab avatar
| `groups` | `array` | Names of the groups the user is a member of
-Only the `sub` and `sub_legacy` claims are included in the ID token, all other claims are available from the `/oauth/userinfo` endpoint used by OIDC clients.
+The claims `sub`, `sub_legacy`, `email` and `email_verified` are included in the ID token, all other claims are available from the `/oauth/userinfo` endpoint used by OIDC clients.
diff --git a/doc/integration/recaptcha.md b/doc/integration/recaptcha.md
index bf2e6568ea6..48c83f1393b 100644
--- a/doc/integration/recaptcha.md
+++ b/doc/integration/recaptcha.md
@@ -15,6 +15,12 @@ To use reCAPTCHA, first you must create a site and private key.
1. Fill all reCAPTCHA fields with keys from previous steps.
1. Check the `Enable reCAPTCHA` checkbox.
1. Save the configuration.
+1. Change the first line of the `#execute` method in `app/services/spam/spam_verdict_service.rb`
+ to `return CONDITONAL_ALLOW` so that the spam check short-circuits and triggers the response to
+ return `recaptcha_html`.
+
+NOTE: **Note:**
+Make sure you are viewing an issuable in a project that is public, and if you're working with an issue, the issue is public.
## Enabling reCAPTCHA for user logins via passwords
diff --git a/doc/integration/saml.md b/doc/integration/saml.md
index 97384d8f6cf..9dce7269c86 100644
--- a/doc/integration/saml.md
+++ b/doc/integration/saml.md
@@ -1,11 +1,29 @@
+---
+type: reference
+stage: Manage
+group: Access
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
+---
+
# SAML OmniAuth Provider **(CORE ONLY)**
-Note that:
+This page describes instance-wide SAML for self-managed GitLab instances. For SAML on GitLab.com, see [SAML SSO for GitLab.com groups](../user/group/saml_sso/index.md).
+
+You should also reference the [OmniAuth documentation](omniauth.md) for general settings that apply to all OmniAuth providers.
-- SAML OmniAuth Provider is for SAML on self-managed GitLab instances. For SAML on
- GitLab.com, see [SAML SSO for GitLab.com Groups](../user/group/saml_sso/index.md).
-- Starting from GitLab 11.4, OmniAuth is enabled by default. If you're using an
- earlier version, you'll need to explicitly enable it.
+## Common SAML Terms
+
+| Term | Description |
+|------|-------------|
+| Identity Provider (IdP) | The service which manages your user identities such as ADFS, Okta, Onelogin, or Ping Identity. |
+| Service Provider (SP) | SAML considers GitLab to be a service provider. |
+| Assertion | A piece of information about a user's identity, such as their name or role. Also know as claims or attributes. |
+| SSO | Single Sign-On. |
+| Assertion consumer service URL | The callback on GitLab where users will be redirected after successfully authenticating with the identity provider. |
+| Issuer | How GitLab identifies itself to the identity provider. Also known as a "Relying party trust identifier". |
+| Certificate fingerprint | Used to confirm that communications over SAML are secure by checking that the server is signing communications with the correct certificate. Also known as a certificate thumbprint. |
+
+## General Setup
GitLab can be configured to act as a SAML 2.0 Service Provider (SP). This allows
GitLab to consume assertions from a SAML 2.0 Identity Provider (IdP) such as
@@ -143,17 +161,10 @@ On the sign in page there should now be a SAML button below the regular sign in
Click the icon to begin the authentication process. If everything goes well the user
will be returned to GitLab and will be signed in.
-## Marking Users as External based on SAML Groups
-
->**Note:**
-This setting is only available on GitLab 8.7 and above.
+## SAML Groups
-SAML login includes support for automatically identifying whether a user should
-be considered an [external](../user/permissions.md) user based on the user's group
-membership in the SAML identity provider. This feature **does not** allow you to
-automatically add users to GitLab [Groups](../user/group/index.md), it simply
-allows you to mark users as External if they are members of certain groups in the
-Identity Provider.
+You can require users to be members of a certain group, or assign users `external`, `admin` or `auditor` roles based on group membership. This feature **does not** allow you to
+automatically add users to GitLab [Groups](../user/group/index.md).
### Requirements
@@ -164,74 +175,73 @@ with the regular SAML response. Here is an example:
```xml
<saml:AttributeStatement>
<saml:Attribute Name="Groups">
- <saml:AttributeValue xsi:type="xs:string">SecurityGroup</saml:AttributeValue>
<saml:AttributeValue xsi:type="xs:string">Developers</saml:AttributeValue>
- <saml:AttributeValue xsi:type="xs:string">Designers</saml:AttributeValue>
+ <saml:AttributeValue xsi:type="xs:string">Freelancers</saml:AttributeValue>
+ <saml:AttributeValue xsi:type="xs:string">Admins</saml:AttributeValue>
+ <saml:AttributeValue xsi:type="xs:string">Auditors</saml:AttributeValue>
</saml:Attribute>
</saml:AttributeStatement>
```
The name of the attribute can be anything you like, but it must contain the groups
to which a user belongs. In order to tell GitLab where to find these groups, you need
-to add a `groups_attribute:` element to your SAML settings. You will also need to
-tell GitLab which groups are external via the `external_groups:` element:
+to add a `groups_attribute:` element to your SAML settings.
+
+### Required groups **(STARTER ONLY)**
+
+Your IdP passes Group Information to the SP (GitLab) in the SAML Response. You need to configure GitLab to identify:
+
+- Where to look for the groups in the SAML response via the `groups_attribute` setting
+- Which group membership is requisite to sign in via the `required_groups` setting
+
+When `required_groups` is not set or it is empty, anyone with proper authentication
+will be able to use the service.
+
+Example:
```yaml
{ name: 'saml',
label: 'Our SAML Provider',
groups_attribute: 'Groups',
- external_groups: ['Freelancers', 'Interns'],
+ required_groups: ['Developers', 'Freelancers', 'Admins', 'Auditors'],
args: {
assertion_consumer_service_url: 'https://gitlab.example.com/users/auth/saml/callback',
idp_cert_fingerprint: '43:51:43:a1:b5:fc:8b:b7:0a:3a:a9:b1:0f:66:73:a8',
idp_sso_target_url: 'https://login.example.com/idp',
issuer: 'https://gitlab.example.com',
- name_identifier_format: 'urn:oasis:names:tc:SAML:2.0:nameid-format:persistent'
+ name_identifier_format: 'urn:oasis:names:tc:SAML:2.0:nameid-format:transient'
} }
```
-## Required groups **(STARTER ONLY)**
-
->**Note:**
-This setting is only available on GitLab 10.2 EE and above.
-
-This setting works like `External Groups` setting. Just like there, your IdP has to
-pass Group Information to GitLab, you have to tell GitLab where to look or the
-groups SAML response, and which group membership should be requisite for logging in.
-When `required_groups` is not set or it is empty, anyone with proper authentication
-will be able to use the service.
+### External Groups **(STARTER ONLY)**
-Example:
+SAML login supports automatic identification on whether a user should be considered an [external](../user/permissions.md) user. This is based on the user's group membership in the SAML identity provider.
```yaml
{ name: 'saml',
label: 'Our SAML Provider',
groups_attribute: 'Groups',
- required_groups: ['Developers', 'Managers', 'Admins'],
+ external_groups: ['Freelancers'],
args: {
assertion_consumer_service_url: 'https://gitlab.example.com/users/auth/saml/callback',
idp_cert_fingerprint: '43:51:43:a1:b5:fc:8b:b7:0a:3a:a9:b1:0f:66:73:a8',
idp_sso_target_url: 'https://login.example.com/idp',
issuer: 'https://gitlab.example.com',
- name_identifier_format: 'urn:oasis:names:tc:SAML:2.0:nameid-format:transient'
+ name_identifier_format: 'urn:oasis:names:tc:SAML:2.0:nameid-format:persistent'
} }
```
-## Admin Groups **(STARTER ONLY)**
-
->**Note:**
-This setting is only available on GitLab 8.8 EE and above.
+### Admin Groups **(STARTER ONLY)**
-This setting works very similarly to the `External Groups` setting. The requirements
-are the same, your IdP needs to pass Group information to GitLab, you need to tell
-GitLab where to look for the groups in the SAML response, and which group should be
-considered `admin groups`.
+The requirements are the same as the previous settings, your IdP needs to pass Group information to GitLab, you need to tell
+GitLab where to look for the groups in the SAML response, and which group(s) should be
+considered admin users.
```yaml
{ name: 'saml',
label: 'Our SAML Provider',
groups_attribute: 'Groups',
- admin_groups: ['Managers', 'Admins'],
+ admin_groups: ['Admins'],
args: {
assertion_consumer_service_url: 'https://gitlab.example.com/users/auth/saml/callback',
idp_cert_fingerprint: '43:51:43:a1:b5:fc:8b:b7:0a:3a:a9:b1:0f:66:73:a8',
@@ -241,18 +251,20 @@ considered `admin groups`.
} }
```
-## Auditor Groups **(STARTER ONLY)**
+### Auditor Groups **(STARTER ONLY)**
>**Note:**
This setting is only available on GitLab 11.4 EE and above.
-This setting also follows the requirements documented for the `External Groups` setting. GitLab uses the Group information provided by your IdP to determine if a user should be assigned the `auditor` role.
+The requirements are the same as the previous settings, your IdP needs to pass Group information to GitLab, you need to tell
+GitLab where to look for the groups in the SAML response, and which group(s) should be
+considered auditor users.
```yaml
{ name: 'saml',
label: 'Our SAML Provider',
groups_attribute: 'Groups',
- auditor_groups: ['Auditors', 'Security'],
+ auditor_groups: ['Auditors'],
args: {
assertion_consumer_service_url: 'https://gitlab.example.com/users/auth/saml/callback',
idp_cert_fingerprint: '43:51:43:a1:b5:fc:8b:b7:0a:3a:a9:b1:0f:66:73:a8',
@@ -265,7 +277,18 @@ This setting also follows the requirements documented for the `External Groups`
## Bypass two factor authentication
If you want some SAML authentication methods to count as 2FA on a per session basis, you can register them in the
-`upstream_two_factor_authn_contexts` list:
+`upstream_two_factor_authn_contexts` list.
+
+In addition to the changes in GitLab, make sure that your IdP is returning the
+`AuthnContext`. For example:
+
+```xml
+<saml:AuthnStatement>
+ <saml:AuthnContext>
+ <saml:AuthnContextClassRef>urn:oasis:names:tc:SAML:2.0:ac:classes:MediumStrongCertificateProtectedTransport</saml:AuthnContextClassRef>
+ </saml:AuthnContext>
+</saml:AuthnStatement>
+```
**For Omnibus installations:**
@@ -324,18 +347,7 @@ If you want some SAML authentication methods to count as 2FA on a per session ba
}
```
-1. Save the file and [restart GitLab](../administration/restart_gitlab.md#installations-from-source) for the changes ot take effect
-
-In addition to the changes in GitLab, make sure that your Idp is returning the
-`AuthnContext`. For example:
-
-```xml
-<saml:AuthnStatement>
- <saml:AuthnContext>
- <saml:AuthnContextClassRef>urn:oasis:names:tc:SAML:2.0:ac:classes:MediumStrongCertificateProtectedTransport</saml:AuthnContextClassRef>
- </saml:AuthnContext>
-</saml:AuthnStatement>
-```
+1. Save the file and [restart GitLab](../administration/restart_gitlab.md#installations-from-source) for the changes to take effect
## Customization
@@ -368,7 +380,6 @@ You may also bypass the auto signin feature by browsing to
### `attribute_statements`
>**Note:**
-This setting is only available on GitLab 8.6 and above.
This setting should only be used to map attributes that are part of the
OmniAuth `info` hash schema.
@@ -396,6 +407,21 @@ args: {
}
```
+#### Setting a username
+
+By default, the email in the SAML response will be used to automatically generate the user's GitLab username. If you'd like to set another attribute as the username, assign it to the `nickname` OmniAuth `info` hash attribute. For example, if you wanted to set the `username` attribute in your SAML Response to the username in GitLab, use the following setting:
+
+```yaml
+args: {
+ assertion_consumer_service_url: 'https://gitlab.example.com/users/auth/saml/callback',
+ idp_cert_fingerprint: '43:51:43:a1:b5:fc:8b:b7:0a:3a:a9:b1:0f:66:73:a8',
+ idp_sso_target_url: 'https://login.example.com/idp',
+ issuer: 'https://gitlab.example.com',
+ name_identifier_format: 'urn:oasis:names:tc:SAML:2.0:nameid-format:persistent',
+ attribute_statements: { nickname: ['username'] }
+}
+```
+
### `allowed_clock_drift`
The clock of the Identity Provider may drift slightly ahead of your system clocks.
@@ -557,10 +583,12 @@ Avoid user control of the following attributes:
These attributes define the SAML user. If users can change these attributes, they can impersonate others.
-Refer to the documentation for your [SAML Identity Provider](../user/group/saml_sso/index.md#providers) for information on how to fix these attributes.
+Refer to the documentation for your SAML Identity Provider for information on how to fix these attributes.
## Troubleshooting
+You can find the base64-encoded SAML Response in the [`production_json.log`](../administration/logs.md#production_jsonlog).
+
### GitLab+SAML Testing Environments
If you need to troubleshoot, [a complete GitLab+SAML testing environment using Docker compose](https://gitlab.com/gitlab-com/support/toolbox/replication/tree/master/compose_files) is available.
@@ -580,13 +608,14 @@ Make sure the IdP provides a claim containing the user's email address, using cl
If after signing in into your SAML server you are redirected back to the sign in page and
no error is displayed, check your `production.log` file. It will most likely contain the
message `Can't verify CSRF token authenticity`. This means that there is an error during
-the SAML request, but this error never reaches GitLab due to the CSRF check.
+the SAML request, but in GitLab 11.7 and earlier this error never reaches GitLab due to
+the CSRF check.
To bypass this you can add `skip_before_action :verify_authenticity_token` to the
`omniauth_callbacks_controller.rb` file immediately after the `class` line and
-comment out the `protect_from_forgery` line using a `#` then restart Unicorn. This
-will allow the error to hit GitLab, where it can then be seen in the usual logs,
-or as a flash message on the login screen.
+comment out the `protect_from_forgery` line using a `#`. Restart Unicorn for this
+change to take effect. This will allow the error to hit GitLab, where it can then
+be seen in the usual logs, or as a flash message on the login screen.
That file is located in `/opt/gitlab/embedded/service/gitlab-rails/app/controllers`
for Omnibus installations and by default in `/home/git/gitlab/app/controllers` for
@@ -613,6 +642,8 @@ is not providing this information, all SAML requests will fail.
Make sure this information is provided.
+Another issue that can result in this error is when the correct information is being sent by the IdP, but the attributes don't match the names in the OmniAuth `info` hash. In this case, you'll need to set `attribute_statements` in the SAML configuration to [map the attribute names in your SAML Response to the corresponding OmniAuth `info` hash names](#attribute_statements).
+
### Key validation error, Digest mismatch or Fingerprint mismatch
These errors all come from a similar place, the SAML certificate. SAML requests