diff options
author | GitLab Bot <gitlab-bot@gitlab.com> | 2020-04-07 15:09:30 +0000 |
---|---|---|
committer | GitLab Bot <gitlab-bot@gitlab.com> | 2020-04-07 15:09:30 +0000 |
commit | c6b3ec3f56fa32a0e0ed3de0d0878d25f1adaddf (patch) | |
tree | 967afee9a510ff9dd503ebd83706dc760ec2e3ed /doc/administration | |
parent | 903ccf7c93eb9490c76857bffe744249cc07de09 (diff) | |
download | gitlab-ce-c6b3ec3f56fa32a0e0ed3de0d0878d25f1adaddf.tar.gz |
Add latest changes from gitlab-org/gitlab@master
Diffstat (limited to 'doc/administration')
-rw-r--r-- | doc/administration/high_availability/object_storage.md | 31 | ||||
-rw-r--r-- | doc/administration/job_artifacts.md | 2 | ||||
-rw-r--r-- | doc/administration/object_storage.md | 140 |
3 files changed, 143 insertions, 30 deletions
diff --git a/doc/administration/high_availability/object_storage.md b/doc/administration/high_availability/object_storage.md index 148753657cf..eeb730d3cc7 100644 --- a/doc/administration/high_availability/object_storage.md +++ b/doc/administration/high_availability/object_storage.md @@ -1,32 +1,5 @@ --- -type: reference +redirect_to: '../object_storage.md' --- -# Cloud Object Storage - -GitLab supports utilizing a Cloud Object Storage service rather than [NFS](nfs.md) for holding -numerous types of data. This is recommended in larger setups as object storage is -typically much more performant, reliable, and scalable. - -For configuring GitLab to use Object Storage refer to the following guides: - -1. Make sure the [`git` user home directory](https://docs.gitlab.com/omnibus/settings/configuration.html#moving-the-home-directory-for-a-user) is on local disk. -1. Configure [database lookup of SSH keys](../operations/fast_ssh_key_lookup.md) - to eliminate the need for a shared `authorized_keys` file. -1. Configure [object storage for backups](../../raketasks/backup_restore.md#uploading-backups-to-a-remote-cloud-storage). -1. Configure [object storage for job artifacts](../job_artifacts.md#using-object-storage) - including [incremental logging](../job_logs.md#new-incremental-logging-architecture). -1. Configure [object storage for LFS objects](../lfs/lfs_administration.md#storing-lfs-objects-in-remote-object-storage). -1. Configure [object storage for uploads](../uploads.md#using-object-storage-core-only). -1. Configure [object storage for merge request diffs](../merge_request_diffs.md#using-object-storage). -1. Configure [object storage for container registry](../packages/container_registry.md#container-registry-storage-driver) (optional feature). -1. Configure [object storage for Mattermost](https://docs.mattermost.com/administration/config-settings.html#file-storage) (optional feature). -1. Configure [object storage for packages](../packages/index.md#using-object-storage) (optional feature). **(PREMIUM ONLY)** -1. Configure [object storage for dependency proxy](../packages/dependency_proxy.md#using-object-storage) (optional feature). **(PREMIUM ONLY)** -1. Configure [object storage for Pseudonymizer](../pseudonymizer.md#configuration) (optional feature). **(ULTIMATE ONLY)** - -NOTE: **Note:** -One current feature of GitLab that still requires a shared directory (NFS) is -[GitLab Pages](../../user/project/pages/index.md). -There is [work in progress](https://gitlab.com/gitlab-org/gitlab-pages/issues/196) -to eliminate the need for NFS to support GitLab Pages. +This document was moved to [another location](../object_storage.md). diff --git a/doc/administration/job_artifacts.md b/doc/administration/job_artifacts.md index c003761299e..c45388087ab 100644 --- a/doc/administration/job_artifacts.md +++ b/doc/administration/job_artifacts.md @@ -90,7 +90,7 @@ This configuration relies on valid AWS credentials to be configured already. Use an object storage option like AWS S3 to store job artifacts. DANGER: **Danger:** -If you're enabling S3 in [GitLab HA](high_availability/README.md), you will need to have an [NFS mount set up for CI logs and artifacts](high_availability/nfs.md#a-single-nfs-mount) or enable [incremental logging](job_logs.md#new-incremental-logging-architecture). If these settings are not set, you will risk job logs disappearing or not being saved. +If you configure GitLab to store CI logs and artifacts on object storage, you must also enable [incremental logging](job_logs.md#new-incremental-logging-architecture). Otherwise, job logs will disappear or not be saved. #### Object Storage Settings diff --git a/doc/administration/object_storage.md b/doc/administration/object_storage.md new file mode 100644 index 00000000000..ba537c5019a --- /dev/null +++ b/doc/administration/object_storage.md @@ -0,0 +1,140 @@ +--- +type: reference +--- + +# Object Storage + +GitLab supports using an object storage service for holding numerous types of data. +In a high availability setup, it's recommended over [NFS](high_availability/nfs.md) and +in general it's better in larger setups as object storage is +typically much more performant, reliable, and scalable. + +## Options + +Object storage options that GitLab has tested, or is aware of customers using include: + +- SaaS/Cloud solutions such as [Amazon S3](https://aws.amazon.com/s3/), [Google cloud storage](https://cloud.google.com/storage). +- On-premises hardware and appliances from various storage vendors. +- MinIO. We have [a guide to deploying this](https://docs.gitlab.com/charts/advanced/external-object-storage/minio.html) within our Helm Chart documentation. + +## Configuration guides + +For configuring GitLab to use Object Storage refer to the following guides: + +1. Make sure the [`git` user home directory](https://docs.gitlab.com/omnibus/settings/configuration.html#moving-the-home-directory-for-a-user) is on local disk. +1. Configure [database lookup of SSH keys](operations/fast_ssh_key_lookup.md) + to eliminate the need for a shared `authorized_keys` file. +1. Configure [object storage for backups](../raketasks/backup_restore.md#uploading-backups-to-a-remote-cloud-storage). +1. Configure [object storage for job artifacts](job_artifacts.md#using-object-storage) + including [incremental logging](job_logs.md#new-incremental-logging-architecture). +1. Configure [object storage for LFS objects](lfs/lfs_administration.md#storing-lfs-objects-in-remote-object-storage). +1. Configure [object storage for uploads](uploads.md#using-object-storage-core-only). +1. Configure [object storage for merge request diffs](merge_request_diffs.md#using-object-storage). +1. Configure [object storage for Container Registry](packages/container_registry.md#container-registry-storage-driver) (optional feature). +1. Configure [object storage for Mattermost](https://docs.mattermost.com/administration/config-settings.html#file-storage) (optional feature). +1. Configure [object storage for packages](packages/index.md#using-object-storage) (optional feature). **(PREMIUM ONLY)** +1. Configure [object storage for Dependency Proxy](packages/dependency_proxy.md#using-object-storage) (optional feature). **(PREMIUM ONLY)** +1. Configure [object storage for Pseudonymizer](pseudonymizer.md#configuration) (optional feature). **(ULTIMATE ONLY)** +1. Configure [object storage for autoscale Runner caching](https://docs.gitlab.com/runner/configuration/autoscale.html#distributed-runners-caching) (optional - for improved performance). + +## Warnings, limitations, and known issues + +### Use separate buckets + +Using separate buckets for each data type is the recommended approach for GitLab. + +A limitation of our configuration is that each use of object storage is separately configured. +[We have an issue for improving this](https://gitlab.com/gitlab-org/gitlab/-/issues/23345) +and easily using one bucket with separate folders is one improvement that this might bring. + +There is at least one specific issue with using the same bucket: +when GitLab is deployed with the Helm chart restore from backup +[will not properly function](https://docs.gitlab.com/charts/advanced/external-object-storage/#lfs-artifacts-uploads-packages-external-diffs-pseudonymizer) +unless separate buckets are used. + +One risk of using a single bucket would be that if your organisation decided to +migrate GitLab to the Helm deployment in the future. GitLab would run, but the situation with +backups might not be realised until the organisation had a critical requirement for the backups to work. + +### S3 API compatability issues + +Not all S3 providers [are fully compatible](../raketasks/backup_restore.md#other-s3-providers) +with the Fog library that GitLab uses. Symptoms include: + +```plaintext +411 Length Required +``` + +### GitLab Pages requires NFS + +If you're working to [scale out](high_availability/README.md) your GitLab implementation and +one of your requirements is [GitLab Pages](../user/project/pages/index.md) this currently requires +NFS. There is [work in progress](https://gitlab.com/gitlab-org/gitlab-pages/issues/196) +to remove this dependency. In the future, GitLab Pages may use +[object storage](https://gitlab.com/gitlab-org/gitlab/-/issues/208135). + +The dependency on disk storage also prevents Pages being deployed using the +[GitLab Helm chart](https://gitlab.com/gitlab-org/charts/gitlab/-/issues/37). + +### Incremental logging is required for CI to use object storage + +If you configure GitLab to use object storage for CI logs and artifacts, +[you must also enable incremental logging](job_artifacts.md#using-object-storage). + +### Proxy Download + +A number of the use cases for object storage allow client traffic to be redirected to the +object storage back end, like when Git clients request large files via LFS or when +downloading CI artifacts and logs. + +When the files are stored on local block storage or NFS, GitLab has to act as a proxy. +This is not the default behaviour with object storage. + +The `proxy_download` setting controls this behaviour: the default is generally `false`. +Verify this in the documentation for each use case. Set it to `true` so that GitLab proxies +the files. + +When not proxying files, GitLab returns an +[HTTP 302 redirect with a pre-signed, time-limited object storage URL](https://gitlab.com/gitlab-org/gitlab/-/issues/32117#note_218532298). +This can result in some of the following problems: + +- If GitLab is using non-secure HTTP to access the object storage, clients may generate +`https->http` downgrade errors and refuse to process the redirect. The solution to this +is for GitLab to use HTTPS. LFS, for example, will generate this error: + + ```plaintext + LFS: lfsapi/client: refusing insecure redirect, https->http + ``` + +- Clients will need to trust the certificate authority that issued the object storage +certificate, or may return common TLS errors such as: + + ```plaintext + x509: certificate signed by unknown authority + ``` + +- Clients will need network access to the object storage. Errors that might result +if this access is not in place include: + + ```plaintext + Received status code 403 from server: Forbidden + ``` + +Getting a `403 Forbidden` response is specifically called out on the +[package repository documentation](packages/index.md#using-object-storage) +as a side effect of how some build tools work. + +### ETag mismatch + +Using the default GitLab settings, some object storage back-ends such as +[MinIO](https://gitlab.com/gitlab-org/gitlab/-/issues/23188) +and [Alibaba](https://gitlab.com/gitlab-org/charts/gitlab/-/issues/1564) +might generate `ETag mismatch` errors. + +When using GitLab direct upload, the +[workaround for MinIO](https://gitlab.com/gitlab-org/charts/gitlab/-/issues/1564#note_244497658) +is to use the `--compat` parameter on the server. + +We are working on a fix to GitLab component Workhorse, and also +a workaround, in the mean time, to +[allow ETag verification to be disabled](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/18175). |