diff options
author | GitLab Bot <gitlab-bot@gitlab.com> | 2021-12-20 13:37:47 +0000 |
---|---|---|
committer | GitLab Bot <gitlab-bot@gitlab.com> | 2021-12-20 13:37:47 +0000 |
commit | aee0a117a889461ce8ced6fcf73207fe017f1d99 (patch) | |
tree | 891d9ef189227a8445d83f35c1b0fc99573f4380 /doc/administration/geo/replication/datatypes.md | |
parent | 8d46af3258650d305f53b819eabf7ab18d22f59e (diff) | |
download | gitlab-ce-aee0a117a889461ce8ced6fcf73207fe017f1d99.tar.gz |
Add latest changes from gitlab-org/gitlab@14-6-stable-eev14.6.0-rc42
Diffstat (limited to 'doc/administration/geo/replication/datatypes.md')
-rw-r--r-- | doc/administration/geo/replication/datatypes.md | 100 |
1 files changed, 52 insertions, 48 deletions
diff --git a/doc/administration/geo/replication/datatypes.md b/doc/administration/geo/replication/datatypes.md index c98436157fc..31bc473d74b 100644 --- a/doc/administration/geo/replication/datatypes.md +++ b/doc/administration/geo/replication/datatypes.md @@ -5,7 +5,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w type: howto --- -# Geo data types support **(PREMIUM SELF)** +# Supported Geo data types **(PREMIUM SELF)** A Geo data type is a specific class of data that is required by one or more GitLab features to store relevant information. @@ -14,7 +14,7 @@ To replicate data produced by these features with Geo, we use several strategies ## Data types -We currently distinguish between three different data types: +We distinguish between three different data types: - [Git repositories](#git-repositories) - [Blobs](#blobs) @@ -35,9 +35,9 @@ verification methods: | Git | Project Snippets | Geo with Gitaly | Gitaly Checksum | | Git | Personal Snippets | Geo with Gitaly | Gitaly Checksum | | Git | Group wiki repository | Geo with Gitaly | _Not implemented_ | -| Blobs | User uploads _(file system)_ | Geo with API | _Not implemented_ | +| Blobs | User uploads _(file system)_ | Geo with API | SHA256 checksum | | Blobs | User uploads _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ | -| Blobs | LFS objects _(file system)_ | Geo with API | _Not implemented_ | +| Blobs | LFS objects _(file system)_ | Geo with API | SHA256 checksum | | Blobs | LFS objects _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ | | Blobs | CI job artifacts _(file system)_ | Geo with API | _Not implemented_ | | Blobs | CI job artifacts _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ | @@ -51,11 +51,11 @@ verification methods: | Blobs | Infrastructure registry _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ | | Blobs | Versioned Terraform State _(file system)_ | Geo with API | SHA256 checksum | | Blobs | Versioned Terraform State _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ | -| Blobs | External Merge Request Diffs _(file system)_ | Geo with API | _Not implemented_ | +| Blobs | External Merge Request Diffs _(file system)_ | Geo with API | SHA256 checksum | | Blobs | External Merge Request Diffs _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ | | Blobs | Pipeline artifacts _(file system)_ | Geo with API | SHA256 checksum | -| Blobs | Pipeline artifacts _(object storage)_ | Geo with API/Managed (*2*) | SHA256 checksum | -| Blobs | Pages _(file system)_ | Geo with API | _Not implemented_ | +| Blobs | Pipeline artifacts _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ | +| Blobs | Pages _(file system)_ | Geo with API | SHA256 checksum | | Blobs | Pages _(object storage)_ | Geo with API/Managed (*2*) | _Not implemented_ | - (*1*): Redis replication can be used as part of HA with Redis sentinel. It's not used between Geo sites. @@ -66,15 +66,20 @@ verification methods: A GitLab instance can have one or more repository shards. Each shard has a Gitaly instance that is responsible for allowing access and operations on the locally stored Git repositories. It can run -on a machine with a single disk, multiple disks mounted as a single mount-point (like with a RAID array), -or using LVM. +on a machine: -It requires no special file system and can work with NFS or a mounted Storage Appliance (there may be -performance limitations when using a remote file system). +- With a single disk. +- With multiple disks mounted as a single mount-point (like with a RAID array). +- Using LVM. -Geo will trigger garbage collection in Gitaly to [deduplicate forked repositories](../../../development/git_object_deduplication.md#git-object-deduplication-and-gitlab-geo) on Geo secondary sites. +GitLab does not require a special file system and can work with: -Communication is done via Gitaly's own gRPC API. There are three possible ways of synchronization: +- NFS. +- A mounted Storage Appliance (there may be performance limitations when using a remote file system). + +Geo triggers garbage collection in Gitaly to [deduplicate forked repositories](../../../development/git_object_deduplication.md#git-object-deduplication-and-gitlab-geo) on Geo secondary sites. + +Communication is done via Gitaly's own gRPC API, with three possible ways of synchronization: - Using regular Git clone/fetch from one Geo site to another (with special authentication). - Using repository snapshots (for when the first method fails or repository is corrupt). @@ -90,7 +95,7 @@ They all live in the same shard and share the same base name with a `-wiki` and for Wiki and Design Repository cases. Besides that, there are snippet repositories. They can be connected to a project or to some specific user. -Both types will be synced to a secondary site. +Both types are synced to a secondary site. ### Blobs @@ -102,7 +107,7 @@ GitLab stores files and blobs such as Issue attachments or LFS objects into eith - Hosted by you (like MinIO). - A Storage Appliance that exposes an Object Storage-compatible API. -When using the file system store instead of Object Storage, you need to use network mounted file systems +When using the file system store instead of Object Storage, use network mounted file systems to run GitLab when using more than one node. With respect to replication and verification: @@ -118,17 +123,17 @@ GitLab relies on data stored in multiple databases, for different use-cases. PostgreSQL is the single point of truth for user-generated content in the Web interface, like issues content, comments as well as permissions and credentials. -PostgreSQL can also hold some level of cached data like HTML rendered Markdown, cached merge-requests diff (this can -also be configured to be offloaded to object storage). +PostgreSQL can also hold some level of cached data like HTML-rendered Markdown and cached merge-requests diff. +This can also be configured to be offloaded to object storage. We use PostgreSQL's own replication functionality to replicate data from the **primary** to **secondary** sites. We use Redis both as a cache store and to hold persistent data for our background jobs system. Because both use-cases have data that are exclusive to the same Geo site, we don't replicate it between sites. -Elasticsearch is an optional database, that can enable advanced searching capabilities, like improved Advanced Search -in both source-code level and user generated content in Issues / Merge-Requests and discussions. Currently it's not -supported in Geo. +Elasticsearch is an optional database that for advanced searching capabilities. It can improve search +in both source-code level, and user generated content in issues, merge requests, and discussions. +Elasticsearch is not supported in Geo. ## Limitations on replication/verification @@ -142,7 +147,6 @@ these epics/issues: - [Geo: Improve the self-service Geo replication framework](https://gitlab.com/groups/gitlab-org/-/epics/3761) - [Geo: Move existing blobs to framework](https://gitlab.com/groups/gitlab-org/-/epics/3588) - [Geo: Add unreplicated data types](https://gitlab.com/groups/gitlab-org/-/epics/893) -- [Geo: Support GitLab Pages](https://gitlab.com/groups/gitlab-org/-/epics/589) ### Replicated data types behind a feature flag @@ -174,35 +178,35 @@ Feature.enable(:geo_package_file_replication) WARNING: Features not on this list, or with **No** in the **Replicated** column, are not replicated to a **secondary** site. Failing over without manually -replicating data from those features will cause the data to be **lost**. -If you wish to use those features on a **secondary** site, or to execute a failover +replicating data from those features causes the data to be **lost**. +To use those features on a **secondary** site, or to execute a failover successfully, you must replicate their data using some other means. -|Feature | Replicated (added in GitLab version) | Verified (added in GitLab version) | Object Storage replication (see [Geo with Object Storage](object_storage.md)) | Notes | -|:--------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------|:------------------------------------------------------------------------|:------------------------------------------------------------------------------|:------| -|[Application data in PostgreSQL](../../postgresql/index.md) | **Yes** (10.2) | **Yes** (10.2) | No | | -|[Project repository](../../../user/project/repository/) | **Yes** (10.2) | **Yes** (10.7) | No | | -|[Project wiki repository](../../../user/project/wiki/) | **Yes** (10.2) | **Yes** (10.7) | No | | -|[Group wiki repository](../../../user/project/wiki/group.md) | [**Yes** (13.10)](https://gitlab.com/gitlab-org/gitlab/-/issues/208147) | No | No | Behind feature flag `geo_group_wiki_repository_replication`, enabled by default. | -|[Uploads](../../uploads.md) | **Yes** (10.2) | [No](https://gitlab.com/groups/gitlab-org/-/epics/1817) | No | Verified only on transfer or manually using [Integrity Check Rake Task](../../raketasks/check.md) on both sites and comparing the output between them. | -|[LFS objects](../../lfs/index.md) | **Yes** (10.2) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/8922) | Via Object Storage provider if supported. Native Geo support (Beta). | Verified only on transfer or manually using [Integrity Check Rake Task](../../raketasks/check.md) on both sites and comparing the output between them. GitLab versions 11.11.x and 12.0.x are affected by [a bug that prevents any new LFS objects from replicating](https://gitlab.com/gitlab-org/gitlab/-/issues/32696).<br /><br />Behind feature flag `geo_lfs_object_replication`, enabled by default. | -|[Personal snippets](../../../user/snippets.md) | **Yes** (10.2) | **Yes** (10.2) | No | | -|[Project snippets](../../../user/snippets.md) | **Yes** (10.2) | **Yes** (10.2) | No | | -|[CI job artifacts](../../../ci/pipelines/job_artifacts.md) | **Yes** (10.4) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/8923) | Via Object Storage provider if supported. Native Geo support (Beta). | Verified only manually using [Integrity Check Rake Task](../../raketasks/check.md) on both sites and comparing the output between them. Job logs also verified on transfer. | -|[CI Pipeline Artifacts](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/ci/pipeline_artifact.rb) | [**Yes** (13.11)](https://gitlab.com/gitlab-org/gitlab/-/issues/238464) | [**Yes** (13.11)](https://gitlab.com/gitlab-org/gitlab/-/issues/238464) | Via Object Storage provider if supported. Native Geo support (Beta). | Persists additional artifacts after a pipeline completes | -|[Container Registry](../../packages/container_registry.md) | **Yes** (12.3) | No | No | Disabled by default. See [instructions](docker_registry.md) to enable. | -|[Content in object storage (beta)](object_storage.md) | **Yes** (12.4) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/13845) | No | | -|[Infrastructure Registry](../../../user/packages/infrastructure_registry/index.md) | **Yes** (14.0) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (14.0) | Via Object Storage provider if supported. Native Geo support (Beta). | Behind feature flag `geo_package_file_replication`, enabled by default. | -|[Project designs repository](../../../user/project/issues/design_management.md) | **Yes** (12.7) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/32467) | No | Designs also require replication of LFS objects and Uploads. | -|[Package Registry](../../../user/packages/package_registry/index.md) | **Yes** (13.2) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (13.10) | Via Object Storage provider if supported. Native Geo support (Beta). | Behind feature flag `geo_package_file_replication`, enabled by default. | -|[Versioned Terraform State](../../terraform_state.md) | **Yes** (13.5) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (13.12) | Via Object Storage provider if supported. Native Geo support (Beta). | Replication is behind the feature flag `geo_terraform_state_version_replication`, enabled by default. Verification was behind the feature flag `geo_terraform_state_version_verification`, which was removed in 14.0| -|[External merge request diffs](../../merge_request_diffs.md) | **Yes** (13.5) | No | Via Object Storage provider if supported. Native Geo support (Beta). | Replication is behind the feature flag `geo_merge_request_diff_replication`, enabled by default. Verification is under development, behind the feature flag `geo_merge_request_diff_verification`, introduced in 14.0.| -|[Versioned snippets](../../../user/snippets.md#versioned-snippets) | [**Yes** (13.7)](https://gitlab.com/groups/gitlab-org/-/epics/2809) | [**Yes** (14.2)](https://gitlab.com/groups/gitlab-org/-/epics/2810) | No | Verification was implemented behind the feature flag `geo_snippet_repository_verification` in 13.11, and the feature flag was removed in 14.2. | -|[GitLab Pages](../../pages/index.md) | [**Yes** (14.3)](https://gitlab.com/groups/gitlab-org/-/epics/589) | No | Via Object Storage provider if supported. Native Geo support (Beta). | Behind feature flag `geo_pages_deployment_replication`, enabled by default. | -|[Server-side Git hooks](../../server_hooks.md) | [Not planned](https://gitlab.com/groups/gitlab-org/-/epics/1867) | No | No | Not planned because of current implementation complexity, low customer interest, and availability of alternatives to hooks. | -|[Elasticsearch integration](../../../integration/elasticsearch.md) | [Not planned](https://gitlab.com/gitlab-org/gitlab/-/issues/1186) | No | No | Not planned because further product discovery is required and Elasticsearch (ES) clusters can be rebuilt. Secondaries currently use the same ES cluster as the primary. | -|[Dependency proxy images](../../../user/packages/dependency_proxy/index.md) | [Not planned](https://gitlab.com/gitlab-org/gitlab/-/issues/259694) | No | No | Blocked by [Geo: Secondary Mimicry](https://gitlab.com/groups/gitlab-org/-/epics/1528). Replication of this cache is not needed for disaster recovery purposes because it can be recreated from external sources. | -|[Vulnerability Export](../../../user/application_security/vulnerability_report/#export-vulnerability-details) | [Not planned](https://gitlab.com/groups/gitlab-org/-/epics/3111) | No | No | Not planned because they are ephemeral and sensitive information. They can be regenerated on demand. | +|Feature | Replicated (added in GitLab version) | Verified (added in GitLab version) | Object Storage replication (see [Geo with Object Storage](object_storage.md)) | Notes | +|:--------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------|:---------------------------------------------------------------------------|:------------------------------------------------------------------------------|:------| +|[Application data in PostgreSQL](../../postgresql/index.md) | **Yes** (10.2) | **Yes** (10.2) | No | | +|[Project repository](../../../user/project/repository/) | **Yes** (10.2) | **Yes** (10.7) | No | | +|[Project wiki repository](../../../user/project/wiki/) | **Yes** (10.2) | **Yes** (10.7) | No | | +|[Group wiki repository](../../../user/project/wiki/group.md) | [**Yes** (13.10)](https://gitlab.com/gitlab-org/gitlab/-/issues/208147) | No | No | Behind feature flag `geo_group_wiki_repository_replication`, enabled by default. | +|[Uploads](../../uploads.md) | **Yes** (10.2) | **Yes** (14.6) | Via Object Storage provider if supported. Native Geo support (Beta). | Replication is behind the feature flag `geo_upload_replication`, enabled by default. Verification is behind the feature flag `geo_upload_verification` introduced and enabled by default in 14.6. | +|[LFS objects](../../lfs/index.md) | **Yes** (10.2) | **Yes** (14.6) | Via Object Storage provider if supported. Native Geo support (Beta). | GitLab versions 11.11.x and 12.0.x are affected by [a bug that prevents any new LFS objects from replicating](https://gitlab.com/gitlab-org/gitlab/-/issues/32696).<br /><br />Replication is behind the feature flag `geo_lfs_object_replication`, enabled by default. Verification is behind the feature flag `geo_lfs_object_verification` introduced and enabled by default in 14.6. | +|[Personal snippets](../../../user/snippets.md) | **Yes** (10.2) | **Yes** (10.2) | No | | +|[Project snippets](../../../user/snippets.md) | **Yes** (10.2) | **Yes** (10.2) | No | | +|[CI job artifacts](../../../ci/pipelines/job_artifacts.md) | **Yes** (10.4) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/8923) | Via Object Storage provider if supported. Native Geo support (Beta). | Verified only manually using [Integrity Check Rake Task](../../raketasks/check.md) on both sites and comparing the output between them. Job logs also verified on transfer. | +|[CI Pipeline Artifacts](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/ci/pipeline_artifact.rb) | [**Yes** (13.11)](https://gitlab.com/gitlab-org/gitlab/-/issues/238464) | [**Yes** (13.11)](https://gitlab.com/gitlab-org/gitlab/-/issues/238464) | Via Object Storage provider if supported. Native Geo support (Beta). | Persists additional artifacts after a pipeline completes. | +|[Container Registry](../../packages/container_registry.md) | **Yes** (12.3) | No | No | Disabled by default. See [instructions](docker_registry.md) to enable. | +|[Content in object storage (beta)](object_storage.md) | **Yes** (12.4) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/13845) | No | | +|[Infrastructure Registry](../../../user/packages/infrastructure_registry/index.md) | **Yes** (14.0) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (14.0) | Via Object Storage provider if supported. Native Geo support (Beta). | Behind feature flag `geo_package_file_replication`, enabled by default. | +|[Project designs repository](../../../user/project/issues/design_management.md) | **Yes** (12.7) | [No](https://gitlab.com/gitlab-org/gitlab/-/issues/32467) | No | Designs also require replication of LFS objects and Uploads. | +|[Package Registry](../../../user/packages/package_registry/index.md) | **Yes** (13.2) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (13.10) | Via Object Storage provider if supported. Native Geo support (Beta). | Behind feature flag `geo_package_file_replication`, enabled by default. | +|[Versioned Terraform State](../../terraform_state.md) | **Yes** (13.5) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (13.12) | Via Object Storage provider if supported. Native Geo support (Beta). | Replication is behind the feature flag `geo_terraform_state_version_replication`, enabled by default. Verification was behind the feature flag `geo_terraform_state_version_verification`, which was removed in 14.0. | +|[External merge request diffs](../../merge_request_diffs.md) | **Yes** (13.5) | **Yes** (14.6) | Via Object Storage provider if supported. Native Geo support (Beta). | Replication is behind the feature flag `geo_merge_request_diff_replication`, enabled by default. Verification is behind the feature flag `geo_merge_request_diff_verification`, enabled by default in 14.6.| +|[Versioned snippets](../../../user/snippets.md#versioned-snippets) | [**Yes** (13.7)](https://gitlab.com/groups/gitlab-org/-/epics/2809) | [**Yes** (14.2)](https://gitlab.com/groups/gitlab-org/-/epics/2810) | No | Verification was implemented behind the feature flag `geo_snippet_repository_verification` in 13.11, and the feature flag was removed in 14.2. | +|[GitLab Pages](../../pages/index.md) | [**Yes** (14.3)](https://gitlab.com/groups/gitlab-org/-/epics/589) | [**Yes**](#limitation-of-verification-for-files-in-object-storage) (14.6) | Via Object Storage provider if supported. Native Geo support (Beta). | Behind feature flag `geo_pages_deployment_replication`, enabled by default. Verification is behind the feature flag `geo_pages_deployment_verification`, enabled by default in 14.6. | +|[Server-side Git hooks](../../server_hooks.md) | [Not planned](https://gitlab.com/groups/gitlab-org/-/epics/1867) | No | No | Not planned because of current implementation complexity, low customer interest, and availability of alternatives to hooks. | +|[Elasticsearch integration](../../../integration/elasticsearch.md) | [Not planned](https://gitlab.com/gitlab-org/gitlab/-/issues/1186) | No | No | Not planned because further product discovery is required and Elasticsearch (ES) clusters can be rebuilt. Secondaries use the same ES cluster as the primary. | +|[Dependency proxy images](../../../user/packages/dependency_proxy/index.md) | [Not planned](https://gitlab.com/gitlab-org/gitlab/-/issues/259694) | No | No | Blocked by [Geo: Secondary Mimicry](https://gitlab.com/groups/gitlab-org/-/epics/1528). Replication of this cache is not needed for disaster recovery purposes because it can be recreated from external sources. | +|[Vulnerability Export](../../../user/application_security/vulnerability_report/#export-vulnerability-details) | [Not planned](https://gitlab.com/groups/gitlab-org/-/epics/3111) | No | No | Not planned because they are ephemeral and sensitive information. They can be regenerated on demand. | #### Limitation of verification for files in Object Storage |