summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--doc/administration/gitaly/index.md36
-rw-r--r--doc/administration/high_availability/README.md278
-rw-r--r--doc/administration/high_availability/database.md23
-rw-r--r--doc/administration/high_availability/gitaly.md9
-rw-r--r--doc/administration/high_availability/object_storage.md28
-rw-r--r--doc/administration/high_availability/redis.md24
-rw-r--r--doc/integration/elasticsearch.md2
7 files changed, 104 insertions, 296 deletions
diff --git a/doc/administration/gitaly/index.md b/doc/administration/gitaly/index.md
index 390e0ae05af..92a44271775 100644
--- a/doc/administration/gitaly/index.md
+++ b/doc/administration/gitaly/index.md
@@ -6,9 +6,9 @@ components can read or write Git data. GitLab components that access Git
repositories (GitLab Rails, GitLab Shell, GitLab Workhorse, etc.) act as clients
to Gitaly. End users do not have direct access to Gitaly.
-In the rest of this page, Gitaly server is referred to the standalone node that
-only runs Gitaly, and Gitaly client to the GitLab Rails node that runs all other
-processes except Gitaly.
+On this page, *Gitaly server* refers to a standalone node that only runs Gitaly
+and *Gitaly client* is a GitLab Rails app node that runs all other processes
+except Gitaly.
## Architecture
@@ -20,7 +20,7 @@ Here's a high-level architecture overview of how Gitaly is used.
The Gitaly service itself is configured via a [TOML configuration file](reference.md).
-In case you want to change some of its settings:
+If you want to change any of its settings:
**For Omnibus GitLab**
@@ -54,10 +54,6 @@ scenario, the [new repository indexer](../../integration/elasticsearch.md#elasti
needs to be enabled in your GitLab configuration. [Since GitLab v12.3](https://gitlab.com/gitlab-org/gitlab/issues/6481),
the new indexer becomes the default and no configuration is required.
-NOTE: **Note:** While Gitaly can be used as a replacement for NFS, it's not recommended
-to use EFS as it may impact GitLab's performance. Review the [relevant documentation](../high_availability/nfs.md#avoid-using-awss-elastic-file-system-efs)
-for more details.
-
### Network architecture
The following list depicts what the network architecture of Gitaly is:
@@ -568,30 +564,6 @@ server with the following settings.
1. Save the file and [restart GitLab](../restart_gitlab.md#installations-from-source).
-## Eliminating NFS altogether
-
-If you are planning to use Gitaly without NFS for your storage needs
-and want to eliminate NFS from your environment altogether, there are
-a few things that you need to do:
-
-1. Make sure the [`git` user home directory](https://docs.gitlab.com/omnibus/settings/configuration.html#moving-the-home-directory-for-a-user) is on local disk.
-1. Configure [database lookup of SSH keys](../operations/fast_ssh_key_lookup.md)
- to eliminate the need for a shared `authorized_keys` file.
-1. Configure [object storage for job artifacts](../job_artifacts.md#using-object-storage)
- including [incremental logging](../job_logs.md#new-incremental-logging-architecture).
-1. Configure [object storage for LFS objects](../lfs/lfs_administration.md#storing-lfs-objects-in-remote-object-storage).
-1. Configure [object storage for uploads](../uploads.md#using-object-storage-core-only).
-1. Configure [object storage for merge request diffs](../merge_request_diffs.md#using-object-storage).
-1. Configure [object storage for packages](../packages/index.md#using-object-storage) (optional feature).
-1. Configure [object storage for dependency proxy](../packages/dependency_proxy.md#using-object-storage) (optional feature).
-1. Configure [object storage for Mattermost](https://docs.mattermost.com/administration/config-settings.html#file-storage) (optional feature).
-
-NOTE: **Note:**
-One current feature of GitLab that still requires a shared directory (NFS) is
-[GitLab Pages](../../user/project/pages/index.md).
-There is [work in progress](https://gitlab.com/gitlab-org/gitlab-pages/issues/196)
-to eliminate the need for NFS to support GitLab Pages.
-
## Limiting RPC concurrency
It can happen that CI clone traffic puts a large strain on your Gitaly
diff --git a/doc/administration/high_availability/README.md b/doc/administration/high_availability/README.md
index 2c2fc075dbe..cfc027a675e 100644
--- a/doc/administration/high_availability/README.md
+++ b/doc/administration/high_availability/README.md
@@ -4,210 +4,56 @@ type: reference, concepts
# Scaling and High Availability
-GitLab supports a number of options for scaling your self-managed instance and configuring high availability (HA).
-The solution you choose will be based on the level of scalability and
-availability you require. The easiest solutions are scalable, but not necessarily
-highly available.
-
-GitLab provides a service that is essential to most organizations: it
-enables people to collaborate on code in a timely fashion. Any downtime should
-therefore be short and planned. Due to the distributed nature
-of Git, developers can continue to commit code locally even when GitLab is not
-available. However, some GitLab features such as the issue tracker and
-continuous integration are not available when GitLab is down.
-If you require all GitLab functionality to be highly available,
-consider the options outlined below.
-
-**Keep in mind that all highly-available solutions come with a trade-off between
-cost/complexity and uptime**. The more uptime you want, the more complex the
-solution. And the more complex the solution, the more work is involved in
-setting up and maintaining it. High availability is not free and every HA
-solution should balance the costs against the benefits.
-
-There are many options when choosing a highly-available GitLab architecture. We
-recommend engaging with GitLab Support to choose the best architecture for your
-use case. This page contains recommendations based on
-experience with GitLab.com and internal scale testing.
+GitLab supports a number of options for larger self-managed instances to
+ensure that they are scalable and highly available. While these needs can be tackled
+individually, they typically go hand in hand: a performant scalable environment
+will have availability by default, as its components are separated and pooled.
+
+On this page, we present recommendations for setups based on the number
+of users you expect. For larger setups we give several recommended
+architectures based on experience with GitLab.com and internal scale
+testing that aim to achieve the right balance between both scalability
+and availability.
For detailed insight into how GitLab scales and configures GitLab.com, you can
watch [this 1 hour Q&A](https://www.youtube.com/watch?v=uCU8jdYzpac)
-with [John Northrup](https://gitlab.com/northrup), and live questions coming in from some of our customers.
-
-## GitLab Components
-
-The following components need to be considered for a scaled or highly-available
-environment. In many cases, components can be combined on the same nodes to reduce
-complexity.
-
-- GitLab application nodes (Unicorn / Puma, Workhorse) - Web-requests (UI, API, Git over HTTP)
-- Sidekiq - Asynchronous/Background jobs
-- PostgreSQL - Database
- - Consul - Database service discovery and health checks/failover
- - PgBouncer - Database pool manager
-- Redis - Key/Value store (User sessions, cache, queue for Sidekiq)
- - Sentinel - Redis health check/failover manager
-- Gitaly - Provides high-level storage and RPC access to Git repositories
-- S3 Object Storage service[^4] and / or NFS storage servers[^5] for entities such as Uploads, Artifacts, LFS Objects, etc...
-- Load Balancer[^6] - Main entry point and handles load balancing for the GitLab application nodes.
-- Monitor - Prometheus and Grafana monitoring with auto discovery.
-
-## Scalable Architecture Examples
-
-When an organization reaches a certain threshold it will be necessary to scale
-the GitLab instance. Still, true high availability may not be necessary. There
-are options for scaling GitLab instances relatively easily without incurring the
-infrastructure and maintenance costs of full high availability.
-
-### Basic Scaling
-
-This is the simplest form of scaling and will work for the majority of
-cases. Backend components such as PostgreSQL, Redis, and storage are offloaded
-to their own nodes while the remaining GitLab components all run on 2 or more
-application nodes.
-
-This form of scaling also works well in a cloud environment when it is more
-cost effective to deploy several small nodes rather than a single
-larger one.
-
-- 1 PostgreSQL node
-- 1 Redis node
-- 1 Gitaly node
-- 1 or more Object Storage services[^4] and / or NFS storage server[^5]
-- 2 or more GitLab application nodes (Unicorn / Puma, Workhorse, Sidekiq)
-- 1 or more Load Balancer nodes[^6]
-- 1 Monitoring node (Prometheus, Grafana)
-
-#### Installation Instructions
-
-Complete the following installation steps in order. A link at the end of each
-section will bring you back to the Scalable Architecture Examples section so
-you can continue with the next step.
-
-1. [Load Balancer(s)](load_balancer.md)[^6]
-1. [Consul](consul.md)
-1. [PostgreSQL](database.md#postgresql-in-a-scaled-environment) with [PgBouncer](pgbouncer.md)
-1. [Redis](redis.md#redis-in-a-scaled-environment)
-1. [Gitaly](gitaly.md) (recommended) and / or [NFS](nfs.md)[^5]
-1. [GitLab application nodes](gitlab.md)
- - With [Object Storage service enabled](../gitaly/index.md#eliminating-nfs-altogether)[^4]
-1. [Monitoring node (Prometheus and Grafana)](monitoring_node.md)
-
-### Full Scaling
-
-For very large installations, it might be necessary to further split components
-for maximum scalability. In a fully-scaled architecture, the application node
-is split into separate Sidekiq and Unicorn/Workhorse nodes. One indication that
-this architecture is required is if Sidekiq queues begin to periodically increase
-in size, indicating that there is contention or there are not enough resources.
-
-- 1 or more PostgreSQL nodes
-- 1 or more Redis nodes
-- 1 or more Gitaly storage servers
-- 1 or more Object Storage services[^4] and / or NFS storage server[^5]
-- 2 or more Sidekiq nodes
-- 2 or more GitLab application nodes (Unicorn / Puma, Workhorse, Sidekiq)
-- 1 or more Load Balancer nodes[^6]
-- 1 Monitoring node (Prometheus, Grafana)
-
-## High Availability Architecture Examples
-
-When organizations require scaling *and* high availability, the following
-architectures can be utilized. As the introduction section at the top of this
-page mentions, there is a tradeoff between cost/complexity and uptime. Be sure
-this complexity is absolutely required before taking the step into full
-high availability.
-
-For all examples below, we recommend running Consul and Redis Sentinel separately
-from the services they monitor. If Consul is running on PostgreSQL nodes or Sentinel on
-Redis nodes, there is a potential that high resource usage by PostgreSQL or
-Redis could prevent communication between the other Consul and Sentinel nodes.
-This may lead to the other nodes believing a failure has occurred and initiating
-automated failover. Isolating Consul and Redis Sentinel from the services they monitor
-reduces the chances of a false positive that a failure has occurred.
-
-The examples below do not address high availability of NFS for objects. We recommend a
-S3 Object Storage service[^4] is used where possible over NFS but it's still required in
-certain cases[^5]. Where NFS is to be used some enterprises have access to NFS appliances
-that manage availability and this would be best case scenario.
-
-There are many options in between each of these examples. Work with GitLab Support
-to understand the best starting point for your workload and adapt from there.
-
-### Horizontal
-
-This is the simplest form of high availability and scaling. It requires the
-fewest number of individual servers (virtual or physical) but does have some
-trade-offs and limits.
-
-This architecture will work well for many GitLab customers. Larger customers
-may begin to notice certain events cause contention/high load - for example,
-cloning many large repositories with binary files, high API usage, a large
-number of enqueued Sidekiq jobs, and so on. If this happens, you should consider
-moving to a hybrid or fully distributed architecture depending on what is causing
-the contention.
-
-- 3 PostgreSQL nodes
-- 3 Redis nodes
-- 3 Consul / Sentinel nodes
-- 2 or more GitLab application nodes (Unicorn / Puma, Workhorse, Sidekiq)
-- 1 Gitaly storage servers
-- 1 Object Storage service[^4] and / or NFS storage server[^5]
-- 1 or more Load Balancer nodes[^6]
-- 1 Monitoring node (Prometheus, Grafana)
-
-![Horizontal architecture diagram](img/horizontal.png)
-
-### Hybrid
-
-In this architecture, certain components are split on dedicated nodes so high
-resource usage of one component does not interfere with others. In larger
-environments this is a good architecture to consider if you foresee or do have
-contention due to certain workloads.
-
-- 3 PostgreSQL nodes
-- 1 PgBouncer node
-- 3 Redis nodes
-- 3 Consul / Sentinel nodes
-- 2 or more Sidekiq nodes
-- 2 or more GitLab application nodes (Unicorn / Puma, Workhorse, Sidekiq)
-- 1 Gitaly storage servers
-- 1 Object Storage service[^4] and / or NFS storage server[^5]
-- 1 or more Load Balancer nodes[^6]
-- 1 Monitoring node (Prometheus, Grafana)
-
-![Hybrid architecture diagram](img/hybrid.png)
-
-### Fully Distributed
-
-This architecture scales to hundreds of thousands of users and projects and is
-the basis of the GitLab.com architecture. While this scales well it also comes
-with the added complexity of many more nodes to configure, manage, and monitor.
-
-- 3 PostgreSQL nodes
-- 1 or more PgBouncer nodes (with associated internal load balancers)
-- 4 or more Redis nodes (2 separate clusters for persistent and cache data)
-- 3 Consul nodes
-- 3 Sentinel nodes
-- Multiple dedicated Sidekiq nodes (Split into real-time, best effort, ASAP,
- CI Pipeline and Pull Mirror sets)
-- 2 or more Git nodes (Git over SSH/Git over HTTP)
-- 2 or more API nodes (All requests to `/api`)
-- 2 or more Web nodes (All other web requests)
-- 2 or more Gitaly storage servers
-- 1 or more Object Storage services[^4] and / or NFS storage servers[^5]
-- 1 or more Load Balancer nodes[^6]
-- 1 Monitoring node (Prometheus, Grafana)
-
-![Fully Distributed architecture diagram](img/fully-distributed.png)
-
-## Reference Architecture Recommendations
-
-The Support and Quality teams build, performance test, and validate Reference
-Architectures that support large numbers of users. The specifications below are
-a representation of this work so far and may be adjusted in the future based on
-additional testing and iteration.
-
-The architectures have been tested with specific coded workloads, and the
+with [John Northrup](https://gitlab.com/northrup), and live questions coming
+in from some of our customers.
+
+## Recommended Setups based on number of users
+
+- 1 - 1000 Users: A single-node [Omnibus](https://docs.gitlab.com/omnibus/) setup with frequent backups. Refer to the [requirements page](https://docs.gitlab.com/ee/install/requirements.html) for further details of the specs you will require.
+- 2000 - 50000+ Users: A scaled HA environment based on one of our [Reference Architectures](#reference-architectures) below.
+
+## GitLab Components and Configuration Instructions
+
+The GitLab application depends on the following [components](https://docs.gitlab.com/ee/development/architecture.html#component-diagram)
+and services. They are included in the reference architectures along with our
+recommendations for their use and configuration. They are presented in the order
+in which you would typically configure them.
+
+| Component | Description | Configuration Instructions |
+|-------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------|
+| [Load Balancer(s)](load_balancer.md)[^6] | Handles load balancing for the GitLab nodes where required. | [Load balancer HA configuration](load_balancer.md) |
+| [Consul](https://docs.gitlab.com/ee/development/architecture.html#consul)[^3] | Service discovery and health checks/failover | [Consul HA configuration](consul.md) |
+| [PostgreSQL](https://docs.gitlab.com/ee/development/architecture.html#postgresql) | Database | [Database HA configuration](database.md) |
+| [PgBouncer](https://docs.gitlab.com/ee/development/architecture.html#pgbouncer) | Database Pool Manager | [PgBouncer HA configuration](pgbouncer.md) |
+| [Redis](https://docs.gitlab.com/ee/development/architecture.html#redis)[^3] with Redis Sentinel | Key/Value store for shared data with HA watcher service | [Redis HA configuration](redis.md) |
+| [Gitaly](https://docs.gitlab.com/ee/development/architecture.html#gitaly)[^2] [^5] [^7] | Recommended high-level storage for Git repository data. | [Gitaly HA configuration](gitaly.md) |
+| [Sidekiq](https://docs.gitlab.com/ee/development/architecture.html#sidekiq) | Asynchronous/Background jobs | |
+| [Cloud Object Storage service](object_storage.md)[^4] | Recommended store for shared data objects such as LFS, Uploads, Artifacts, etc... | [Cloud Object Storage configuration](object_storage.md) |
+| [GitLab application nodes](https://docs.gitlab.com/ee/development/architecture.html#unicorn)[^1] | (Unicorn / Puma, Workhorse) - Web-requests (UI, API, Git over HTTP) | [GitLab app HA/scaling configuration](gitlab.md) |
+| [NFS](nfs.md)[^5] [^7] | Shared disk storage service. Can be used as an alternative for Gitaly or Object Storage. Required for GitLab Pages. | [NFS configuration](nfs.md) |
+| [Prometheus](https://docs.gitlab.com/ee/development/architecture.html#prometheus) and [Grafana](https://docs.gitlab.com/ee/development/architecture.html#grafana) | GitLab environment monitoring | [Monitoring node for scaling/HA](monitoring_node.md) |
+
+In some cases, components can be combined on the same nodes to reduce complexity as well.
+
+## Reference Architectures
+
+In this section we'll detail the Reference Architectures that can support large numbers
+of users. These were built, tested and verified by our Quality and Support teams.
+
+Testing was done with our GitLab Performance Tool at specific coded workloads, and the
throughputs used for testing were calculated based on sample customer data. We
test each endpoint type with the following number of requests per second (RPS)
per 1000 users:
@@ -235,11 +81,11 @@ On different cloud vendors a best effort like for like can be used.
| GitLab Rails[^1] | 3 | 8 vCPU, 7.2GB Memory | n1-highcpu-8 |
| PostgreSQL | 3 | 2 vCPU, 7.5GB Memory | n1-standard-2 |
| PgBouncer | 3 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
-| Gitaly[^2] [^7] | X | 4 vCPU, 15GB Memory | n1-standard-4 |
+| Gitaly[^2] [^5] [^7] | X | 4 vCPU, 15GB Memory | n1-standard-4 |
| Redis[^3] | 3 | 2 vCPU, 7.5GB Memory | n1-standard-2 |
| Consul + Sentinel[^3] | 3 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
| Sidekiq | 4 | 2 vCPU, 7.5GB Memory | n1-standard-2 |
-| S3 Object Storage[^4] | - | - | - |
+| Cloud Object Storage[^4] | - | - | - |
| NFS Server[^5] [^7] | 1 | 4 vCPU, 3.6GB Memory | n1-highcpu-4 |
| Monitoring node | 1 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
| External load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
@@ -257,11 +103,11 @@ On different cloud vendors a best effort like for like can be used.
| GitLab Rails[^1] | 3 | 16 vCPU, 14.4GB Memory | n1-highcpu-16 |
| PostgreSQL | 3 | 2 vCPU, 7.5GB Memory | n1-standard-2 |
| PgBouncer | 3 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
-| Gitaly[^2] [^7] | X | 8 vCPU, 30GB Memory | n1-standard-8 |
+| Gitaly[^2] [^5] [^7] | X | 8 vCPU, 30GB Memory | n1-standard-8 |
| Redis[^3] | 3 | 2 vCPU, 7.5GB Memory | n1-standard-2 |
| Consul + Sentinel[^3] | 3 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
| Sidekiq | 4 | 2 vCPU, 7.5GB Memory | n1-standard-2 |
-| S3 Object Storage[^4] | - | - | - |
+| Cloud Object Storage[^4] | - | - | - |
| NFS Server[^5] [^7] | 1 | 4 vCPU, 3.6GB Memory | n1-highcpu-4 |
| Monitoring node | 1 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
| External load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
@@ -279,14 +125,14 @@ On different cloud vendors a best effort like for like can be used.
| GitLab Rails[^1] | 3 | 32 vCPU, 28.8GB Memory | n1-highcpu-32 |
| PostgreSQL | 3 | 4 vCPU, 15GB Memory | n1-standard-4 |
| PgBouncer | 3 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
-| Gitaly[^2] [^7] | X | 16 vCPU, 60GB Memory | n1-standard-16 |
+| Gitaly[^2] [^5] [^7] | X | 16 vCPU, 60GB Memory | n1-standard-16 |
| Redis[^3] - Cache | 3 | 4 vCPU, 15GB Memory | n1-standard-4 |
| Redis[^3] - Queues / Shared State | 3 | 4 vCPU, 15GB Memory | n1-standard-4 |
| Redis Sentinel[^3] - Cache | 3 | 1 vCPU, 1.7GB Memory | g1-small |
| Redis Sentinel[^3] - Queues / Shared State | 3 | 1 vCPU, 1.7GB Memory | g1-small |
| Consul | 3 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
| Sidekiq | 4 | 4 vCPU, 15GB Memory | n1-standard-4 |
-| S3 Object Storage[^4] | - | - | - |
+| Cloud Object Storage[^4] | - | - | - |
| NFS Server[^5] [^7] | 1 | 4 vCPU, 3.6GB Memory | n1-highcpu-4 |
| Monitoring node | 1 | 4 vCPU, 3.6GB Memory | n1-highcpu-4 |
| External load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
@@ -304,14 +150,14 @@ On different cloud vendors a best effort like for like can be used.
| GitLab Rails[^1] | 7 | 32 vCPU, 28.8GB Memory | n1-highcpu-32 |
| PostgreSQL | 3 | 8 vCPU, 30GB Memory | n1-standard-8 |
| PgBouncer | 3 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
-| Gitaly[^2] [^7] | X | 32 vCPU, 120GB Memory | n1-standard-32 |
+| Gitaly[^2] [^5] [^7] | X | 32 vCPU, 120GB Memory | n1-standard-32 |
| Redis[^3] - Cache | 3 | 4 vCPU, 15GB Memory | n1-standard-4 |
| Redis[^3] - Queues / Shared State | 3 | 4 vCPU, 15GB Memory | n1-standard-4 |
| Redis Sentinel[^3] - Cache | 3 | 1 vCPU, 1.7GB Memory | g1-small |
| Redis Sentinel[^3] - Queues / Shared State | 3 | 1 vCPU, 1.7GB Memory | g1-small |
| Consul | 3 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
| Sidekiq | 4 | 4 vCPU, 15GB Memory | n1-standard-4 |
-| S3 Object Storage[^4] | - | - | - |
+| Cloud Object Storage[^4] | - | - | - |
| NFS Server[^5] [^7] | 1 | 4 vCPU, 3.6GB Memory | n1-highcpu-4 |
| Monitoring node | 1 | 4 vCPU, 3.6GB Memory | n1-highcpu-4 |
| External load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
@@ -329,7 +175,7 @@ On different cloud vendors a best effort like for like can be used.
| GitLab Rails[^1] | 15 | 32 vCPU, 28.8GB Memory | n1-highcpu-32 |
| PostgreSQL | 3 | 16 vCPU, 60GB Memory | n1-standard-16 |
| PgBouncer | 3 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
-| Gitaly[^2] [^7] | X | 64 vCPU, 240GB Memory | n1-standard-64 |
+| Gitaly[^2] [^5] [^7] | X | 64 vCPU, 240GB Memory | n1-standard-64 |
| Redis[^3] - Cache | 3 | 4 vCPU, 15GB Memory | n1-standard-4 |
| Redis[^3] - Queues / Shared State | 3 | 4 vCPU, 15GB Memory | n1-standard-4 |
| Redis Sentinel[^3] - Cache | 3 | 1 vCPU, 1.7GB Memory | g1-small |
@@ -337,7 +183,7 @@ On different cloud vendors a best effort like for like can be used.
| Consul | 3 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
| Sidekiq | 4 | 4 vCPU, 15GB Memory | n1-standard-4 |
| NFS Server[^5] [^7] | 1 | 4 vCPU, 3.6GB Memory | n1-highcpu-4 |
-| S3 Object Storage[^4] | - | - | - |
+| Cloud Object Storage[^4] | - | - | - |
| Monitoring node | 1 | 4 vCPU, 3.6GB Memory | n1-highcpu-4 |
| External load balancing node[^6] | 1 | 2 vCPU, 1.8GB Memory | n1-highcpu-2 |
| Internal load balancing node[^6] | 1 | 8 vCPU, 7.2GB Memory | n1-highcpu-8 |
@@ -361,7 +207,7 @@ On different cloud vendors a best effort like for like can be used.
and another for the Queues and Shared State classes respectively. We also recommend
that you run the Redis Sentinel clusters separately as well for each Redis Cluster.
-[^4]: For data objects such as LFS, Uploads, Artifacts, etc... We recommend a S3 Object Storage
+[^4]: For data objects such as LFS, Uploads, Artifacts, etc... We recommend a Cloud Object Storage
where possible over NFS due to better performance and availability. Several types of objects
are supported for S3 storage - [Job artifacts](../job_artifacts.md#using-object-storage),
[LFS](../lfs/lfs_administration.md#storing-lfs-objects-in-remote-object-storage),
@@ -370,15 +216,15 @@ On different cloud vendors a best effort like for like can be used.
[Packages](../packages/index.md#using-object-storage) (Optional Feature),
[Dependency Proxy](../packages/dependency_proxy.md#using-object-storage) (Optional Feature).
-[^5]: NFS storage server is still required for [GitLab Pages](https://gitlab.com/gitlab-org/gitlab-pages/issues/196)
- and optionally for CI Job Incremental Logging
- ([can be switched to use Redis instead](../job_logs.md#new-incremental-logging-architecture)).
+[^5]: NFS can be used as an alternative for both repository data (replacing Gitaly) and
+ object storage but this isn't typically recommended for performance reasons. Note however it is required for
+ [GitLab Pages](https://gitlab.com/gitlab-org/gitlab-pages/issues/196).
[^6]: Our architectures have been tested and validated with [HAProxy](https://www.haproxy.org/)
as the load balancer. However other reputable load balancers with similar feature sets
should also work instead but be aware these aren't validated.
-[^7]: We strongly recommend that the Gitaly and / or NFS nodes are set up with SSD disks over
+[^7]: We strongly recommend that any Gitaly and / or NFS nodes are set up with SSD disks over
HDD with a throughput of at least 8,000 IOPS for read operations and 2,000 IOPS for write
as these components have heavy I/O. These IOPS values are recommended only as a starter
as with time they may be adjusted higher or lower depending on the scale of your
diff --git a/doc/administration/high_availability/database.md b/doc/administration/high_availability/database.md
index daeb0f9baf5..596df656e2e 100644
--- a/doc/administration/high_availability/database.md
+++ b/doc/administration/high_availability/database.md
@@ -22,11 +22,9 @@ If you use a cloud-managed service, or provide your own PostgreSQL:
1. Configure the GitLab application servers with the appropriate details.
This step is covered in [Configuring GitLab for HA](gitlab.md).
-## PostgreSQL in a Scaled Environment
+## PostgreSQL in a Scaled and Highly Available Environment
-This section is relevant for [Scaled Architecture](README.md#scalable-architecture-examples)
-environments including [Basic Scaling](README.md#basic-scaling) and
-[Full Scaling](README.md#full-scaling).
+This section is relevant for [Scalable and Highly Available Setups](README.md).
### Provide your own PostgreSQL instance **(CORE ONLY)**
@@ -94,23 +92,6 @@ deploy the bundled PostgreSQL.
Advanced configuration options are supported and can be added if
needed.
-Continue configuration of other components by going
-[back to Scaled Architectures](README.md#scalable-architecture-examples)
-
-## PostgreSQL with High Availability
-
-This section is relevant for [High Availability Architecture](README.md#high-availability-architecture-examples)
-environments including [Horizontal](README.md#horizontal),
-[Hybrid](README.md#hybrid), and
-[Fully Distributed](README.md#fully-distributed).
-
-### Provide your own PostgreSQL instance **(CORE ONLY)**
-
-If you want to use your own deployed PostgreSQL instance(s),
-see [Provide your own PostgreSQL instance](#provide-your-own-postgresql-instance-core-only)
-for more details. However, you can use the GitLab Omnibus package to easily
-deploy the bundled PostgreSQL.
-
### High Availability with GitLab Omnibus **(PREMIUM ONLY)**
> Important notes:
diff --git a/doc/administration/high_availability/gitaly.md b/doc/administration/high_availability/gitaly.md
index 739d1ae35fb..bb40747b24c 100644
--- a/doc/administration/high_availability/gitaly.md
+++ b/doc/administration/high_availability/gitaly.md
@@ -11,18 +11,15 @@ should consider using Gitaly on a separate node.
See the [Gitaly HA Epic](https://gitlab.com/groups/gitlab-org/-/epics/289) to
track plans and progress toward high availability support.
-This document is relevant for [Scaled Architecture](README.md#scalable-architecture-examples)
-environments and [High Availability Architecture](README.md#high-availability-architecture-examples).
+This document is relevant for [Scalable and Highly Available Setups](README.md).
## Running Gitaly on its own server
See [Running Gitaly on its own server](../gitaly/index.md#running-gitaly-on-its-own-server)
in Gitaly documentation.
-Continue configuration of other components by going back to:
-
-- [Scaled Architectures](README.md#scalable-architecture-examples)
-- [High Availability Architectures](README.md#high-availability-architecture-examples)
+Continue configuration of other components by going back to the
+[Scaling and High Availability](README.md#gitlab-components-and-configuration-instructions) page.
## Enable Monitoring
diff --git a/doc/administration/high_availability/object_storage.md b/doc/administration/high_availability/object_storage.md
new file mode 100644
index 00000000000..6ec34ea2f5d
--- /dev/null
+++ b/doc/administration/high_availability/object_storage.md
@@ -0,0 +1,28 @@
+---
+type: reference
+---
+
+# Cloud Object Storage
+
+GitLab supports utilizing a Cloud Object Storage service over [NFS](nfs.md) for holding
+numerous types of data. This is recommended in larger setups as object storage is
+typically much more performant and reliable.
+
+For configuring GitLab to use Object Storage refer to the following guides:
+
+1. Make sure the [`git` user home directory](https://docs.gitlab.com/omnibus/settings/configuration.html#moving-the-home-directory-for-a-user) is on local disk.
+1. Configure [database lookup of SSH keys](../operations/fast_ssh_key_lookup.md)
+ to eliminate the need for a shared `authorized_keys` file.
+1. Configure [object storage for job artifacts](../job_artifacts.md#using-object-storage)
+ including [incremental logging](../job_logs.md#new-incremental-logging-architecture).
+1. Configure [object storage for LFS objects](../lfs/lfs_administration.md#storing-lfs-objects-in-remote-object-storage).
+1. Configure [object storage for uploads](../uploads.md#using-object-storage-core-only).
+1. Configure [object storage for merge request diffs](../merge_request_diffs.md#using-object-storage).
+1. Configure [object storage for packages](../packages/index.md#using-object-storage) (optional feature).
+1. Configure [object storage for dependency proxy](../packages/dependency_proxy.md#using-object-storage) (optional feature).
+
+NOTE: **Note:**
+One current feature of GitLab that still requires a shared directory (NFS) is
+[GitLab Pages](../../user/project/pages/index.md).
+There is [work in progress](https://gitlab.com/gitlab-org/gitlab-pages/issues/196)
+to eliminate the need for NFS to support GitLab Pages.
diff --git a/doc/administration/high_availability/redis.md b/doc/administration/high_availability/redis.md
index 539d492632f..79082fefdd9 100644
--- a/doc/administration/high_availability/redis.md
+++ b/doc/administration/high_availability/redis.md
@@ -20,11 +20,9 @@ The following are the requirements for providing your own Redis instance:
Note the Redis node's IP address or hostname, port, and password (if required).
These will be necessary when configuring the GitLab application servers later.
-## Redis in a Scaled Environment
+## Redis in a Scaled and Highly Available Environment
-This section is relevant for [Scaled Architecture](README.md#scalable-architecture-examples)
-environments including [Basic Scaling](README.md#basic-scaling) and
-[Full Scaling](README.md#full-scaling).
+This section is relevant for [Scalable and Highly Available Setups](README.md).
### Provide your own Redis instance **(CORE ONLY)**
@@ -85,22 +83,8 @@ Omnibus:
Advanced configuration options are supported and can be added if
needed.
-Continue configuration of other components by going
-[back to Scaled Architectures](README.md#scalable-architecture-examples)
-
-## Redis with High Availability
-
-This section is relevant for [High Availability Architecture](README.md#high-availability-architecture-examples)
-environments including [Horizontal](README.md#horizontal),
-[Hybrid](README.md#hybrid), and
-[Fully Distributed](README.md#fully-distributed).
-
-### Provide your own Redis instance **(CORE ONLY)**
-
-If you want to use your own deployed Redis instance(s),
-see [Provide your own Redis instance](#provide-your-own-redis-instance-core-only)
-for more details. However, you can use the GitLab Omnibus package to easily
-deploy the bundled Redis.
+Continue configuration of other components by going back to the
+[Scaling and High Availability](README.md#gitlab-components-and-configuration-instructions) page.
### High Availability with GitLab Omnibus **(PREMIUM ONLY)**
diff --git a/doc/integration/elasticsearch.md b/doc/integration/elasticsearch.md
index 9ec56d304e0..c2f4fff0ce3 100644
--- a/doc/integration/elasticsearch.md
+++ b/doc/integration/elasticsearch.md
@@ -260,7 +260,7 @@ If the database size is less than 500 MiB, and the size of all hosted repos is l
CAUTION: **Warning**:
Performing asynchronous indexing will generate a lot of Sidekiq jobs.
-Make sure to prepare for this task by either [Horizontally Scaling](../administration/high_availability/README.md#basic-scaling)
+Make sure to prepare for this task by having a [Scalable and Highly Available Setup](README.md)
or creating [extra Sidekiq processes](../administration/operations/extra_sidekiq_processes.md)
1. [Configure your Elasticsearch host and port](#enabling-elasticsearch).