summaryrefslogtreecommitdiff
path: root/doc/administration/reference_architectures/50k_users.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/reference_architectures/50k_users.md')
-rw-r--r--doc/administration/reference_architectures/50k_users.md46
1 files changed, 32 insertions, 14 deletions
diff --git a/doc/administration/reference_architectures/50k_users.md b/doc/administration/reference_architectures/50k_users.md
index 6c4e2227b18..078e3a289cc 100644
--- a/doc/administration/reference_architectures/50k_users.md
+++ b/doc/administration/reference_architectures/50k_users.md
@@ -39,7 +39,7 @@ full list of reference architectures, see
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
-2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
+2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS ElastiCache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
@@ -86,7 +86,7 @@ card "Database" as database {
card "redis" as redis {
collections "**Redis Persistent** x3" as redis_persistent #FF6347
collections "**Redis Cache** x3" as redis_cache #FF6347
-
+
redis_cache -[hidden]-> redis_persistent
}
@@ -1364,11 +1364,18 @@ To configure the Praefect nodes, on each one:
on the page.
1. Edit the `/etc/gitlab/gitlab.rb` file to configure Praefect:
+<!--
+Updates to example must be made at:
+- https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/administration/gitaly/praefect.md
+- all reference architecture pages
+-->
+
```ruby
# Avoid running unnecessary services on the Praefect server
gitaly['enable'] = false
postgresql['enable'] = false
redis['enable'] = false
+ nginx['enable'] = false
puma['enable'] = false
sidekiq['enable'] = false
gitlab_workhorse['enable'] = false
@@ -1377,7 +1384,6 @@ To configure the Praefect nodes, on each one:
grafana['enable'] = false
gitlab_exporter['enable'] = false
gitlab_kas['enable'] = false
- nginx['enable'] = false
# Praefect Configuration
praefect['enable'] = true
@@ -1404,8 +1410,8 @@ To configure the Praefect nodes, on each one:
praefect['database_host'] = '10.6.0.141'
praefect['database_port'] = 5432
# `no_proxy` settings must always be a direct connection for caching
- praefect['database_host_no_proxy'] = '10.6.0.141'
- praefect['database_port_no_proxy'] = 5432
+ praefect['database_direct_host'] = '10.6.0.141'
+ praefect['database_direct_port'] = 5432
praefect['database_dbname'] = 'praefect_production'
praefect['database_user'] = 'praefect'
praefect['database_password'] = '<praefect_postgresql_password>'
@@ -1508,10 +1514,18 @@ On each node:
1. Edit the Gitaly server node's `/etc/gitlab/gitlab.rb` file to configure
storage paths, enable the network listener, and to configure the token:
+<!--
+Updates to example must be made at:
+- https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
+- https://gitlab.com/gitlab-org/gitlab/blob/master/doc/administration/gitaly/index.md#gitaly-server-configuration
+- all reference architecture pages
+-->
+
```ruby
# Avoid running unnecessary services on the Gitaly server
postgresql['enable'] = false
redis['enable'] = false
+ nginx['enable'] = false
puma['enable'] = false
sidekiq['enable'] = false
gitlab_workhorse['enable'] = false
@@ -1520,7 +1534,6 @@ On each node:
grafana['enable'] = false
gitlab_exporter['enable'] = false
gitlab_kas['enable'] = false
- nginx['enable'] = false
# Prevent database migrations from running on upgrade automatically
gitlab_rails['auto_migrate'] = false
@@ -1706,11 +1719,18 @@ To configure the Sidekiq nodes, on each one:
on the page.
1. Create or edit `/etc/gitlab/gitlab.rb` and use the following configuration:
+<!--
+Updates to example must be made at:
+- https://gitlab.com/gitlab-org/gitlab/blob/master/doc/administration/sidekiq.md
+- all reference architecture pages
+-->
+
```ruby
# Avoid running unnecessary services on the Sidekiq server
gitaly['enable'] = false
postgresql['enable'] = false
redis['enable'] = false
+ nginx['enable'] = false
puma['enable'] = false
gitlab_workhorse['enable'] = false
prometheus['enable'] = false
@@ -1718,7 +1738,6 @@ To configure the Sidekiq nodes, on each one:
grafana['enable'] = false
gitlab_exporter['enable'] = false
gitlab_kas['enable'] = false
- nginx['enable'] = false
# External URL
## This should match the URL of the external load balancer
@@ -2189,7 +2208,7 @@ GitLab has been tested on a number of object storage providers:
- [Google Cloud Storage](https://cloud.google.com/storage)
- [Digital Ocean Spaces](http://www.digitalocean.com/products/spaces)
- [Oracle Cloud Infrastructure](https://docs.cloud.oracle.com/en-us/iaas/Content/Object/Tasks/s3compatibleapi.htm)
-- [OpenStack Swift](https://docs.openstack.org/swift/latest/s3_compat.html)
+- [OpenStack Swift (S3 compatibility mode)](https://docs.openstack.org/swift/latest/s3_compat.html)
- [Azure Blob storage](https://docs.microsoft.com/en-us/azure/storage/blobs/storage-blobs-introduction)
- On-premises hardware and appliances from various storage vendors.
- MinIO. We have [a guide to deploying this](https://docs.gitlab.com/charts/advanced/external-object-storage/minio.html) within our Helm Chart documentation.
@@ -2222,7 +2241,6 @@ on what features you intend to use:
| [Mattermost](https://docs.mattermost.com/administration/config-settings.html#file-storage)| No |
| [Packages](../packages/index.md#using-object-storage) (optional feature) | Yes |
| [Dependency Proxy](../packages/dependency_proxy.md#using-object-storage) (optional feature) | Yes |
-| [Pseudonymizer](../pseudonymizer.md) (optional feature) | No |
| [Autoscale runner caching](https://docs.gitlab.com/runner/configuration/autoscale.html#distributed-runners-caching) (optional for improved performance) | No |
| [Terraform state files](../terraform_state.md#using-object-storage) | Yes |
@@ -2295,7 +2313,7 @@ The following tables and diagram detail the hybrid environment using the same fo
as the normal environment above.
First are the components that run in Kubernetes. The recommendation at this time is to
-use Google Cloud's Kubernetes Engine (GKE) and associated machine types, but the memory
+use Google Cloud's Kubernetes Engine (GKE) or AWS Elastic Kubernetes Service (EKS) and associated machine types, but the memory
and CPU requirements should translate to most other providers. We hope to update this in the
future with further specific cloud provider details.
@@ -2304,10 +2322,10 @@ future with further specific cloud provider details.
| Webservice | 16 | 32 vCPU, 28.8 GB memory | `n1-highcpu-32` | `m5.8xlarge` | 510 vCPU, 472 GB memory |
| Sidekiq | 4 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | 15.5 vCPU, 50 GB memory |
| Supporting services such as NGINX, Prometheus | 2 | 4 vCPU, 15 GB memory | `n1-standard-4` | `m5.xlarge` | 7.75 vCPU, 25 GB memory |
-
+
- For this setup, we **recommend** and regularly [test](index.md#validation-and-test-results)
[Google Kubernetes Engine (GKE)](https://cloud.google.com/kubernetes-engine) and [Amazon Elastic Kubernetes Service (EKS)](https://aws.amazon.com/eks/). Other Kubernetes services may also work, but your mileage may vary.
-- Nodes configuration is shown as it is forced to ensure pod vcpu / memory ratios and avoid scaling during **performance testing**.
+- Nodes configuration is shown as it is forced to ensure pod vCPU / memory ratios and avoid scaling during **performance testing**.
- In production deployments, there is no need to assign pods to nodes. A minimum of three nodes in three different availability zones is strongly recommended to align with resilient cloud architecture practices.
Next are the backend components that run on static compute VMs via Omnibus (or External PaaS
@@ -2329,7 +2347,7 @@ services where applicable):
<!-- Disable ordered list rule https://github.com/DavidAnson/markdownlint/blob/main/doc/Rules.md#md029---ordered-list-item-prefix -->
<!-- markdownlint-disable MD029 -->
1. Can be optionally run on reputable third-party external PaaS PostgreSQL solutions. [Google Cloud SQL](https://cloud.google.com/sql/docs/postgres/high-availability#normal) and [Amazon RDS](https://aws.amazon.com/rds/) are known to work, however Azure Database for PostgreSQL is **not recommended** due to [performance issues](https://gitlab.com/gitlab-org/quality/reference-architectures/-/issues/61). Consul is primarily used for PostgreSQL high availability so can be ignored when using a PostgreSQL PaaS setup. However it is also used optionally by Prometheus for Omnibus auto host discovery.
-2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS Elasticache are known to work.
+2. Can be optionally run on reputable third-party external PaaS Redis solutions. Google Memorystore and AWS ElastiCache are known to work.
3. Can be optionally run on reputable third-party load balancing services (LB PaaS). AWS ELB is known to work.
4. Should be run on reputable third-party object storage (storage PaaS) for cloud implementations. Google Cloud Storage and AWS S3 are known to work.
5. Gitaly Cluster provides the benefits of fault tolerance, but comes with additional complexity of setup and management. Review the existing [technical limitations and considerations before deploying Gitaly Cluster](../gitaly/index.md#before-deploying-gitaly-cluster). If you want sharded Gitaly, use the same specs listed above for `Gitaly`.
@@ -2377,7 +2395,7 @@ card "Database" as database {
card "redis" as redis {
collections "**Redis Persistent** x3" as redis_persistent #FF6347
collections "**Redis Cache** x3" as redis_cache #FF6347
-
+
redis_cache -[hidden]-> redis_persistent
}