summaryrefslogtreecommitdiff
path: root/doc/development/scalability.md
diff options
context:
space:
mode:
authorGitLab Bot <gitlab-bot@gitlab.com>2022-09-19 23:18:09 +0000
committerGitLab Bot <gitlab-bot@gitlab.com>2022-09-19 23:18:09 +0000
commit6ed4ec3e0b1340f96b7c043ef51d1b33bbe85fde (patch)
treedc4d20fe6064752c0bd323187252c77e0a89144b /doc/development/scalability.md
parent9868dae7fc0655bd7ce4a6887d4e6d487690eeed (diff)
downloadgitlab-ce-6ed4ec3e0b1340f96b7c043ef51d1b33bbe85fde.tar.gz
Add latest changes from gitlab-org/gitlab@15-4-stable-eev15.4.0-rc42
Diffstat (limited to 'doc/development/scalability.md')
-rw-r--r--doc/development/scalability.md18
1 files changed, 9 insertions, 9 deletions
diff --git a/doc/development/scalability.md b/doc/development/scalability.md
index b7ee0ca1167..66f436bd391 100644
--- a/doc/development/scalability.md
+++ b/doc/development/scalability.md
@@ -35,7 +35,7 @@ The application has a tight coupling to the database schema. When the
application starts, Rails queries the database schema, caching the tables and
column types for the data requested. Because of this schema cache, dropping a
column or table while the application is running can produce 500 errors to the
-user. This is why we have a
+user. This is why we have a
[process for dropping columns and other no-downtime changes](database/avoiding_downtime_in_migrations.md).
#### Multi-tenancy
@@ -61,10 +61,10 @@ There are two ways to deal with this:
- Sharding. Distribute data across multiple databases.
Partitioning is a built-in PostgreSQL feature and requires minimal changes
-in the application. However, it
+in the application. However, it
[requires PostgreSQL 11](https://www.2ndquadrant.com/en/blog/partitioning-evolution-postgresql-11/).
-For example, a natural way to partition is to
+For example, a natural way to partition is to
[partition tables by dates](https://gitlab.com/groups/gitlab-org/-/epics/2023). For example,
the `events` and `audit_events` table are natural candidates for this
kind of partitioning.
@@ -77,9 +77,9 @@ to abstract data access into API calls that abstract the database from
the application, but this is a significant amount of work.
There are solutions that may help abstract the sharding to some extent
-from the application. For example, we want to look at
+from the application. For example, we want to look at
[Citus Data](https://www.citusdata.com/product/community) closely. Citus Data
-provides a Rails plugin that adds a
+provides a Rails plugin that adds a
[tenant ID to ActiveRecord models](https://www.citusdata.com/blog/2017/01/05/easily-scale-out-multi-tenant-apps/).
Sharding can also be done based on feature verticals. This is the
@@ -97,11 +97,11 @@ systems.
#### Database size
-A recent
+A recent
[database checkup shows a breakdown of the table sizes on GitLab.com](https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/8022#master-1022016101-8).
Since `merge_request_diff_files` contains over 1 TB of data, we want to
-reduce/eliminate this table first. GitLab has support for
-[storing diffs in object storage](../administration/merge_request_diffs.md), which we
+reduce/eliminate this table first. GitLab has support for
+[storing diffs in object storage](../administration/merge_request_diffs.md), which we
[want to do on GitLab.com](https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/7356).
#### High availability
@@ -149,7 +149,7 @@ limitation:
- Use a multi-threaded connection pooler (for example,
[Odyssey](https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/7776).
-On some Linux systems, it's possible to run
+On some Linux systems, it's possible to run
[multiple PgBouncer instances on the same port](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/4796).
On GitLab.com, we run multiple PgBouncer instances on different ports to