summaryrefslogtreecommitdiff
path: root/doc/development/database
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/database')
-rw-r--r--doc/development/database/add_foreign_key_to_existing_column.md115
-rw-r--r--doc/development/database/adding_database_indexes.md63
-rw-r--r--doc/development/database/avoiding_downtime_in_migrations.md146
-rw-r--r--doc/development/database/batched_background_migrations.md77
-rw-r--r--doc/development/database/clickhouse/gitlab_activity_data.md482
-rw-r--r--doc/development/database/clickhouse/index.md85
-rw-r--r--doc/development/database/clickhouse/optimization.md60
-rw-r--r--doc/development/database/database_debugging.md18
-rw-r--r--doc/development/database/index.md17
-rw-r--r--doc/development/database/migrations_for_multiple_databases.md3
-rw-r--r--doc/development/database/new_database_migration_version.md64
-rw-r--r--doc/development/database/pagination_guidelines.md2
-rw-r--r--doc/development/database/query_recorder.md1
-rw-r--r--doc/development/database/table_partitioning.md103
14 files changed, 1107 insertions, 129 deletions
diff --git a/doc/development/database/add_foreign_key_to_existing_column.md b/doc/development/database/add_foreign_key_to_existing_column.md
index 07fa8133496..2c2999e69d6 100644
--- a/doc/development/database/add_foreign_key_to_existing_column.md
+++ b/doc/development/database/add_foreign_key_to_existing_column.md
@@ -143,3 +143,118 @@ class ValidateForeignKeyOnEmailUsers < Gitlab::Database::Migration[2.1]
end
end
```
+
+### Validate the foreign key asynchronously
+
+For very large tables, foreign key validation can be a challenge to manage when
+it runs for many hours. Necessary database operations like `autovacuum` cannot
+run, and on GitLab.com, the deployment process is blocked waiting for the
+migrations to finish.
+
+To limit impact on GitLab.com, a process exists to validate them asynchronously
+during weekend hours. Due to generally lower traffic and fewer deployments,
+FK validation can proceed at a lower level of risk.
+
+### Schedule foreign key validation for a low-impact time
+
+1. [Schedule the FK to be validated](#schedule-the-fk-to-be-validated).
+1. [Verify the MR was deployed and the FK is valid in production](#verify-the-mr-was-deployed-and-the-fk-is-valid-in-production).
+1. [Add a migration to validate the FK synchronously](#add-a-migration-to-validate-the-fk-synchronously).
+
+### Schedule the FK to be validated
+
+1. Create a merge request containing a post-deployment migration, which prepares
+ the foreign key for asynchronous validation.
+1. Create a follow-up issue to add a migration that validates the foreign key
+ synchronously.
+1. In the merge request that prepares the asynchronous foreign key, add a
+ comment mentioning the follow-up issue.
+
+An example of validating the foreign key using the asynchronous helpers can be
+seen in the block below. This migration enters the foreign key name into the
+`postgres_async_foreign_key_validations` table. The process that runs on
+weekends pulls foreign keys from this table and attempts to validate them.
+
+```ruby
+# in db/post_migrate/
+
+FK_NAME = :fk_be5624bf37
+
+# TODO: FK to be validated synchronously in issue or merge request
+def up
+ # `some_column` can be an array of columns, and is not mandatory if `name` is supplied.
+ # `name` takes precedence over other arguments.
+ prepare_async_foreign_key_validation :ci_builds, :some_column, name: FK_NAME
+
+ # Or in case of partitioned tables, use:
+ prepare_partitioned_async_foreign_key_validation :p_ci_builds, :some_column, name: FK_NAME
+end
+
+def down
+ unprepare_async_foreign_key_validation :ci_builds, :some_column, name: FK_NAME
+
+ # Or in case of partitioned tables, use:
+ unprepare_partitioned_async_foreign_key_validation :p_ci_builds, :some_column, name: FK_NAME
+end
+```
+
+### Verify the MR was deployed and the FK is valid in production
+
+1. Verify that the post-deploy migration was executed on GitLab.com using ChatOps with
+ `/chatops run auto_deploy status <merge_sha>`. If the output returns `db/gprd`,
+ the post-deploy migration has been executed in the production database. For more information, see
+ [How to determine if a post-deploy migration has been executed on GitLab.com](https://gitlab.com/gitlab-org/release/docs/-/blob/master/general/post_deploy_migration/readme.md#how-to-determine-if-a-post-deploy-migration-has-been-executed-on-gitlabcom).
+1. Wait until the next week so that the FK can be validated over a weekend.
+1. Use [Database Lab](database_lab.md) to check if validation was successful.
+ Ensure the output does not indicate the foreign key is `NOT VALID`.
+
+### Add a migration to validate the FK synchronously
+
+After the foreign key is valid on the production database, create a second
+merge request that validates the foreign key synchronously. The schema changes
+must be updated and committed to `structure.sql` in this second merge request.
+The synchronous migration results in a no-op on GitLab.com, but you should still
+add the migration as expected for other installations. The below block
+demonstrates how to create the second migration for the previous
+asynchronous example.
+
+WARNING:
+Verify that the foreign key is valid in production before merging a second
+migration with `validate_foreign_key`. If the second migration is deployed
+before the validation has been executed, the foreign key is validated
+synchronously when the second migration executes.
+
+```ruby
+# in db/post_migrate/
+
+ FK_NAME = :fk_be5624bf37
+
+ def up
+ validate_foreign_key :ci_builds, :some_column, name: FK_NAME
+ end
+
+ def down
+ # Can be safely a no-op if we don't roll back the inconsistent data.
+ end
+end
+
+```
+
+## Test database FK changes locally
+
+You must test the database foreign key changes locally before creating a merge request.
+
+### Verify the foreign keys validated asynchronously
+
+Use the asynchronous helpers on your local environment to test changes for
+validating a foreign key:
+
+1. Enable the feature flags by running `Feature.enable(:database_async_foreign_key_validation)`
+ and `Feature.enable(:database_reindexing)` in the Rails console.
+1. Run `bundle exec rails db:migrate` so that it creates an entry in the async validation table.
+1. Run `bundle exec rails gitlab:db:reindex` so that the FK is validated asynchronously.
+1. To verify the foreign key, open the PostgreSQL console using the
+ [GDK](https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/main/doc/howto/postgresql.md)
+ command `gdk psql` and run the command `\d+ table_name` to check that your
+ foreign key is valid. A successful validation removes `NOT VALID` from
+ the foreign key definition.
diff --git a/doc/development/database/adding_database_indexes.md b/doc/development/database/adding_database_indexes.md
index e1d5a7af6d9..1e3a1de9b69 100644
--- a/doc/development/database/adding_database_indexes.md
+++ b/doc/development/database/adding_database_indexes.md
@@ -214,6 +214,45 @@ def down
end
```
+## Analyzing a new index before a batched background migration
+
+Sometimes it is necessary to add an index to support a [batched background migration](batched_background_migrations.md).
+It is commonly done by creating two [post deployment migrations](post_deployment_migrations.md):
+
+1. Add the new index, often a [temporary index](#temporary-indexes).
+1. [Queue the batched background migration](batched_background_migrations.md#queueing).
+
+In most cases, no additional work is needed. The new index is created and is used
+as expected when queuing and executing the batched background migration.
+
+[Expression indexes](https://www.postgresql.org/docs/current/indexes-expressional.html),
+however, do not generate statistics for the new index on creation. Autovacuum
+eventually runs `ANALYZE`, and updates the statistics so the new index is used.
+Run `ANALYZE` explicitly only if it is needed right after the index
+is created, such as in the background migration scenario described above.
+
+To trigger `ANALYZE` after the index is created, update the index creation migration
+to analyze the table:
+
+```ruby
+# in db/post_migrate/
+
+INDEX_NAME = 'tmp_index_projects_on_owner_and_lower_name_where_emails_disabled'
+TABLE = :projects
+
+disable_ddl_transaction!
+
+def up
+ add_concurrent_index TABLE, '(creator_id, lower(name))', where: 'emails_disabled = false', name: INDEX_NAME
+
+ connection.execute("ANALYZE #{TABLE}")
+end
+```
+
+`ANALYZE` should only be run in post deployment migrations and should not target
+[large tables](https://gitlab.com/gitlab-org/gitlab/-/blob/master/rubocop/rubocop-migrations.yml#L3).
+If this behavior is needed on a larger table, ask for assistance in the `#database` Slack channel.
+
## Indexes for partitioned tables
Indexes [cannot be created](https://www.postgresql.org/docs/15/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-MAINTENANCE)
@@ -254,7 +293,7 @@ end
For very large tables, index creation can be a challenge to manage.
While `add_concurrent_index` creates indexes in a way that does not block
-normal traffic, it can still be problematic when index creation runs for
+ordinary traffic, it can still be problematic when index creation runs for
many hours. Necessary database operations like `autovacuum` cannot run, and
on GitLab.com, the deployment process is blocked waiting for index
creation to finish.
@@ -271,8 +310,13 @@ index creation can proceed at a lower level of risk.
### Schedule the index to be created
-Create an MR with a post-deployment migration which prepares the index
-for asynchronous creation. An example of creating an index using
+1. Create a merge request containing a post-deployment migration, which prepares
+ the index for asynchronous creation.
+1. [Create a follow-up issue](https://gitlab.com/gitlab-org/gitlab/-/issues/new?issuable_template=Synchronous%20Database%20Index)
+ to add a migration that creates the index synchronously.
+1. In the merge request that prepares the asynchronous index, add a comment mentioning the follow-up issue.
+
+An example of creating an index using
the asynchronous index helpers can be seen in the block below. This migration
enters the index name and definition into the `postgres_async_indexes`
table. The process that runs on weekends pulls indexes from this
@@ -283,6 +327,7 @@ table and attempt to create them.
INDEX_NAME = 'index_ci_builds_on_some_column'
+# TODO: Index to be created synchronously in https://gitlab.com/gitlab-org/gitlab/-/issues/XXXXX
def up
prepare_async_index :ci_builds, :some_column, name: INDEX_NAME
end
@@ -351,7 +396,7 @@ Use the asynchronous index helpers on your local environment to test changes for
For very large tables, index destruction can be a challenge to manage.
While `remove_concurrent_index` removes indexes in a way that does not block
-normal traffic, it can still be problematic if index destruction runs for
+ordinary traffic, it can still be problematic if index destruction runs for
during `autovacuum`. Necessary database operations like `autovacuum` cannot run, and
the deployment process on GitLab.com is blocked while waiting for index
destruction to finish.
@@ -366,8 +411,13 @@ index destruction can proceed at a lower level of risk.
### Schedule the index to be removed
-Create an MR with a post-deployment migration which prepares the index
-for asynchronous destruction. For example. to destroy an index using
+1. Create a merge request containing a post-deployment migration, which prepares
+ the index for asynchronous destruction.
+1. [Create a follow-up issue](https://gitlab.com/gitlab-org/gitlab/-/issues/new?issuable_template=Synchronous%20Database%20Index)
+ to add a migration that destroys the index synchronously.
+1. In the merge request that prepares the asynchronous index removal, add a comment mentioning the follow-up issue.
+
+For example, to destroy an index using
the asynchronous index helpers:
```ruby
@@ -375,6 +425,7 @@ the asynchronous index helpers:
INDEX_NAME = 'index_ci_builds_on_some_column'
+# TODO: Index to be destroyed synchronously in https://gitlab.com/gitlab-org/gitlab/-/issues/XXXXX
def up
prepare_async_index_removal :ci_builds, :some_column, name: INDEX_NAME
end
diff --git a/doc/development/database/avoiding_downtime_in_migrations.md b/doc/development/database/avoiding_downtime_in_migrations.md
index bb6e13eff53..8e1eeee7a42 100644
--- a/doc/development/database/avoiding_downtime_in_migrations.md
+++ b/doc/development/database/avoiding_downtime_in_migrations.md
@@ -6,20 +6,20 @@ info: To determine the technical writer assigned to the Stage/Group associated w
# Avoiding downtime in migrations
-When working with a database certain operations may require downtime. Since we
+When working with a database certain operations may require downtime. As we
cannot have downtime in migrations we need to use a set of steps to get the
same end result without downtime. This guide describes various operations that
may appear to need downtime, their impact, and how to perform them without
requiring downtime.
-## Dropping Columns
+## Dropping columns
Removing columns is tricky because running GitLab processes may still be using
the columns. To work around this safely, you need three steps in three releases:
-1. Ignoring the column (release M)
-1. Dropping the column (release M+1)
-1. Removing the ignore rule (release M+2)
+1. [Ignoring the column](#ignoring-the-column-release-m) (release M)
+1. [Dropping the column](#dropping-the-column-release-m1) (release M+1)
+1. [Removing the ignore rule](#removing-the-ignore-rule-release-m2) (release M+2)
The reason we spread this out across three releases is that dropping a column is
a destructive operation that can't be rolled back easily.
@@ -27,9 +27,9 @@ a destructive operation that can't be rolled back easily.
Following this procedure helps us to make sure there are no deployments to GitLab.com
and upgrade processes for self-managed installations that lump together any of these steps.
-### Step 1: Ignoring the column (release M)
+### Ignoring the column (release M)
-The first step is to ignore the column in the application code. This is
+The first step is to ignore the column in the application code. This step is
necessary because Rails caches the columns and re-uses this cache in various
places. This can be done by defining the columns to ignore. For example, to ignore
`updated_at` in the User model you'd use the following:
@@ -50,7 +50,7 @@ ignore_columns %i[updated_at created_at], remove_with: '12.7', remove_after: '20
If the model exists in CE and EE, the column has to be ignored in the CE model. If the
model only exists in EE, then it has to be added there.
-We require indication of when it is safe to remove the column ignore with:
+We require indication of when it is safe to remove the column ignore rule with:
- `remove_with`: set to a GitLab release typically two releases (M+2) after adding the
column ignore.
@@ -64,7 +64,7 @@ to ignore the column and subsequently remove the column ignore (which would resu
In this example, the change to ignore the column went into release 12.5.
-### Step 2: Dropping the column (release M+1)
+### Dropping the column (release M+1)
Continuing our example, dropping the column goes into a _post-deployment_ migration in release 12.6:
@@ -74,12 +74,14 @@ Start by creating the **post-deployment migration**:
bundle exec rails g post_deployment_migration remove_users_updated_at_column
```
-There are two scenarios that you need to consider
-to write a migration that removes a column:
+You must consider these scenarios when you write a migration that removes a column:
-#### A. The removed column has no indexes or constraints that belong to it
+- [The removed column has no indexes or constraints that belong to it](#the-removed-column-has-no-indexes-or-constraints-that-belong-to-it)
+- [The removed column has an index or constraint that belongs to it](#the-removed-column-has-an-index-or-constraint-that-belongs-to-it)
-In this case, a **transactional migration** can be used. Something as simple as:
+#### The removed column has no indexes or constraints that belong to it
+
+In this case, a **transactional migration** can be used:
```ruby
class RemoveUsersUpdatedAtColumn < Gitlab::Database::Migration[2.1]
@@ -97,10 +99,10 @@ You can consider [enabling lock retries](../migration_style_guide.md#usage-with-
when you run a migration on big tables, because it might take some time to
acquire a lock on this table.
-#### B. The removed column has an index or constraint that belongs to it
+#### The removed column has an index or constraint that belongs to it
If the `down` method requires adding back any dropped indexes or constraints, that cannot
-be done within a transactional migration, then the migration would look like this:
+be done in a transactional migration. The migration would look like this:
```ruby
class RemoveUsersUpdatedAtColumn < Gitlab::Database::Migration[2.1]
@@ -131,7 +133,7 @@ is used to disable the transaction that wraps the whole migration.
You can refer to the page [Migration Style Guide](../migration_style_guide.md)
for more information about database migrations.
-### Step 3: Removing the ignore rule (release M+2)
+### Removing the ignore rule (release M+2)
With the next release, in this example 12.7, we set up another merge request to remove the ignore rule.
This removes the `ignore_column` line and - if not needed anymore - also the inclusion of `IgnoreableColumns`.
@@ -139,18 +141,24 @@ This removes the `ignore_column` line and - if not needed anymore - also the inc
This should only get merged with the release indicated with `remove_with` and once
the `remove_after` date has passed.
-## Renaming Columns
+## Renaming columns
-Renaming columns the normal way requires downtime as an application may continue
+Renaming columns the standard way requires downtime as an application may continue
to use the old column names during or after a database migration. To rename a column
without requiring downtime, we need two migrations: a regular migration and a
post-deployment migration. Both these migrations can go in the same release.
+The steps:
+
+1. [Add the regular migration](#add-the-regular-migration-release-m) (release M)
+1. [Ignore the column](#ignore-the-column-release-m) (release M)
+1. [Add a post-deployment migration](#add-a-post-deployment-migration-release-m) (release M)
+1. [Remove the ignore rule](#remove-the-ignore-rule-release-m1) (release M+1)
NOTE:
It's not possible to rename columns with default values. For more details, see
[this merge request](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/52032#default-values).
-### Step 1: Add The Regular Migration
+### Add the regular migration (release M)
First we need to create the regular migration. This migration should use
`Gitlab::Database::MigrationHelpers#rename_column_concurrently` to perform the
@@ -178,7 +186,20 @@ If a column contains one or more indexes that don't contain the name of the
original column, the previously described procedure fails. In that case,
you need to rename these indexes.
-### Step 2: Add A Post-Deployment Migration
+### Ignore the column (release M)
+
+The next step is to ignore the column in the application code, and make sure it is not used. This step is
+necessary because Rails caches the columns and re-uses this cache in various places.
+This step is similar to [the first step when column is dropped](#ignoring-the-column-release-m), and the same requirements apply.
+
+```ruby
+class User < ApplicationRecord
+ include IgnorableColumns
+ ignore_column :updated_at, remove_with: '12.7', remove_after: '2020-01-22'
+end
+```
+
+### Add a post-deployment migration (release M)
The renaming procedure requires some cleaning up in a post-deployment migration.
We can perform this cleanup using
@@ -202,7 +223,11 @@ end
If you're renaming a [large table](https://gitlab.com/gitlab-org/gitlab/-/blob/master/rubocop/rubocop-migrations.yml#L3), carefully consider the state when the first migration has run but the second cleanup migration hasn't been run yet.
With [Canary](https://gitlab.com/gitlab-com/gl-infra/readiness/-/tree/master/library/canary/) it is possible that the system runs in this state for a significant amount of time.
-## Changing Column Constraints
+### Remove the ignore rule (release M+1)
+
+Same as when column is dropped, after the rename is completed, we need to [remove the ignore rule](#removing-the-ignore-rule-release-m2) in a subsequent release.
+
+## Changing column constraints
Adding or removing a `NOT NULL` clause (or another constraint) can typically be
done without requiring downtime. However, this does require that any application
@@ -218,14 +243,18 @@ You can check the following guides for each specific use case:
- [Adding `NOT NULL` constraints](not_null_constraints.md)
- [Adding limits to text columns](strings_and_the_text_data_type.md)
-## Changing Column Types
+## Changing column types
Changing the type of a column can be done using
`Gitlab::Database::MigrationHelpers#change_column_type_concurrently`. This
method works similarly to `rename_column_concurrently`. For example, let's say
-we want to change the type of `users.username` from `string` to `text`.
+we want to change the type of `users.username` from `string` to `text`:
+
+1. [Create a regular migration](#create-a-regular-migration)
+1. [Create a post-deployment migration](#create-a-post-deployment-migration)
+1. [Casting data to a new type](#casting-data-to-a-new-type)
-### Step 1: Create A Regular Migration
+### Create a regular migration
A regular migration is used to create a new column with a temporary name along
with setting up some triggers to keep data in sync. Such a migration would look
@@ -246,7 +275,7 @@ class ChangeUsersUsernameStringToText < Gitlab::Database::Migration[2.1]
end
```
-### Step 2: Create A Post Deployment Migration
+### Create a post-deployment migration
Next we need to clean up our changes using a post-deployment migration:
@@ -293,13 +322,13 @@ specify the old default.
Doing this requires steps in two minor releases:
-1. Add the `SafelyChangeColumnDefault` concern to the model and change the default in a post-migration.
-1. Clean up the `SafelyChangeColumnDefault` concern in the next minor release.
+1. [Add the `SafelyChangeColumnDefault` concern to the model](#add-the-safelychangecolumndefault-concern-to-the-model-and-change-the-default-in-a-post-migration) and change the default in a post-migration.
+1. [Clean up the `SafelyChangeColumnDefault` concern](#clean-up-the-safelychangecolumndefault-concern-in-the-next-minor-release) in the next minor release.
We must wait a minor release before cleaning up the `SafelyChangeColumnDefault` because self-managed
releases bundle an entire minor release into a single zero-downtime deployment.
-### Step 1: Add the `SafelyChangeColumnDefault` concern to the model and change the default in a post-migration
+### Add the `SafelyChangeColumnDefault` concern to the model and change the default in a post-migration
The first step is to mark the column as safe to change in application code.
@@ -333,12 +362,12 @@ You can consider [enabling lock retries](../migration_style_guide.md#usage-with-
when you run a migration on big tables, because it might take some time to
acquire a lock on this table.
-### Step 2: Clean up the `SafelyChangeColumnDefault` concern in the next minor release
+### Clean up the `SafelyChangeColumnDefault` concern in the next minor release
In the next minor release, create a new merge request to remove the `columns_changing_default` call. Also remove the `SafelyChangeColumnDefault` include
if it is not needed for a different column.
-## Changing The Schema For Large Tables
+## Changing the schema for large tables
While `change_column_type_concurrently` and `rename_column_concurrently` can be
used for changing the schema of a table without downtime, it doesn't work very
@@ -354,7 +383,7 @@ down deployments.
For more information, see [the documentation on cleaning up batched background migrations](batched_background_migrations.md#cleaning-up).
-## Adding Indexes
+## Adding indexes
Adding indexes does not require downtime when `add_concurrent_index`
is used.
@@ -362,15 +391,15 @@ is used.
See also [Migration Style Guide](../migration_style_guide.md#adding-indexes)
for more information.
-## Dropping Indexes
+## Dropping indexes
Dropping an index does not require downtime.
-## Adding Tables
+## Adding tables
This operation is safe as there's no code using the table just yet.
-## Dropping Tables
+## Dropping tables
Dropping tables can be done safely using a post-deployment migration, but only
if the application no longer uses the table.
@@ -378,7 +407,7 @@ if the application no longer uses the table.
Add the table to [`db/docs/deleted_tables`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/db/docs/deleted_tables) using the process described in [database dictionary](database_dictionary.md#dropping-tables).
Even though the table is deleted, it is still referenced in database migrations.
-## Renaming Tables
+## Renaming tables
Renaming tables requires downtime as an application may continue
using the old table name during/after a database migration.
@@ -389,7 +418,7 @@ table and creating a new one is the preferred way to "rename" the table.
Renaming a table is possible without downtime by following our multi-release
[rename table process](rename_database_tables.md#rename-table-without-downtime).
-## Adding Foreign Keys
+## Adding foreign keys
Adding foreign keys usually works in 3 steps:
@@ -404,7 +433,7 @@ GitLab allows you to work around this by using
`Gitlab::Database::MigrationHelpers#add_concurrent_foreign_key`. This method
ensures that no downtime is needed.
-## Removing Foreign Keys
+## Removing foreign keys
This operation does not require downtime.
@@ -418,14 +447,16 @@ without downtime and causing too much load on the database is described below.
To start the process, add a regular migration to create the new `bigint` columns. Use the provided
`initialize_conversion_of_integer_to_bigint` helper. The helper also creates a database trigger
-to keep in sync both columns for any new records ([see an example](https://gitlab.com/gitlab-org/gitlab/-/blob/41fbe34a4725a4e357a83fda66afb382828767b2/db/migrate/20210608072312_initialize_conversion_of_ci_stages_to_bigint.rb)):
+to keep in sync both columns for any new records ([code](https://gitlab.com/gitlab-org/gitlab/-/blob/97aee76c4bfc2043dc0a1ef9ffbb71c58e0e2857/db/migrate/20230127093353_initialize_conversion_of_merge_request_metrics_to_bigint.rb)):
```ruby
-class InitializeConversionOfCiStagesToBigint < ActiveRecord::Migration[6.1]
- include Gitlab::Database::MigrationHelpers
+# frozen_string_literal: true
+
+class InitializeConversionOfMergeRequestMetricsToBigint < Gitlab::Database::Migration[2.1]
+ disable_ddl_transaction!
- TABLE = :ci_stages
- COLUMNS = %i(id)
+ TABLE = :merge_request_metrics
+ COLUMNS = %i[id]
def up
initialize_conversion_of_integer_to_bigint(TABLE, COLUMNS)
@@ -440,29 +471,28 @@ end
Ignore the new `bigint` columns:
```ruby
-module Ci
- class Stage < Ci::ApplicationRecord
- include IgnorableColumns
- ignore_column :id_convert_to_bigint, remove_with: '14.2', remove_after: '2021-08-22'
- end
+# frozen_string_literal: true
+
+class MergeRequest::Metrics < ApplicationRecord
+ include IgnorableColumns
+ ignore_column :id_convert_to_bigint, remove_with: '16.0', remove_after: '2023-05-22'
+end
```
-To migrate existing data, we introduced new type of _batched background migrations_.
-Unlike the classic background migrations, built on top of Sidekiq, batched background migrations
-don't have to enqueue and schedule all the background jobs at the beginning.
-They also have other advantages, like automatic tuning of the batch size, better progress visibility,
-and collecting metrics. To start the process, use the provided `backfill_conversion_of_integer_to_bigint`
-helper ([example](https://gitlab.com/gitlab-org/gitlab/-/blob/41fbe34a4725a4e357a83fda66afb382828767b2/db/migrate/20210608072346_backfill_ci_stages_for_bigint_conversion.rb)):
+Enqueue batched background migration ([code](https://gitlab.com/gitlab-org/gitlab/-/blob/97aee76c4bfc2043dc0a1ef9ffbb71c58e0e2857/db/post_migrate/20230127101834_backfill_merge_request_metrics_for_bigint_conversion.rb))
+to migrate the existing data:
```ruby
-class BackfillCiStagesForBigintConversion < ActiveRecord::Migration[6.1]
- include Gitlab::Database::MigrationHelpers
+# frozen_string_literal: true
+
+class BackfillMergeRequestMetricsForBigintConversion < Gitlab::Database::Migration[2.1]
+ restrict_gitlab_migration gitlab_schema: :gitlab_main
- TABLE = :ci_stages
- COLUMNS = %i(id)
+ TABLE = :merge_request_metrics
+ COLUMNS = %i[id]
def up
- backfill_conversion_of_integer_to_bigint(TABLE, COLUMNS)
+ backfill_conversion_of_integer_to_bigint(TABLE, COLUMNS, sub_batch_size: 200)
end
def down
diff --git a/doc/development/database/batched_background_migrations.md b/doc/development/database/batched_background_migrations.md
index 88fdfab9828..c6fe6d16faf 100644
--- a/doc/development/database/batched_background_migrations.md
+++ b/doc/development/database/batched_background_migrations.md
@@ -309,7 +309,7 @@ In the second (filtered) example, we know exactly 100 will be updated with each
NOTE:
When applying additional filters, it is important to ensure they are properly covered by an index to optimize `EachBatch` performance.
-In the example above we need an index on `(type, id)` to support the filters. See [the `EachBatch` docs for more information](iterating_tables_in_batches.md).
+In the example above we need an index on `(type, id)` to support the filters. See [the `EachBatch` documentation for more information](iterating_tables_in_batches.md).
## Example
@@ -349,20 +349,35 @@ background migration.
`BatchedMigrationJob` is initialized with necessary arguments to
execute the batch, as well as a connection to the tracking database.
-1. Add a new trigger to the database to update newly created and updated routes,
- similar to this example:
+1. Create a database migration that adds a new trigger to the database. Example:
```ruby
- execute(<<~SQL)
- CREATE OR REPLACE FUNCTION example() RETURNS trigger
- LANGUAGE plpgsql
- AS $$
- BEGIN
- NEW."namespace_id" = NEW."source_id"
- RETURN NEW;
- END;
- $$;
- SQL
+ class AddTriggerToRoutesToCopySourceIdToNamespaceId < Gitlab::Database::Migration[2.1]
+ FUNCTION_NAME = 'example_function'
+ TRIGGER_NAME = 'example_trigger'
+
+ def up
+ execute(<<~SQL)
+ CREATE OR REPLACE FUNCTION #{FUNCTION_NAME}() RETURNS trigger
+ LANGUAGE plpgsql
+ AS $$
+ BEGIN
+ NEW."namespace_id" = NEW."source_id"
+ RETURN NEW;
+ END;
+ $$;
+
+ CREATE TRIGGER #{TRIGGER_NAME}() AFTER INSERT OR UPDATE
+ ON routes
+ FOR EACH ROW EXECUTE FUNCTION #{FUNCTION_NAME}();
+ SQL
+ end
+
+ def down
+ drop_trigger(TRIGGER_NAME, :routes)
+ drop_function(FUNCTION_NAME)
+ end
+ end
```
1. Create a post-deployment migration that queues the migration for existing data:
@@ -398,10 +413,28 @@ background migration.
`restrict_gitlab_migration gitlab_schema: :gitlab_ci`.
After deployment, our application:
- - Continues using the data as before.
- - Ensures that both existing and new data are migrated.
+ - Continues using the data as before.
+ - Ensures that both existing and new data are migrated.
+
+1. In the next release, add a database migration to remove the trigger.
-1. In the next release, remove the trigger. We must also add a new post-deployment migration
+ ```ruby
+ class RemoveNamepaceIdTriggerFromRoutes < Gitlab::Database::Migration[2.1]
+ FUNCTION_NAME = 'example_function'
+ TRIGGER_NAME = 'example_trigger'
+
+ def up
+ drop_trigger(TRIGGER_NAME, :routes)
+ drop_function(FUNCTION_NAME)
+ end
+
+ def down
+ # Should reverse the trigger and the function in the up method of the migration that added it
+ end
+ end
+ ```
+
+1. Add a new post-deployment migration
that checks that the batched background migration is completed. For example:
```ruby
@@ -502,7 +535,7 @@ end
```
NOTE:
-[Additional filters](#additional-filters) defined with `scope_to` will be ignored by `LooseIndexScanBatchingStrategy` and `distinct_each_batch`.
+[Additional filters](#additional-filters) defined with `scope_to` are ignored by `LooseIndexScanBatchingStrategy` and `distinct_each_batch`.
## Testing
@@ -686,6 +719,16 @@ You can view failures in two ways:
WHERE transition_logs.next_status = '2' AND migration.job_class_name = "CLASS_NAME";
```
+### Adding indexes to support batched background migrations
+
+Sometimes it is necessary to add a new or temporary index to support a batched background migration.
+To do this, create the index in a post-deployment migration that precedes the post-deployment
+migration that queues the background migration.
+
+See the documentation for [adding database indexes](adding_database_indexes.md#analyzing-a-new-index-before-a-batched-background-migration)
+for additional information about some cases that require special attention to allow the index to be used directly after
+creation.
+
## Legacy background migrations
Batched background migrations replaced the [legacy background migrations framework](background_migrations.md).
diff --git a/doc/development/database/clickhouse/gitlab_activity_data.md b/doc/development/database/clickhouse/gitlab_activity_data.md
new file mode 100644
index 00000000000..6ba11b8afaf
--- /dev/null
+++ b/doc/development/database/clickhouse/gitlab_activity_data.md
@@ -0,0 +1,482 @@
+---
+stage: Data Stores
+group: Database
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
+---
+
+# Store GitLab activity data in ClickHouse
+
+## Overview of the existing implementation
+
+### What is GitLab activity data
+
+GitLab records activity data during its operation as users interact with the application. Most of these interactions revolve around the projects, issues, and merge requests domain objects. Users can perform several different actions and some of these actions are recorded in a separate PostgreSQL database table called `events`.
+
+Example events:
+
+- Issue opened
+- Issue reopened
+- User joined a project
+- Merge Request merged
+- Repository pushed
+- Snippet created
+
+### Where is the activity data used
+
+Several features use activity data:
+
+- The user's [contribution calendar](../../../user/profile/contributions_calendar.md) on the profile page.
+- Paginated list of the user's contributions.
+- Paginated list of user activity for a Project and a Group.
+- [Contribution analytics](../../../user/group/contribution_analytics/index.md).
+
+### How is the activity data created
+
+The activity data is usually generated on the service layer when a specific operation is executed by the user. The persistence characteristics of an `events` record depend on the implementation of the service. Two main approaches exist:
+
+1. In the database transaction where the actual event occurs.
+1. After the database transaction (which could be delayed).
+
+The above-mentioned mechanics provide a "mostly" consistent stream of `events`.
+
+For example, consistently recording an `events` record:
+
+```ruby
+ApplicationRecord.transaction do
+ issue.closed!
+ Event.create!(action: :closed, target: issue)
+end
+```
+
+Example, unsafe recording of an `events` record:
+
+```ruby
+ApplicationRecord.transaction do
+ issue.closed!
+end
+
+# If a crash happens here, the event will not be recorded.
+Event.create!(action: :closed, target: issue)
+```
+
+### Database table structure
+
+The `events` table uses [polymorphic association](https://guides.rubyonrails.org/association_basics.html#polymorphic-associations) to allow associating different database tables (issues, merge requests, etc.) with a record. A simplified database structure:
+
+```sql
+ Column | Type | Nullable | Default | Storage |
+-------------+--------------------------+-----------+----------+------------------------------------+
+ project_id | integer | | | plain |
+ author_id | integer | not null | | plain |
+ target_id | integer | | | plain |
+ created_at | timestamp with time zone | not null | | plain |
+ updated_at | timestamp with time zone | not null | | plain |
+ action | smallint | not null | | plain |
+ target_type | character varying | | | extended |
+ group_id | bigint | | | plain |
+ fingerprint | bytea | | | extended |
+ id | bigint | not null | nextval('events_id_seq'::regclass) | plain |
+```
+
+Some unexpected characteristics due to the evolving database design:
+
+- The `project_id` and the `group_id` columns are mutually exclusive, internally we call them resource parent.
+ - Example 1: for an issue opened event, the `project_id` field is populated.
+ - Example 2: for an epic-related event, the `group_id` field is populated (epic is always part of a group).
+- The `target_id` and `target_type` column pair identifies the target record.
+ - Example: `target_id=1` and `target_type=Issue`.
+ - When the columns are `null`, we refer to an event which has no representation in the database. For example a repository `push` action.
+- Fingerprint is used in some cases to later alter the event based on some metadata change. This approach is mostly used for Wiki pages.
+
+### Database record modifications
+
+Most of the data is written once however, we cannot say that the table is append-only. A few use cases where actual row updates and deletions happen:
+
+- Fingerprint-based update for certain Wiki page records.
+- When user or an associated resource is deleted, the event rows are also deleted.
+ - The deletion of the associated `events` records happens in batches.
+
+### Current performance problems
+
+- The table uses significant disk space.
+- Adding new events may significantly increase the database record count.
+- Implementing data pruning logic is difficult.
+- Time-range-based aggregations are not performant enough, some features may break due to slow database queries.
+
+### Example queries
+
+NOTE:
+These queries have been significantly simplified from the actual queries from production.
+
+Database query for the user's contribution graph:
+
+```sql
+SELECT DATE(events.created_at), COUNT(*)
+FROM events
+WHERE events.author_id = 1
+AND events.created_at BETWEEN '2022-01-17 23:00:00' AND '2023-01-18 22:59:59.999999'
+AND (
+ (
+ events.action = 5
+ ) OR
+ (
+ events.action IN (1, 3) -- Enum values are documented in the Event model, see the ACTIONS constant in app/models/event.rb
+ AND events.target_type IN ('Issue', 'WorkItem')
+ ) OR
+ (
+ events.action IN (7, 1, 3)
+ AND events.target_type = 'MergeRequest'
+ ) OR
+ (
+ events.action = 6
+ )
+)
+GROUP BY DATE(events.created_at)
+```
+
+Query for group contributions for each user:
+
+```sql
+SELECT events.author_id, events.target_type, events.action, COUNT(*)
+FROM events
+WHERE events.created_at BETWEEN '2022-01-17 23:00:00' AND '2023-03-18 22:59:59.999999'
+AND events.project_id IN (1, 2, 3) -- list of project ids in the group
+GROUP BY events.author_id, events.target_type, events.action
+```
+
+## Storing activity data in ClickHouse
+
+### Data persistence
+
+At the moment, there is no consensus about the way we would replicate data from the PostgreSQL database to ClickHouse. A few ideas that might work for the `events` table:
+
+#### Record data immediately
+
+This approach provides a simple way to keep the existing `events` table working while we're also sending data to the ClickHouse database. When an event record is created, ensure that it's created outside of the transaction. After persisting the data in PostgreSQL, persist it in ClickHouse.
+
+```ruby
+ApplicationRecord.transaction do
+ issue.update!(state: :closed)
+end
+
+# could be a method to hide complexity
+Event.create!(action: :closed, target: issue)
+ClickHouse::Event.create(action: :closed, target: issue)
+```
+
+What's behind the implementation of `ClickHouse::Event` is not decided yet, it could be one of the following:
+
+- ActiveRecord model directly connecting the ClickHouse database.
+- REST API call to an intermediate service.
+- Enqueueing an event to an event-streaming tool (like Kafka).
+
+#### Replication of `events` rows
+
+Assuming that the creation of `events` record is an integral part of the system, introducing another storage call might cause performance degradation in various code paths, or it could introduce significant complexity.
+
+Rather than sending data to ClickHouse on event creation time, we would move this processing in the background by iterating over the `events` table and sending the newly created database rows.
+
+By keeping track of which records have been sent over ClickHouse, we could incrementally send data.
+
+```ruby
+last_updated_at = SyncProcess.last_updated_at
+
+# oversimplified loop, we would probably batch this...
+Event.where(updated_at > last_updated_at).each do |row|
+ last_row = ClickHouse::Event.create(row)
+end
+
+SyncProcess.last_updated_at = last_row.updated_at
+```
+
+### ClickHouse database table structure
+
+When coming up with the initial database structure, we must look at the way the data is queried.
+
+We have two main use cases:
+
+- Query data for a certain user, within a time range.
+ - `WHERE author_id = 1 AND created_at BETWEEN '2021-01-01' AND '2021-12-31'`
+ - Additionally, there might be extra `project_id` condition due to the access control check.
+- Query data for a project or group, within a time range.
+ - `WHERE project_id IN (1, 2) AND created_at BETWEEN '2021-01-01' AND '2021-12-31'`
+
+The `author_id` and `project_id` columns are considered high-selectivity columns. By this we mean that optimizing the filtering of the `author_id` and the `project_id` columns is desirable for having performant database queries.
+
+The most recent activity data is queried more often. At some point, we might just drop or relocate older data. Most of the features look back only a year.
+
+For these reasons, we could start with a database table storing low-level `events` data:
+
+```plantuml
+hide circle
+
+entity "events" as events {
+ id : UInt64 ("primary key")
+--
+ project_id : UInt64
+ group_id : UInt64
+ target_id : UInt64
+ target_type : String
+ action : UInt8
+ fingerprint : UInt64
+ created_at : DateTime
+ updated_at : DateTime
+}
+```
+
+The SQL statement for creating the table:
+
+```sql
+CREATE TABLE events
+(
+ `id` UInt64,
+ `project_id` UInt64 DEFAULT 0 NOT NULL,
+ `group_id` UInt64 DEFAULT 0 NOT NULL,
+ `author_id` UInt64 DEFAULT 0 NOT NULL,
+ `target_id` UInt64 DEFAULT 0 NOT NULL,
+ `target_type` LowCardinality(String) DEFAULT '' NOT NULL,
+ `action` UInt8 DEFAULT 0 NOT NULL,
+ `fingerprint` UInt64 DEFAULT 0 NOT NULL,
+ `created_at` DateTime64(6, 'UTC') DEFAULT now() NOT NULL,
+ `updated_at` DateTime64(6, 'UTC') DEFAULT now() NOT NULL
+)
+ENGINE = ReplacingMergeTree(updated_at)
+ORDER BY id;
+```
+
+A few changes compared to the PostgreSQL version:
+
+- `target_type` uses [an optimization](https://clickhouse.com/docs/en/sql-reference/data-types/lowcardinality/) for low-cardinality column values.
+- `fingerprint` becomes an integer and leverages a performant integer-based hashing function such as xxHash64.
+- All columns get a default value, the 0 default value for the integer columns means no value. See the related [best practices](https://clickhouse.com/docs/en/cloud/bestpractices/avoid-nullable-columns/).
+- `NOT NULL` to ensure that we always use the default values when data is missing (different behavior compared to PostgreSQL).
+- The "primary" key automatically becomes the `id` column due to the `ORDER BY` clause.
+
+Let's insert the same primary key value twice:
+
+```sql
+INSERT INTO events (id, project_id, target_id, author_id, target_type, action) VALUES (1, 2, 3, 4, 'Issue', null);
+INSERT INTO events (id, project_id, target_id, author_id, target_type, action) VALUES (1, 20, 30, 5, 'Issue', null);
+```
+
+Let's inspect the results:
+
+```sql
+SELECT * FROM events
+```
+
+- We have two rows with the same `id` value (primary key).
+- The `null` `action` becomes `0`.
+- The non-specified fingerprint column becomes `0`.
+- The `DateTime` columns have the insert timestamp.
+
+ClickHouse will eventually "replace" the rows with the same primary key in the background. When running this operation, the higher `updated_at` value takes precedence. The same behavior can be simulated with the `final` keyword:
+
+```sql
+SELECT * FROM events FINAL
+```
+
+Adding `FINAL` to a query can have significant performance consequences, some of the issues are documented in the [ClickHouse documentation](https://clickhouse.com/docs/en/sql-reference/statements/select/from/#final-modifier).
+
+We should always expect duplicated values in the table, so we must take care of the deduplication in query time.
+
+### ClickHouse database queries
+
+ClickHouse uses SQL for querying the data, in some cases, a PostgreSQL query can be used in ClickHouse without major modifications assuming that the underlying database structure is very similar.
+
+Query for group contributions for each user (PostgreSQL):
+
+```sql
+SELECT events.author_id, events.target_type, events.action, COUNT(*)
+FROM events
+WHERE events.created_at BETWEEN '2022-01-17 23:00:00' AND '2023-03-18 22:59:59.999999'
+AND events.project_id IN (1, 2, 3) -- list of project ids in the group
+GROUP BY events.author_id, events.target_type, events.action
+```
+
+The same query would work in PostgreSQL however, we might see duplicated values in ClickHouse due to the way the table engine works. The deduplication can be achieved by using a nested `FROM` statement.
+
+```sql
+SELECT author_id, target_type, action, count(*)
+FROM (
+ SELECT
+ id,
+ argMax(events.project_id, events.updated_at) AS project_id,
+ argMax(events.group_id, events.updated_at) AS group_id,
+ argMax(events.author_id, events.updated_at) AS author_id,
+ argMax(events.target_type, events.updated_at) AS target_type,
+ argMax(events.target_id, events.updated_at) AS target_id,
+ argMax(events.action, events.updated_at) AS action,
+ argMax(events.fingerprint, events.updated_at) AS fingerprint,
+ FIRST_VALUE(events.created_at) AS created_at,
+ MAX(events.updated_at) AS updated_at
+ FROM events
+ WHERE events.created_at BETWEEN '2022-01-17 23:00:00' AND '2023-03-18 22:59:59.999999'
+ AND events.project_id IN (1, 2, 3) -- list of project ids in the group
+ GROUP BY id
+) AS events
+GROUP BY author_id, target_type, action
+```
+
+- Take the most recent column values based on the `updated_at` column.
+- Take the first value for `created_at`, assuming that the first `INSERT` contains the correct value. An issue only when we don't sync `created_at` at all and the default value (`NOW()`) is used.
+- Take the most recent `updated_at` value.
+
+The query looks more complicated now because of the deduplication logic. The complexity can be hidden behind a database view.
+
+### Optimizing the performance
+
+The aggregation query in the previous section might not be performant enough for production use due to the large volume of data.
+
+Let's add 1 million extra rows to the `events` table:
+
+```sql
+INSERT INTO events (id, project_id, author_id, target_id, target_type, action) SELECT id, project_id, author_id, target_id, 'Issue' AS target_type, action FROM generateRandom('id UInt64, project_id UInt64, author_id UInt64, target_id UInt64, action UInt64') LIMIT 1000000;
+```
+
+Running the previous aggregation query in the console prints out some performance data:
+
+```plaintext
+1 row in set. Elapsed: 0.122 sec. Processed 1.00 million rows, 42.00 MB (8.21 million rows/s., 344.96 MB/s.)
+```
+
+The query returned 1 row (correctly) however, it had to process 1 million rows (full table). We can optimize the query with an index on the `project_id` column:
+
+```sql
+ALTER TABLE events ADD INDEX project_id_index project_id TYPE minmax GRANULARITY 10;
+ALTER TABLE events MATERIALIZE INDEX project_id_index;
+```
+
+Executing the query returns much better figures:
+
+```plaintext
+Read 2 rows, 107.00 B in 0.005616811 sec., 356 rows/sec., 18.60 KiB/sec.
+```
+
+To optimize the date range filter on the `created_at` column, we could try adding another index on the `created_at` column.
+
+#### Query for the contribution graph
+
+Just to recap, this is the PostgreSQL query:
+
+```sql
+SELECT DATE(events.created_at), COUNT(*)
+FROM events
+WHERE events.author_id = 1
+AND events.created_at BETWEEN '2022-01-17 23:00:00' AND '2023-01-18 22:59:59.999999'
+AND (
+ (
+ events.action = 5
+ ) OR
+ (
+ events.action IN (1, 3) -- Enum values are documented in the Event model, see the ACTIONS constant in app/models/event.rb
+ AND events.target_type IN ('Issue', 'WorkItem')
+ ) OR
+ (
+ events.action IN (7, 1, 3)
+ AND events.target_type = 'MergeRequest'
+ ) OR
+ (
+ events.action = 6
+ )
+)
+GROUP BY DATE(events.created_at)
+```
+
+The filtering and the count aggregation is mainly done on the `author_id` and the `created_at` columns. Grouping the data by these two columns would probably give an adequate performance.
+
+We could attempt adding an index on the `author_id` column however, we still need an additional index on the `created_at` column to properly cover this query. Besides, under the contribution graph, GitLab shows the list of ordered contributions of the user which would be great to get it efficiently via a different query with the `ORDER BY` clause.
+
+For these reasons, it's probably better to use a ClickHouse projection which stores the events rows redundantly but we can specify a different sort order.
+
+The ClickHouse query would be the following (with a slightly adjusted date range):
+
+```sql
+SELECT DATE(events.created_at) AS date, COUNT(*) AS count
+FROM (
+ SELECT
+ id,
+ argMax(events.created_at, events.updated_at) AS created_at
+ FROM events
+ WHERE events.author_id = 4
+ AND events.created_at BETWEEN '2023-01-01 23:00:00' AND '2024-01-01 22:59:59.999999'
+ AND (
+ (
+ events.action = 5
+ ) OR
+ (
+ events.action IN (1, 3) -- Enum values are documented in the Event model, see the ACTIONS constant in app/models/event.rb
+ AND events.target_type IN ('Issue', 'WorkItem')
+ ) OR
+ (
+ events.action IN (7, 1, 3)
+ AND events.target_type = 'MergeRequest'
+ ) OR
+ (
+ events.action = 6
+ )
+ )
+ GROUP BY id
+) AS events
+GROUP BY DATE(events.created_at)
+```
+
+The query does a full table scan, let's optimize it:
+
+```sql
+ALTER TABLE events ADD PROJECTION events_by_authors (
+ SELECT * ORDER BY author_id, created_at -- different sort order for the table
+);
+
+ALTER TABLE events MATERIALIZE PROJECTION events_by_authors;
+```
+
+#### Pagination of contributions
+
+Listing the contributions of a user can be queried in the following way:
+
+```sql
+SELECT events.*
+FROM (
+ SELECT
+ id,
+ argMax(events.project_id, events.updated_at) AS project_id,
+ argMax(events.group_id, events.updated_at) AS group_id,
+ argMax(events.author_id, events.updated_at) AS author_id,
+ argMax(events.target_type, events.updated_at) AS target_type,
+ argMax(events.target_id, events.updated_at) AS target_id,
+ argMax(events.action, events.updated_at) AS action,
+ argMax(events.fingerprint, events.updated_at) AS fingerprint,
+ FIRST_VALUE(events.created_at) AS created_at,
+ MAX(events.updated_at) AS updated_at
+ FROM events
+ WHERE events.author_id = 4
+ GROUP BY id
+ ORDER BY created_at DESC, id DESC
+) AS events
+LIMIT 20
+```
+
+ClickHouse supports the standard `LIMIT N OFFSET M` clauses, so we can request the next page:
+
+```sql
+SELECT events.*
+FROM (
+ SELECT
+ id,
+ argMax(events.project_id, events.updated_at) AS project_id,
+ argMax(events.group_id, events.updated_at) AS group_id,
+ argMax(events.author_id, events.updated_at) AS author_id,
+ argMax(events.target_type, events.updated_at) AS target_type,
+ argMax(events.target_id, events.updated_at) AS target_id,
+ argMax(events.action, events.updated_at) AS action,
+ argMax(events.fingerprint, events.updated_at) AS fingerprint,
+ FIRST_VALUE(events.created_at) AS created_at,
+ MAX(events.updated_at) AS updated_at
+ FROM events
+ WHERE events.author_id = 4
+ GROUP BY id
+ ORDER BY created_at DESC, id DESC
+) AS events
+LIMIT 20 OFFSET 20
+```
diff --git a/doc/development/database/clickhouse/index.md b/doc/development/database/clickhouse/index.md
new file mode 100644
index 00000000000..a26bac261fd
--- /dev/null
+++ b/doc/development/database/clickhouse/index.md
@@ -0,0 +1,85 @@
+---
+stage: Data Stores
+group: Database
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
+---
+
+# Introduction to ClickHouse use and table design
+
+## How it differs from PostgreSQL
+
+The [intro](https://clickhouse.com/docs/en/intro/) page is quite good to give an overview of ClickHouse.
+
+ClickHouse has a lot of differences from traditional OLTP (online transaction processing) databases like PostgreSQL. The underlying architecture is a bit different, and the processing is a lot more CPU-bound than in traditional databases.
+ClickHouse is a log-centric database where immutability is a key component. The advantages of such approaches are well documented [[1]](https://www.odbms.org/2015/10/the-rise-of-immutable-data-stores/) however it also makes updates much harder. See ClickHouse [documentation](https://clickhouse.com/docs/en/guides/developer/mutations/) for operations that provide UPDATE/DELETE support. It is noticeable that these operations are supposed to be non-frequent.
+
+This distinction is important while designing tables. Either:
+
+- The updates are not required (best case)
+- If they are needed, they aren't to be run during query execution.
+
+## ACID compatibility
+
+ClickHouse has a slightly different overview of Transactional support, where the guarantees are applicable only up to a block of inserted data to a specific table. See [the Transactional (ACID) support](https://clickhouse.com/docs/en/guides/developer/transactional/) documentation for details.
+
+Multiple insertions in a single write should be avoided as transactional support across multiple tables is only covered in materialized views.
+
+ClickHouse is heavily geared towards providing the best-in-class support for analytical queries. Operations like aggregation are very fast and there are several features to augment these capabilities.
+ClickHouse has some good blog posts covering [details of aggregations](https://altinity.com/blog/clickhouse-aggregation-fun-part-1-internals-and-handy-tools).
+
+## Primary indexes, sorting index and dictionaries
+
+It is highly recommended to read ["A practical introduction to primary indexes in ClickHouse""](https://clickhouse.com/docs/en/guides/improving-query-performance/sparse-primary-indexes/sparse-primary-indexes-intro) to get an understanding of indexes in ClickHouse.
+
+Particularly how database index design in ClickHouse [differs](https://clickhouse.com/docs/en/guides/improving-query-performance/sparse-primary-indexes/sparse-primary-indexes-design/#an-index-design-for-massive-data-scales) from those in transactional databases like PostgreSQL.
+
+Primary index design plays a very important role in query performance and should be stated carefully. Almost all of the queries should rely on the primary index as full data scans are bound to take longer.
+
+Read the documentation for [primary keys and indexes in queries](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/mergetree/#primary-keys-and-indexes-in-queries) to learn how indexes can affect query performance in MergeTree Table engines (default table engine in ClickHouse).
+
+Secondary indexes in ClickHouse are different from what is available in other systems. They are also called data-skipping indexes as they are used to skip over a block of data. See the documentation for [data-skipping indexes](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/mergetree#table_engine-mergetree-data_skipping-indexes).
+
+ClickHouse also offers ["Dictionaries"](https://clickhouse.com/docs/en/sql-reference/dictionaries/external-dictionaries/external-dicts/) which can be used as external indexes. Dictionaries are loaded from memory and can be used to look up values on query runtime.
+
+## Data types & Partitioning
+
+ClickHouse offers SQL-compatible [data types](https://clickhouse.com/docs/en/sql-reference/data-types/) and few specialized data types like:
+
+- [`LowCardinality`](https://clickhouse.com/docs/en/sql-reference/data-types/lowcardinality)
+- [UUID](https://clickhouse.com/docs/en/sql-reference/data-types/uuid)
+- [Maps](https://clickhouse.com/docs/en/sql-reference/data-types/map)
+- [Nested](https://clickhouse.com/docs/en/sql-reference/data-types/nested-data-structures/nested) which is interesting, because it simulates a table inside a column.
+
+One key design aspect that comes up front while designing a table is the partitioning key. Partitions can be any arbitrary expression but usually, these are time duration like months, days, or weeks. ClickHouse takes a best-effort approach to minimize the data read by using the smallest set of partitions.
+
+Suggested reads:
+
+- [Choose a low cardinality partitioning key](https://clickhouse.com/docs/en/optimize/partitioning-key)
+- [Custom partitioning key](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/custom-partitioning-key).
+
+## Sharding and replication
+
+Sharding is a feature that allows splitting the data into multiple ClickHouse nodes to increase throughput and decrease latency. The sharding feature uses a distributed engine that is backed by local tables. The distributed engine is a "virtual" table that does not store any data. It is used as an interface to insert and query data.
+
+See [the ClickHouse documentation](https://clickhouse.com/docs/en/engines/table-engines/special/distributed/) and this section on [replication and sharding](https://clickhouse.com/docs/en/manage/replication-and-sharding/). ClickHouse can use either Zookeeper or its own compatible API via a component called [ClickHouse Keeper](https://clickhouse.com/docs/en/operations/clickhouse-keeper) to maintain consensus.
+
+After nodes are set up, they can become invisible from the Clients and both write and read queries can be issued to any node.
+
+In most cases, clusters usually start with a fixed number of nodes(~ shards). [Rebalancing shards](https://clickhouse.com/docs/en/guides/sre/scaling-clusters) is operationally heavy and requires rigorous testing.
+
+Replication is supported by MergeTree Table engine, see the [replication section](https://clickhouse.com/docs/en/engines/table-engines/mergetree-family/replication/) in documentation for details on how to define them.
+ClickHouse relies on a distributed coordination component (either Zookeeper or ClickHouse Keeper) to track the participating nodes in the quorum. Replication is asynchronous and multi-leader. Inserts can be issued to any node and they can appear on other nodes with some latency. If desired, stickiness to a specific node can be used to make sure that reads observe the latest written data.
+
+## Materialized views
+
+One of the defining features of ClickHouse is materialized views. Functionally they resemble insert triggers for ClickHouse.
+Materialized views can be used for a variety of use cases which are well [documented](https://www.polyscale.ai/blog/polyscale-metrics-materialized-views/) on the web.
+
+We recommended reading the [views](https://clickhouse.com/docs/en/sql-reference/statements/create/view#materialized-view) section from the official documentation to get a better understanding of how they work.
+
+Quoting the [documentation](https://clickhouse.com/docs/en/sql-reference/statements/create/view#materialized-view):
+
+> Materialized views in ClickHouse are implemented more like insert triggers.
+> If there's some aggregation in the view query, it's applied only to the batch
+> of freshly inserted data. Any changes to existing data of the source table
+> (like update, delete, drop a partition, etc.) do not change the materialized view.
diff --git a/doc/development/database/clickhouse/optimization.md b/doc/development/database/clickhouse/optimization.md
new file mode 100644
index 00000000000..166bbf7d2b1
--- /dev/null
+++ b/doc/development/database/clickhouse/optimization.md
@@ -0,0 +1,60 @@
+---
+stage: Data Stores
+group: Database
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
+---
+
+# Optimizing query execution
+
+ClickHouse Inc has listed a [variety of optimization strategies](https://clickhouse.com/docs/en/optimize/).
+
+ClickHouse relies heavily on the structure of the primary index. However, in some cases, it's possible that queries rely on a column that's part of the primary index, but isn't the first column. See [Using multiple primary indexes](https://clickhouse.com/docs/en/guides/improving-query-performance/sparse-primary-indexes/sparse-primary-indexes-multiple) which offers several options in such cases. For example: using a data skipping index as a secondary index.
+
+In cases of compound primary indexes, it's helpful to understand the data characteristics of key columns is also very helpful. They can allow the index to be more efficient. [Ordering key columns efficiently](https://clickhouse.com/docs/en/guides/improving-query-performance/sparse-primary-indexes/sparse-primary-indexes-cardinality) goes into details on these concepts.
+
+ClickHouse blog also has a very good post, [Super charging your ClickHouse queries](https://clickhouse.com/blog/clickhouse-faster-queries-with-projections-and-primary-indexes), that outlines almost all of the approaches listed above.
+
+It is possible to use [`EXPLAIN`](https://clickhouse.com/docs/en/sql-reference/statements/explain/) statements with queries to get visible steps of the query pipeline. Note the different [types](https://clickhouse.com/docs/en/sql-reference/statements/explain/#explain-types) of `EXPLAIN`.
+
+Also, to get detailed query execution pipeline, you can toggle the logs level to `trace` via `clickhouse-client` and then execute the query.
+
+For example:
+
+```plaintext
+$ clickhouse-client :) SET send_logs_level = 'trace'
+$ clickhouse-client :) select count(traceID) from jaeger_index WHERE tenant = '12' AND service != 'jaeger-query' FORMAT Vertical ;
+
+SELECT count(traceID)
+FROM jaeger_index
+WHERE (tenant = '12') AND (service != 'jaeger-query')
+FORMAT Vertical
+
+Query id: 6ce40daf-e1b1-4714-ab02-268246f3c5c9
+
+[cluster-0-0-0] 2023.01.30 06:31:32.240819 [ 4991 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Debug> executeQuery: (from 127.0.0.1:53654) select count(traceID) from jaeger_index WHERE tenant = '12' AND service != 'jaeger-query' FORMAT Vertical ; (stage: Complete)
+....
+[cluster-0-0-0] 2023.01.30 06:31:32.244071 [ 4991 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Debug> InterpreterSelectQuery: MergeTreeWhereOptimizer: condition "service != 'jaeger-query'" moved to PREWHERE
+[cluster-0-0-0] 2023.01.30 06:31:32.244420 [ 4991 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Debug> InterpreterSelectQuery: MergeTreeWhereOptimizer: condition "service != 'jaeger-query'" moved to PREWHERE
+....
+[cluster-0-0-0] 2023.01.30 06:31:32.245153 [ 4991 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Trace> InterpreterSelectQuery: FetchColumns -> Complete
+[cluster-0-0-0] 2023.01.30 06:31:32.245255 [ 4991 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Trace> InterpreterSelectQuery: Complete -> Complete
+[cluster-0-0-0] 2023.01.30 06:31:32.245590 [ 4991 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Debug> tracing_gcs.jaeger_index_local (66c6ca81-e20d-44dc-8101-92678fc24d99) (SelectExecutor): Key condition: (column 1 not in ['jaeger-query', 'jaeger-query']), unknown, (column 0 in ['12', '12']), and, and
+[cluster-0-0-0] 2023.01.30 06:31:32.245784 [ 4991 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Debug> tracing_gcs.jaeger_index_local (66c6ca81-e20d-44dc-8101-92678fc24d99) (SelectExecutor): MinMax index condition: unknown, unknown, and, unknown, and
+[cluster-0-0-0] 2023.01.30 06:31:32.246239 [ 1503 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Trace> tracing_gcs.jaeger_index_local (66c6ca81-e20d-44dc-8101-92678fc24d99) (SelectExecutor): Used generic exclusion search over index for part 202301_1512_21497_9164 with 4 steps
+[cluster-0-0-0] 2023.01.30 06:31:32.246293 [ 1503 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Trace> tracing_gcs.jaeger_index_local (66c6ca81-e20d-44dc-8101-92678fc24d99) (SelectExecutor): Used generic exclusion search over index for part 202301_21498_24220_677 with 1 steps
+[cluster-0-0-0] 2023.01.30 06:31:32.246488 [ 4991 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Debug> tracing_gcs.jaeger_index_local (66c6ca81-e20d-44dc-8101-92678fc24d99) (SelectExecutor): Selected 2/2 parts by partition key, 1 parts by primary key, 2/4 marks by primary key, 2 marks to read from 1 ranges
+[cluster-0-0-0] 2023.01.30 06:31:32.246591 [ 4991 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Trace> MergeTreeInOrderSelectProcessor: Reading 1 ranges in order from part 202301_1512_21497_9164, approx. 16384 rows starting from 0
+[cluster-0-0-0] 2023.01.30 06:31:32.642095 [ 348 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Trace> AggregatingTransform: Aggregating
+[cluster-0-0-0] 2023.01.30 06:31:32.642193 [ 348 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Trace> Aggregator: An entry for key=16426982211452591884 found in cache: sum_of_sizes=2, median_size=1
+[cluster-0-0-0] 2023.01.30 06:31:32.642210 [ 348 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Trace> Aggregator: Aggregation method: without_key
+[cluster-0-0-0] 2023.01.30 06:31:32.642330 [ 348 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Debug> AggregatingTransform: Aggregated. 3211 to 1 rows (from 50.18 KiB) in 0.395452983 sec. (8119.802 rows/sec., 126.89 KiB/sec.)
+[cluster-0-0-0] 2023.01.30 06:31:32.642343 [ 348 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Trace> Aggregator: Merging aggregated data
+Row 1:
+──────
+count(traceID): 3211
+[cluster-0-0-0] 2023.01.30 06:31:32.642887 [ 4991 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Information> executeQuery: Read 16384 rows, 620.52 KiB in 0.401978272 sec., 40758 rows/sec., 1.51 MiB/sec.
+[cluster-0-0-0] 2023.01.30 06:31:32.645232 [ 4991 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Debug> MemoryTracker: Peak memory usage (for query): 831.98 KiB.
+[cluster-0-0-0] 2023.01.30 06:31:32.645251 [ 4991 ] {6ce40daf-e1b1-4714-ab02-268246f3c5c9} <Debug> TCPHandler: Processed in 0.404908496 sec.
+
+1 row in set. Elapsed: 0.402 sec. Processed 16.38 thousand rows, 635.41 KB (40.71 thousand rows/s., 1.58 MB/s.)
+```
diff --git a/doc/development/database/database_debugging.md b/doc/development/database/database_debugging.md
index edc35dd95e8..9cc85610e98 100644
--- a/doc/development/database/database_debugging.md
+++ b/doc/development/database/database_debugging.md
@@ -177,3 +177,21 @@ you should set the `SKIP_SCHEMA_VERSION_CHECK` environment variable.
```shell
bundle exec rake db:migrate SKIP_SCHEMA_VERSION_CHECK=true
```
+
+## Performance issues
+
+### Reduce connection overhead with connection pooling
+
+Creating new database connections is not free, and in PostgreSQL specifically, it requires
+forking an entire process to handle each new one. In case a connection lives for a very long time,
+this is no problem. However, forking a process for several small queries can turn out to be costly.
+If left unattended, peaks of new database connections can cause performance degradation,
+or even lead to a complete outage.
+
+A proven solution for instances that deal with surges of small, short-lived database connections
+is to implement [PgBouncer](../../administration/postgresql/pgbouncer.md#pgbouncer-as-part-of-a-fault-tolerant-gitlab-installation) as a connection pooler.
+This pool can be used to hold thousands of connections for almost no overhead. The drawback is the addition of
+a small amount of latency, in exchange for up to more than 90% performance improvement, depending on the usage patterns.
+
+PgBouncer can be fine-tuned to fit different installations. See our documentation on
+[fine-tuning PgBouncer](../../administration/postgresql/pgbouncer.md#fine-tuning) for more information.
diff --git a/doc/development/database/index.md b/doc/development/database/index.md
index 5abc7cd3ffa..8f22eaac496 100644
--- a/doc/development/database/index.md
+++ b/doc/development/database/index.md
@@ -20,6 +20,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
- [explain.depesz.com](https://explain.depesz.com/) or [explain.dalibo.com](https://explain.dalibo.com/) for visualizing the output of `EXPLAIN`
- [pgFormatter](https://sqlformat.darold.net/) a PostgreSQL SQL syntax beautifier
- [db:check-migrations job](dbcheck-migrations-job.md)
+- [Database migration pipeline](database_migration_pipeline.md)
## Migrations
@@ -93,6 +94,22 @@ info: To determine the technical writer assigned to the Stage/Group associated w
- [Troubleshooting PostgreSQL](../../administration/troubleshooting/postgresql.md)
- [Working with the bundled PgBouncer service](../../administration/postgresql/pgbouncer.md)
+## User information for scaling
+
+For GitLab administrators, information about
+[configuring PostgreSQL for scaling](../../administration/postgresql/index.md) is available,
+including the major methods:
+
+- [Standalone PostgreSQL](../../administration/postgresql/standalone.md)
+- [External PostgreSQL instances](../../administration/postgresql/external.md)
+- [Replication and failover](../../administration/postgresql/replication_and_failover.md)
+
+## ClickHouse
+
+- [Introduction](clickhouse/index.md)
+- [Optimizing query execution](clickhouse/optimization.md)
+- [Rebuild GitLab features using ClickHouse 1: Activity data](clickhouse/gitlab_activity_data.md)
+
## Miscellaneous
- [Maintenance operations](maintenance_operations.md)
diff --git a/doc/development/database/migrations_for_multiple_databases.md b/doc/development/database/migrations_for_multiple_databases.md
index bc0ef654336..b903c56651d 100644
--- a/doc/development/database/migrations_for_multiple_databases.md
+++ b/doc/development/database/migrations_for_multiple_databases.md
@@ -10,8 +10,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
This document describes how to properly write database migrations
for [the decomposed GitLab application using multiple databases](https://gitlab.com/groups/gitlab-org/-/epics/6168).
-
-Learn more about general multiple databases support in a [separate document](multiple_databases.md).
+For more information, see [Multiple databases](multiple_databases.md).
The design for multiple databases (except for the Geo database) assumes
that all decomposed databases have **the same structure** (for example, schema), but **the data is different** in each database. This means that some tables do not contain data on each database.
diff --git a/doc/development/database/new_database_migration_version.md b/doc/development/database/new_database_migration_version.md
new file mode 100644
index 00000000000..b97ecd83f37
--- /dev/null
+++ b/doc/development/database/new_database_migration_version.md
@@ -0,0 +1,64 @@
+---
+stage: Data Stores
+group: Database
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments
+---
+
+# Introducing a new database migration version
+
+At GitLab we've added many helpers for the database migrations to help developers manipulate
+the schema and data of tables on a large scale like on GitLab.com. To avoid the repetitive task
+of including the helpers into each database migration, we use a subclass of the Rails `ActiveRecord::Migration`
+class that we use for all of our database migrations. This subclass is `Gitlab::Database::Migration`, and it already
+includes all the helpers that developers can use. You can see many use cases of helpers built
+in-house in [Avoiding downtime in migrations](avoiding_downtime_in_migrations.md).
+
+Sometimes, we need to add or modify existing an helper's functionality without having a reverse effect on all the
+previous database migrations. That's why we introduced versioning to `Gitlab::Database::Migration`. Now,
+each database migration can inherit the latest version of this class at the time of the writing the database migration.
+After we add a new feature, those old database migrations are no longer affected. We usually
+refer to the version using `Gitlab::Database::Migration[2.1]`, where `2.1` is the current version.
+
+Because we are chasing a moving target, adding a new migration and deprecating older versions
+can be challenging. Database migrations are introduced every day, and this can break the pipeline.
+In this document, we explain a two-step method to add a new database migration version.
+
+1. [Introduce a new version, and keep the older version allowed](#introduce-a-new-version-and-keep-the-older-version-allowed)
+1. [Prevent the usage of the older database migration version](#prevent-the-usage-of-the-older-database-migration-version)
+
+## Introduce a new version, and keep the older version allowed
+
+1. The new version can be added to the
+ [`migration.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/database/migration.rb)
+ class, along with any new helpers that should be included in the new version.
+ Make sure that `current_version` refers to this new version. For example:
+
+ ```ruby
+ class V2_2 < V2_1 # rubocop:disable Naming/ClassAndModuleCamelCase
+ include Gitlab::Database::MigrationHelpers::ANY_NEW_HELPER
+ end
+
+ def self.current_version
+ 2.2
+ end
+ ```
+
+1. Update all the examples in the documentation to refer to the new database
+ migration version `Gitlab::Database::Migration[2.2]`.
+1. Make sure that [`migration_spec.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/db/migration_spec.rb)
+ doesn't fail for the new database migrations by adding an open date rate for
+ the **new database version**.
+
+## Prevent the usage of the older database migration version
+
+After some time passes, and ensuring all developers are using the
+new database migration version in their merge requests, prevent the older
+version from being used:
+
+1. Close the date range in [`migration_spec.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/db/migration_spec.rb)
+ for the older database version.
+1. Modify the
+ [`RuboCop::Cop::Migration::VersionedMigrationClass`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/rubocop/cop/migration/versioned_migration_class.rb)
+ and [its owned tests](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/rubocop/cop/migration/versioned_migration_class_spec.rb).
+1. Communicate this change on our Slack `#backend` and `#database` channels and
+ [Engineering Week-in-Review document](https://about.gitlab.com/handbook/engineering/#communication).
diff --git a/doc/development/database/pagination_guidelines.md b/doc/development/database/pagination_guidelines.md
index aeab45e2158..04a2a8cdf9c 100644
--- a/doc/development/database/pagination_guidelines.md
+++ b/doc/development/database/pagination_guidelines.md
@@ -66,7 +66,7 @@ Offset-based pagination is the easiest way to paginate over records, however, it
- Avoid using page numbers, use next and previous page buttons.
- Keyset pagination doesn't support page numbers.
- For APIs, advise against building URLs for the next page by "hand".
- - Promote the usage of the [`Link` header](../../api/index.md#pagination-link-header) where the URLs for the next and previous page are provided by the backend.
+ - Promote the usage of the [`Link` header](../../api/rest/index.md#pagination-link-header) where the URLs for the next and previous page are provided by the backend.
- This way changing the URL structure is possible without breaking backward compatibility.
NOTE:
diff --git a/doc/development/database/query_recorder.md b/doc/development/database/query_recorder.md
index dfaaf8afcde..bae211c1618 100644
--- a/doc/development/database/query_recorder.md
+++ b/doc/development/database/query_recorder.md
@@ -148,3 +148,4 @@ There are multiple ways to find the source of queries.
- [Performance guidelines](../performance.md)
- [Merge request performance guidelines - Query counts](../merge_request_concepts/performance.md#query-counts)
- [Merge request performance guidelines - Cached queries](../merge_request_concepts/performance.md#cached-queries)
+- [RedisCommands::Recorder](../redis.md#n1-calls-problem) For testing `N+1` calls in Redis
diff --git a/doc/development/database/table_partitioning.md b/doc/development/database/table_partitioning.md
index 30131fc0347..0d5e3c233f6 100644
--- a/doc/development/database/table_partitioning.md
+++ b/doc/development/database/table_partitioning.md
@@ -307,44 +307,12 @@ class PrepareIndexesForPartitioning < Gitlab::Database::Migration[2.1]
end
```
-### Step 3 - Swap primary key
+### Step 3 - Enforce unique constraint
-Swap the primary key including the partitioning key column. For example, in a rails migration:
-
-```ruby
-class PreparePrimaryKeyForPartitioning < Gitlab::Database::Migration[2.1]
- disable_ddl_transaction!
-
- TABLE_NAME = :table_name
- PRIMARY_KEY = :primary_key
- OLD_INDEX_NAME = :old_index_name
- NEW_INDEX_NAME = :new_index_name
-
- def up
- swap_primary_key(TABLE_NAME, PRIMARY_KEY, NEW_INDEX_NAME)
- end
-
- def down
- add_concurrent_index(TABLE_NAME, :id, unique: true, name: OLD_INDEX_NAME)
- add_concurrent_index(TABLE_NAME, [:id, :partition_id], unique: true, name: NEW_INDEX_NAME)
-
- unswap_primary_key(TABLE_NAME, PRIMARY_KEY, OLD_INDEX_NAME)
- end
-end
-```
-
-NOTE:
-Do not forget to set the primary key explicitly in your model as `ActiveRecord` does not support composite primary keys.
-
-```ruby
-class Model < ApplicationRecord
- self.primary_key = :id
-end
-```
-
-### Step 4 - Enforce unique constraint
-
-Enforce unique indexes including the partitioning key column. For example, in a rails migration:
+Change all unique indexes to include the partitioning key column,
+including the primary key index. You can start by adding an unique
+index on `[primary_key_column, :partition_id]`, which will be
+required for the next two steps. For example, in a rails migration:
```ruby
class PrepareUniqueContraintForPartitioning < Gitlab::Database::Migration[2.1]
@@ -355,20 +323,20 @@ class PrepareUniqueContraintForPartitioning < Gitlab::Database::Migration[2.1]
NEW_UNIQUE_INDEX_NAME = :new_index_name
def up
- add_concurrent_index(TABLE_NAME, [:some_column, :partition_id], unique: true, name: NEW_UNIQUE_INDEX_NAME)
+ add_concurrent_index(TABLE_NAME, [:id, :partition_id], unique: true, name: NEW_UNIQUE_INDEX_NAME)
remove_concurrent_index_by_name(TABLE_NAME, OLD_UNIQUE_INDEX_NAME)
end
def down
- add_concurrent_index(TABLE_NAME, :some_column, unique: true, name: OLD_UNIQUE_INDEX_NAME)
+ add_concurrent_index(TABLE_NAME, :id, unique: true, name: OLD_UNIQUE_INDEX_NAME)
remove_concurrent_index_by_name(TABLE_NAME, NEW_UNIQUE_INDEX_NAME)
end
end
```
-### Step 5 - Enforce foreign key constraint
+### Step 4 - Enforce foreign key constraint
Enforce foreign keys including the partitioning key column. For example, in a rails migration:
@@ -380,7 +348,7 @@ class PrepareForeignKeyForPartitioning < Gitlab::Database::Migration[2.1]
TARGET_TABLE_NAME = :target_table_name
COLUMN = :foreign_key_id
TARGET_COLUMN = :id
- CONSTRAINT_NAME = :fk_365d1db505_p
+ FK_NAME = :fk_365d1db505_p
PARTITION_COLUMN = :partition_id
def up
@@ -391,14 +359,17 @@ class PrepareForeignKeyForPartitioning < Gitlab::Database::Migration[2.1]
target_column: [PARTITION_COLUMN, TARGET_COLUMN],
validate: false,
on_update: :cascade,
- name: CONSTRAINT_NAME
+ name: FK_NAME
)
- validate_foreign_key(TARGET_TABLE_NAME, CONSTRAINT_NAME)
+ # This should be done in a separate post migration when dealing with a high traffic table
+ validate_foreign_key(TABLE_NAME, [PARTITION_COLUMN, COLUMN], name: FK_NAME)
end
def down
- drop_constraint(TARGET_TABLE_NAME, CONSTRAINT_NAME)
+ with_lock_retries do
+ remove_foreign_key_if_exists(SOURCE_TABLE_NAME, name: FK_NAME)
+ end
end
end
```
@@ -410,6 +381,48 @@ result in a `Key is still referenced from table ...` error and updating the
partition column on the source table would raise a
`Key is not present in table ...` error.
+This migration can be automatically generated using:
+
+```shell
+./scripts/partitioning/generate-fk --source source_table_name --target target_table_name
+```
+
+### Step 5 - Swap primary key
+
+Swap the primary key including the partitioning key column. This can be done only after
+including the partition key for all references foreign keys. For example, in a rails migration:
+
+```ruby
+class PreparePrimaryKeyForPartitioning < Gitlab::Database::Migration[2.1]
+ disable_ddl_transaction!
+
+ TABLE_NAME = :table_name
+ PRIMARY_KEY = :primary_key
+ OLD_INDEX_NAME = :old_index_name
+ NEW_INDEX_NAME = :new_index_name
+
+ def up
+ swap_primary_key(TABLE_NAME, PRIMARY_KEY, NEW_INDEX_NAME)
+ end
+
+ def down
+ add_concurrent_index(TABLE_NAME, :id, unique: true, name: OLD_INDEX_NAME)
+ add_concurrent_index(TABLE_NAME, [:id, :partition_id], unique: true, name: NEW_INDEX_NAME)
+
+ unswap_primary_key(TABLE_NAME, PRIMARY_KEY, OLD_INDEX_NAME)
+ end
+end
+```
+
+NOTE:
+Do not forget to set the primary key explicitly in your model as `ActiveRecord` does not support composite primary keys.
+
+```ruby
+class Model < ApplicationRecord
+ self.primary_key = :id
+end
+```
+
### Step 6 - Create parent table and attach existing table as the initial partition
You can now create the parent table attaching the existing table as the initial
@@ -465,7 +478,7 @@ class ConvertTableToListPartitioning < Gitlab::Database::Migration[2.1]
table_name: TABLE_NAME,
partitioning_column: PARTITION_COLUMN,
parent_table_name: PARENT_TABLE_NAME,
- initial_partitioning_value: FIRST_PARTITION
+ initial_partitioning_value: FIRST_PARTITION,
lock_tables: [TABLE_FK, TABLE_NAME]
)
end