summaryrefslogtreecommitdiff
path: root/doc/development/database/adding_database_indexes.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/database/adding_database_indexes.md')
-rw-r--r--doc/development/database/adding_database_indexes.md63
1 files changed, 57 insertions, 6 deletions
diff --git a/doc/development/database/adding_database_indexes.md b/doc/development/database/adding_database_indexes.md
index e1d5a7af6d9..1e3a1de9b69 100644
--- a/doc/development/database/adding_database_indexes.md
+++ b/doc/development/database/adding_database_indexes.md
@@ -214,6 +214,45 @@ def down
end
```
+## Analyzing a new index before a batched background migration
+
+Sometimes it is necessary to add an index to support a [batched background migration](batched_background_migrations.md).
+It is commonly done by creating two [post deployment migrations](post_deployment_migrations.md):
+
+1. Add the new index, often a [temporary index](#temporary-indexes).
+1. [Queue the batched background migration](batched_background_migrations.md#queueing).
+
+In most cases, no additional work is needed. The new index is created and is used
+as expected when queuing and executing the batched background migration.
+
+[Expression indexes](https://www.postgresql.org/docs/current/indexes-expressional.html),
+however, do not generate statistics for the new index on creation. Autovacuum
+eventually runs `ANALYZE`, and updates the statistics so the new index is used.
+Run `ANALYZE` explicitly only if it is needed right after the index
+is created, such as in the background migration scenario described above.
+
+To trigger `ANALYZE` after the index is created, update the index creation migration
+to analyze the table:
+
+```ruby
+# in db/post_migrate/
+
+INDEX_NAME = 'tmp_index_projects_on_owner_and_lower_name_where_emails_disabled'
+TABLE = :projects
+
+disable_ddl_transaction!
+
+def up
+ add_concurrent_index TABLE, '(creator_id, lower(name))', where: 'emails_disabled = false', name: INDEX_NAME
+
+ connection.execute("ANALYZE #{TABLE}")
+end
+```
+
+`ANALYZE` should only be run in post deployment migrations and should not target
+[large tables](https://gitlab.com/gitlab-org/gitlab/-/blob/master/rubocop/rubocop-migrations.yml#L3).
+If this behavior is needed on a larger table, ask for assistance in the `#database` Slack channel.
+
## Indexes for partitioned tables
Indexes [cannot be created](https://www.postgresql.org/docs/15/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE-MAINTENANCE)
@@ -254,7 +293,7 @@ end
For very large tables, index creation can be a challenge to manage.
While `add_concurrent_index` creates indexes in a way that does not block
-normal traffic, it can still be problematic when index creation runs for
+ordinary traffic, it can still be problematic when index creation runs for
many hours. Necessary database operations like `autovacuum` cannot run, and
on GitLab.com, the deployment process is blocked waiting for index
creation to finish.
@@ -271,8 +310,13 @@ index creation can proceed at a lower level of risk.
### Schedule the index to be created
-Create an MR with a post-deployment migration which prepares the index
-for asynchronous creation. An example of creating an index using
+1. Create a merge request containing a post-deployment migration, which prepares
+ the index for asynchronous creation.
+1. [Create a follow-up issue](https://gitlab.com/gitlab-org/gitlab/-/issues/new?issuable_template=Synchronous%20Database%20Index)
+ to add a migration that creates the index synchronously.
+1. In the merge request that prepares the asynchronous index, add a comment mentioning the follow-up issue.
+
+An example of creating an index using
the asynchronous index helpers can be seen in the block below. This migration
enters the index name and definition into the `postgres_async_indexes`
table. The process that runs on weekends pulls indexes from this
@@ -283,6 +327,7 @@ table and attempt to create them.
INDEX_NAME = 'index_ci_builds_on_some_column'
+# TODO: Index to be created synchronously in https://gitlab.com/gitlab-org/gitlab/-/issues/XXXXX
def up
prepare_async_index :ci_builds, :some_column, name: INDEX_NAME
end
@@ -351,7 +396,7 @@ Use the asynchronous index helpers on your local environment to test changes for
For very large tables, index destruction can be a challenge to manage.
While `remove_concurrent_index` removes indexes in a way that does not block
-normal traffic, it can still be problematic if index destruction runs for
+ordinary traffic, it can still be problematic if index destruction runs for
during `autovacuum`. Necessary database operations like `autovacuum` cannot run, and
the deployment process on GitLab.com is blocked while waiting for index
destruction to finish.
@@ -366,8 +411,13 @@ index destruction can proceed at a lower level of risk.
### Schedule the index to be removed
-Create an MR with a post-deployment migration which prepares the index
-for asynchronous destruction. For example. to destroy an index using
+1. Create a merge request containing a post-deployment migration, which prepares
+ the index for asynchronous destruction.
+1. [Create a follow-up issue](https://gitlab.com/gitlab-org/gitlab/-/issues/new?issuable_template=Synchronous%20Database%20Index)
+ to add a migration that destroys the index synchronously.
+1. In the merge request that prepares the asynchronous index removal, add a comment mentioning the follow-up issue.
+
+For example, to destroy an index using
the asynchronous index helpers:
```ruby
@@ -375,6 +425,7 @@ the asynchronous index helpers:
INDEX_NAME = 'index_ci_builds_on_some_column'
+# TODO: Index to be destroyed synchronously in https://gitlab.com/gitlab-org/gitlab/-/issues/XXXXX
def up
prepare_async_index_removal :ci_builds, :some_column, name: INDEX_NAME
end