summaryrefslogtreecommitdiff
path: root/doc/development/migration_style_guide.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/migration_style_guide.md')
-rw-r--r--doc/development/migration_style_guide.md82
1 files changed, 42 insertions, 40 deletions
diff --git a/doc/development/migration_style_guide.md b/doc/development/migration_style_guide.md
index 5764c876e4d..8f035d4aa13 100644
--- a/doc/development/migration_style_guide.md
+++ b/doc/development/migration_style_guide.md
@@ -184,13 +184,21 @@ git checkout origin/master db/structure.sql
VERSION=<migration ID> bundle exec rails db:migrate:main
```
-### Adding new tables to GitLab Schema
-
-GitLab connects to two different Postgres databases: `main` and `ci`. New tables should be defined in [`lib/gitlab/database/gitlab_schemas.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/database/gitlab_schemas.yml) with the databases they need to be added to.
-
- ```yaml
- <TABLE_NAME>: :gitlab_main
- ```
+### Adding new tables to the database dictionary
+
+GitLab connects to two different Postgres databases: `main` and `ci`. New tables should be defined in [`db/docs/`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/db/docs):
+
+```yaml
+table_name: table name exmaple
+description: Description example
+introduced_by_url: Merge request link
+milestone: Milestone example
+feature_categories:
+- Feature category example
+classes:
+- Class example
+gitlab_schema: gitlab_main
+```
## Avoiding downtime
@@ -305,7 +313,7 @@ of migration helpers.
In this example, we use version 2.0 of the migration class:
```ruby
-class TestMigration < Gitlab::Database::Migration[2.0]
+class TestMigration < Gitlab::Database::Migration[2.1]
def change
end
end
@@ -338,7 +346,7 @@ is concurrently accessed and modified by other processes, acquiring the lock may
a while. The lock request is waiting in a queue and it may also block other queries
on the `users` table once it has been enqueued.
-More information about PostgresSQL locks: [Explicit Locking](https://www.postgresql.org/docs/current/explicit-locking.html)
+More information about PostgreSQL locks: [Explicit Locking](https://www.postgresql.org/docs/current/explicit-locking.html)
For stability reasons, GitLab.com has a specific [`statement_timeout`](../user/gitlab_com/index.md#postgresql)
set. When the migration is invoked, any database query has
@@ -376,7 +384,7 @@ Occasionally a migration may need to acquire multiple locks on different objects
To prevent catalog bloat, ask for all those locks explicitly before performing any DDL.
A better strategy is to split the migration, so that we only need to acquire one lock at the time.
-**Removing a column:**
+#### Removing a column
```ruby
enable_lock_retries!
@@ -386,7 +394,7 @@ def change
end
```
-**Multiple changes on the same table:**
+#### Multiple changes on the same table
With the lock-retry methodology enabled, all operations wrap into a single transaction. When you have the lock,
you should do as much as possible inside the transaction rather than trying to get another lock later.
@@ -406,7 +414,7 @@ def down
end
```
-**Removing a foreign key:**
+#### Removing a foreign key
```ruby
enable_lock_retries!
@@ -420,7 +428,7 @@ def down
end
```
-**Changing default value for a column:**
+#### Changing default value for a column
```ruby
enable_lock_retries!
@@ -434,7 +442,7 @@ def down
end
```
-**Creating a new table with a foreign key:**
+#### Creating a new table with a foreign key
We can wrap the `create_table` method with `with_lock_retries`:
@@ -453,7 +461,7 @@ def down
end
```
-**Creating a new table when we have two foreign keys:**
+#### Creating a new table when we have two foreign keys
Only one foreign key should be created per transaction. This is because [the addition of a foreign key constraint requires a `SHARE ROW EXCLUSIVE` lock on the referenced table](https://www.postgresql.org/docs/12/sql-createtable.html#:~:text=The%20addition%20of%20a%20foreign%20key%20constraint%20requires%20a%20SHARE%20ROW%20EXCLUSIVE%20lock%20on%20the%20referenced%20table), and locking multiple tables in the same transaction should be avoided.
@@ -600,7 +608,7 @@ by calling the method `disable_ddl_transaction!` in the body of your migration
class like so:
```ruby
-class MyMigration < Gitlab::Database::Migration[2.0]
+class MyMigration < Gitlab::Database::Migration[2.1]
disable_ddl_transaction!
INDEX_NAME = 'index_name'
@@ -611,10 +619,10 @@ class MyMigration < Gitlab::Database::Migration[2.0]
end
```
-Verify the index is not being used anymore with this Thanos query:
+You can verify that the index is not being used with [Thanos](https://thanos-query.ops.gitlab.net/graph?g0.expr=sum%20by%20(type)(rate(pg_stat_user_indexes_idx_scan%7Benv%3D%22gprd%22%2C%20indexrelname%3D%22INSERT%20INDEX%20NAME%20HERE%22%7D%5B30d%5D))&g0.tab=1&g0.stacked=0&g0.range_input=1h&g0.max_source_resolution=0s&g0.deduplicate=1&g0.partial_response=0&g0.store_matches=%5B%5D):
```sql
-sum by (type)(rate(pg_stat_user_indexes_idx_scan{env="gprd", indexrelname="index_groups_on_parent_id_id"}[5m]))
+sum by (type)(rate(pg_stat_user_indexes_idx_scan{env="gprd", indexrelname="INSERT INDEX NAME HERE"}[30d]))
```
Note that it is not necessary to check if the index exists prior to
@@ -657,7 +665,7 @@ by calling the method `disable_ddl_transaction!` in the body of your migration
class like so:
```ruby
-class MyMigration < Gitlab::Database::Migration[2.0]
+class MyMigration < Gitlab::Database::Migration[2.1]
disable_ddl_transaction!
INDEX_NAME = 'index_name'
@@ -700,7 +708,7 @@ The easiest way to test for existence of an index by name is to use the
be used with a name option. For example:
```ruby
-class MyMigration < Gitlab::Database::Migration[2.0]
+class MyMigration < Gitlab::Database::Migration[2.1]
INDEX_NAME = 'index_name'
def up
@@ -735,7 +743,7 @@ Here's an example where we add a new column with a foreign key
constraint. Note it includes `index: true` to create an index for it.
```ruby
-class Migration < Gitlab::Database::Migration[2.0]
+class Migration < Gitlab::Database::Migration[2.1]
def change
add_reference :model, :other_model, index: true, foreign_key: { on_delete: :cascade }
@@ -766,12 +774,7 @@ With PostgreSQL 11 being the minimum version in GitLab 13.0 and later, adding co
the standard `add_column` helper should be used in all cases.
Before PostgreSQL 11, adding a column with a default was problematic as it would
-have caused a full table rewrite. The corresponding helper `add_column_with_default`
-has been deprecated and is scheduled to be removed in a later release.
-
-If a backport adding a column with a default value is needed for %12.9 or earlier versions,
-it should use `add_column_with_default` helper. If a [large table](https://gitlab.com/gitlab-org/gitlab/-/blob/master/rubocop/rubocop-migrations.yml#L3)
-is involved, backporting to %12.9 is contraindicated.
+have caused a full table rewrite.
## Removing the column default for non-nullable columns
@@ -796,7 +799,7 @@ expensive and disruptive operation for larger tables, but in reality it's not.
Take the following migration as an example:
```ruby
-class DefaultRequestAccessGroups < Gitlab::Database::Migration[2.0]
+class DefaultRequestAccessGroups < Gitlab::Database::Migration[2.1]
def change
change_column_default(:namespaces, :request_access_enabled, from: false, to: true)
end
@@ -818,7 +821,7 @@ in the `namespaces` table. Only when creating a new column with a default, all t
NOTE:
A faster [ALTER TABLE ADD COLUMN with a non-null default](https://www.depesz.com/2018/04/04/waiting-for-postgresql-11-fast-alter-table-add-column-with-a-non-null-default/)
-was introduced on PostgresSQL 11.0, removing the need of rewriting the table when a new column with a default value is added.
+was introduced on PostgreSQL 11.0, removing the need of rewriting the table when a new column with a default value is added.
For the reasons mentioned above, it's safe to use `change_column_default` in a single-transaction migration
without requiring `disable_ddl_transaction!`.
@@ -852,8 +855,7 @@ update_column_in_batches(:projects, :foo, update_value) do |table, query|
end
```
-Like `add_column_with_default`, there is a RuboCop cop to detect usage of this
-on large tables. In the case of `update_column_in_batches`, it may be acceptable
+In the case of `update_column_in_batches`, it may be acceptable
to run on a large table, as long as it is only updating a small subset of the
rows in the table, but do not ignore that without validating on the GitLab.com
staging environment - or asking someone else to do so for you - beforehand.
@@ -991,7 +993,7 @@ Re-add a sequence:
A Rails migration example:
```ruby
-class DropSequenceTest < Gitlab::Database::Migration[2.0]
+class DropSequenceTest < Gitlab::Database::Migration[2.1]
def up
drop_sequence(:ci_pipelines_config, :pipeline_id, :ci_pipelines_config_pipeline_id_seq)
end
@@ -1022,7 +1024,7 @@ Under the hood, it works like this:
- Add the primary key using the index defined beforehand.
```ruby
-class SwapPrimaryKey < Gitlab::Database::Migration[2.0]
+class SwapPrimaryKey < Gitlab::Database::Migration[2.1]
TABLE_NAME = :table_name
PRIMARY_KEY = :table_name_pkey
OLD_INDEX_NAME = :old_index_name
@@ -1113,18 +1115,18 @@ The Rails 5 natively supports `JSONB` (binary JSON) column type.
Example migration adding this column:
```ruby
-class AddOptionsToBuildMetadata < Gitlab::Database::Migration[2.0]
+class AddOptionsToBuildMetadata < Gitlab::Database::Migration[2.1]
def change
add_column :ci_builds_metadata, :config_options, :jsonb
end
end
```
-You have to use a serializer to provide a translation layer:
+By default hash keys will be strings. Optionally you can add a custom data type to provide different access to keys.
```ruby
class BuildMetadata
- serialize :config_options, Serializers::Json # rubocop:disable Cop/ActiveRecordSerialize
+ attribute :config_options, :ind_jsonb # for indifferent accesss or :sym_jsonb if you need symbols only as keys.
end
```
@@ -1145,7 +1147,7 @@ Do not store `attr_encrypted` attributes as `:text` in the database; use
efficient:
```ruby
-class AddSecretToSomething < Gitlab::Database::Migration[2.0]
+class AddSecretToSomething < Gitlab::Database::Migration[2.1]
def change
add_column :something, :encrypted_secret, :binary
add_column :something, :encrypted_secret_iv, :binary
@@ -1203,7 +1205,7 @@ If you need more complex logic, you can define and use models local to a
migration. For example:
```ruby
-class MyMigration < Gitlab::Database::Migration[2.0]
+class MyMigration < Gitlab::Database::Migration[2.1]
class Project < MigrationRecord
self.table_name = 'projects'
end
@@ -1227,7 +1229,7 @@ Be aware of the limitations [when using models in migrations](#using-models-in-m
In most circumstances, prefer migrating data in **batches** when modifying data in the database.
-We introduced a new helper [each_batch_range](https://gitlab.com/gitlab-org/gitlab/-/blob/cd3e0a5cddcb464cb9b8c6e3275839cf57dfa6e2/lib/gitlab/database/dynamic_model_helpers.rb#L28-32) which facilitates the process of iterating over a collection in a performant way. The default size of the batch is defined in the `BATCH_SIZE` constant.
+We introduced a new helper [`each_batch_range`](https://gitlab.com/gitlab-org/gitlab/-/blob/cd3e0a5cddcb464cb9b8c6e3275839cf57dfa6e2/lib/gitlab/database/dynamic_model_helpers.rb#L28-32) which facilitates the process of iterating over a collection in a performant way. The default size of the batch is defined in the `BATCH_SIZE` constant.
See the following example to get an idea.
@@ -1302,7 +1304,7 @@ in a previous migration.
It is important not to leave out the `User.reset_column_information` command, to ensure that the old schema is dropped from the cache and ActiveRecord loads the updated schema information.
```ruby
-class AddAndSeedMyColumn < Gitlab::Database::Migration[2.0]
+class AddAndSeedMyColumn < Gitlab::Database::Migration[2.1]
class User < MigrationRecord
self.table_name = 'users'
end