diff options
author | Mayra Cabrera <mcabrera@gitlab.com> | 2019-08-16 11:31:02 -0500 |
---|---|---|
committer | Mayra Cabrera <mcabrera@gitlab.com> | 2019-08-23 15:07:03 -0500 |
commit | aee8a868f6b74cf4350d1ac4f66d06515d0b78c3 (patch) | |
tree | 020c8fee1b25ae73ec8ad8afb417997de2fbecd8 | |
parent | 4594d1c05e4c5181396713a49f6e7046a21a4d0a (diff) | |
download | gitlab-ce-nik-db-migrations-prefer-atomic-docs.tar.gz |
Addreses backend and technical writer comments:nik-db-migrations-prefer-atomic-docs
- Removes text mentioning MySQL
- Removes duplicated text
- Makes section a bit more descriptive
- Reorganices 'Atomicity' and 'Heave operations in a single transaction'
sections.
- Add missing commans and fixes typos
- Makes the text more understandable.
-rw-r--r-- | doc/development/database_review.md | 2 | ||||
-rw-r--r-- | doc/development/migration_style_guide.md | 92 |
2 files changed, 52 insertions, 42 deletions
diff --git a/doc/development/database_review.md b/doc/development/database_review.md index 3f1b359cb0b..367a481ee11 100644 --- a/doc/development/database_review.md +++ b/doc/development/database_review.md @@ -91,7 +91,7 @@ and details for a database reviewer: concurrent index/foreign key helpers (with transactions disabled) - Check consistency with `db/schema.rb` and that migrations are [reversible](migration_style_guide.md#reversibility) - Check queries timing (If any): Queries executed in a migration - need to fit comfortable within `15s` - preferably much less than that - on GitLab.com. + need to fit comfortably within `15s` - preferably much less than that - on GitLab.com. - Check [background migrations](background_migrations.md): - For data migrations, establish a time estimate for execution - They should only be used when migrating data in larger tables. diff --git a/doc/development/migration_style_guide.md b/doc/development/migration_style_guide.md index afee1da1f8a..077dd2f43d0 100644 --- a/doc/development/migration_style_guide.md +++ b/doc/development/migration_style_guide.md @@ -10,10 +10,7 @@ migrations are written carefully, can be applied online, and adhere to the style guide below. Migrations are **not** allowed to require GitLab installations to be taken -offline unless _absolutely necessary_. Downtime assumptions should be based on -the behavior of the migration when performed using PostgreSQL, as various -operations in MySQL may require downtime without there being alternatives. ->>>>>>> Migrations guide: use atomic steps, when possible +offline unless _absolutely necessary_. When downtime is necessary the migration has to be approved by: @@ -90,23 +87,34 @@ migration was tested. ## Atomicity -When it is possible, your changes must be grouped into a single migration, -or to as few migrations as possible. For example, if you create an empty table -and need to build an index for it, it is recommended to have a single migration, -*without* `disable_ddl_transaction!`. Of course, in this case, you have to use -`add_index` (without `concurrently`), a blocking operation – it will not block -anything because the target table is not yet used. +By default, migrations are single transaction. That is, a transaction is opened +at the beginning of the migration, and committed after all steps are processed. -Such an approach allows having fewer migration steps, which are atomic ("all or -nothing"). +Running migrations in a single transaction makes sure that if one of the steps fails, +none of the steps will be executed, leaving the database in valid state. +Therefore, either: -## No schema changes and heavy operations in a single transaction +- Put all migrations in one single-transaction migration. +- If necessary, put most actions in one migration and create a separate migration + for the steps that cannot be done in a single transaction. + +For example, if you create an empty table and need to build an index for it, +it is recommended to use a regular single-transaction migration and the default +rails schema statement: [`add_index`](https://api.rubyonrails.org/v5.2/classes/ActiveRecord/ConnectionAdapters/SchemaStatements.html#method-i-add_index). +This is a blocking operation, but it won't cause problems because the table is not yet used, +and therefore it does not have any records yet. -However, a single-transaction migration (that one without -`disable_ddl_transaction!`) involving schema changes must not include any -processing of significant (more than a few records rows) amounts of data. -If you need to insert, update, or delete something, you must not do it inside -a transaction with DDL. +## Heavy operations in a single transaction + +When using a single-transaction migration, a transaction will hold on a database connection +for the duration of the migration, so you must make sure the actions in the migration +do not take too much time: In general, queries executed in a migration need to fit comfortably +within `15s` on GitLab.com. + +In case you need to insert, update, or delete a significant amount of data, you: + +- Must disable the single transaction with `disable_ddl_transaction!`. +- Should consider doing it in a [Background Migration](background_migrations.md). ## Multi-Threading @@ -151,11 +159,10 @@ should be more than enough. ## Removing indexes -If the table is not empty when removing an index make sure to use the method -`remove_concurrent_index` instead -of the regular `remove_index` method. The `remove_concurrent_index` method -automatically drops concurrent indexes when using PostgreSQL, removing the -need for downtime. To use this method, you must disable single-transaction mode +If the table is not empty when removing an index, make sure to use the method +`remove_concurrent_index` instead of the regular `remove_index` method. +The `remove_concurrent_index` method drops indexes concurrently, so no locking is required, +and there is no need for downtime. To use this method, you must disable single-transaction mode by calling the method `disable_ddl_transaction!` in the body of your migration class like so: @@ -173,9 +180,9 @@ end Note that it is not necessary to check if the index exists prior to removing it. -For an empty table (such as a fresh one), it is recommended to use -`remove_index` in a single-transaction migration, combining it with other -operations that don't require `disable_ddl_transaction!`. +For a small table (such as an empty one or one with less than `1,000` records), +it is recommended to use `remove_index` in a single-transaction migration, +combining it with other operations that don't require `disable_ddl_transaction!`. ## Adding indexes @@ -185,12 +192,13 @@ always _first_ add a migration that removes any duplicates, before adding the unique index. When adding an index to a non-empty table make sure to use the method -`add_concurrent_index` instead -of the regular `add_index` method. The `add_concurrent_index` method -automatically creates concurrent indexes when using PostgreSQL, removing the -need for downtime. To use this method, you must disable transactions by calling -the method `disable_ddl_transaction!` in the body of your migration class like -so: +`add_concurrent_index` instead of the regular `add_index` method. +The `add_concurrent_index` method automatically creates concurrent indexes +when using PostgreSQL, removing the need for downtime. + +To use this method, you must disable single-transactions mode +by calling the method `disable_ddl_transaction!` in the body of your migration +class like so: ```ruby class MyMigration < ActiveRecord::Migration[4.2] @@ -208,8 +216,8 @@ class MyMigration < ActiveRecord::Migration[4.2] end ``` -For an empty table (such as a fresh one), it is recommended to use -`add_index` in a single-transaction migration, combining it with other +For a small table (such as an empty one or one with less than `1,000` records), +it is recommended to use `add_index` in a single-transaction migration, combining it with other operations that don't require `disable_ddl_transaction!`. ## Adding foreign-key constraints @@ -235,7 +243,7 @@ class Migration < ActiveRecord::Migration[4.2] end ``` -When adding a foreign-key constraint to an existing column in a non-empty table +When adding a foreign-key constraint to an existing column in a non-empty table, we have to employ `add_concurrent_foreign_key` and `add_concurrent_index` instead of `add_reference`. @@ -269,13 +277,13 @@ end ``` Keep in mind that this operation can easily take 10-15 minutes to complete on -larger installations (e.g., GitLab.com). As a result, you should only add +larger installations (e.g. GitLab.com). As a result, you should only add default values if absolutely necessary. There is a RuboCop cop that will fail if this method is used on some tables that are very large on GitLab.com, which would cause other issues. -For an empty table (such as a fresh one), it is recommended to use -`add_column` + `change_column_default` in a single-transaction migration, +For a small table (such as an empty one or one with less than `1,000` records), +use `add_column` and `change_column_default` in a single-transaction migration, combining it with other operations that don't require `disable_ddl_transaction!`. ## Updating an existing column @@ -295,7 +303,9 @@ end ``` If a computed update is needed, the value can be wrapped in `Arel.sql`, so Arel -treats it as an SQL literal. The below example is the same as the one above, but +treats it as an SQL literal. It's also a required deprecation for [Rails 6](https://gitlab.com/gitlab-org/gitlab-ce/issues/61451). + +The below example is the same as the one above, but the value is set to the product of the `bar` and `baz` columns: ```ruby @@ -321,7 +331,7 @@ restricts the maximum file size to just over 2GB. To allow an integer column to hold up to an 8-byte (64-bit) number, explicitly set the limit to 8-bytes. This will allow the column to hold a value up to -9,223,372,036,854,775,807. +`9,223,372,036,854,775,807`. Rails migration example: @@ -445,7 +455,7 @@ derived from the class name or namespace. ### Renaming reserved paths -When a new route for projects is introduced, that could conflict with any +When a new route for projects is introduced, it could conflict with any existing records. The path for these records should be renamed, and the related data should be moved on disk. |