summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorMayra Cabrera <mcabrera@gitlab.com>2019-08-16 11:31:02 -0500
committerMayra Cabrera <mcabrera@gitlab.com>2019-08-23 15:07:03 -0500
commitaee8a868f6b74cf4350d1ac4f66d06515d0b78c3 (patch)
tree020c8fee1b25ae73ec8ad8afb417997de2fbecd8
parent4594d1c05e4c5181396713a49f6e7046a21a4d0a (diff)
downloadgitlab-ce-nik-db-migrations-prefer-atomic-docs.tar.gz
Addreses backend and technical writer comments:nik-db-migrations-prefer-atomic-docs
- Removes text mentioning MySQL - Removes duplicated text - Makes section a bit more descriptive - Reorganices 'Atomicity' and 'Heave operations in a single transaction' sections. - Add missing commans and fixes typos - Makes the text more understandable.
-rw-r--r--doc/development/database_review.md2
-rw-r--r--doc/development/migration_style_guide.md92
2 files changed, 52 insertions, 42 deletions
diff --git a/doc/development/database_review.md b/doc/development/database_review.md
index 3f1b359cb0b..367a481ee11 100644
--- a/doc/development/database_review.md
+++ b/doc/development/database_review.md
@@ -91,7 +91,7 @@ and details for a database reviewer:
concurrent index/foreign key helpers (with transactions disabled)
- Check consistency with `db/schema.rb` and that migrations are [reversible](migration_style_guide.md#reversibility)
- Check queries timing (If any): Queries executed in a migration
- need to fit comfortable within `15s` - preferably much less than that - on GitLab.com.
+ need to fit comfortably within `15s` - preferably much less than that - on GitLab.com.
- Check [background migrations](background_migrations.md):
- For data migrations, establish a time estimate for execution
- They should only be used when migrating data in larger tables.
diff --git a/doc/development/migration_style_guide.md b/doc/development/migration_style_guide.md
index afee1da1f8a..077dd2f43d0 100644
--- a/doc/development/migration_style_guide.md
+++ b/doc/development/migration_style_guide.md
@@ -10,10 +10,7 @@ migrations are written carefully, can be applied online, and adhere to the style
guide below.
Migrations are **not** allowed to require GitLab installations to be taken
-offline unless _absolutely necessary_. Downtime assumptions should be based on
-the behavior of the migration when performed using PostgreSQL, as various
-operations in MySQL may require downtime without there being alternatives.
->>>>>>> Migrations guide: use atomic steps, when possible
+offline unless _absolutely necessary_.
When downtime is necessary the migration has to be approved by:
@@ -90,23 +87,34 @@ migration was tested.
## Atomicity
-When it is possible, your changes must be grouped into a single migration,
-or to as few migrations as possible. For example, if you create an empty table
-and need to build an index for it, it is recommended to have a single migration,
-*without* `disable_ddl_transaction!`. Of course, in this case, you have to use
-`add_index` (without `concurrently`), a blocking operation – it will not block
-anything because the target table is not yet used.
+By default, migrations are single transaction. That is, a transaction is opened
+at the beginning of the migration, and committed after all steps are processed.
-Such an approach allows having fewer migration steps, which are atomic ("all or
-nothing").
+Running migrations in a single transaction makes sure that if one of the steps fails,
+none of the steps will be executed, leaving the database in valid state.
+Therefore, either:
-## No schema changes and heavy operations in a single transaction
+- Put all migrations in one single-transaction migration.
+- If necessary, put most actions in one migration and create a separate migration
+ for the steps that cannot be done in a single transaction.
+
+For example, if you create an empty table and need to build an index for it,
+it is recommended to use a regular single-transaction migration and the default
+rails schema statement: [`add_index`](https://api.rubyonrails.org/v5.2/classes/ActiveRecord/ConnectionAdapters/SchemaStatements.html#method-i-add_index).
+This is a blocking operation, but it won't cause problems because the table is not yet used,
+and therefore it does not have any records yet.
-However, a single-transaction migration (that one without
-`disable_ddl_transaction!`) involving schema changes must not include any
-processing of significant (more than a few records rows) amounts of data.
-If you need to insert, update, or delete something, you must not do it inside
-a transaction with DDL.
+## Heavy operations in a single transaction
+
+When using a single-transaction migration, a transaction will hold on a database connection
+for the duration of the migration, so you must make sure the actions in the migration
+do not take too much time: In general, queries executed in a migration need to fit comfortably
+within `15s` on GitLab.com.
+
+In case you need to insert, update, or delete a significant amount of data, you:
+
+- Must disable the single transaction with `disable_ddl_transaction!`.
+- Should consider doing it in a [Background Migration](background_migrations.md).
## Multi-Threading
@@ -151,11 +159,10 @@ should be more than enough.
## Removing indexes
-If the table is not empty when removing an index make sure to use the method
-`remove_concurrent_index` instead
-of the regular `remove_index` method. The `remove_concurrent_index` method
-automatically drops concurrent indexes when using PostgreSQL, removing the
-need for downtime. To use this method, you must disable single-transaction mode
+If the table is not empty when removing an index, make sure to use the method
+`remove_concurrent_index` instead of the regular `remove_index` method.
+The `remove_concurrent_index` method drops indexes concurrently, so no locking is required,
+and there is no need for downtime. To use this method, you must disable single-transaction mode
by calling the method `disable_ddl_transaction!` in the body of your migration
class like so:
@@ -173,9 +180,9 @@ end
Note that it is not necessary to check if the index exists prior to
removing it.
-For an empty table (such as a fresh one), it is recommended to use
-`remove_index` in a single-transaction migration, combining it with other
-operations that don't require `disable_ddl_transaction!`.
+For a small table (such as an empty one or one with less than `1,000` records),
+it is recommended to use `remove_index` in a single-transaction migration,
+combining it with other operations that don't require `disable_ddl_transaction!`.
## Adding indexes
@@ -185,12 +192,13 @@ always _first_ add a migration that removes any duplicates, before adding the
unique index.
When adding an index to a non-empty table make sure to use the method
-`add_concurrent_index` instead
-of the regular `add_index` method. The `add_concurrent_index` method
-automatically creates concurrent indexes when using PostgreSQL, removing the
-need for downtime. To use this method, you must disable transactions by calling
-the method `disable_ddl_transaction!` in the body of your migration class like
-so:
+`add_concurrent_index` instead of the regular `add_index` method.
+The `add_concurrent_index` method automatically creates concurrent indexes
+when using PostgreSQL, removing the need for downtime.
+
+To use this method, you must disable single-transactions mode
+by calling the method `disable_ddl_transaction!` in the body of your migration
+class like so:
```ruby
class MyMigration < ActiveRecord::Migration[4.2]
@@ -208,8 +216,8 @@ class MyMigration < ActiveRecord::Migration[4.2]
end
```
-For an empty table (such as a fresh one), it is recommended to use
-`add_index` in a single-transaction migration, combining it with other
+For a small table (such as an empty one or one with less than `1,000` records),
+it is recommended to use `add_index` in a single-transaction migration, combining it with other
operations that don't require `disable_ddl_transaction!`.
## Adding foreign-key constraints
@@ -235,7 +243,7 @@ class Migration < ActiveRecord::Migration[4.2]
end
```
-When adding a foreign-key constraint to an existing column in a non-empty table
+When adding a foreign-key constraint to an existing column in a non-empty table,
we have to employ `add_concurrent_foreign_key` and `add_concurrent_index`
instead of `add_reference`.
@@ -269,13 +277,13 @@ end
```
Keep in mind that this operation can easily take 10-15 minutes to complete on
-larger installations (e.g., GitLab.com). As a result, you should only add
+larger installations (e.g. GitLab.com). As a result, you should only add
default values if absolutely necessary. There is a RuboCop cop that will fail if
this method is used on some tables that are very large on GitLab.com, which
would cause other issues.
-For an empty table (such as a fresh one), it is recommended to use
-`add_column` + `change_column_default` in a single-transaction migration,
+For a small table (such as an empty one or one with less than `1,000` records),
+use `add_column` and `change_column_default` in a single-transaction migration,
combining it with other operations that don't require `disable_ddl_transaction!`.
## Updating an existing column
@@ -295,7 +303,9 @@ end
```
If a computed update is needed, the value can be wrapped in `Arel.sql`, so Arel
-treats it as an SQL literal. The below example is the same as the one above, but
+treats it as an SQL literal. It's also a required deprecation for [Rails 6](https://gitlab.com/gitlab-org/gitlab-ce/issues/61451).
+
+The below example is the same as the one above, but
the value is set to the product of the `bar` and `baz` columns:
```ruby
@@ -321,7 +331,7 @@ restricts the maximum file size to just over 2GB.
To allow an integer column to hold up to an 8-byte (64-bit) number, explicitly
set the limit to 8-bytes. This will allow the column to hold a value up to
-9,223,372,036,854,775,807.
+`9,223,372,036,854,775,807`.
Rails migration example:
@@ -445,7 +455,7 @@ derived from the class name or namespace.
### Renaming reserved paths
-When a new route for projects is introduced, that could conflict with any
+When a new route for projects is introduced, it could conflict with any
existing records. The path for these records should be renamed, and the
related data should be moved on disk.