Addreses backend and technical writer comments:nik-db-migrations-prefer-atomic-docs

- Removes text mentioning MySQL - Removes duplicated text - Makes section a bit more descriptive - Reorganices 'Atomicity' and 'Heave operations in a single transaction' sections. - Add missing commans and fixes typos - Makes the text more understandable.
author: Mayra Cabrera <mcabrera@gitlab.com> 2019-08-16 11:31:02 -0500
committer: Mayra Cabrera <mcabrera@gitlab.com> 2019-08-23 15:07:03 -0500
commit: aee8a868f6b74cf4350d1ac4f66d06515d0b78c3 (patch)
tree: 020c8fee1b25ae73ec8ad8afb417997de2fbecd8
parent: 4594d1c05e4c5181396713a49f6e7046a21a4d0a (diff)
download: gitlab-ce-nik-db-migrations-prefer-atomic-docs.tar.gz
2 files changed, 52 insertions, 42 deletions
diff --git a/doc/development/database_review.md b/doc/development/database_review.md
index 3f1b359cb0b..367a481ee11 100644
--- a/doc/development/database_review.md
+++ b/doc/development/database_review.md
@@ -91,7 +91,7 @@ and details for a database reviewer:
     concurrent index/foreign key helpers (with transactions disabled)
   - Check consistency with `db/schema.rb` and that migrations are [reversible](migration_style_guide.md#reversibility)
   - Check queries timing (If any): Queries executed in a migration
-    need to fit comfortable within `15s` - preferably much less than that - on GitLab.com.
+    need to fit comfortably within `15s` - preferably much less than that - on GitLab.com.
 - Check [background migrations](background_migrations.md):
   - For data migrations, establish a time estimate for execution
   - They should only be used when migrating data in larger tables.
diff --git a/doc/development/migration_style_guide.md b/doc/development/migration_style_guide.md
index afee1da1f8a..077dd2f43d0 100644
--- a/doc/development/migration_style_guide.md
+++ b/doc/development/migration_style_guide.md
@@ -10,10 +10,7 @@ migrations are written carefully, can be applied online, and adhere to the style
 guide below.
 
 Migrations are **not** allowed to require GitLab installations to be taken
-offline unless _absolutely necessary_. Downtime assumptions should be based on
-the behavior of the migration when performed using PostgreSQL, as various
-operations in MySQL may require downtime without there being alternatives.
->>>>>>> Migrations guide: use atomic steps, when possible
+offline unless _absolutely necessary_.
 
 When downtime is necessary the migration has to be approved by:
 
@@ -90,23 +87,34 @@ migration was tested.
 
 ## Atomicity
 
-When it is possible, your changes must be grouped into a single migration,
-or to as few migrations as possible. For example, if you create an empty table
-and need to build an index for it, it is recommended to have a single migration,
-*without* `disable_ddl_transaction!`. Of course, in this case, you have to use
-`add_index` (without `concurrently`), a blocking operation – it will not block
-anything because the target table is not yet used.
+By default, migrations are single transaction. That is, a transaction is opened
+at the beginning of the migration, and committed after all steps are processed.
 
-Such an approach allows having fewer migration steps, which are atomic ("all or
-nothing").
+Running migrations in a single transaction makes sure that if one of the steps fails,
+none of the steps will be executed, leaving the database in valid state.
+Therefore, either:
 
-## No schema changes and heavy operations in a single transaction
+- Put all migrations in one single-transaction migration.
+- If necessary, put most actions in one migration and create a separate migration 
+  for the steps that cannot be done in a single transaction.
+   
+For example, if you create an empty table and need to build an index for it,
+it is recommended to use a regular single-transaction migration and the default
+rails schema statement: [`add_index`](https://api.rubyonrails.org/v5.2/classes/ActiveRecord/ConnectionAdapters/SchemaStatements.html#method-i-add_index).
+This is a blocking operation, but it won't cause problems because the table is not yet used,
+and therefore it does not have any records yet.
 
-However, a single-transaction migration (that one without
-`disable_ddl_transaction!`) involving schema changes must not include any
-processing of significant (more than a few records rows) amounts of data.
-If you need to insert, update, or delete something, you must not do it inside
-a transaction with DDL.
+## Heavy operations in a single transaction
+
+When using a single-transaction migration, a transaction will hold on a database connection
+for the duration of the migration, so you must make sure the actions in the migration
+do not take too much time: In general, queries executed in a migration need to fit comfortably
+within `15s` on GitLab.com.
+
+In case you need to insert, update, or delete a significant amount of data, you:
+
+- Must disable the single transaction with `disable_ddl_transaction!`.
+- Should consider doing it in a [Background Migration](background_migrations.md).
 
 ## Multi-Threading
 
@@ -151,11 +159,10 @@ should be more than enough.
 
 ## Removing indexes
 
-If the table is not empty when removing an index make sure to use the method
-`remove_concurrent_index` instead
-of the regular `remove_index` method. The `remove_concurrent_index` method
-automatically drops concurrent indexes when using PostgreSQL, removing the
-need for downtime. To use this method, you must disable single-transaction mode
+If the table is not empty when removing an index, make sure to use the method
+`remove_concurrent_index` instead of the regular `remove_index` method.
+The `remove_concurrent_index` method drops indexes concurrently, so no locking is required,
+and there is no need for downtime. To use this method, you must disable single-transaction mode
 by calling the method `disable_ddl_transaction!` in the body of your migration
 class like so:
 
@@ -173,9 +180,9 @@ end
 Note that it is not necessary to check if the index exists prior to
 removing it.
 
-For an empty table (such as a fresh one), it is recommended to use
-`remove_index` in a single-transaction migration, combining it with other
-operations that don't require `disable_ddl_transaction!`.
+For a small table (such as an empty one or one with less than `1,000` records),
+it is recommended to use `remove_index` in a single-transaction migration,
+combining it with other operations that don't require `disable_ddl_transaction!`.
 
 ## Adding indexes
 
@@ -185,12 +192,13 @@ always _first_ add a migration that removes any duplicates, before adding the
 unique index.
 
 When adding an index to a non-empty table make sure to use the method
-`add_concurrent_index` instead
-of the regular `add_index` method. The `add_concurrent_index` method
-automatically creates concurrent indexes when using PostgreSQL, removing the
-need for downtime. To use this method, you must disable transactions by calling
-the method `disable_ddl_transaction!` in the body of your migration class like
-so:
+`add_concurrent_index` instead of the regular `add_index` method. 
+The `add_concurrent_index` method automatically creates concurrent indexes
+when using PostgreSQL, removing the need for downtime. 
+
+To use this method, you must disable single-transactions mode
+by calling the method `disable_ddl_transaction!` in the body of your migration
+class like so:
 
 ```ruby
 class MyMigration < ActiveRecord::Migration[4.2]
@@ -208,8 +216,8 @@ class MyMigration < ActiveRecord::Migration[4.2]
 end
 ```
 
-For an empty table (such as a fresh one), it is recommended to use
-`add_index` in a single-transaction migration, combining it with other
+For a small table (such as an empty one or one with less than `1,000` records), 
+it is recommended to use `add_index` in a single-transaction migration, combining it with other
 operations that don't require `disable_ddl_transaction!`.
 
 ## Adding foreign-key constraints
@@ -235,7 +243,7 @@ class Migration < ActiveRecord::Migration[4.2]
 end
 ```
 
-When adding a foreign-key constraint to an existing column in a non-empty table
+When adding a foreign-key constraint to an existing column in a non-empty table,
 we have to employ `add_concurrent_foreign_key` and `add_concurrent_index`
 instead of `add_reference`.
 
@@ -269,13 +277,13 @@ end
 ```
 
 Keep in mind that this operation can easily take 10-15 minutes to complete on
-larger installations (e.g., GitLab.com). As a result, you should only add
+larger installations (e.g. GitLab.com). As a result, you should only add
 default values if absolutely necessary. There is a RuboCop cop that will fail if
 this method is used on some tables that are very large on GitLab.com, which
 would cause other issues.
 
-For an empty table (such as a fresh one), it is recommended to use
-`add_column` + `change_column_default` in a single-transaction migration,
+For a small table (such as an empty one or one with less than `1,000` records),
+use `add_column` and `change_column_default` in a single-transaction migration,
 combining it with other operations that don't require `disable_ddl_transaction!`.
 
 ## Updating an existing column
@@ -295,7 +303,9 @@ end
 ```
 
 If a computed update is needed, the value can be wrapped in `Arel.sql`, so Arel
-treats it as an SQL literal. The below example is the same as the one above, but
+treats it as an SQL literal. It's also a required deprecation for [Rails 6](https://gitlab.com/gitlab-org/gitlab-ce/issues/61451).
+
+The below example is the same as the one above, but
 the value is set to the product of the `bar` and `baz` columns:
 
 ```ruby
@@ -321,7 +331,7 @@ restricts the maximum file size to just over 2GB.
 
 To allow an integer column to hold up to an 8-byte (64-bit) number, explicitly
 set the limit to 8-bytes. This will allow the column to hold a value up to
-9,223,372,036,854,775,807.
+`9,223,372,036,854,775,807`.
 
 Rails migration example:
 
@@ -445,7 +455,7 @@ derived from the class name or namespace.
 
 ### Renaming reserved paths
 
-When a new route for projects is introduced, that could conflict with any
+When a new route for projects is introduced, it could conflict with any
 existing records. The path for these records should be renamed, and the
 related data should be moved on disk.
author	Mayra Cabrera <mcabrera@gitlab.com>	2019-08-16 11:31:02 -0500
committer	Mayra Cabrera <mcabrera@gitlab.com>	2019-08-23 15:07:03 -0500
commit	aee8a868f6b74cf4350d1ac4f66d06515d0b78c3 (patch)
tree	020c8fee1b25ae73ec8ad8afb417997de2fbecd8
parent	4594d1c05e4c5181396713a49f6e7046a21a4d0a (diff)
download	gitlab-ce-nik-db-migrations-prefer-atomic-docs.tar.gz