1 files changed, 81 insertions, 11 deletions
diff --git a/doc/development/migration_style_guide.md b/doc/development/migration_style_guide.md
index d1f956f957e..ebaceac6d17 100644
--- a/doc/development/migration_style_guide.md
+++ b/doc/development/migration_style_guide.md
@@ -30,7 +30,7 @@ versions. If needed copy-paste GitLab code into the migration to make it forward
 compatible.
 
 For GitLab.com, please take into consideration that regular migrations (under `db/migrate`)
-are run before [Canary is deployed](https://about.gitlab.com/handbook/engineering/infrastructure/library/canary/#configuration-and-deployment),
+are run before [Canary is deployed](https://gitlab.com/gitlab-com/gl-infra/readiness/-/tree/master/library/canary/#configuration-and-deployment),
 and post-deployment migrations (`db/post_migrate`) are run after the deployment to production has finished.
 
 ## Schema Changes
@@ -380,7 +380,8 @@ Example changes:
 - `change_column_default`
 - `create_table` / `drop_table`
 
-**Note:** `with_lock_retries` method **cannot** be used within the `change` method, you must manually define the `up` and `down` methods to make the migration reversible.
+NOTE: **Note:**
+`with_lock_retries` method **cannot** be used within the `change` method, you must manually define the `up` and `down` methods to make the migration reversible.
 
 ### How the helper method works
 
@@ -440,7 +441,8 @@ the `with_multiple_threads` block, instead of re-using the global connection
 pool. This ensures each thread has its own connection object, and won't time
 out when trying to obtain one.
 
-**NOTE:** PostgreSQL has a maximum amount of connections that it allows. This
+NOTE: **Note:**
+PostgreSQL has a maximum amount of connections that it allows. This
 limit can vary from installation to installation. As a result, it's recommended
 you do not use more than 32 threads in a single migration. Usually, 4-8 threads
 should be more than enough.
@@ -873,14 +875,7 @@ end
 When doing so be sure to explicitly set the model's table name, so it's not
 derived from the class name or namespace.
 
-Finally, make sure that `reset_column_information` is run in the `up` method of
-the migration for all local Models that update the database.
-
-The reason for that is that all migration classes are loaded at the beginning
-(when `db:migrate` starts), so they can get out of sync with the table schema
-they map to in case another migration updates that schema. That makes the data
-migration fail when trying to insert or make updates to the underlying table,
-as the new columns are reported as `unknown attribute` by `ActiveRecord`.
+Be aware of the limitations [when using models in migrations](#using-models-in-migrations-discouraged).
 
 ### Renaming reserved paths
 
@@ -906,3 +901,78 @@ _namespaces_ that have a `project_id`.
 
 The `path` column for these rows will be renamed to their previous value followed
 by an integer. For example: `users` would turn into `users0`
+
+## Using models in migrations (discouraged)
+
+The use of models in migrations is generally discouraged. As such models are
+[contraindicated for background migrations](background_migrations.md#isolation),
+the model needs to be declared in the migration.
+
+If using a model in the migrations, you should first
+[clear the column cache](https://api.rubyonrails.org/classes/ActiveRecord/ModelSchema/ClassMethods.html#method-i-reset_column_information)
+using `reset_column_information`.
+
+This avoids problems where a column that you are using was altered and cached
+in a previous migration.
+
+### Example: Add a column `my_column` to the users table
+
+NOTE: **Note:**
+It is important not to leave out the `User.reset_column_information` command, in order to ensure that the old schema is dropped from the cache and ActiveRecord loads the updated schema information.
+
+```ruby
+class AddAndSeedMyColumn < ActiveRecord::Migration[6.0]
+  class User < ActiveRecord::Base
+    self.table_name = 'users'
+  end
+
+  def up
+    User.count # Any ActiveRecord calls on the model that caches the column information.
+
+    add_column :users, :my_column, :integer, default: 1
+
+    User.reset_column_information # The old schema is dropped from the cache.
+    User.find_each do |user|
+      user.my_column = 42 if some_condition # ActiveRecord sees the correct schema here.
+      user.save!
+    end
+  end
+end
+```
+
+The underlying table is modified and then accessed via ActiveRecord.
+
+Note that this also needs to be used if the table is modified in a previous, different migration,
+if both migrations are run in the same `db:migrate` process.
+
+This results in the following. Note the inclusion of `my_column`:
+
+```shell
+== 20200705232821 AddAndSeedMyColumn: migrating ==============================
+D, [2020-07-06T00:37:12.483876 #130101] DEBUG -- :    (0.2ms)  BEGIN
+D, [2020-07-06T00:37:12.521660 #130101] DEBUG -- :    (0.4ms)  SELECT COUNT(*) FROM "user"
+-- add_column(:users, :my_column, :integer, {:default=>1})
+D, [2020-07-06T00:37:12.523309 #130101] DEBUG -- :    (0.8ms)  ALTER TABLE "users" ADD "my_column" integer DEFAULT 1
+   -> 0.0016s
+D, [2020-07-06T00:37:12.650641 #130101] DEBUG -- :   AddAndSeedMyColumn::User Load (0.7ms)  SELECT "users".* FROM "users" ORDER BY "users"."id" ASC LIMIT $1  [["LIMIT", 1000]]
+D, [2020-07-18T00:41:26.851769 #459802] DEBUG -- :   AddAndSeedMyColumn::User Update (1.1ms)  UPDATE "users" SET "my_column" = $1, "updated_at" = $2 WHERE "users"."id" = $3  [["my_column", 42], ["updated_at", "2020-07-17 23:41:26.849044"], ["id", 1]]
+D, [2020-07-06T00:37:12.653648 #130101] DEBUG -- :   ↳ config/initializers/config_initializers_active_record_locking.rb:13:in `_update_row'
+== 20200705232821 AddAndSeedMyColumn: migrated (0.1706s) =====================
+```
+
+If you skip clearing the schema cache (`User.reset_column_information`), the column is not
+used by ActiveRecord and the intended changes are not made, leading to the result below,
+where `my_column` is missing from the query.
+
+```shell
+== 20200705232821 AddAndSeedMyColumn: migrating ==============================
+D, [2020-07-06T00:37:12.483876 #130101] DEBUG -- :    (0.2ms)  BEGIN
+D, [2020-07-06T00:37:12.521660 #130101] DEBUG -- :    (0.4ms)  SELECT COUNT(*) FROM "user"
+-- add_column(:users, :my_column, :integer, {:default=>1})
+D, [2020-07-06T00:37:12.523309 #130101] DEBUG -- :    (0.8ms)  ALTER TABLE "users" ADD "my_column" integer DEFAULT 1
+   -> 0.0016s
+D, [2020-07-06T00:37:12.650641 #130101] DEBUG -- :   AddAndSeedMyColumn::User Load (0.7ms)  SELECT "users".* FROM "users" ORDER BY "users"."id" ASC LIMIT $1  [["LIMIT", 1000]]
+D, [2020-07-06T00:37:12.653459 #130101] DEBUG -- :   AddAndSeedMyColumn::User Update (0.5ms)  UPDATE "users" SET "updated_at" = $1 WHERE "users"."id" = $2  [["updated_at", "2020-07-05 23:37:12.652297"], ["id", 1]]
+D, [2020-07-06T00:37:12.653648 #130101] DEBUG -- :   ↳ config/initializers/config_initializers_active_record_locking.rb:13:in `_update_row'
+== 20200705232821 AddAndSeedMyColumn: migrated (0.1706s) =====================
+```