diff options
Diffstat (limited to 'doc/development/what_requires_downtime.md')
-rw-r--r-- | doc/development/what_requires_downtime.md | 237 |
1 files changed, 181 insertions, 56 deletions
diff --git a/doc/development/what_requires_downtime.md b/doc/development/what_requires_downtime.md index bbcd26477f3..8da6ad684f5 100644 --- a/doc/development/what_requires_downtime.md +++ b/doc/development/what_requires_downtime.md @@ -2,7 +2,8 @@ When working with a database certain operations can be performed without taking GitLab offline, others do require a downtime period. This guide describes -various operations and their impact. +various operations, their impact, and how to perform them without requiring +downtime. ## Adding Columns @@ -41,50 +42,156 @@ information on how to use this method. ## Dropping Columns -On PostgreSQL you can safely remove an existing column without the need for -downtime. When you drop a column in PostgreSQL it's not immediately removed, -instead it is simply disabled. The data is removed on the next vacuum run. +Removing columns is tricky because running GitLab processes may still be using +the columns. To work around this you will need two separate merge requests and +releases: one to ignore and then remove the column, and one to remove the ignore +rule. -On MySQL this operation requires downtime. +### Step 1: Ignoring The Column -While database wise dropping a column may be fine on PostgreSQL this operation -still requires downtime because the application code may still be using the -column that was removed. For example, consider the following migration: +The first step is to ignore the column in the application code. This is +necessary because Rails caches the columns and re-uses this cache in various +places. This can be done by including the `IgnorableColumn` module into the +model, followed by defining the columns to ignore. For example, to ignore +`updated_at` in the User model you'd use the following: ```ruby -class MyMigration < ActiveRecord::Migration - def change - remove_column :projects, :dummy - end +class User < ActiveRecord::Base + include IgnorableColumn + + ignore_column :updated_at end ``` -Now imagine that the GitLab instance is running and actively uses the `dummy` -column. If we were to run the migration this would result in the GitLab instance -producing errors whenever it tries to use the `dummy` column. +Once added you should create a _post-deployment_ migration that removes the +column. Both these changes should be submitted in the same merge request. -As a result of the above downtime _is_ required when removing a column, even -when using PostgreSQL. +### Step 2: Removing The Ignore Rule + +Once the changes from step 1 have been released & deployed you can set up a +separate merge request that removes the ignore rule. This merge request can +simply remove the `ignore_column` line, and the `include IgnorableColumn` line +if no other `ignore_column` calls remain. ## Renaming Columns -Renaming columns requires downtime as running GitLab instances will continue -using the old column name until a new version is deployed. This can result -in the instance producing errors, which in turn can impact the user experience. +Renaming columns the normal way requires downtime as an application may continue +using the old column name during/after a database migration. To rename a column +without requiring downtime we need two migrations: a regular migration, and a +post-deployment migration. Both these migration can go in the same release. -## Changing Column Constraints +### Step 1: Add The Regular Migration + +First we need to create the regular migration. This migration should use +`Gitlab::Database::MigrationHelpers#rename_column_concurrently` to perform the +renaming. For example + +```ruby +# A regular migration in db/migrate +class RenameUsersUpdatedAtToUpdatedAtTimestamp < ActiveRecord::Migration + include Gitlab::Database::MigrationHelpers + + disable_ddl_transaction! + + def up + rename_column_concurrently :users, :updated_at, :updated_at_timestamp + end + + def down + cleanup_concurrent_column_rename :users, :updated_at_timestamp, :updated_at + end +end +``` + +This will take care of renaming the column, ensuring data stays in sync, copying +over indexes and foreign keys, etc. + +**NOTE:** if a column contains 1 or more indexes that do not contain the name of +the original column, the above procedure will fail. In this case you will first +need to rename these indexes. -Generally changing column constraints requires checking all rows in the table to -see if they meet the new constraint, unless a constraint is _removed_. For -example, changing a column that previously allowed NULL values to not allow NULL -values requires the database to verify all existing rows. +### Step 2: Add A Post-Deployment Migration -The specific behaviour varies a bit between databases but in general the safest -approach is to assume changing constraints requires downtime. +The renaming procedure requires some cleaning up in a post-deployment migration. +We can perform this cleanup using +`Gitlab::Database::MigrationHelpers#cleanup_concurrent_column_rename`: + +```ruby +# A post-deployment migration in db/post_migrate +class CleanupUsersUpdatedAtRename < ActiveRecord::Migration + include Gitlab::Database::MigrationHelpers + + disable_ddl_transaction! + + def up + cleanup_concurrent_column_rename :users, :updated_at, :updated_at_timestamp + end + + def down + rename_column_concurrently :users, :updated_at_timestamp, :updated_at + end +end +``` + +## Changing Column Constraints + +Adding or removing a NOT NULL clause (or another constraint) can typically be +done without requiring downtime. However, this does require that any application +changes are deployed _first_. Thus, changing the constraints of a column should +happen in a post-deployment migration. ## Changing Column Types -This operation requires downtime. +Changing the type of a column can be done using +`Gitlab::Database::MigrationHelpers#change_column_type_concurrently`. This +method works similarly to `rename_column_concurrently`. For example, let's say +we want to change the type of `users.username` from `string` to `text`. + +### Step 1: Create A Regular Migration + +A regular migration is used to create a new column with a temporary name along +with setting up some triggers to keep data in sync. Such a migration would look +as follows: + +```ruby +# A regular migration in db/migrate +class ChangeUsersUsernameStringToText < ActiveRecord::Migration + include Gitlab::Database::MigrationHelpers + + disable_ddl_transaction! + + def up + change_column_type_concurrently :users, :username, :text + end + + def down + cleanup_concurrent_column_type_change :users, :username + end +end +``` + +### Step 2: Create A Post Deployment Migration + +Next we need to clean up our changes using a post-deployment migration: + +```ruby +# A post-deployment migration in db/post_migrate +class ChangeUsersUsernameStringToTextCleanup < ActiveRecord::Migration + include Gitlab::Database::MigrationHelpers + + disable_ddl_transaction! + + def up + cleanup_concurrent_column_type_change :users + end + + def down + change_column_type_concurrently :users, :username, :string + end +end +``` + +And that's it, we're done! ## Adding Indexes @@ -101,12 +208,19 @@ Migrations can take advantage of this by using the method ```ruby class MyMigration < ActiveRecord::Migration - def change + def up add_concurrent_index :projects, :column_name end + + def down + remove_index(:projects, :column_name) if index_exists?(:projects, :column_name) + end end ``` +Note that `add_concurrent_index` can not be reversed automatically, thus you +need to manually define `up` and `down`. + When running this on PostgreSQL the `CONCURRENTLY` option mentioned above is used. On MySQL this method produces a regular `CREATE INDEX` query. @@ -125,43 +239,54 @@ This operation is safe as there's no code using the table just yet. ## Dropping Tables -This operation requires downtime as application code may still be using the -table. +Dropping tables can be done safely using a post-deployment migration, but only +if the application no longer uses the table. ## Adding Foreign Keys -Adding foreign keys acquires an exclusive lock on both the source and target -tables in PostgreSQL. This requires downtime as otherwise the entire application -grinds to a halt for the duration of the operation. +Adding foreign keys usually works in 3 steps: + +1. Start a transaction +1. Run `ALTER TABLE` to add the constraint(s) +1. Check all existing data -On MySQL this operation also requires downtime _unless_ foreign key checks are -disabled. Because this means checks aren't enforced this is not ideal, as such -one should assume MySQL also requires downtime. +Because `ALTER TABLE` typically acquires an exclusive lock until the end of a +transaction this means this approach would require downtime. + +GitLab allows you to work around this by using +`Gitlab::Database::MigrationHelpers#add_concurrent_foreign_key`. This method +ensures that when PostgreSQL is used no downtime is needed. ## Removing Foreign Keys -This operation should not require downtime on both PostgreSQL and MySQL. +This operation does not require downtime. -## Updating Data +## Data Migrations -Updating data should generally be safe. The exception to this is data that's -being migrated from one version to another while the application still produces -data in the old version. +Data migrations can be tricky. The usual approach to migrate data is to take a 3 +step approach: -For example, imagine the application writes the string `'dog'` to a column but -it really is meant to write `'cat'` instead. One might think that the following -migration is all that is needed to solve this problem: +1. Migrate the initial batch of data +1. Deploy the application code +1. Migrate any remaining data -```ruby -class MyMigration < ActiveRecord::Migration - def up - execute("UPDATE some_table SET column = 'cat' WHERE column = 'dog';") - end -end -``` +Usually this works, but not always. For example, if a field's format is to be +changed from JSON to something else we have a bit of a problem. If we were to +change existing data before deploying application code we'll most likely run +into errors. On the other hand, if we were to migrate after deploying the +application code we could run into the same problems. + +If you merely need to correct some invalid data, then a post-deployment +migration is usually enough. If you need to change the format of data (e.g. from +JSON to something else) it's typically best to add a new column for the new data +format, and have the application use that. In such a case the procedure would +be: -Unfortunately this is not enough. Because the application is still running and -using the old value this may result in the table still containing rows where -`column` is set to `dog`, even after the migration finished. +1. Add a new column in the new format +1. Copy over existing data to this new column +1. Deploy the application code +1. In a post-deployment migration, copy over any remaining data -In these cases downtime _is_ required, even for rarely updated tables. +In general there is no one-size-fits-all solution, therefore it's best to +discuss these kind of migrations in a merge request to make sure they are +implemented in the best way possible. |