summaryrefslogtreecommitdiff
path: root/doc/development/database
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/database')
-rw-r--r--doc/development/database/avoiding_downtime_in_migrations.md66
-rw-r--r--doc/development/database/background_migrations.md14
-rw-r--r--doc/development/database/batched_background_migrations.md371
-rw-r--r--doc/development/database/loose_foreign_keys.md10
-rw-r--r--doc/development/database/migrations_for_multiple_databases.md35
-rw-r--r--doc/development/database/multiple_databases.md63
-rw-r--r--doc/development/database/pagination_guidelines.md2
-rw-r--r--doc/development/database/strings_and_the_text_data_type.md2
-rw-r--r--doc/development/database/table_partitioning.md2
9 files changed, 493 insertions, 72 deletions
diff --git a/doc/development/database/avoiding_downtime_in_migrations.md b/doc/development/database/avoiding_downtime_in_migrations.md
index ad2768397e6..3cf9ab1ab5c 100644
--- a/doc/development/database/avoiding_downtime_in_migrations.md
+++ b/doc/development/database/avoiding_downtime_in_migrations.md
@@ -68,10 +68,72 @@ In this example, the change to ignore the column went into release 12.5.
Continuing our example, dropping the column goes into a _post-deployment_ migration in release 12.6:
+Start by creating the **post-deployment migration**:
+
+```shell
+bundle exec rails g post_deployment_migration remove_users_updated_at_column
+```
+
+There are two scenarios that you need to consider
+to write a migration that removes a column:
+
+#### A. The removed column has no indexes or constraints that belong to it
+
+In this case, a **transactional migration** can be used. Something as simple as:
+
+```ruby
+class RemoveUsersUpdatedAtColumn < Gitlab::Database::Migration[2.0]
+ def up
+ remove_column :users, :updated_at
+ end
+
+ def down
+ add_column :users, :updated_at, :datetime
+ end
+end
+```
+
+You can consider [enabling lock retries](
+https://docs.gitlab.com/ee/development/migration_style_guide.html#usage-with-transactional-migrations
+) when you run a migration on big tables, because it might take some time to
+acquire a lock on this table.
+
+#### B. The removed column has an index or constraint that belongs to it
+
+If the `down` method requires adding back any dropped indexes or constraints, that cannot
+be done within a transactional migration, then the migration would look like this:
+
```ruby
- remove_column :user, :updated_at
+class RemoveUsersUpdatedAtColumn < Gitlab::Database::Migration[1.0]
+ disable_ddl_transaction!
+
+ def up
+ remove_column :users, :updated_at
+ end
+
+ def down
+ unless column_exists?(:users, :updated_at)
+ add_column :users, :updated_at, :datetime
+ end
+
+ # Make sure to add back any indexes or constraints,
+ # that were dropped in the `up` method. For example:
+ add_concurrent_index(:users, :updated_at)
+ end
+end
```
+In the `down` method, we check to see if the column already exists before adding it again.
+We do this because the migration is non-transactional and might have failed while it was running.
+
+The [`disable_ddl_transaction!`](
+https://docs.gitlab.com/ee/development/migration_style_guide.html#usage-with-non-transactional-migrations-disable_ddl_transaction
+) is used to disable the transaction that wraps the whole migration.
+
+You can refer to the page [Migration Style Guide](
+https://docs.gitlab.com/ee/development/migration_style_guide.html
+) for more information about database migrations.
+
### Step 3: Removing the ignore rule (release M+2)
With the next release, in this example 12.7, we set up another merge request to remove the ignore rule.
@@ -272,7 +334,7 @@ Renaming a table is possible without downtime by following our multi-release
Adding foreign keys usually works in 3 steps:
1. Start a transaction
-1. Run `ALTER TABLE` to add the constraint(s)
+1. Run `ALTER TABLE` to add the constraints
1. Check all existing data
Because `ALTER TABLE` typically acquires an exclusive lock until the end of a
diff --git a/doc/development/database/background_migrations.md b/doc/development/database/background_migrations.md
index 1f7e0d76c89..80ba0336bda 100644
--- a/doc/development/database/background_migrations.md
+++ b/doc/development/database/background_migrations.md
@@ -7,7 +7,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w
# Background migrations
WARNING:
-Background migrations are strongly discouraged in favor of the new [batched background migrations framework](../batched_background_migrations.md).
+Background migrations are strongly discouraged in favor of the new [batched background migrations framework](batched_background_migrations.md).
Please check that documentation and determine if that framework suits your needs and fall back
to these only if required.
@@ -45,13 +45,17 @@ into this category.
## Isolation
Background migrations must be isolated and can not use application code (for example,
-models defined in `app/models`). Since these migrations can take a long time to
-run it's possible for new versions to be deployed while they are still running.
+models defined in `app/models` except the `ApplicationRecord` classes). Since these migrations
+can take a long time to run it's possible for new versions to be deployed while they are still running.
It's also possible for different migrations to be executed at the same time.
This means that different background migrations should not migrate data in a
way that would cause conflicts.
+## Accessing data for multiple databases
+
+See [Accessing data for multiple databases of Batched Background Migrations](batched_background_migrations.md#accessing-data-for-multiple-databases) for more details.
+
## Idempotence
Background migrations are executed in a context of a Sidekiq process.
@@ -190,7 +194,7 @@ class:
```ruby
class Gitlab::BackgroundMigration::ExtractIntegrationsUrl
- class Integration < ActiveRecord::Base
+ class Integration < ::ApplicationRecord
self.table_name = 'integrations'
end
@@ -214,7 +218,7 @@ created and updated integrations. We can do this using something along the lines
the following:
```ruby
-class Integration < ActiveRecord::Base
+class Integration < ::ApplicationRecord
after_commit :schedule_integration_migration, on: :update
after_commit :schedule_integration_migration, on: :create
diff --git a/doc/development/database/batched_background_migrations.md b/doc/development/database/batched_background_migrations.md
new file mode 100644
index 00000000000..3a0fa77eff9
--- /dev/null
+++ b/doc/development/database/batched_background_migrations.md
@@ -0,0 +1,371 @@
+---
+type: reference, dev
+stage: Enablement
+group: Database
+info: "See the Technical Writers assigned to Development Guidelines: https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments-to-development-guidelines"
+---
+
+# Batched background migrations
+
+Batched Background Migrations should be used to perform data migrations whenever a
+migration exceeds [the time limits](../migration_style_guide.md#how-long-a-migration-should-take)
+in our guidelines. For example, you can use batched background
+migrations to migrate data that's stored in a single JSON column
+to a separate table instead.
+
+## When to use batched background migrations
+
+Use a batched background migration when you migrate _data_ in tables containing
+so many rows that the process would exceed
+[the time limits in our guidelines](../migration_style_guide.md#how-long-a-migration-should-take)
+if performed using a regular Rails migration.
+
+- Batched background migrations should be used when migrating data in
+ [high-traffic tables](../migration_style_guide.md#high-traffic-tables).
+- Batched background migrations may also be used when executing numerous single-row queries
+ for every item on a large dataset. Typically, for single-record patterns, runtime is
+ largely dependent on the size of the dataset. Split the dataset accordingly,
+ and put it into background migrations.
+- Don't use batched background migrations to perform schema migrations.
+
+Background migrations can help when:
+
+- Migrating events from one table to multiple separate tables.
+- Populating one column based on JSON stored in another column.
+- Migrating data that depends on the output of external services. (For example, an API.)
+
+NOTE:
+If the batched background migration is part of an important upgrade, it must be announced
+in the release post. Discuss with your Project Manager if you're unsure if the migration falls
+into this category.
+
+## Isolation
+
+Batched background migrations must be isolated and can not use application code (for example,
+models defined in `app/models` except the `ApplicationRecord` classes).
+Because these migrations can take a long time to run, it's possible
+for new versions to deploy while the migrations are still running.
+
+## Accessing data for multiple databases
+
+Background Migration contrary to regular migrations does have access to multiple databases
+and can be used to efficiently access and update data across them. To properly indicate
+a database to be used it is desired to create ActiveRecord model inline the migration code.
+Such model should use a correct [`ApplicationRecord`](multiple_databases.md#gitlab-schema)
+depending on which database the table is located. As such usage of `ActiveRecord::Base`
+is disallowed as it does not describe a explicitly database to be used to access given table.
+
+```ruby
+# good
+class Gitlab::BackgroundMigration::ExtractIntegrationsUrl
+ class Project < ::ApplicationRecord
+ self.table_name = 'projects'
+ end
+
+ class Build < ::Ci::ApplicationRecord
+ self.table_name = 'ci_builds'
+ end
+end
+
+# bad
+class Gitlab::BackgroundMigration::ExtractIntegrationsUrl
+ class Project < ActiveRecord::Base
+ self.table_name = 'projects'
+ end
+
+ class Build < ActiveRecord::Base
+ self.table_name = 'ci_builds'
+ end
+end
+```
+
+Similarly the usage of `ActiveRecord::Base.connection` is disallowed and needs to be
+replaced preferably with the usage of model connection.
+
+```ruby
+# good
+Project.connection.execute("SELECT * FROM projects")
+
+# acceptable
+ApplicationRecord.connection.execute("SELECT * FROM projects")
+
+# bad
+ActiveRecord::Base.connection.execute("SELECT * FROM projects")
+```
+
+## Idempotence
+
+Batched background migrations are executed in a context of a Sidekiq process.
+The usual Sidekiq rules apply, especially the rule that jobs should be small
+and idempotent. Make sure that in case that your migration job is retried, data
+integrity is guaranteed.
+
+See [Sidekiq best practices guidelines](https://github.com/mperham/sidekiq/wiki/Best-Practices)
+for more details.
+
+## Batched background migrations for EE-only features
+
+All the background migration classes for EE-only features should be present in GitLab CE.
+For this purpose, create an empty class for GitLab CE, and extend it for GitLab EE
+as explained in the guidelines for
+[implementing Enterprise Edition features](../ee_features.md#code-in-libgitlabbackground_migration).
+
+Batched Background migrations are simple classes that define a `perform` method. A
+Sidekiq worker then executes such a class, passing any arguments to it. All
+migration classes must be defined in the namespace
+`Gitlab::BackgroundMigration`. Place the files in the directory
+`lib/gitlab/background_migration/`.
+
+## Queueing
+
+Queueing a batched background migration should be done in a post-deployment
+migration. Use this `queue_batched_background_migration` example, queueing the
+migration to be executed in batches. Replace the class name and arguments with the values
+from your migration:
+
+```ruby
+queue_batched_background_migration(
+ JOB_CLASS_NAME,
+ TABLE_NAME,
+ JOB_ARGUMENTS,
+ JOB_INTERVAL
+ )
+```
+
+Make sure the newly-created data is either migrated, or
+saved in both the old and new version upon creation. Removals in
+turn can be handled by defining foreign keys with cascading deletes.
+
+### Requeuing batched background migrations
+
+If one of the batched background migrations contains a bug that is fixed in a patch
+release, you must requeue the batched background migration so the migration
+repeats on systems that already performed the initial migration.
+
+When you requeue the batched background migration, turn the original
+queuing into a no-op by clearing up the `#up` and `#down` methods of the
+migration performing the requeuing. Otherwise, the batched background migration is
+queued multiple times on systems that are upgrading multiple patch releases at
+once.
+
+When you start the second post-deployment migration, delete the
+previously batched migration with the provided code:
+
+```ruby
+Gitlab::Database::BackgroundMigration::BatchedMigration
+ .for_configuration(MIGRATION_NAME, TABLE_NAME, COLUMN, JOB_ARGUMENTS)
+ .delete_all
+```
+
+## Cleaning up
+
+NOTE:
+Cleaning up any remaining background migrations must be done in either a major
+or minor release. You must not do this in a patch release.
+
+Because background migrations can take a long time, you can't immediately clean
+things up after queueing them. For example, you can't drop a column used in the
+migration process, as jobs would fail. You must add a separate _post-deployment_
+migration in a future release that finishes any remaining
+jobs before cleaning things up. (For example, removing a column.)
+
+To migrate the data from column `foo` (containing a big JSON blob) to column `bar`
+(containing a string), you would:
+
+1. Release A:
+ 1. Create a migration class that performs the migration for a row with a given ID.
+ 1. Update new rows using one of these techniques:
+ - Create a new trigger for simple copy operations that don't need application logic.
+ - Handle this operation in the model/service as the records are created or updated.
+ - Create a new custom background job that updates the records.
+ 1. Queue the batched background migration for all existing rows in a post-deployment migration.
+1. Release B:
+ 1. Add a post-deployment migration that checks if the batched background migration is completed.
+ 1. Deploy code so that the application starts using the new column and stops to update new records.
+ 1. Remove the old column.
+
+Bump to the [import/export version](../../user/project/settings/import_export.md) may
+be required, if importing a project from a prior version of GitLab requires the
+data to be in the new format.
+
+## Example
+
+The `routes` table has a `source_type` field that's used for a polymorphic relationship.
+As part of a database redesign, we're removing the polymorphic relationship. One step of
+the work will be migrating data from the `source_id` column into a new singular foreign key.
+Because we intend to delete old rows later, there's no need to update them as part of the
+background migration.
+
+1. Start by defining our migration class, which should inherit
+ from `Gitlab::BackgroundMigration::BatchedMigrationJob`:
+
+ ```ruby
+ class Gitlab::BackgroundMigration::BackfillRouteNamespaceId < BatchedMigrationJob
+ # For illustration purposes, if we were to use a local model we could
+ # define it like below, using an `ApplicationRecord` as the base class
+ # class Route < ::ApplicationRecord
+ # self.table_name = 'routes'
+ # end
+
+ def perform
+ each_sub_batch(
+ operation_name: :update_all,
+ batching_scope: -> (relation) { relation.where("source_type <> 'UnusedType'") }
+ ) do |sub_batch|
+ sub_batch.update_all('namespace_id = source_id')
+ end
+ end
+ end
+ ```
+
+ NOTE:
+ Job classes must be subclasses of `BatchedMigrationJob` to be
+ correctly handled by the batched migration framework. Any subclass of
+ `BatchedMigrationJob` will be initialized with necessary arguments to
+ execute the batch, as well as a connection to the tracking database.
+ Additional `job_arguments` set on the migration will be passed to the
+ job's `perform` method.
+
+1. Add a new trigger to the database to update newly created and updated routes,
+ similar to this example:
+
+ ```ruby
+ execute(<<~SQL)
+ CREATE OR REPLACE FUNCTION example() RETURNS trigger
+ LANGUAGE plpgsql
+ AS $$
+ BEGIN
+ NEW."namespace_id" = NEW."source_id"
+ RETURN NEW;
+ END;
+ $$;
+ SQL
+ ```
+
+1. Create a post-deployment migration that queues the migration for existing data:
+
+ ```ruby
+ class QueueBackfillRoutesNamespaceId < Gitlab::Database::Migration[1.0]
+ disable_ddl_transaction!
+
+ MIGRATION = 'BackfillRouteNamespaceId'
+ DELAY_INTERVAL = 2.minutes
+
+ def up
+ queue_batched_background_migration(
+ MIGRATION,
+ :routes,
+ :id,
+ job_interval: DELAY_INTERVAL
+ )
+ end
+
+ def down
+ Gitlab::Database::BackgroundMigration::BatchedMigration
+ .for_configuration(MIGRATION, :routes, :id, []).delete_all
+ end
+ end
+ ```
+
+ After deployment, our application:
+ - Continues using the data as before.
+ - Ensures that both existing and new data are migrated.
+
+1. In the next release, remove the trigger. We must also add a new post-deployment migration
+ that checks that the batched background migration is completed. For example:
+
+ ```ruby
+ class FinalizeBackfillRouteNamespaceId < Gitlab::Database::Migration[1.0]
+ MIGRATION = 'BackfillRouteNamespaceId'
+ disable_ddl_transaction!
+
+ def up
+ ensure_batched_background_migration_is_finished(
+ job_class_name: MIGRATION,
+ table_name: :routes,
+ column_name: :id,
+ job_arguments: []
+ )
+ end
+
+ def down
+ # no-op
+ end
+ end
+ ```
+
+ If the application does not depend on the data being 100% migrated (for
+ instance, the data is advisory, and not mission-critical), then you can skip this
+ final step. This step confirms that the migration is completed, and all of the rows were migrated.
+
+After the batched migration is completed, you can safely depend on the
+data in `routes.namespace_id` being populated.
+
+## Testing
+
+Writing tests is required for:
+
+- The batched background migrations' queueing migration.
+- The batched background migration itself.
+- A cleanup migration.
+
+The `:migration` and `schema: :latest` RSpec tags are automatically set for
+background migration specs. Refer to the
+[Testing Rails migrations](../testing_guide/testing_migrations_guide.md#testing-a-non-activerecordmigration-class)
+style guide.
+
+Remember that `before` and `after` RSpec hooks
+migrate your database down and up. These hooks can result in other batched background
+migrations being called. Using `spy` test doubles with
+`have_received` is encouraged, instead of using regular test doubles, because
+your expectations defined in a `it` block can conflict with what is
+called in RSpec hooks. Refer to [issue #35351](https://gitlab.com/gitlab-org/gitlab/-/issues/18839)
+for more details.
+
+## Best practices
+
+1. Know how much data you're dealing with.
+1. Make sure the batched background migration jobs are idempotent.
+1. Confirm the tests you write are not false positives.
+1. If the data being migrated is critical and cannot be lost, the
+ clean-up migration must also check the final state of the data before completing.
+1. Discuss the numbers with a database specialist. The migration may add
+ more pressure on DB than you expect. Measure on staging,
+ or ask someone to measure on production.
+1. Know how much time is required to run the batched background migration.
+
+## Additional tips and strategies
+
+### Viewing failure error logs
+
+You can view failures in two ways:
+
+- Via GitLab logs:
+ 1. After running a batched background migration, if any jobs fail,
+ view the logs in [Kibana](https://log.gprd.gitlab.net/goto/5f06a57f768c6025e1c65aefb4075694).
+ View the production Sidekiq log and filter for:
+
+ - `json.new_state: failed`
+ - `json.job_class_name: <Batched Background Migration job class name>`
+ - `json.job_arguments: <Batched Background Migration job class arguments>`
+
+ 1. Review the `json.exception_class` and `json.exception_message` values to help
+ understand why the jobs failed.
+
+ 1. Remember the retry mechanism. Having a failure does not mean the job failed.
+ Always check the last status of the job.
+
+- Via database:
+
+ 1. Get the batched background migration `CLASS_NAME`.
+ 1. Execute the following query in the PostgreSQL console:
+
+ ```sql
+ SELECT migration.id, migration.job_class_name, transition_logs.exception_class, transition_logs.exception_message
+ FROM batched_background_migrations as migration
+ INNER JOIN batched_background_migration_jobs as jobs
+ ON jobs.batched_background_migration_id = migration.id
+ INNER JOIN batched_background_migration_job_transition_logs as transition_logs
+ ON transition_logs.batched_background_migration_job_id = jobs.id
+ WHERE transition_logs.next_status = '2' AND migration.job_class_name = "CLASS_NAME";
+ ```
diff --git a/doc/development/database/loose_foreign_keys.md b/doc/development/database/loose_foreign_keys.md
index 2bcdc91202a..3db24793f1b 100644
--- a/doc/development/database/loose_foreign_keys.md
+++ b/doc/development/database/loose_foreign_keys.md
@@ -117,8 +117,8 @@ Showing cross-schema foreign keys (20):
18 | N | ci_job_token_project_scope_links | projects | target_project_id | cascade
19 | N | ci_project_monthly_usages | projects | project_id | cascade
-To match FK write one or many filters to match against FROM/TO/COLUMN:
-- scripts/decomposition/generate-loose-foreign-key <filter(s)...>
+To match foreign key (FK), write one or many filters to match against FROM/TO/COLUMN:
+- scripts/decomposition/generate-loose-foreign-key (filters...)
- scripts/decomposition/generate-loose-foreign-key ci_job_artifacts project_id
- scripts/decomposition/generate-loose-foreign-key dast_site_profiles_pipelines
```
@@ -593,7 +593,7 @@ Partitions: gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_84 FOR
The `partition` column controls the insert direction, the `partition` value determines which
partition will get the deleted rows inserted via the trigger. Notice that the default value of
the `partition` table matches with the value of the list partition (84). In `INSERT` query
-within the trigger thevalue of the `partition` is omitted, the trigger always relies on the
+within the trigger the value of the `partition` is omitted, the trigger always relies on the
default value of the column.
Example `INSERT` query for the trigger:
@@ -605,7 +605,7 @@ SELECT TG_TABLE_SCHEMA || '.' || TG_TABLE_NAME, old_table.id FROM old_table;
```
The partition "sliding" process is controlled by two, regularly executed callbacks. These
-callbackes are defined within the `LooseForeignKeys::DeletedRecord` model.
+callbacks are defined within the `LooseForeignKeys::DeletedRecord` model.
The `next_partition_if` callback controls when to create a new partition. A new partition will
be created when the current partition has at least one record older than 24 hours. A new partition
@@ -805,7 +805,7 @@ Possible solutions:
- Long-term: invoke the worker more frequently. Parallelize the worker
For a one-time fix, we can run the cleanup worker several times from the rails console. The worker
-can run parallelly however, this can introduce lock contention and it could increase the worker
+can run in parallel however, this can introduce lock contention and it could increase the worker
runtime.
```ruby
diff --git a/doc/development/database/migrations_for_multiple_databases.md b/doc/development/database/migrations_for_multiple_databases.md
index 0ec4612e985..ce326a6ce4a 100644
--- a/doc/development/database/migrations_for_multiple_databases.md
+++ b/doc/development/database/migrations_for_multiple_databases.md
@@ -33,7 +33,7 @@ Depending on the used constructs, we can classify migrations to be either:
Migrations cannot mix **DDL** and **DML** changes as the application requires the structure
(as described by `db/structure.sql`) to be exactly the same across all decomposed databases.
-### Data Definition Language (DDL)
+### Data Definition Language (DDL)
The DDL migrations are all migrations that:
@@ -43,7 +43,7 @@ The DDL migrations are all migrations that:
1. Add or remove a column with or without a default value (for example, `add_column`).
1. Create or drop trigger functions (for example, `create_trigger_function`).
1. Attach or detach triggers from tables (for example, `track_record_deletions`, `untrack_record_deletions`).
-1. Prepare or not async indexes (for example, `prepare_async_index`, `unprepare_async_index_by_name`).
+1. Prepare or not asynchronous indexes (for example, `prepare_async_index`, `unprepare_async_index_by_name`).
As such DDL migrations **CANNOT**:
@@ -159,7 +159,7 @@ end
### The special purpose of `gitlab_shared`
-As described in [gitlab_schema](multiple_databases.md#the-special-purpose-of-gitlab_shared),
+As described in [`gitlab_schema`](multiple_databases.md#the-special-purpose-of-gitlab_shared),
the `gitlab_shared` tables are allowed to contain data across all databases. This implies
that such migrations should run across all databases to modify structure (DDL) or modify data (DML).
@@ -388,3 +388,32 @@ A Potential extension is to limit running DML migration only to specific environ
```ruby
restrict_gitlab_migration gitlab_schema: :gitlab_main, gitlab_env: :gitlab_com
```
+
+## Background migrations
+
+When you use:
+
+- Background migrations with `track_jobs` set to `true` or
+- Batched background migrations
+
+The migration has to write to a jobs table. All of the
+jobs tables used by background migrations are marked as `gitlab_shared`.
+You can use these migrations when migrating tables in any database.
+
+However, when queuing the batches, you must set `restrict_gitlab_migration` based on the
+table you are iterating over. If you are updating all `projects`, for example, then you would set
+`restrict_gitlab_migration gitlab_schema: :gitlab_main`. If, however, you are
+updating all `ci_pipelines`, you would set
+`restrict_gitlab_migration gitlab_schema: :gitlab_ci`.
+
+As with all DML migrations, you cannot query another database outside of
+`restrict_gitlab_migration` or `gitlab_shared`. If you need to query another database,
+you'll likely need to separate these into two migrations somehow.
+
+Because the actual migration logic (not the queueing step) for background
+migrations runs in a Sidekiq worker, the logic can perform DML queries on
+tables in any database, just like any ordinary Sidekiq worker can.
+
+## How to determine `gitlab_schema` for a given table
+
+See [GitLab Schema](multiple_databases.md#gitlab-schema).
diff --git a/doc/development/database/multiple_databases.md b/doc/development/database/multiple_databases.md
index 3b1b06b557c..c622d4f50ff 100644
--- a/doc/development/database/multiple_databases.md
+++ b/doc/development/database/multiple_databases.md
@@ -74,7 +74,14 @@ in GitLab 14.1. This feature is still under development, and is not ready for pr
### Configure single database
-By default, GDK is configured to run with multiple databases. To configure GDK to use a single database:
+By default, GDK is configured to run with multiple databases.
+
+WARNING:
+Switching back-and-forth between single and multiple databases in
+the same development instance is discouraged. Any data in the `ci`
+database will not be accessible in single database mode. For single database, you should use a separate development instance.
+
+To configure GDK to use a single database:
1. On the GDK root directory, run:
@@ -519,7 +526,7 @@ ci_build.update!(updated_at: Time.current) # CI DB
ci_build.project.update!(updated_at: Time.current) # Main DB
```
-##### Async processing
+##### Asynchronous processing
If we need more guarantee that an operation finishes the work consistently we can execute it
within a background job. A background job is scheduled asynchronously and retried several times
@@ -579,58 +586,6 @@ ensures that we forbid destroying the parent object if something is not cleaned
If all you need to do is clean up the child records themselves from PostgreSQL,
consider using [loose foreign keys](loose_foreign_keys.md).
-## `config/database.yml`
-
-GitLab is adding support to run multiple databases, for example to
-[separate tables for the continuous integration features](https://gitlab.com/groups/gitlab-org/-/epics/6167)
-from the main database. In order to prepare for this change, we
-[validate the structure of the configuration](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/67877)
-in `database.yml` to ensure that only known databases are used.
-
-Previously, the `config/database.yml` looked like this:
-
-```yaml
-production:
- adapter: postgresql
- encoding: unicode
- database: gitlabhq_production
- ...
-```
-
-With the support for many databases this
-syntax is [deprecated](https://gitlab.com/gitlab-org/gitlab/-/issues/338182)
-and will be removed in [15.0](https://gitlab.com/gitlab-org/gitlab/-/issues/338182).
-
-The new `config/database.yml` needs to include a database name
-to define a database configuration. Only `main:` and `ci:` database
-names are supported. The `main:` database must always be a first
-entry in a hash. This change applies to decomposed and non-decomposed
-change. If an invalid or deprecated syntax is used the error
-or warning is printed during application start.
-
-```yaml
-# Non-decomposed database
-production:
- main:
- adapter: postgresql
- encoding: unicode
- database: gitlabhq_production
- ...
-
-# Decomposed database
-production:
- main:
- adapter: postgresql
- encoding: unicode
- database: gitlabhq_production
- ...
- ci:
- adapter: postgresql
- encoding: unicode
- database: gitlabhq_production_ci
- ...
-```
-
## Foreign keys that cross databases
There are many places where we use foreign keys that reference across the two
diff --git a/doc/development/database/pagination_guidelines.md b/doc/development/database/pagination_guidelines.md
index 3a772b10a6d..08840124535 100644
--- a/doc/development/database/pagination_guidelines.md
+++ b/doc/development/database/pagination_guidelines.md
@@ -172,7 +172,7 @@ From the user point of view, this might not be always noticeable. As the user pa
When requesting a large page number, the database needs to read `PAGE * PAGE_SIZE` rows. This makes offset pagination **unsuitable for large database tables**.
-Example: listing users on the Admin page
+Example: listing users on the Admin Area
Listing users with a very simple SQL query:
diff --git a/doc/development/database/strings_and_the_text_data_type.md b/doc/development/database/strings_and_the_text_data_type.md
index 4ed7cf1b4de..7aa529e1518 100644
--- a/doc/development/database/strings_and_the_text_data_type.md
+++ b/doc/development/database/strings_and_the_text_data_type.md
@@ -206,7 +206,7 @@ class ScheduleCapTitleLengthOnIssues < Gitlab::Database::Migration[1.0]
disable_ddl_transaction!
- class Issue < ActiveRecord::Base
+ class Issue < ::ApplicationRecord
include EachBatch
self.table_name = 'issues'
diff --git a/doc/development/database/table_partitioning.md b/doc/development/database/table_partitioning.md
index ec768136404..34cb73978bc 100644
--- a/doc/development/database/table_partitioning.md
+++ b/doc/development/database/table_partitioning.md
@@ -43,7 +43,7 @@ problem.
First, a table is partitioned on a partition key, which is a column or
set of columns which determine how the data will be split across the
partitions. The partition key is used by the database when reading or
-writing data, to decide which partition(s) need to be accessed. The
+writing data, to decide which partitions need to be accessed. The
partition key should be a column that would be included in a `WHERE`
clause on almost all queries accessing that table.