1 files changed, 259 insertions, 0 deletions
diff --git a/doc/development/database/table_partitioning.md b/doc/development/database/table_partitioning.md
new file mode 100644
index 00000000000..30d0b0a2f5b
--- /dev/null
+++ b/doc/development/database/table_partitioning.md
@@ -0,0 +1,259 @@
+---
+stage: Enablement
+group: Database
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
+---
+
+# Database table partitioning
+
+Table partitioning is a powerful database feature that allows a table's
+data to be split into smaller physical tables that act as a single large
+table. If the application is designed to work with partitioning in mind,
+there can be multiple benefits, such as:
+
+- Query performance can be improved greatly, because the database can
+cheaply eliminate much of the data from the search space, while still
+providing full SQL capabilities.
+
+- Bulk deletes can be achieved with minimal impact on the database by
+dropping entire partitions. This is a natural fit for features that need
+to periodically delete data that falls outside the retention window.
+
+- Administrative tasks like `VACUUM` and index rebuilds can operate on
+individual partitions, rather than across a single massive table.
+
+Unfortunately, not all models fit a partitioning scheme, and there are
+significant drawbacks if implemented incorrectly. Additionally, tables
+can only be partitioned at their creation, making it nontrivial to apply
+partitioning to a busy database. A suite of migration tools are available
+to enable backend developers to partition existing tables, but the
+migration process is rather heavy, taking multiple steps split across
+several releases. Due to the limitations of partitioning and the related
+migrations, you should understand how partitioning fits your use case
+before attempting to leverage this feature.
+
+## Determining when to use partitioning
+
+While partitioning can be very useful when properly applied, it's
+imperative to identify if the data and workload of a table naturally fit a
+partitioning scheme. There are a few details you'll have to understand
+in order to decide if partitioning is a good fit for your particular
+problem.
+
+First, a table is partitioned on a partition key, which is a column or
+set of columns which determine how the data will be split across the
+partitions. The partition key is used by the database when reading or
+writing data, to decide which partition(s) need to be accessed. The
+partition key should be a column that would be included in a `WHERE`
+clause on almost all queries accessing that table.
+
+Second, it's necessary to understand the strategy the database will
+use to split the data across the partitions. The scheme supported by the
+GitLab migration helpers is date-range partitioning, where each partition
+in the table contains data for a single month. In this case, the partitioning
+key would need to be a timestamp or date column. In order for this type of
+partitioning to work well, most queries would need to access data within a
+certain date range.
+
+For a more concrete example, the `audit_events` table can be used, which
+was the first table to be partitioned in the application database
+(scheduled for deployment with the GitLab 13.5 release). This
+table tracks audit entries of security events that happen in the
+application. In almost all cases, users want to see audit activity that
+occurs in a certain timeframe. As a result, date-range partitioning
+was a natural fit for how the data would be accessed.
+
+To look at this in more detail, imagine a simplified `audit_events` schema:
+
+```sql
+CREATE TABLE audit_events (
+  id SERIAL NOT NULL PRIMARY KEY,
+  author_id INT NOT NULL,
+  details jsonb NOT NULL,
+  created_at timestamptz NOT NULL);
+```
+
+Now imagine typical queries in the UI would display the data within a
+certain date range, like a single week:
+
+```sql
+SELECT *
+FROM audit_events
+WHERE created_at >= '2020-01-01 00:00:00'
+  AND created_at < '2020-01-08 00:00:00'
+ORDER BY created_at DESC
+LIMIT 100
+```
+
+If the table is partitioned on the `created_at` column the base table would
+look like:
+
+```sql
+CREATE TABLE audit_events (
+  id SERIAL NOT NULL,
+  author_id INT NOT NULL,
+  details jsonb NOT NULL,
+  created_at timestamptz NOT NULL,
+  PRIMARY KEY (id, created_at))
+PARTITION BY RANGE(created_at);
+```
+
+NOTE: **Note:**
+The primary key of a partitioned table must include the partition key as
+part of the primary key definition.
+
+And we might have a list of partitions for the table, such as:
+
+```sql
+audit_events_202001 FOR VALUES FROM ('2020-01-01') TO ('2020-02-01')
+audit_events_202002 FOR VALUES FROM ('2020-02-01') TO ('2020-03-01')
+audit_events_202003 FOR VALUES FROM ('2020-03-01') TO ('2020-04-01')
+```
+
+Each partition is a separate physical table, with the same structure as
+the base `audit_events` table, but contains only data for rows where the
+partition key falls in the specified range. For example, the partition
+`audit_events_202001` contains rows where the `created_at` column is
+greater than or equal to `2020-01-01` and less than `2020-02-01`.
+
+Now, if we look at the previous example query again, the database can
+use the `WHERE` to recognize that all matching rows will be in the
+`audit_events_202001` partition. Rather than searching all of the data
+in all of the partitions, it can search only the single month's worth
+of data in the appropriate partition. In a large table, this can
+dramatically reduce the amount of data the database needs to access.
+However, imagine a query that does not filter based on the partitioning
+key, such as:
+
+```sql
+SELECT *
+FROM audit_events
+WHERE author_id = 123
+ORDER BY created_at DESC
+LIMIT 100
+```
+
+In this example, the database can't prune any partitions from the search,
+because matching data could exist in any of them. As a result, it has to
+query each partition individually, and aggregate the rows into a single result
+set. Since `author_id` would be indexed, the performance impact could
+likely be acceptable, but on more complex queries the overhead can be
+substantial. Partitioning should only be leveraged if the access patterns
+of the data support the partitioning strategy, otherwise performance will
+suffer.
+
+## Partitioning a table
+
+Unfortunately, tables can only be partitioned at their creation, making
+it nontrivial to apply to a busy database. A suite of migration
+tools have been developed to enable backend developers to partition
+existing tables. This migration process takes multiple steps which must
+be split across several releases.
+
+### Caveats
+
+The partitioning migration helpers work by creating a partitioned duplicate
+of the original table and using a combination of a trigger and a background
+migration to copy data into the new table. Changes to the original table
+schema can be made in parallel with the partitioning migration, but they
+must take care to not break the underlying mechanism that makes the migration
+work. For example, if a column is added to the table that is being
+partitioned, both the partitioned table and the trigger definition need to
+be updated to match.
+
+### Step 1: Creating the partitioned copy (Release N)
+
+The first step is to add a migration to create the partitioned copy of
+the original table. This migration will also create the appropriate
+partitions based on the data in the original table, and install a
+trigger that will sync writes from the original table into the
+partitioned copy.
+
+An example migration of partitioning the `audit_events` table by its
+`created_at` column would look like:
+
+```ruby
+class PartitionAuditEvents < ActiveRecord::Migration[6.0]
+  include Gitlab::Database::PartitioningMigrationHelpers
+
+  def up
+    partition_table_by_date :audit_events, :created_at
+  end
+
+  def down
+    drop_partitioned_table_for :audit_events
+  end
+end
+```
+
+Once this has executed, any inserts, updates or deletes in the
+original table will also be duplicated in the new table. For updates and
+deletes, the operation will only have an effect if the corresponding row
+exists in the partitioned table.
+
+### Step 2: Backfill the partitioned copy (Release N)
+
+The second step is to add a post-deployment migration that will schedule
+the background jobs that will backfill existing data from the original table
+into the partitioned copy.
+
+Continuing the above example, the migration would look like:
+
+```ruby
+class BackfillPartitionAuditEvents < ActiveRecord::Migration[6.0]
+  include Gitlab::Database::PartitioningMigrationHelpers
+
+  def up
+    enqueue_partitioning_data_migration :audit_events
+  end
+
+  def down
+    cleanup_partitioning_data_migration :audit_events
+  end
+end
+```
+
+This step uses the same mechanism as any background migration, so you
+may want to read the [Background Migration](../background_migrations.md)
+guide for details on that process. Background jobs are scheduled every
+2 minutes and copy `50_000` records at a time, which can be used to
+estimate the timing of the background migration portion of the
+partitioning migration.
+
+### Step 3: Post-backfill cleanup (Release N+1)
+
+The third step must occur at least one release after the release that
+includes the background migration. This gives time for the background
+migration to execute properly in self-managed installations. In this step,
+add another post-deployment migration that will cleanup after the
+background migration. This includes forcing any remaining jobs to
+execute, and copying data that may have been missed, due to dropped or
+failed jobs.
+
+Once again, continuing the example, this migration would look like:
+
+```ruby
+class CleanupPartitionedAuditEventsBackfill < ActiveRecord::Migration[6.0]
+  include Gitlab::Database::PartitioningMigrationHelpers
+
+  def up
+    finalize_backfilling_partitioned_table :audit_events
+  end
+
+  def down
+    # no op
+  end
+end
+```
+
+After this migration has completed, the original table and partitioned
+table should contain identical data. The trigger installed on the
+original table guarantees that the data will remain in sync going
+forward.
+
+### Step 4: Swap the partitioned and non-partitioned tables (Release N+1)
+
+The final step of the migration will make the partitioned table ready
+for use by the application. This section will be updated when the
+migration helper is ready, for now development can be followed in the
+[Tracking Issue](https://gitlab.com/gitlab-org/gitlab/-/issues/241267).