diff options
Diffstat (limited to 'doc/development/cicd')
-rw-r--r-- | doc/development/cicd/cicd_tables.md | 121 | ||||
-rw-r--r-- | doc/development/cicd/index.md | 3 | ||||
-rw-r--r-- | doc/development/cicd/templates.md | 1 |
3 files changed, 123 insertions, 2 deletions
diff --git a/doc/development/cicd/cicd_tables.md b/doc/development/cicd/cicd_tables.md new file mode 100644 index 00000000000..b246328a817 --- /dev/null +++ b/doc/development/cicd/cicd_tables.md @@ -0,0 +1,121 @@ +--- +stage: Verify +group: Pipeline Execution +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/product/ux/technical-writing/#assignments +--- + +# Add new tables to the CI database + +The [pipeline data partitioning](../../architecture/blueprints/ci_data_decay/pipeline_partitioning.md) +design blueprint describes how to partition existing tables in the CI domain. However, +you still need to add tables for new features. Sometimes these tables hold +references to larger tables that need to be partitioned. To reduce future +work, all tables that use a `belongs_to` association to partitionable tables +should be partitioned from the start. + +## Create a new routing table + +Here is an example on how to use database helpers to create a new table and foreign keys: + +```ruby + include Gitlab::Database::PartitioningMigrationHelpers + disable_ddl_transaction! + + def up + create_table(:p_ci_examples, primary_key: [:id, :partition_id], options: 'PARTITION BY LIST (partition_id)', if_not_exists: true) do |t| + t.bigserial :id, null: false + t.bigint :partition_id, null: false + t.bigint :build_id, null: false + end + + add_concurrent_partitioned_foreign_key( + :p_ci_examples, :ci_builds, + column: [:partition_id, :build_id], + target_column: [:partition_id, :id], + on_update: :cascade, + on_delete: :cascade, + reverse_lock_order: true + ) + end + + def down + drop_table :p_ci_examples + end +``` + +This table is called a routing table and it does not hold any data. The +data is stored in partitions. + +When creating the routing table: + +- The table name must start with the `p_` prefix. There are analyzers in place to ensure that all queries go + through the routing tables and do not access the partitions directly. +- Each new table needs a `partition_id` column and its value must equal + the value from the related association. In this example, that is `ci_builds`. All resources + belonging to a pipeline share the same `partition_id` value. +- The primary key must have the columns ordered this way to allow efficient + search only by `id`. +- The foreign key constraint must include the `ON UPDATE CASCADE` option because + the `partition_id` value should be able to update it for re-balancing the + partitions. + +## Create the first partition + +Usually, you rely on the application to create the initial partition at boot time. +However, due to the high traffic on the CI tables and the large number of nodes, +it can be difficult to acquire a lock on the referenced table. +Consequently, during deployment, a node may fail to start. +To prevent this failure, you must ensure that the partition is already in place before +the application runs: + +```ruby + disable_ddl_transaction! + + def up + with_lock_retries do + connection.execute(<<~SQL) + LOCK TABLE ci_builds IN SHARE UPDATE EXCLUSIVE MODE; + LOCK TABLE ONLY p_ci_examples IN ACCESS EXCLUSIVE MODE; + SQL + + connection.execute(<<~SQL) + CREATE TABLE IF NOT EXISTS gitlab_partitions_dynamic.ci_examples_100 + PARTITION OF p_ci_examples + FOR VALUES IN (100); + SQL + end + end +``` + +Partitions are created in `gitlab_partitions_dynamic` schema. + +When creating a partition, remember: + +- Partition names do not use the `p_` prefix. +- The default value for `partition_id` is `100`. + +## Cascade the partition value + +To cascade the partition value, the module should use the `Ci::Partitionable` module: + +```ruby +class Ci::Example < Ci::ApplicationRecord + include Ci::Partitionable + + self.table_name = :p_ci_examples + self.primary_key = :id + + belongs_to :build, class_name: 'Ci::Build' + partitionable scope: :build, partitioned: true +end +``` + +## Manage partitions + +The model must be included in the [`PARTITIONABLE_MODELS`](https://gitlab.com/gitlab-org/gitlab/-/blob/920147293ae304639915f66b260dc14e4f629850/app/models/concerns/ci/partitionable.rb#L25-44) +list because it is used to test that the `partition_id` is +propagated correctly. + +If it's missing, specifying `partitioned: true` creates the first partition. The model also needs to be registered in the +[`postgres_partitioning.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/920147293ae304639915f66b260dc14e4f629850/config/initializers/postgres_partitioning.rb) +initializer. diff --git a/doc/development/cicd/index.md b/doc/development/cicd/index.md index 641d0ea4f6f..2cc8fca02dd 100644 --- a/doc/development/cicd/index.md +++ b/doc/development/cicd/index.md @@ -5,7 +5,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w type: index, concepts, howto --- -# CI/CD development documentation +# CI/CD development guidelines Development guides that are specific to CI/CD are listed here: @@ -147,6 +147,7 @@ This API endpoint runs [`Ci::RegisterJobService`](https://gitlab.com/gitlab-org/ There are 3 top level queries that this service uses to gather the majority of the jobs and they are selected based on the level where the runner is registered to: - Select jobs for shared runner (instance level) + - Utilizes a fair scheduling algorithm which prioritizes projects with fewer running builds - Select jobs for group runner - Select jobs for project runner diff --git a/doc/development/cicd/templates.md b/doc/development/cicd/templates.md index 4f6799d12d8..1bf4a780e26 100644 --- a/doc/development/cicd/templates.md +++ b/doc/development/cicd/templates.md @@ -436,7 +436,6 @@ To add a metric definition for a new template: ```yaml - name: p_ci_templates_my_template_name category: ci_templates - redis_slot: ci_templates aggregation: weekly ``` |