diff options
author | GitLab Bot <gitlab-bot@gitlab.com> | 2022-06-20 11:10:13 +0000 |
---|---|---|
committer | GitLab Bot <gitlab-bot@gitlab.com> | 2022-06-20 11:10:13 +0000 |
commit | 0ea3fcec397b69815975647f5e2aa5fe944a8486 (patch) | |
tree | 7979381b89d26011bcf9bdc989a40fcc2f1ed4ff /doc/development | |
parent | 72123183a20411a36d607d70b12d57c484394c8e (diff) | |
download | gitlab-ce-0ea3fcec397b69815975647f5e2aa5fe944a8486.tar.gz |
Add latest changes from gitlab-org/gitlab@15-1-stable-eev15.1.0-rc42
Diffstat (limited to 'doc/development')
178 files changed, 3937 insertions, 1352 deletions
diff --git a/doc/development/adding_database_indexes.md b/doc/development/adding_database_indexes.md index 35dbd80e4d1..f524b04c6eb 100644 --- a/doc/development/adding_database_indexes.md +++ b/doc/development/adding_database_indexes.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -32,14 +32,14 @@ data and are slower to update compared to B-tree indexes. Because of all this one should not blindly add a new index for every column used to filter data by. Instead one should ask themselves the following questions: -1. Can I write my query in such a way that it re-uses as many existing indexes +1. Can you write your query in such a way that it re-uses as many existing indexes as possible? -1. Is the data going to be large enough that using an index will actually be +1. Is the data large enough that using an index is actually faster than just iterating over the rows in the table? 1. Is the overhead of maintaining the index worth the reduction in query timings? -We'll explore every question in detail below. +We explore every question in detail below. ## Re-using Queries @@ -54,7 +54,7 @@ AND state = 'open'; ``` Now imagine we already have an index on the `user_id` column but not on the -`state` column. One may think this query will perform badly due to `state` being +`state` column. One may think this query performs badly due to `state` being unindexed. In reality the query may perform just fine given the index on `user_id` can filter out enough rows. @@ -85,8 +85,8 @@ enough rows you may _not_ want to add a new index. ## Maintenance Overhead Indexes have to be updated on every table write. In case of PostgreSQL _all_ -existing indexes will be updated whenever data is written to a table. As a -result of this having many indexes on the same table will slow down writes. +existing indexes are updated whenever data is written to a table. As a +result of this having many indexes on the same table slows down writes. Because of this one should ask themselves: is the reduction in query performance worth the overhead of maintaining an extra index? @@ -184,8 +184,8 @@ def up end ``` -The call to `index_exists?` will return true if **any** index exists on -`:my_table` and `:my_column`, and index creation will be bypassed. +The call to `index_exists?` returns true if **any** index exists on +`:my_table` and `:my_column`, and index creation is bypassed. The `add_concurrent_index` helper is a requirement for creating indexes on populated tables. Since it cannot be used inside a transactional @@ -285,7 +285,7 @@ production clone. After the index is verified to exist on the production database, create a second merge request that adds the index synchronously. The schema changes must be -updated and committed to `structure.sql` in this second merge request. +updated and committed to `structure.sql` in this second merge request. The synchronous migration results in a no-op on GitLab.com, but you should still add the migration as expected for other installations. The below block demonstrates how to create the second migration for the previous diff --git a/doc/development/adding_service_component.md b/doc/development/adding_service_component.md index f5acf0d26eb..51c6e86bb49 100644 --- a/doc/development/adding_service_component.md +++ b/doc/development/adding_service_component.md @@ -23,7 +23,7 @@ The following outline re-uses the [maturity metric](https://about.gitlab.com/dir - [Release management](#release-management) - [Enabled on GitLab.com](feature_flags/controls.md#enabling-a-feature-for-gitlabcom) - Complete - - [Configurable by the GitLab orchestrator](https://gitlab.com/gitlab-org/gitlab-orchestrator) + - [Configurable by the GitLab Environment Toolkit](https://gitlab.com/gitlab-org/gitlab-environment-toolkit) - Lovable - Enabled by default for the majority of users @@ -47,7 +47,7 @@ Adding a new service follows the same [merge request workflow](contributing/merg The first iteration should be to add the ability to connect and use the service as an externally installed component. Often this involves providing settings in GitLab to connect to the service, or allow connections from it. And then shipping documentation on how to install and configure the service with GitLab. -[Elasticsearch](../integration/elasticsearch.md#install-elasticsearch) is an example of a service that has been integrated this way. Many of the other services, including internal projects like Gitaly, started off as separately installed alternatives. +[Elasticsearch](../integration/advanced_search/elasticsearch.md#install-elasticsearch) is an example of a service that has been integrated this way. Many of the other services, including internal projects like Gitaly, started off as separately installed alternatives. **For services that depend on the existing GitLab codebase:** diff --git a/doc/development/api_graphql_styleguide.md b/doc/development/api_graphql_styleguide.md index f807ed0f85e..de6840b2c6c 100644 --- a/doc/development/api_graphql_styleguide.md +++ b/doc/development/api_graphql_styleguide.md @@ -94,11 +94,16 @@ discussed in [Nullable fields](#nullable-fields). - Lowering the global limits for query complexity and depth. - Anything else that can result in queries hitting a limit that previously was allowed. -Fields that use the [`feature_flag` property](#feature_flag-property) and the flag is disabled by default are exempt -from the deprecation process, and can be removed at any time without notice. - See the [deprecating schema items](#deprecating-schema-items) section for how to deprecate items. +### Breaking change exemptions + +Two scenarios exist where schema items are exempt from the deprecation process, +and can be removed or changed at any time without notice. These are schema items that either: + +- Use the [`feature_flag` property](#feature_flag-property) _and_ the flag is disabled by default. +- Are [marked as alpha](#marking-schema-items-as-alpha). + ## Global IDs The GitLab GraphQL API uses Global IDs (i.e: `"gid://gitlab/MyObject/123"`) @@ -718,6 +723,28 @@ aware of the support. The documentation will mention that the old Global ID style is now deprecated. +## Marking schema items as Alpha + +Fields, arguments, enum values, and mutations can be marked as being in +[alpha](https://about.gitlab.com/handbook/product/gitlab-the-product/#alpha-beta-ga). + +An item marked as "alpha" is exempt from the deprecation process and can be removed +at any time without notice. + +This leverages GraphQL deprecations to cause the schema item to appear as deprecated, +and will be described as being in "alpha" in our generated docs and its GraphQL description. + +To mark a schema item as being in "alpha", use the `deprecated:` keyword with `reason: :alpha`. +You must provide the `milestone:` that introduced the alpha item. + +For example: + +```ruby +field :token, GraphQL::Types::String, null: true, + deprecated: { reason: :alpha, milestone: '10.0' }, + description: 'Token for login.' +``` + ## Enums GitLab GraphQL enums are defined in `app/graphql/types`. When defining new enums, the @@ -1848,35 +1875,59 @@ field :created_at, Types::TimeType, null: true, description: 'Timestamp of when ## Testing -### Writing unit tests +For testing mutations and resolvers, consider the unit of +test a full GraphQL request, not a call to a resolver. The reasons for this are +that we want to avoid lots of coupling to the framework, since this makes +upgrades to dependencies much more difficult. -Before creating unit tests, review the following examples: +You should: -- [`spec/graphql/resolvers/users_resolver_spec.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/graphql/resolvers/users_resolver_spec.rb) -- [`spec/graphql/mutations/issues/create_spec.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/graphql/mutations/issues/create_spec.rb) +- Prefer request specs (either using the full API endpoint or going through + `GitlabSchema.execute`) to unit specs for resolvers and mutations. +- Prefer `GraphqlHelpers#execute_query` and `GraphqlHelpers#run_with_clean_state` to + `GraphqlHelpers#resolve` and `GraphqlHelpers#resolve_field`. -It's faster to test as much of the logic from your GraphQL queries and mutations -with unit tests, which are stored in `spec/graphql`. +For example: -Use unit tests to verify that: +```ruby +# Good: +gql_query = %q(some query text...) +post_graphql(gql_query, current_user: current_user) +# or: +GitlabSchema.execute(gql_query, context: { current_user: current_user }) -- Types have the expected fields. -- Resolvers and mutations apply authorizations and return expected data. -- Edge cases are handled correctly. +# Deprecated: avoid +resolve(described_class, obj: project, ctx: { current_user: current_user }) +``` + +### Writing unit tests (deprecated) + +WARNING: +Avoid writing unit tests if the same thing can be tested with +a full GraphQL request. + +Before creating unit tests, review the following examples: + +- [`spec/graphql/resolvers/users_resolver_spec.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/graphql/resolvers/users_resolver_spec.rb) +- [`spec/graphql/mutations/issues/create_spec.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/graphql/mutations/issues/create_spec.rb) ### Writing integration tests Integration tests check the full stack for a GraphQL query or mutation and are stored in `spec/requests/api/graphql`. -For speed, you should test most logic in unit tests instead of integration tests. -However, integration tests that check if data is returned verify the following +For speed, consider calling `GitlabSchema.execute` directly, or making use +of smaller test schemas that only contain the types under test. + +However, full request integration tests that check if data is returned verify the following additional items: - The mutation is actually queryable in the schema (was mounted in `MutationType`). - The data returned by a resolver or mutation correctly matches the [return types](https://graphql-ruby.org/fields/introduction.html#field-return-type) of the fields and resolves without errors. +- The arguments coerce correctly on input, and the fields serialize correctly + on output. Integration tests can also verify the following items, because they invoke the full stack: @@ -1929,6 +1980,55 @@ end ### Testing tips and tricks +- Become familiar with the methods in the `GraphqlHelpers` support module. + Many of these methods make writing GraphQL tests easier. + +- Use traversal helpers like `GraphqlHelpers#graphql_data_at` and + `GraphqlHelpers#graphql_dig_at` to access result fields. For example: + + ```ruby + result = GitlabSchema.execute(query) + + mr_iid = graphql_dig_at(result.to_h, :data, :project, :merge_request, :iid) + ``` + +- Use `GraphqlHelpers#a_graphql_entity_for` to match against results. + For example: + + ```ruby + post_graphql(some_query) + + # checks that it is a hash containing { id => global_id_of(issue) } + expect(graphql_data_at(:project, :issues, :nodes)) + .to contain_exactly(a_graphql_entity_for(issue)) + + # Additional fields can be passed, either as names of methods, or with values + expect(graphql_data_at(:project, :issues, :nodes)) + .to contain_exactly(a_graphql_entity_for(issue, :iid, :title, created_at: some_time)) + ``` + +- Use `GraphqlHelpers#empty_schema` to create an empty schema, rather than creating + one by hand. For example: + + ```ruby + # good + let(:schema) { empty_schema } + + # bad + let(:query_type) { GraphQL::ObjectType.new } + let(:schema) { GraphQL::Schema.define(query: query_type, mutation: nil)} + ``` + +- Use `GraphqlHelpers#query_double(schema: nil)` of `double('query', schema: nil)`. For example: + + ```ruby + # good + let(:query) { query_double(schema: GitlabSchema) } + + # bad + let(:query) { double('Query', schema: GitlabSchema) } + ``` + - Avoid false positives: Authenticating a user with the `current_user:` argument for `post_graphql` @@ -1983,6 +2083,122 @@ end `spec/requests/api/graphql/ci/pipeline_spec.rb` regardless of the query being used to fetch the pipeline data. +- There can be possible cyclic dependencies within our GraphQL types. + See [Adding field with resolver on a Type causes "Can't determine the return type " error on a different Type](https://github.com/rmosolgo/graphql-ruby/issues/3974#issuecomment-1084444214) + and [Fix unresolved name due to cyclic definition](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/84202/diffs#diff-content-32d14251082fd45412e1fdbf5820e62d157e70d2). + + In particular, this can happen with `connection_type`. Normally we might use the following in a resolver: + + ```ruby + type Types::IssueType.connection_type, null: true + ``` + + However this might cause a cyclic definition, which can result in errors like: + + ```ruby + NameError: uninitialized constant Resolvers::GroupIssuesResolver + ``` + + To fix this, we must create a new file that encapsulates the connection type, + and then reference it using double quotes. This gives a delayed resolution, + and the proper connection type. For example: + + ```ruby + module Types + # rubocop: disable Graphql/AuthorizeTypes + class IssueConnectionType < CountableConnectionType + end + end + + Types::IssueConnectionType.prepend_mod_with('Types::IssueConnectionType') + ``` + + in [types/issue_connection_type.rb](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/graphql/types/issue_connection_type.rb) + defines a new `Types::IssueConnectionType`, and is then referenced in + [app/graphql/resolvers/base_issues_resolver.rb](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/graphql/resolvers/base_issues_resolver.rb) + + ```ruby + type "Types::IssueConnection", null: true + ``` + + Only use this style if you are having spec failures. This is not intended to be a new + pattern that we use. This issue may disappear after we've upgraded to `2.x`. + +- There can be instances where a spec fails because the class is not loaded correctly. + It relates to the + [circular dependencies problem](https://github.com/rmosolgo/graphql-ruby/issues/1929) and + [Adding field with resolver on a Type causes "Can't determine the return type " error on a different Type](https://github.com/rmosolgo/graphql-ruby/issues/3974). + + Unfortunately, the errors generated don't really indicate what the problem is. For example, + remove the quotes from the `Rspec.descrbe` in + [ee/spec/graphql/resolvers/compliance_management/merge_requests/compliance_violation_resolver_spec.rb](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/spec/graphql/resolvers/compliance_management/merge_requests/compliance_violation_resolver_spec.rb). + Then run `rspec ee/spec/graphql/resolvers/compliance_management/merge_requests/compliance_violation_resolver_spec.rb`. + + This generates errors with the expectations. For example: + + ```ruby + 1) Resolvers::ComplianceManagement::MergeRequests::ComplianceViolationResolver#resolve user is authorized filtering the results when given an array of project IDs finds the filtered compliance violations + Failure/Error: expect(subject).to contain_exactly(compliance_violation) + + expected collection contained: [#<MergeRequests::ComplianceViolation id: 4, violating_user_id: 26, merge_request_id: 4, reason: "approved_by_committer", severity_level: "low">] + actual collection contained: [#<MergeRequests::ComplianceViolation id: 4, violating_user_id: 26, merge_request_id: 4, reason: "app...er_id: 27, merge_request_id: 5, reason: "approved_by_merge_request_author", severity_level: "high">] + the extra elements were: [#<MergeRequests::ComplianceViolation id: 5, violating_user_id: 27, merge_request_id: 5, reason: "approved_by_merge_request_author", severity_level: "high">] + # ./ee/spec/graphql/resolvers/compliance_management/merge_requests/compliance_violation_resolver_spec.rb:55:in `block (6 levels) in <top (required)>' + ``` + + However, this is not a case of the wrong result being generated, it's because of the loading order + of the `ComplianceViolationResolver` class. + + The only way we've found to fix this is by quoting the class name in the spec. For example, changing + + ```ruby + RSpec.describe Resolvers::ComplianceManagement::MergeRequests::ComplianceViolationResolver do + ``` + + into: + + ```ruby + RSpec.describe 'Resolvers::ComplianceManagement::MergeRequests::ComplianceViolationResolver' do + ``` + + See [this merge request](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/87295#note_946174036) for some discussion. + + Only use this style if you are having spec failures. This is not intended to be a new + pattern that we use. This issue may disappear after we've upgraded to `2.x`. + +- When testing resolvers using `GraphqlHelpers#resolve`, arguments for the resolver can be handled two ways. + + 1. 95% of the resolver specs use arguments that are Ruby objects, as opposed to when using the GraphQL API + only strings and integers are used. This works fine in most cases. + 1. If your resolver takes arguments that use a `prepare` proc, such as a resolver that accepts timeframe + arguments (`TimeFrameArguments`), you must pass the `arg_style: :internal_prepared` parameter into + the `resolve` method. This tells the code to convert the arguments into strings and integers and pass + them through regular argument handling, ensuring that the `prepare` proc is called correctly. + For example in [`iterations_resolver_spec.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/spec/graphql/resolvers/iterations_resolver_spec.rb): + + ```ruby + def resolve_group_iterations(args = {}, obj = group, context = { current_user: current_user }) + resolve(described_class, obj: obj, args: args, ctx: context, arg_style: :internal_prepared) + end + ``` + + One additional caveat is that if you are passing enums as a resolver argument, you must use the + external representation of the enum, rather than the internal. For example: + + ```ruby + # good + resolve_group_iterations({ search: search, in: ['CADENCE_TITLE'] }) + + # bad + resolve_group_iterations({ search: search, in: [:cadence_title] }) + ``` + + The use of `:internal_prepared` was added as a bridge for the + [GraphQL gem](https://graphql-ruby.org) upgrade. Testing resolvers directly will be + [removed eventually](https://gitlab.com/gitlab-org/gitlab/-/issues/363121), + and writing unit tests for resolvers/mutations is + [already deprecated](#writing-unit-tests-deprecated) + ## Notes about Query flow and GraphQL infrastructure The GitLab GraphQL infrastructure can be found in `lib/gitlab/graphql`. diff --git a/doc/development/application_limits.md b/doc/development/application_limits.md index c4146b5af3e..6c7213ab235 100644 --- a/doc/development/application_limits.md +++ b/doc/development/application_limits.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Systems group: Distribution info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -189,3 +189,23 @@ the middleware level, this can be used at the controller or API level. See the `CheckRateLimit` concern for use in controllers. In other parts of the code the `Gitlab::ApplicationRateLimiter` module can be called directly. + +## Next rate limiting architecture + +In May 2022 we've started working on the next iteration of our application +limits framework using a forward looking rate limiting architecture. + +We are working on defining new requirements and designing the next +architecture, so if you need new functionalities to add new limits, instead of +building them right now, consider contributing to the +[Rate Limiting Architecture Working Group](https://about.gitlab.com/company/team/structure/working-groups/rate-limit-architecture/) + +Examples of what features we might want to build into the next iteration of +rate limiting architecture: + +1. Making it possible to define and override limits per namespace / per plan. +1. Automatically generating documentation about what limits are implemented and + what the defaults are. +1. Defining limits in a single place that is easy to find an explore. +1. Soft and hard limits, with support for notifying users when a limit is + approaching. diff --git a/doc/development/application_slis/index.md b/doc/development/application_slis/index.md index 2834723fc01..8d7941865e1 100644 --- a/doc/development/application_slis/index.md +++ b/doc/development/application_slis/index.md @@ -39,8 +39,8 @@ for clarity, they define different metric names: 1. `gitlab_sli:foo_apdex:success_total` for the number of successful measurements. 1. `Gitlab::Metrics::Sli::ErrorRate.new('foo')` defines: - 1. `gitlab_sli:foo_error_rate:total` for the total number of measurements. - 1. `gitlab_sli:foo_error_rate:error_total` for the number of error + 1. `gitlab_sli:foo:total` for the total number of measurements. + 1. `gitlab_sli:foo:error_total` for the number of error measurements - as this is an error rate, it's more natural to talk about errors divided by the total. diff --git a/doc/development/architecture.md b/doc/development/architecture.md index 486ef6d27fc..a61a891b096 100644 --- a/doc/development/architecture.md +++ b/doc/development/architecture.md @@ -442,9 +442,9 @@ Consul is a tool for service discovery and configuration. Consul is distributed, - [Project page](https://github.com/elastic/elasticsearch/) - Configuration: - - [Omnibus](../integration/elasticsearch.md) - - [Charts](../integration/elasticsearch.md) - - [Source](../integration/elasticsearch.md) + - [Omnibus](../integration/advanced_search/elasticsearch.md) + - [Charts](../integration/advanced_search/elasticsearch.md) + - [Source](../integration/advanced_search/elasticsearch.md) - [GDK](https://gitlab.com/gitlab-org/gitlab-development-kit/blob/main/doc/howto/elasticsearch.md) - Layer: Core Service (Data) - GitLab.com: [Get Advanced Search working on GitLab.com (Closed)](https://gitlab.com/groups/gitlab-org/-/epics/153) epic. diff --git a/doc/development/audit_event_guide/index.md b/doc/development/audit_event_guide/index.md index 0d62bcdc3b2..14cd2fd1dc3 100644 --- a/doc/development/audit_event_guide/index.md +++ b/doc/development/audit_event_guide/index.md @@ -14,6 +14,17 @@ new audit events. Audit Events are a tool for GitLab owners and administrators to view records of important actions performed across the application. +## What should not be Audit Events? + +While any events could trigger an Audit Event, not all events should. In general, events that are not good candidates for audit events are: + +- Not attributable to one specific user. +- Not of specific interest to an admin or owner persona. +- Are tracking information for product feature adoption. +- Are covered in the direction page's discussion on [what is not planned](https://about.gitlab.com/direction/manage/compliance/audit-events/#what-is-not-planned-right-now). + +If you have any questions, please reach out to `@gitlab-org/manage/compliance` to see if an Audit Event, or some other approach, may be best for your event. + ## Audit Event Schemas To instrument an audit event, the following attributes should be provided: diff --git a/doc/development/backend/ruby_style_guide.md b/doc/development/backend/ruby_style_guide.md index 6c8125a6157..eff6ae7f217 100644 --- a/doc/development/backend/ruby_style_guide.md +++ b/doc/development/backend/ruby_style_guide.md @@ -9,7 +9,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w This is a GitLab-specific style guide for Ruby code. -Generally, if a style is not covered by [existing Rubocop rules or style guides](../contributing/style_guides.md#ruby-rails-rspec), it shouldn't be a blocker. +Generally, if a style is not covered by [existing RuboCop rules or style guides](../contributing/style_guides.md#ruby-rails-rspec), it shouldn't be a blocker. Before adding a new cop to enforce a given style, make sure to discuss it with your team. When the style is approved by a backend EM or by a BE staff eng, add a new section to this page to document the new rule. For every new guideline, add it in a new section and link the discussion from the section's diff --git a/doc/development/build_test_package.md b/doc/development/build_test_package.md index bd2d7545bfc..89b13efc1aa 100644 --- a/doc/development/build_test_package.md +++ b/doc/development/build_test_package.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Systems group: Distribution info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/cached_queries.md b/doc/development/cached_queries.md index 8c69981b27a..b0bf7c7b6f5 100644 --- a/doc/development/cached_queries.md +++ b/doc/development/cached_queries.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Memory info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -93,7 +93,7 @@ below the query. You can see multiple duplicate cached queries in this modal win ![Performance Bar Cached Queries Modal](img/performance_bar_cached_queries.png) -Click **...** to expand the actual stack trace: +Select **...** to expand the actual stack trace: ```ruby [ diff --git a/doc/development/cascading_settings.md b/doc/development/cascading_settings.md index 76ab2c6e693..56699ff5ffc 100644 --- a/doc/development/cascading_settings.md +++ b/doc/development/cascading_settings.md @@ -210,25 +210,13 @@ This function should be imported and called in the [page-specific JavaScript](fe = s_('Settings|Merge method') .gl-form-radio.custom-control.custom-radio - = f.radio_button :merge_method, :merge, class: "custom-control-input", disabled: merge_method_locked - = f.label :merge_method_merge, class: 'custom-control-label' do - = s_('Settings|Merge commit') - %p.help-text - = s_('Settings|Every merge creates a merge commit.') + = f.gitlab_ui_radio_component :merge_method, :merge, s_('Settings|Merge commit'), help_text: s_('Settings|Every merge creates a merge commit.'), radio_options: { disabled: merge_method_locked } .gl-form-radio.custom-control.custom-radio - = f.radio_button :merge_method, :rebase_merge, class: "custom-control-input", disabled: merge_method_locked - = f.label :merge_method_rebase_merge, class: 'custom-control-label' do - = s_('Settings|Merge commit with semi-linear history') - %p.help-text - = s_('Settings|Every merge creates a merge commit.') + = f.gitlab_ui_radio_component :merge_method, :rebase_merge, s_('Settings|Merge commit with semi-linear history'), help_text: s_('Settings|Every merge creates a merge commit.'), radio_options: { disabled: merge_method_locked } .gl-form-radio.custom-control.custom-radio - = f.radio_button :merge_method, :ff, class: "custom-control-input", disabled: merge_method_locked - = f.label :merge_method_ff, class: 'custom-control-label' do - = s_('Settings|Fast-forward merge') - %p.help-text - = s_('Settings|No merge commits are created.') + = f.gitlab_ui_radio_component :merge_method, :ff, s_('Settings|Fast-forward merge'), help_text: s_('Settings|No merge commits are created.'), radio_options: { disabled: merge_method_locked } = render 'shared/namespaces/cascading_settings/enforcement_checkbox', attribute: :merge_method, diff --git a/doc/development/changelog.md b/doc/development/changelog.md index c19c5b40382..83919bab671 100644 --- a/doc/development/changelog.md +++ b/doc/development/changelog.md @@ -11,7 +11,7 @@ file, as well as information and history about our changelog process. ## Overview -Each bullet point, or **entry**, in our +Each list item, or **entry**, in our [`CHANGELOG.md`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/CHANGELOG.md) file is generated from the subject line of a Git commit. Commits are included when they contain the `Changelog` [Git trailer](https://git-scm.com/docs/git-interpret-trailers). diff --git a/doc/development/cicd/img/pipeline_wizard_sample_step1.png b/doc/development/cicd/img/pipeline_wizard_sample_step1.png Binary files differnew file mode 100644 index 00000000000..77e5f07aad2 --- /dev/null +++ b/doc/development/cicd/img/pipeline_wizard_sample_step1.png diff --git a/doc/development/cicd/img/pipeline_wizard_sample_step2.png b/doc/development/cicd/img/pipeline_wizard_sample_step2.png Binary files differnew file mode 100644 index 00000000000..b59414d78f9 --- /dev/null +++ b/doc/development/cicd/img/pipeline_wizard_sample_step2.png diff --git a/doc/development/cicd/img/pipeline_wizard_sample_step3.png b/doc/development/cicd/img/pipeline_wizard_sample_step3.png Binary files differnew file mode 100644 index 00000000000..42c995b64d6 --- /dev/null +++ b/doc/development/cicd/img/pipeline_wizard_sample_step3.png diff --git a/doc/development/cicd/index.md b/doc/development/cicd/index.md index 8677d5b08e3..e8e116037de 100644 --- a/doc/development/cicd/index.md +++ b/doc/development/cicd/index.md @@ -15,6 +15,12 @@ Development guides that are specific to CI/CD are listed here: See the [CI/CD YAML reference documentation guide](cicd_reference_documentation_guide.md) to learn how to update the [reference page](../../ci/yaml/index.md). +## Examples of CI/CD usage + +We maintain a [`ci-sample-projects`](https://gitlab.com/gitlab-org/ci-sample-projects) group, with projects that showcase +examples of `.gitlab-ci.yml` for different use cases of GitLab CI/CD. They also cover specific syntax that could +be used for different scenarios. + ## CI Architecture overview The following is a simplified diagram of the CI architecture. Some details are left out to focus on diff --git a/doc/development/cicd/pipeline_wizard.md b/doc/development/cicd/pipeline_wizard.md new file mode 100644 index 00000000000..608c21778c0 --- /dev/null +++ b/doc/development/cicd/pipeline_wizard.md @@ -0,0 +1,229 @@ +--- +stage: none +group: Incubation Engineering +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Pipeline Wizard + +The Pipeline Wizard is a Vue frontend component that helps users create a +pipeline by using input fields. The type of input fields and the form of the final +pipeline is configured by a YAML template. + +The Pipeline Wizard expects a single template file that configures the user +flow. The wizard is agnostic with regards to the contents of the file, +so you can use the wizard to display a range of different flows. For example, there +could be one template file for static sites, +one for Docker images, one for mobile apps, and so on. As a first iteration, +these templates are part of the GitLab source code. + +The template file defines multiple steps. The last step shown to the user is always +the commit, and is not part of the template definition. An ideal user experience +consists of 2-3 steps, for a total of 3-4 steps visible to the user. + +## Usage Example + +### Vue Component + +```vue +<!-- ~/my_feature/my_component.vue --> + +<script> + import PipelineWizard from '~/pipeline_wizard/pipeline_wizard.vue' + import template from '~/pipeline_wizard/templates/my_template.yml'; + + export default { + name: "MyComponent", + components: { PipelineWizard }, + data() { + return { template } + }, + methods: { + onDone() { + // redirect + } + } + } +</script> + +<template> + <pipeline-wizard :template="template" + project-path="foo/bar" + default-branch="main" + @done="onDone" /> +</template> +``` + +### Template + +```yaml +# ~/pipeline_wizard/templates/my_template.yml + +title: Set up my specific tech pipeline +description: Here's two or three introductory sentences that help the user understand what this wizard is going to set up. +steps: + # Step 1 + - inputs: + # First input widget + - label: Select your build image + description: A Docker image that we can use to build your image + placeholder: node:lts + widget: text + target: $BUILD_IMAGE + required: true + pattern: "^(?:(?=[^:\/]{1,253})(?!-)[a-zA-Z0-9-]{1,63}(?<!-)(?:\.(?!-)[a-zA-Z0-9-]{1,63}(?<!-))*(?::[0-9]{1,5})?\/)?((?![._-])(?:[a-z0-9._-]*)(?<![._-])(?:\/(?![._-])[a-z0-9._-]*(?<![._-]))*)(?::(?![.-])[a-zA-Z0-9_.-]{1,128})?$" + invalid-feedback: Please enter a valid docker image + + # Second input widget + - label: Installation Steps + description: "Enter the steps that need to run to set up a local build + environment, for example installing dependencies." + placeholder: npm ci + widget: list + target: $INSTALLATION_STEPS + + # This is the template to copy to the final pipeline file and updated with + # the values input by the user. Comments are copied as-is. + template: + my-job: + # The Docker image that will be used to build your app + image: $BUILD_IMAGE + + before_script: $INSTALLATION_STEPS + + artifacts: + paths: + - foo + + # Step 2 + - inputs: + # This is the only input widget for this step + - label: Installation Steps + description: "Enter the steps that need to run to set up a local build + environment, for example installing dependencies." + placeholder: npm ci + widget: list + target: $INSTALLATION_STEPS + + template: + # Functions that should be executed before the build script runs + before_script: $INSTALLATION_STEPS +``` + +### The result + +1. ![Step 1](img/pipeline_wizard_sample_step1.png) +1. ![Step 2](img/pipeline_wizard_sample_step2.png) +1. ![Step 3](img/pipeline_wizard_sample_step3.png) + +### The commit step + +The last step of the wizard is always the commit step. Users can commit the +newly created file to the repository defined by the [wizard's props](#props). +The user has the option to change the branch to commit to. A future iteration +is planned to add the ability to create a MR from here. + +## Component API Reference + +### Props + +- `template` (required): The template content as an unparsed String. See + [Template file location](#template-file-location) for more information. +- `project-path` (required): The full path of the project the final file + should be committed to +- `default-branch` (required): The branch that will be pre-selected during + the commit step. This can be changed by the user. +- `default-filename` (optional, default: `.gitlab-ci.yml`): The Filename + to be used for the file. This can be overridden in the template file. + +### Events + +- `done` - Emitted after the file has been committed. Use this to redirect the +user to the pipeline, for example. + +### Template file location + +Template files are normally stored as YAML files in `~/pipeline_wizard/templates/`. + +The `PipelineWizard` component expects the `template` property as an un-parsed `String`, +and Webpack is configured to load `.yml` files from the above folder as strings. +If you must load the file from a different place, make sure +Webpack does not parse it as an Object. + +## Template Reference + +### Template + +In the root element of the template file, you can define the following properties: + +| Name | Required | Type | Description | +|---------------|------------------------|--------|---------------------------------------------------------------------------------------| +| `title` | **{check-circle}** Yes | string | The page title as displayed to the user. It becomes an `h1` heading above the wizard. | +| `description` | **{check-circle}** Yes | string | The page description as displayed to the user. | +| `filename` | **{dotted-circle}** No | string | The name of the file that is being generated. Defaults to `.gitlab-ci.yml`. | +| `steps` | **{check-circle}** Yes | list | A list of [step definitions](#step-reference). | + +### `step` Reference + +A step makes up one page in a multi-step (or page) process. It consists of one or more +related input fields that build a part of the final `.gitlab-ci.yml`. + +Steps include two properties: + +| Name | Required | Type | Description | +|------------|------------------------|------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `template` | **{check-circle}** Yes | map | The raw YAML to deep-merge into the final `.gitlab-ci.yml`. This template section can contain variables denoted by a `$` sign that is replaced with the values from the input fields. | +| `inputs` | **{check-circle}** Yes | list | A list of [input definitions](#input-reference). | + +### `input` Reference + +Each step can contain one or more `inputs`. For an ideal user experience, it should not +contain more than three. + +The look and feel of the input, as well as the YAML type it produces (string, list, and so on) +depends on the [`widget`](#widgets) used. [`widget: text`](#text) displays a +text input +and inserts the user's input as a string into the template. [`widget: list`](#list) +displays one or more input fields and inserts a list. + +All `inputs` must have a `label`, `widget`, and optionally `target`, but +most properties +are dependent on the widget being used: + +| Name | Required | Type | Description | +|----------|------------------------|--------|-----------------------------------------------------------------------------------------------------------------------------| +| `label` | **{check-circle}** Yes | string | The label for the input field. | +| `widget` | **{check-circle}** Yes | string | The [widget](#widgets) type to use for this input. | +| `target` | **{dotted-circle}** No | string | The variable name inside the step's template that should be replaced with the value of the input field, for example `$FOO`. | + +### Widgets + +#### Text + +Use as `widget: text`. This inserts a `string` in the YAML file. + +| Name | Required | Type | Description | +|-------------------|------------------------|---------|-----------------------| +| `label` | **{check-circle}** Yes | string | The label for the input field. | +| `description` | **{dotted-circle}** No | string | Help text related to the input field. | +| `required` | **{dotted-circle}** No | boolean | Whether or not the user must provide a value before proceeding to the next step. `false` if not defined. | +| `placeholder` | **{dotted-circle}** No | string | A placeholder for the input field. | +| `pattern` | **{dotted-circle}** No | string | A regular expression that the user's input must match before they can proceed to the next step. | +| `invalidFeedback` | **{dotted-circle}** No | string | Help text displayed when the pattern validation fails. | +| `default` | **{dotted-circle}** No | string | The default value for the field. | +| `id` | **{dotted-circle}** No | string | The input field ID is usually autogenerated but can be overridden by providing this property. | + +#### List + +Use as `widget: list`. This inserts a `list` in the YAML file. + +| Name | Required | Type | Description | +|-------------------|------------------------|---------|-----------------------| +| `label` | **{check-circle}** Yes | string | The label for the input field. | +| `description` | **{dotted-circle}** No | string | Help text related to the input field. | +| `required` | **{dotted-circle}** No | boolean | Whether or not the user must provide a value before proceeding to the next step. `false` if not defined. | +| `placeholder` | **{dotted-circle}** No | string | A placeholder for the input field. | +| `pattern` | **{dotted-circle}** No | string | A regular expression that the user's input must match before they can proceed to the next step. | +| `invalidFeedback` | **{dotted-circle}** No | string | Help text displayed when the pattern validation fails. | +| `default` | **{dotted-circle}** No | list | The default value for the list | +| `id` | **{dotted-circle}** No | string | The input field ID is usually autogenerated but can be overridden by providing this property. | diff --git a/doc/development/cicd/schema.md b/doc/development/cicd/schema.md index 0e456a25a7a..ee5b5e4359a 100644 --- a/doc/development/cicd/schema.md +++ b/doc/development/cicd/schema.md @@ -77,7 +77,7 @@ For example, this defines the `retry` keyword: } ] } - } + } } ``` @@ -106,7 +106,7 @@ under the topmost **properties** key. } }, } - } + } } ``` diff --git a/doc/development/code_review.md b/doc/development/code_review.md index 252bd1daf55..a6976271ddf 100644 --- a/doc/development/code_review.md +++ b/doc/development/code_review.md @@ -15,35 +15,33 @@ code is effective, understandable, maintainable, and secure. ## Getting your merge request reviewed, approved, and merged -You are strongly encouraged to get your code **reviewed** by a -[reviewer](https://about.gitlab.com/handbook/engineering/workflow/code-review/#reviewer) as soon as -there is any code to review, to get a second opinion on the chosen solution and -implementation, and an extra pair of eyes looking for bugs, logic problems, or -uncovered edge cases. - -The default approach is to choose a reviewer from your group or team for the first review. -This is only a recommendation and the reviewer may be from a different team. -However, it is recommended to pick someone who is a [domain expert](#domain-experts). -If your merge request touches more than one domain (for example, Dynamic Analysis and GraphQL), ask for reviews from an expert from each domain. +Before you begin: -You can read more about the importance of involving reviewers in the section on the responsibility of the author below. +- Familiarize yourself with the [contribution acceptance criteria](contributing/merge_request_workflow.md#contribution-acceptance-criteria). +- If you need some guidance (for example, if it's your first merge request), feel free to ask + one of the [Merge request coaches](https://about.gitlab.com/company/team/?department=merge-request-coach). -If you need some guidance (for example, it's your first merge request), feel free to ask -one of the [Merge request coaches](https://about.gitlab.com/company/team/). +As soon as you have code to review, have the code **reviewed** by a [reviewer](https://about.gitlab.com/handbook/engineering/workflow/code-review/#reviewer). +This reviewer can be from your group or team, or a [domain expert](#domain-experts). +The reviewer can: -If you need assistance with security scans or comments, feel free to include the -Application Security Team (`@gitlab-com/gl-security/appsec`) in the review. +- Give you a second opinion on the chosen solution and implementation. +- Help look for bugs, logic problems, or uncovered edge cases. -Depending on the areas your merge request touches, it must be **approved** by one -or more [maintainers](https://about.gitlab.com/handbook/engineering/workflow/code-review/#maintainer). +For assistance with security scans or comments, include the Application Security Team (`@gitlab-com/gl-security/appsec`). -For approvals, we use the approval functionality found in the merge request -widget. For reviewers, we use the [reviewer functionality](../user/project/merge_requests/getting_started.md#reviewer) in the sidebar. +The reviewers use the [reviewer functionality](../user/project/merge_requests/getting_started.md#reviewer) in the sidebar. Reviewers can add their approval by [approving additionally](../user/project/merge_requests/approvals/index.md#approve-a-merge-request). +Depending on the areas your merge request touches, it must be **approved** by one +or more [maintainers](https://about.gitlab.com/handbook/engineering/workflow/code-review/#maintainer). +The **Approved** button is in the merge request widget. + Getting your merge request **merged** also requires a maintainer. If it requires more than one approval, the last maintainer to review and approve merges it. +Read more about [author responsibilities](#the-responsibility-of-the-merge-request-author) below. + ### Domain experts Domain experts are team members who have substantial experience with a specific technology, @@ -90,7 +88,7 @@ page, with these behaviors: 1. People whose [GitLab status](../user/profile/index.md#set-your-current-status) emoji is 🔶 `:large_orange_diamond:` or 🔸 `:small_orange_diamond:` are half as likely to be picked. 1. It always picks the same reviewers and maintainers for the same - branch name (unless their out-of-office (OOO) status changes, as in point 1). It + branch name (unless their out-of-office (`OOO`) status changes, as in point 1). It removes leading `ce-` and `ee-`, and trailing `-ce` and `-ee`, so that it can be stable for backport branches. @@ -110,14 +108,14 @@ As described in the section on the responsibility of the maintainer below, you are recommended to get your merge request approved and merged by maintainers with [domain expertise](#domain-experts). -1. If your merge request includes backend changes (*1*), it must be +1. If your merge request includes `~backend` changes (*1*), it must be **approved by a [backend maintainer](https://about.gitlab.com/handbook/engineering/projects/#gitlab_maintainers_backend)**. 1. If your merge request includes database migrations or changes to expensive queries (*2*), it must be **approved by a [database maintainer](https://about.gitlab.com/handbook/engineering/projects/#gitlab_maintainers_database)**. Read the [database review guidelines](database_review.md) for more details. -1. If your merge request includes frontend changes (*1*), it must be +1. If your merge request includes `~frontend` changes (*1*), it must be **approved by a [frontend maintainer](https://about.gitlab.com/handbook/engineering/projects/#gitlab_maintainers_frontend)**. -1. If your merge request includes user-facing changes (*3*), it must be +1. If your merge request includes (`~UX`) user-facing changes (*3*), it must be **approved by a [Product Designer](https://about.gitlab.com/handbook/engineering/projects/#gitlab_reviewers_UX)**. See the [design and user interface guidelines](contributing/design.md) for details. 1. If your merge request includes adding a new JavaScript library (*1*)... @@ -143,7 +141,7 @@ with [domain expertise](#domain-experts). 1. If your merge request introduces a new service to GitLab (Puma, Sidekiq, Gitaly are examples), it must be **approved by a [product manager](https://about.gitlab.com/company/team/)**. See the [process for adding a service component to GitLab](adding_service_component.md) for details. 1. If your merge request includes changes related to authentication or authorization, it must be **approved by a [Manage:Authentication and Authorization team member](https://about.gitlab.com/company/team/)**. Check the [code review section on the group page](https://about.gitlab.com/handbook/engineering/development/dev/manage/authentication-and-authorization/#additional-considerations) for more details. Patterns for files known to require review from the team are listed in the in the `Authentication and Authorization` section of the [`CODEOWNERS`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/.gitlab/CODEOWNERS) file, and the team will be listed in the approvers section of all merge requests that modify these files. -- (*1*): Specs other than JavaScript specs are considered backend code. +- (*1*): Specs other than JavaScript specs are considered `~backend` code. Haml markup is considered `~frontend` code. However, Ruby code within Haml templates is considered `~backend` code. - (*2*): We encourage you to seek guidance from a database maintainer if your merge request is potentially introducing expensive queries. It is most efficient to comment on the line of code in question with the SQL queries so they can give their advice. @@ -185,6 +183,7 @@ See the [test engineering process](https://about.gitlab.com/handbook/engineering ##### Observability instrumentation 1. I have included enough instrumentation to facilitate debugging and proactive performance improvements through observability. + See [example](https://gitlab.com/gitlab-org/gitlab/-/issues/346124#expectations) of adding feature flags, logging, and instrumentation. ##### Documentation @@ -237,6 +236,8 @@ up confusion or verify that the end result matches what they had in mind, to database specialists to get input on the data model or specific queries, or to any other developer to get an in-depth review of the solution. +If your merge request touches more than one domain (for example, Dynamic Analysis and GraphQL), ask for reviews from an expert from each domain. + If an author is unsure if a merge request needs a [domain expert's](#domain-experts) opinion, then that indicates it does. Without it, it's unlikely they have the required level of confidence in their solution. @@ -246,7 +247,7 @@ request diff alerting the reviewer to anything important as well as for anything that demands further explanation or attention. Examples of content that may warrant a comment could be: -- The addition of a linting rule (Rubocop, JS etc). +- The addition of a linting rule (RuboCop, JS etc). - The addition of a library (Ruby gem, JS lib etc). - Where not obvious, a link to the parent class or method. - Any benchmarking performed to complement the change. @@ -269,10 +270,18 @@ This saves reviewers time and helps authors catch mistakes earlier. ### The responsibility of the reviewer -[Review the merge request](#reviewing-a-merge-request) thoroughly. When you are confident +[Review the merge request](#reviewing-a-merge-request) thoroughly. + +Verify that the merge request meets all [contribution acceptance criteria](contributing/merge_request_workflow.md#contribution-acceptance-criteria). + +If a merge request is too large, fixes more than one issue, or implements more +than one feature, you should guide the author towards spltting the merge request +into smaller merge requests. + +When you are confident that it meets all requirements, you should: -- Click the Approve button. +- Select **Approve**. - `@` mention the author to generate a to-do notification, and advise them that their merge request has been reviewed and approved. - Request a review from a maintainer. Default to requests for a maintainer with [domain expertise](#domain-experts), however, if one isn't available or you think the merge request doesn't need a review by a [domain expert](#domain-experts), feel free to follow the [Reviewer roulette](#reviewer-roulette) suggestion. @@ -291,6 +300,12 @@ Because a maintainer's job only depends on their knowledge of the overall GitLab codebase, and not that of any specific domain, they can review, approve, and merge merge requests from any team and in any product area. +If a merge request is too large, fixes more than one issue, or implements more +than one feature, the maintainer can ask the author to make the merge request +smaller. Request the previous reviewer, or a merge request coach to help guide +the author on how to split the merge request, and to review the resulting +changes. + Maintainers do their best to also review the specifics of the chosen solution before merging, but as they are not necessarily [domain experts](#domain-experts), they may be poorly placed to do so without an unreasonable investment of time. In those cases, they diff --git a/doc/development/contributing/design.md b/doc/development/contributing/design.md index 7f5c800216a..ce013a9254b 100644 --- a/doc/development/contributing/design.md +++ b/doc/development/contributing/design.md @@ -54,7 +54,7 @@ Check visual design properties using your browser's _elements inspector_ ([Chrom - Use recommended [colors](https://design.gitlab.com/product-foundations/colors/) and [typography](https://design.gitlab.com/product-foundations/type-fundamentals/). - Follow [layout guidelines](https://design.gitlab.com/layout/grid/). -- Use existing [icons](http://gitlab-org.gitlab.io/gitlab-svgs/) and [illustrations](http://gitlab-org.gitlab.io/gitlab-svgs/illustrations/) +- Use existing [icons](https://gitlab-org.gitlab.io/gitlab-svgs/) and [illustrations](https://gitlab-org.gitlab.io/gitlab-svgs/illustrations/) or propose new ones according to [iconography](https://design.gitlab.com/product-foundations/iconography/) and [illustration](https://design.gitlab.com/product-foundations/illustration/) guidelines. @@ -98,7 +98,7 @@ Check accessibility using your browser's _accessibility inspector_ ([Chrome](htt When the design is ready, _before_ starting its implementation: -- Share design specifications in the related issue, preferably through a [Figma link](https://help.figma.com/hc/en-us/articles/360040531773-Share-Files-with-anyone-using-Link-Sharing#Copy_links) +- Share design specifications in the related issue, preferably through a [Figma link](https://help.figma.com/hc/en-us/articles/360040531773-Share-Files-with-anyone-using-Link-Sharing#copy-link) link or [GitLab Designs feature](../../user/project/issues/design_management.md). See [when you should use each tool](https://about.gitlab.com/handbook/engineering/ux/product-designer/#deliver). - Document user flow and states (for example, using [Mermaid flowcharts in Markdown](../../user/markdown.md#mermaid)). diff --git a/doc/development/contributing/index.md b/doc/development/contributing/index.md index 8a4b06840a4..182d00d52ab 100644 --- a/doc/development/contributing/index.md +++ b/doc/development/contributing/index.md @@ -37,7 +37,7 @@ Report suspected security vulnerabilities by following the [disclosure process on the GitLab.com website](https://about.gitlab.com/security/disclosure/). WARNING: -Do **NOT** create publicly viewable issues for suspected security vulnerabilities. +Do **not** create publicly viewable issues for suspected security vulnerabilities. ## Code of conduct @@ -128,9 +128,12 @@ The general flow of contributing to GitLab is: 1. In the merge request's description: - Ensure you provide complete and accurate information. - Review the provided checklist. -1. Assign the merge request (if possible) to, or [mention](../../user/discussions/index.md#mentions), - one of the [code owners](../../user/project/code_owners.md) for the relevant project, - and explain that you are ready for review. +1. Once you're ready, mark your MR as ready for review with `@gitlab-bot ready`. + - This will add the `~"workflow::ready for review"` label, and then automatically assign a merge request coach as reviewer. + - If you know a relevant reviewer (for example, someone that was involved a related issue), you can also + assign them directly with `@gitlab-bot ready @username`. + +#### Review process When you submit code to GitLab, we really want it to get merged! However, we always review submissions carefully, and this takes time. Code submissions will usually be reviewed by two @@ -139,7 +142,11 @@ submissions carefully, and this takes time. Code submissions will usually be rev - A [reviewer](../code_review.md#the-responsibility-of-the-reviewer). - A [maintainer](../code_review.md#the-responsibility-of-the-maintainer). -Keep the following in mind when submitting merge requests: +After review, the reviewer could ask the author to update the merge request. In that case, the reviewer would set the `~"workflow::in dev"` label. +Once the merge request has been updated and set as ready for review again (for example, with `@gitlab-bot ready`), they will review the code again. +This process may repeat any number of times before merge, to help make the contribution the best it can be. + +Lastly, keep the following in mind when submitting merge requests: - When reviewers are reading through a merge request they may request guidance from other reviewers. @@ -154,13 +161,11 @@ Keep the following in mind when submitting merge requests: [approval](../../user/project/merge_requests/approvals/index.md) of merge requests, the maintainer may require [approvals from certain reviewers](../code_review.md#approval-guidelines) before merging a merge request. -- After review, the author may be asked to update the merge request. Once the merge request has been - updated and reassigned to the reviewer, they will review the code again. This process may repeat - any number of times before merge, to help make the contribution the best it can be. +- Sometimes a maintainer may choose to close a merge request. They will fully disclose why it will not + be merged, as well as some guidance. The maintainers will be open to discussion about how to change + the code so it can be approved and merged in the future. -Sometimes a maintainer may choose to close a merge request. They will fully disclose why it will not -be merged, as well as some guidance. The maintainers will be open to discussion about how to change -the code so it can be approved and merged in the future. +#### Getting attention on your merge request GitLab will do its best to review community contributions as quickly as possible. Specially appointed developers review community contributions daily. Look at the @@ -170,8 +175,9 @@ written some front-end code, you should mention the frontend merge request coach your code has multiple disciplines, you may mention multiple merge request coaches. GitLab receives a lot of community contributions. If your code has not been reviewed within two -working days of its initial submission, feel free to mention all merge request coaches with -`@gitlab-org/coaches` to get their attention. +working days of its initial submission, you can ask for help with `@gitlab-bot help`. + +#### Addition of external libraries When submitting code to GitLab, you may feel that your contribution requires the aid of an external library. If your code includes an external library, please provide a link to the library, as well as diff --git a/doc/development/contributing/merge_request_workflow.md b/doc/development/contributing/merge_request_workflow.md index ee1ed744cd4..eff1d2e671d 100644 --- a/doc/development/contributing/merge_request_workflow.md +++ b/doc/development/contributing/merge_request_workflow.md @@ -281,7 +281,7 @@ requirements. 1. The change is tested in a review app where possible and if appropriate. 1. The new feature does not degrade the user experience of the product. 1. The change is evaluated to [limit the impact of far-reaching work](https://about.gitlab.com/handbook/engineering/development/#reducing-the-impact-of-far-reaching-work). -1. An agreed-upon [rollout plan](https://about.gitlab.com/handbook/engineering/development/processes/rollout-plans/). +1. An agreed-upon [rollout plan](https://about.gitlab.com/handbook/engineering/development/processes/rollout-plans/). 1. Merged by a project maintainer. ### Production use @@ -292,7 +292,7 @@ requirements. 1. If there is a performance risk in the change, I have analyzed the performance of the system before and after the change. 1. *If the merge request uses feature flags, per-project or per-group enablement, and a staged rollout:* - Confirmed to be working on GitLab projects. - - Confirmed to be working at each stage for all projects added. + - Confirmed to be working at each stage for all projects added. 1. Added to the [release post](https://about.gitlab.com/handbook/marketing/blog/release-posts/), if relevant. 1. Added to [the website](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/features.yml), if relevant. @@ -322,7 +322,7 @@ issue) that are incremental improvements, such as: 1. Unprioritized bug fixes (for example, [Banner alerting of project move is showing up everywhere](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/18985)) 1. Documentation improvements -1. Rubocop or Code Quality improvements +1. RuboCop or Code Quality improvements Tag a merge request with ~"Stuff that should Just Work" to track work in this area. diff --git a/doc/development/contributing/verify/index.md b/doc/development/contributing/verify/index.md index 01aacffd00f..09b206d59aa 100644 --- a/doc/development/contributing/verify/index.md +++ b/doc/development/contributing/verify/index.md @@ -53,7 +53,7 @@ and they serve us and our users well. Some examples of these principles are that - The feedback delivered by GitLab CI/CD and data produced by the platform should be accurate. If a job fails and we notify a user that it was successful, it can have severe negative consequences. - Feedback needs to be available when a user needs it and data can not disappear unexpectedly when engineers need it. -- It all doesn’t matter if the platform is not secure and we +- It all doesn't matter if the platform is not secure and we are leaking credentials or secrets. - When a user provides a set of preconditions in a form of CI/CD configuration, the result should be deterministic each time a pipeline runs, because otherwise the platform might not be trustworthy. - If it is fast, simple to use and has a great UX it will serve our users well. @@ -134,7 +134,7 @@ applied to many other technical implementations. GitLab is a DevOps platform. We popularize DevOps because it helps companies be more efficient and achieve better results. One important component of DevOps culture is to take ownership over features and code that you are -building. It is very difficult to do that when you don’t know how your features +building. It is very difficult to do that when you don't know how your features perform and behave in the production environment. This is why we want to make our features and code observable. It @@ -164,15 +164,29 @@ data from the database, file system, or object storage, you should get an extra of eyes on your changes. When you are defining a new retention policy, you should double check with PMs and EMs. +### Get your design reviewed + +When you are designing a subsystem for pipeline processing and transitioning +CI/CD statuses, request an additional opinion on the design from a Verify maintainer (`@gitlab-org/maintainers/cicd-verify`) +as early as possible and hold others accountable for doing the same. Having your +design reviewed by a Verify maintainer helps to identify any blind spots you might +have overlooked as early as possible and possibly leads to a better solution. + +By having the design reviewed before any development work is started, it also helps to +make merge request review more efficient. You would be less likely to encounter +significantly differing opinions or change requests during the maintainer review +if the design has been reviewed by a Verify maintainer. As a result, the merge request +could be merged sooner. + ### Get your changes reviewed -When your merge request is ready for reviews you must assign -reviewers and then maintainers. Depending on the complexity of a change, you -might want to involve the people that know the most about the codebase area you are -changing. We do have many domain experts in Verify and it is absolutely acceptable to -ask them to review your code when you are not certain if a reviewer or -maintainer assigned by the Reviewer Roulette has enough context about the -change. +When your merge request is ready for reviews you must assign reviewers and then +maintainers. Depending on the complexity of a change, you might want to involve +the people that know the most about the codebase area you are changing. We do +have many domain experts and maintainers in Verify and it is absolutely +acceptable to ask them to review your code when you are not certain if a +reviewer or maintainer assigned by the Reviewer Roulette has enough context +about the change. The reviewer roulette offers useful suggestions, but as assigning the right reviewers is important it should not be done automatically every time. It might @@ -181,9 +195,19 @@ updating, because their feedback might be limited to code style and syntax. Depending on the complexity and impact of a change, assigning the right people to review your changes might be very important. -If you don’t know who to assign, consult `git blame` or ask in the `#verify` +If you don't know who to assign, consult `git blame` or ask in the `#s_verify` Slack channel (GitLab team members only). +There are two kinds of changes / merge requests that require additional +attention from reviews and an additional reviewer: + +1. Merge requests changing code around pipelines / stages / builds statuses. +1. Merge requests changing code around authentication / security features. + +In both cases engineers are expected to request a review from a maintainer and +a domain expert. If maintainer is the domain expert, involving another person +is recommended. + ### Incremental rollouts After your merge request is merged by a maintainer, it is time to release it to @@ -220,7 +244,7 @@ scenario relating to a software being built by one of our [early customers](http That would be quite an undesirable outcome of a small bug in GitLab CI/CD status processing. Please take extra care when you are working on CI/CD statuses, -we don’t want to implode our Universe! +we don't want to implode our Universe! This is an extreme and unlikely scenario, but presenting data that is not accurate can potentially cause a myriad of problems through the @@ -230,6 +254,22 @@ can have disastrous consequences. GitLab CI/CD is being used by companies building medical, aviation, and automotive software. Continuous Integration is a mission critical part of software engineering. -When you are working on a subsystem for pipeline processing and transitioning -CI/CD statuses, request an additional opinion on the design from a domain expert -as early as possible and hold others accountable for doing the same. +### Definition of Done + +In Verify, we follow our Development team's [Definition of Done](../merge_request_workflow.md#definition-of-done). +We also want to keep things efficient and [DRY](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself) when we answer questions +and solve problems for our users. + +For any issue that is resolved because the solution is supported with existing `.gitlab-ci.yml` syntax, +create a project in the [`ci-sample-projects`](https://gitlab.com/gitlab-org/ci-sample-projects) group +that demonstrates the solution. + +The project must have: + +- A simple title. +- A clear description. +- A `README.md` with: + - A link to the resolved issue. You should also direct users to collaborate in the + resolved issue if any questions arise. + - A link to any relevant documentation. + - A detailed explanation of what the example is doing. diff --git a/doc/development/creating_enums.md b/doc/development/creating_enums.md index 1f04f4c9712..450cb97d978 100644 --- a/doc/development/creating_enums.md +++ b/doc/development/creating_enums.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -98,7 +98,7 @@ end This looks working as a workaround, however, this approach has some downsides that: - Features could move from EE to FOSS or vice versa. Therefore, the offset might be mixed between FOSS and EE in the future. - For example, when you move `activity_limit_exceeded` to FOSS, you'll see `{ unknown_failure: 0, config_error: 1, activity_limit_exceeded: 1_000 }`. + For example, when you move `activity_limit_exceeded` to FOSS, you see `{ unknown_failure: 0, config_error: 1, activity_limit_exceeded: 1_000 }`. - The integer column for the `enum` is likely created [as `SMALLINT`](#creating-enums). Therefore, you need to be careful of that the offset doesn't exceed the maximum value of 2 bytes integer. diff --git a/doc/development/database/add_foreign_key_to_existing_column.md b/doc/development/database/add_foreign_key_to_existing_column.md index bfd455ef9da..9842814816f 100644 --- a/doc/development/database/add_foreign_key_to_existing_column.md +++ b/doc/development/database/add_foreign_key_to_existing_column.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/database/avoiding_downtime_in_migrations.md b/doc/development/database/avoiding_downtime_in_migrations.md index 3cf9ab1ab5c..2d079656e23 100644 --- a/doc/development/database/avoiding_downtime_in_migrations.md +++ b/doc/development/database/avoiding_downtime_in_migrations.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -15,7 +15,7 @@ requiring downtime. ## Dropping Columns Removing columns is tricky because running GitLab processes may still be using -the columns. To work around this safely, you will need three steps in three releases: +the columns. To work around this safely, you need three steps in three releases: 1. Ignoring the column (release M) 1. Dropping the column (release M+1) @@ -77,7 +77,7 @@ bundle exec rails g post_deployment_migration remove_users_updated_at_column There are two scenarios that you need to consider to write a migration that removes a column: -#### A. The removed column has no indexes or constraints that belong to it +#### A. The removed column has no indexes or constraints that belong to it In this case, a **transactional migration** can be used. Something as simple as: @@ -170,12 +170,12 @@ class RenameUsersUpdatedAtToUpdatedAtTimestamp < Gitlab::Database::Migration[1.0 end ``` -This will take care of renaming the column, ensuring data stays in sync, and +This takes care of renaming the column, ensuring data stays in sync, and copying over indexes and foreign keys. If a column contains one or more indexes that don't contain the name of the -original column, the previously described procedure will fail. In that case, -you'll first need to rename these indexes. +original column, the previously described procedure fails. In that case, +you need to rename these indexes. ### Step 2: Add A Post-Deployment Migration @@ -270,7 +270,7 @@ And that's it, we're done! Some type changes require casting data to a new type. For example when changing from `text` to `jsonb`. In this case, use the `type_cast_function` option. -Make sure there is no bad data and the cast will always succeed. You can also provide a custom function that handles +Make sure there is no bad data and the cast always succeeds. You can also provide a custom function that handles casting errors. Example migration: @@ -291,8 +291,9 @@ They can also produce a lot of pressure on the database due to it rapidly updating many rows in sequence. To reduce database pressure you should instead use a background migration -when migrating a column in a large table (for example, `issues`). This will -spread the work / load over a longer time period, without slowing down deployments. +when migrating a column in a large table (for example, `issues`). Background +migrations spread the work / load over a longer time period, without slowing +down deployments. For more information, see [the documentation on cleaning up background migrations](background_migrations.md#cleaning-up). @@ -533,7 +534,7 @@ step approach: Usually this works, but not always. For example, if a field's format is to be changed from JSON to something else we have a bit of a problem. If we were to -change existing data before deploying application code we'll most likely run +change existing data before deploying application code we would most likely run into errors. On the other hand, if we were to migrate after deploying the application code we could run into the same problems. diff --git a/doc/development/database/background_migrations.md b/doc/development/database/background_migrations.md index 80ba0336bda..0124dbae51f 100644 --- a/doc/development/database/background_migrations.md +++ b/doc/development/database/background_migrations.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -65,7 +65,7 @@ and idempotent. See [Sidekiq best practices guidelines](https://github.com/mperham/sidekiq/wiki/Best-Practices) for more details. -Make sure that in case that your migration job is going to be retried data +Make sure that in case that your migration job is retried, data integrity is guaranteed. ## Background migrations for EE-only features @@ -77,7 +77,7 @@ as explained in the [guidelines for implementing Enterprise Edition features](.. ## How It Works Background migrations are simple classes that define a `perform` method. A -Sidekiq worker will then execute such a class, passing any arguments to it. All +Sidekiq worker then executes such a class, passing any arguments to it. All migration classes must be defined in the namespace `Gitlab::BackgroundMigration`, the files should be placed in the directory `lib/gitlab/background_migration/`. @@ -100,13 +100,13 @@ to automatically split the job into batches: ```ruby queue_background_migration_jobs_by_range_at_intervals( ClassName, - BackgroundMigrationClassName, + 'BackgroundMigrationClassName', 2.minutes, batch_size: 10_000 ) ``` -You'll also need to make sure that newly created data is either migrated, or +You also need to make sure that newly created data is either migrated, or saved in both the old and new version upon creation. For complex and time consuming migrations it's best to schedule a background job using an `after_create` hook so this doesn't affect response timings. The same applies to @@ -142,7 +142,7 @@ or minor release, you _must not_ do this in a patch release. Because background migrations can take a long time you can't immediately clean things up after scheduling them. For example, you can't drop a column that's used in the migration process as this would cause jobs to fail. This means that -you'll need to add a separate _post deployment_ migration in a future release +you need to add a separate _post deployment_ migration in a future release that finishes any remaining jobs before cleaning things up (for example, removing a column). @@ -189,7 +189,7 @@ extract the `url` key from this JSON object and store it in the `integrations.ur column. There are millions of integrations and parsing JSON is slow, thus you can't do this in a regular migration. -To do this using a background migration we'll start with defining our migration +To do this using a background migration we start with defining our migration class: ```ruby @@ -213,7 +213,7 @@ class Gitlab::BackgroundMigration::ExtractIntegrationsUrl end ``` -Next we'll need to adjust our code so we schedule the above migration for newly +Next we need to adjust our code so we schedule the above migration for newly created and updated integrations. We can do this using something along the lines of the following: @@ -232,7 +232,7 @@ We're using `after_commit` here to ensure the Sidekiq job is not scheduled before the transaction completes as doing so can lead to race conditions where the changes are not yet visible to the worker. -Next we'll need a post-deployment migration that schedules the migration for +Next we need a post-deployment migration that schedules the migration for existing data. ```ruby @@ -254,11 +254,11 @@ class ScheduleExtractIntegrationsUrl < Gitlab::Database::Migration[1.0] end ``` -Once deployed our application will continue using the data as before but at the -same time will ensure that both existing and new data is migrated. +After deployed our application continues using the data as before, but at the +same time ensures that both existing and new data is migrated. In the next release we can remove the `after_commit` hooks and related code. We -will also need to add a post-deployment migration that consumes any remaining +also need to add a post-deployment migration that consumes any remaining jobs and manually run on any un-migrated rows. Such a migration would look like this: @@ -292,7 +292,7 @@ If the application does not depend on the data being 100% migrated (for instance, the data is advisory, and not mission-critical), then this final step can be skipped. -This migration will then process any jobs for the ExtractIntegrationsUrl migration +This migration then processes any jobs for the `ExtractIntegrationsUrl` migration and continue once all jobs have been processed. Once done you can safely remove the `integrations.properties` column. @@ -325,13 +325,13 @@ for more details. 1. Make sure that tests you write are not false positives. 1. Make sure that if the data being migrated is critical and cannot be lost, the clean-up migration also checks the final state of the data before completing. -1. When migrating many columns, make sure it won't generate too many +1. When migrating many columns, make sure it does not generate too many dead tuples in the process (you may need to directly query the number of dead tuples and adjust the scheduling according to this piece of data). 1. Make sure to discuss the numbers with a database specialist, the migration may add more pressure on DB than you expect (measure on staging, or ask someone to measure on production). -1. Make sure to know how much time it'll take to run all scheduled migrations. +1. Make sure to know how much time it takes to run all scheduled migrations. 1. Provide an estimation section in the description, estimating both the total migration run time and the query times for each background migration job. Explain plans for each query should also be provided. @@ -503,6 +503,6 @@ View the production Sidekiq log and filter for: - `json.meta.caller_id: <MyBackgroundMigrationSchedulingMigrationClassName>` - `json.args: <MyBackgroundMigrationClassName>` -Looking at the `json.error_class`, `json.error_message` and `json.error_backtrace` values may be helpful in understanding why the jobs failed. +Looking at the `json.exception.class`, `json.exception.message`, `json.exception.backtrace`, and `json.exception.sql` values may be helpful in understanding why the jobs failed. Depending on when and how the failure occurred, you may find other helpful information by filtering with `json.class: <MyBackgroundMigrationClassName>`. diff --git a/doc/development/database/batched_background_migrations.md b/doc/development/database/batched_background_migrations.md index 3a0fa77eff9..6d3d5fa7f92 100644 --- a/doc/development/database/batched_background_migrations.md +++ b/doc/development/database/batched_background_migrations.md @@ -1,6 +1,6 @@ --- type: reference, dev -stage: Enablement +stage: Data Stores group: Database info: "See the Technical Writers assigned to Development Guidelines: https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments-to-development-guidelines" --- @@ -152,9 +152,7 @@ When you start the second post-deployment migration, delete the previously batched migration with the provided code: ```ruby -Gitlab::Database::BackgroundMigration::BatchedMigration - .for_configuration(MIGRATION_NAME, TABLE_NAME, COLUMN, JOB_ARGUMENTS) - .delete_all +delete_batched_background_migration(MIGRATION_NAME, TABLE_NAME, COLUMN, JOB_ARGUMENTS) ``` ## Cleaning up @@ -192,7 +190,7 @@ data to be in the new format. The `routes` table has a `source_type` field that's used for a polymorphic relationship. As part of a database redesign, we're removing the polymorphic relationship. One step of -the work will be migrating data from the `source_id` column into a new singular foreign key. +the work is migrating data from the `source_id` column into a new singular foreign key. Because we intend to delete old rows later, there's no need to update them as part of the background migration. @@ -221,9 +219,9 @@ background migration. NOTE: Job classes must be subclasses of `BatchedMigrationJob` to be correctly handled by the batched migration framework. Any subclass of - `BatchedMigrationJob` will be initialized with necessary arguments to + `BatchedMigrationJob` is initialized with necessary arguments to execute the batch, as well as a connection to the tracking database. - Additional `job_arguments` set on the migration will be passed to the + Additional `job_arguments` set on the migration are passed to the job's `perform` method. 1. Add a new trigger to the database to update newly created and updated routes, @@ -245,12 +243,14 @@ background migration. 1. Create a post-deployment migration that queues the migration for existing data: ```ruby - class QueueBackfillRoutesNamespaceId < Gitlab::Database::Migration[1.0] + class QueueBackfillRoutesNamespaceId < Gitlab::Database::Migration[2.0] disable_ddl_transaction! MIGRATION = 'BackfillRouteNamespaceId' DELAY_INTERVAL = 2.minutes + restrict_gitlab_migration gitlab_schema: :gitlab_main + def up queue_batched_background_migration( MIGRATION, @@ -261,12 +261,19 @@ background migration. end def down - Gitlab::Database::BackgroundMigration::BatchedMigration - .for_configuration(MIGRATION, :routes, :id, []).delete_all + delete_batched_background_migration(MIGRATION, :routes, :id, []) end end ``` + NOTE: + When queuing a batched background migration, you need to restrict + the schema to the database where you make the actual changes. + In this case, we are updating `routes` records, so we set + `restrict_gitlab_migration gitlab_schema: :gitlab_main`. If, however, + you need to perform a CI data migration, you would set + `restrict_gitlab_migration gitlab_schema: :gitlab_ci`. + After deployment, our application: - Continues using the data as before. - Ensures that both existing and new data are migrated. @@ -275,16 +282,19 @@ background migration. that checks that the batched background migration is completed. For example: ```ruby - class FinalizeBackfillRouteNamespaceId < Gitlab::Database::Migration[1.0] + class FinalizeBackfillRouteNamespaceId < Gitlab::Database::Migration[2.0] MIGRATION = 'BackfillRouteNamespaceId' disable_ddl_transaction! + restrict_gitlab_migration gitlab_schema: :gitlab_main + def up ensure_batched_background_migration_is_finished( job_class_name: MIGRATION, table_name: :routes, column_name: :id, - job_arguments: [] + job_arguments: [], + finalize: true ) end @@ -294,6 +304,11 @@ background migration. end ``` + NOTE: + If the batched background migration is not finished, the system will + execute the batched background migration inline. If you don't want + to see this behavior, you need to pass `finalize: false`. + If the application does not depend on the data being 100% migrated (for instance, the data is advisory, and not mission-critical), then you can skip this final step. This step confirms that the migration is completed, and all of the rows were migrated. diff --git a/doc/development/database/client_side_connection_pool.md b/doc/development/database/client_side_connection_pool.md index 60c8665df87..dc52a551407 100644 --- a/doc/development/database/client_side_connection_pool.md +++ b/doc/development/database/client_side_connection_pool.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/database/constraint_naming_convention.md b/doc/development/database/constraint_naming_convention.md index a22ddc1551c..72f16c20559 100644 --- a/doc/development/database/constraint_naming_convention.md +++ b/doc/development/database/constraint_naming_convention.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/database/database_lab.md b/doc/development/database/database_lab.md index 1c8694b113d..5346df2690d 100644 --- a/doc/development/database/database_lab.md +++ b/doc/development/database/database_lab.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/database/database_migration_pipeline.md b/doc/development/database/database_migration_pipeline.md index ce7e1801abc..496bd09bf1d 100644 --- a/doc/development/database/database_migration_pipeline.md +++ b/doc/development/database/database_migration_pipeline.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/database/database_reviewer_guidelines.md b/doc/development/database/database_reviewer_guidelines.md index ca9ca36b156..b6bbfe690c1 100644 --- a/doc/development/database/database_reviewer_guidelines.md +++ b/doc/development/database/database_reviewer_guidelines.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -47,7 +47,7 @@ As a database reviewer, join the internal `#database` Slack channel and ask ques database related issues with other database reviewers and maintainers. There is also an optional database office hours call held bi-weekly, alternating between -European/US and APAC friendly hours. You can join the office hours call and bring topics +European/US and Asia-Pacific (APAC) friendly hours. You can join the office hours call and bring topics that require a more in-depth discussion between the database reviewers and maintainers: - [Database Office Hours Agenda](https://docs.google.com/document/d/1wgfmVL30F8SdMg-9yY6Y8djPSxWNvKmhR5XmsvYX1EI/edit). diff --git a/doc/development/database/dbcheck-migrations-job.md b/doc/development/database/dbcheck-migrations-job.md index af72e28a875..49f8b183272 100644 --- a/doc/development/database/dbcheck-migrations-job.md +++ b/doc/development/database/dbcheck-migrations-job.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/database/deleting_migrations.md b/doc/development/database/deleting_migrations.md index be9009f365d..8354cb62d0c 100644 --- a/doc/development/database/deleting_migrations.md +++ b/doc/development/database/deleting_migrations.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/database/efficient_in_operator_queries.md b/doc/development/database/efficient_in_operator_queries.md index 2503be826ea..a2481577e8c 100644 --- a/doc/development/database/efficient_in_operator_queries.md +++ b/doc/development/database/efficient_in_operator_queries.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -26,7 +26,7 @@ Pagination may be used to fetch subsequent records. Example tasks requiring querying nested domain objects from the group level: - Show first 20 issues by creation date or due date from the group `gitlab-org`. -- Show first 20 merge_requests by merged at date from the group `gitlab-com`. +- Show first 20 merge requests by merged at date from the group `gitlab-com`. Unfortunately, ordered group-level queries typically perform badly as their executions require heavy I/O, memory, and computations. @@ -163,7 +163,7 @@ The technique can only optimize `IN` queries that satisfy the following requirem (the combination of the columns uniquely identifies one particular column in the table). WARNING: -This technique will not improve the performance of the `COUNT(*)` queries. +This technique does not improve the performance of the `COUNT(*)` queries. ## The `InOperatorOptimization` module @@ -183,7 +183,7 @@ in `Gitlab::Pagination::Keyset::InOperatorOptimization`. ### Basic usage of `QueryBuilder` -To illustrate a basic usage, we will build a query that +To illustrate a basic usage, we build a query that fetches 20 issues with the oldest `created_at` from the group `gitlab-org`. The following ActiveRecord query would produce a query similar to @@ -226,10 +226,10 @@ Gitlab::Pagination::Keyset::InOperatorOptimization::QueryBuilder.new( the order by column expressions is available for locating the record. In this example, the yielded values are `created_at` and `id` SQL expressions. Finding a record is very fast via the primary key, so we don't use the `created_at` value. Providing the `finder_query` lambda is optional. - If it's not given, the IN operator optimization will only make the ORDER BY columns available to + If it's not given, the `IN` operator optimization only makes the `ORDER BY` columns available to the end-user and not the full database row. - If it's not given, the IN operator optimization will only make the ORDER BY columns available to + If it's not given, the `IN` operator optimization only makes the `ORDER BY` columns available to the end-user and not the full database row. The following database index on the `issues` table must be present @@ -416,7 +416,7 @@ scope = Issue .limit(20) ``` -To construct the array scope, we'll need to take the Cartesian product of the `project_id IN` and +To construct the array scope, we need to take the Cartesian product of the `project_id IN` and the `issue_type IN` queries. `issue_type` is an ActiveRecord enum, so we need to construct the following table: @@ -589,7 +589,7 @@ LIMIT 20 NOTE: To make the query efficient, the following columns need to be covered with an index: `project_id`, `issue_type`, `created_at`, and `id`. -#### Using calculated ORDER BY expression +#### Using calculated `ORDER BY` expression The following example orders epic records by the duration between the creation time and closed time. It is calculated with the following formula: @@ -766,7 +766,7 @@ using the generalized `IN` optimization technique. ### Array CTE -As the first step, we use a common table expression (CTE) for collecting the `projects.id` values. +As the first step, we use a Common Table Expression (CTE) for collecting the `projects.id` values. This is done by wrapping the incoming `array_scope` ActiveRecord relation parameter with a CTE. ```sql @@ -792,7 +792,7 @@ This query produces the following result set with only one column (`projects.id` ### Array mapping For each project (that is, each record storing a project ID in `array_cte`), -we will fetch the cursor value identifying the first issue respecting the `ORDER BY` clause. +we fetch the cursor value identifying the first issue respecting the `ORDER BY` clause. As an example, let's pick the first record `ID=9` from `array_cte`. The following query should fetch the cursor value `(created_at, id)` identifying @@ -805,7 +805,7 @@ ORDER BY "issues"."created_at" ASC, "issues"."id" ASC LIMIT 1; ``` -We will use `LATERAL JOIN` to loop over the records in the `array_cte` and find the +We use `LATERAL JOIN` to loop over the records in the `array_cte` and find the cursor value for each project. The query would be built using the `array_mapping_scope` lambda function. @@ -854,11 +854,11 @@ The table shows the cursor values (`created_at, id`) of the first record for eac respecting the `ORDER BY` clause. At this point, we have the initial data. To start collecting the actual records from the database, -we'll use a recursive CTE query where each recursion locates one row until +we use a recursive CTE query where each recursion locates one row until the `LIMIT` is reached or no more data can be found. -Here's an outline of the steps we will take in the recursive CTE query -(expressing the steps in SQL is non-trivial but will be explained next): +Here's an outline of the steps we take in the recursive CTE query +(expressing the steps in SQL is non-trivial but is explained next): 1. Sort the initial resultset according to the `ORDER BY` clause. 1. Pick the top cursor to fetch the record, this is our first record. In the example, @@ -877,7 +877,7 @@ this cursor would be (`2020-01-05`, `3`) for `project_id=9`. ### Initializing the recursive CTE query -For the initial recursive query, we'll need to produce exactly one row, we call this the +For the initial recursive query, we need to produce exactly one row, we call this the initializer query (`initializer_query`). Use `ARRAY_AGG` function to compact the initial result set into a single row @@ -994,7 +994,7 @@ After this, the recursion starts again by finding the next lowest cursor value. ### Finalizing the query -For producing the final `issues` rows, we're going to wrap the query with another `SELECT` statement: +For producing the final `issues` rows, we wrap the query with another `SELECT` statement: ```sql SELECT "issues".* @@ -1031,17 +1031,17 @@ Optimized `IN` query: | issue lookup query | 519 | 20 | 10 000 | The group and project queries are not using sorting, the necessary columns are read from database -indexes. These values are accessed frequently so it's very likely that most of the data will be +indexes. These values are accessed frequently so it's very likely that most of the data is in the PostgreSQL's buffer cache. -The optimized `IN` query will read maximum 519 entries (cursor values) from the index: +The optimized `IN` query reads maximum 519 entries (cursor values) from the index: - 500 index-only scans for populating the arrays for each project. The cursor values of the first -record will be here. +record is here. - Maximum 19 additional index-only scans for the consecutive records. -The optimized `IN` query will sort the array (cursor values per project array) 20 times, which -means we'll sort 20 x 500 rows. However, this might be a less memory-intensive task than +The optimized `IN` query sorts the array (cursor values per project array) 20 times, which +means we sort 20 x 500 rows. However, this might be a less memory-intensive task than sorting 10 000 rows at once. Performance comparison for the `gitlab-org` group: @@ -1053,5 +1053,5 @@ Performance comparison for the `gitlab-org` group: NOTE: Before taking measurements, the group lookup query was executed separately in order to make -the group data available in the buffer cache. Since it's a frequently called query, it's going to -hit many shared buffers during the query execution in the production environment. +the group data available in the buffer cache. Since it's a frequently called query, it +hits many shared buffers during the query execution in the production environment. diff --git a/doc/development/database/index.md b/doc/development/database/index.md index 0363d13ed4c..b427f54ff3c 100644 --- a/doc/development/database/index.md +++ b/doc/development/database/index.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/database/keyset_pagination.md b/doc/development/database/keyset_pagination.md index 88928feb927..4aec64b8cce 100644 --- a/doc/development/database/keyset_pagination.md +++ b/doc/development/database/keyset_pagination.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -178,7 +178,7 @@ To make keyset pagination work, we must configure custom order objects, to do so collect information about the order columns: - `relative_position` can have duplicated values because no unique index is present. -- `relative_position` can have null values because we don't have a not null constraint on the column. For this, we must determine where we see NULL values, at the beginning of the result set, or the end (`NULLS LAST`). +- `relative_position` can have null values because we don't have a not null constraint on the column. For this, we must determine where we see `NULL` values, at the beginning of the result set, or the end (`NULLS LAST`). - Keyset pagination requires distinct order columns, so we must add the primary key (`id`) to make the order distinct. - Jumping to the last page and paginating backwards actually reverses the `ORDER BY` clause. For this, we must provide the reversed `ORDER BY` clause. diff --git a/doc/development/database/layout_and_access_patterns.md b/doc/development/database/layout_and_access_patterns.md index a3e2fefb2a3..99a50b503aa 100644 --- a/doc/development/database/layout_and_access_patterns.md +++ b/doc/development/database/layout_and_access_patterns.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/database/loose_foreign_keys.md b/doc/development/database/loose_foreign_keys.md index 3db24793f1b..dec51d484fd 100644 --- a/doc/development/database/loose_foreign_keys.md +++ b/doc/development/database/loose_foreign_keys.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -255,7 +255,7 @@ When the loose foreign key definition is no longer needed (parent table is remov we need to remove the definition from the YAML file and ensure that we don't leave pending deleted records in the database. -1. Remove the loose foreign key definition from the config (`config/gitlab_loose_foreign_keys.yml`). +1. Remove the loose foreign key definition from the configuration (`config/gitlab_loose_foreign_keys.yml`). 1. Remove the deletion tracking trigger from the parent table (if the parent table is still there). 1. Remove leftover deleted records from the `loose_foreign_keys_deleted_records` table. @@ -429,7 +429,7 @@ ALTER TABLE ONLY vulnerability_occurrence_pipelines In this example we expect to delete all associated `vulnerability_occurrence_pipelines` records whenever we delete the `ci_pipelines` record associated with them. In this case you might end up with some vulnerability page in GitLab which shows an occurrence -of a vulnerability. However, when you try to click a link to the pipeline, you get +of a vulnerability. However, when you try to select a link to the pipeline, you get a 404, because the pipeline is deleted. Then, when you navigate back you might find the occurrence has disappeared too. @@ -515,13 +515,13 @@ referenced child tables. ### Database structure The feature relies on triggers installed on the parent tables. When a parent record is deleted, -the trigger will automatically insert a new record into the `loose_foreign_keys_deleted_records` +the trigger automatically inserts a new record into the `loose_foreign_keys_deleted_records` database table. -The inserted record will store the following information about the deleted record: +The inserted record stores the following information about the deleted record: - `fully_qualified_table_name`: name of the database table where the record was located. -- `primary_key_value`: the ID of the record, the value will be present in the child tables as +- `primary_key_value`: the ID of the record, the value is present in the child tables as the foreign key value. At the moment, composite primary keys are not supported, the parent table must have an `id` column. - `status`: defaults to pending, represents the status of the cleanup process. @@ -532,7 +532,7 @@ several runs. #### Database decomposition -The `loose_foreign_keys_deleted_records` table will exist on both database servers (Ci and Main) +The `loose_foreign_keys_deleted_records` table exists on both database servers (`ci` and `main`) after the [database decomposition](https://gitlab.com/groups/gitlab-org/-/epics/6168). The worker ill determine which parent tables belong to which database by reading the `lib/gitlab/database/gitlab_schemas.yml` YAML file. @@ -547,10 +547,10 @@ Example: - `ci_builds` - `ci_pipelines` -When the worker is invoked for the Ci database, the worker will load deleted records only from the +When the worker is invoked for the `ci` database, the worker loads deleted records only from the `ci_builds` and `ci_pipelines` tables. During the cleanup process, `DELETE` and `UPDATE` queries -will mostly run on tables located in the Main database. In this example, one `UPDATE` query will -nullify the `merge_requests.head_pipeline_id` column. +mostly run on tables located in the Main database. In this example, one `UPDATE` query +nullifies the `merge_requests.head_pipeline_id` column. #### Database partitioning @@ -561,7 +561,7 @@ strategy was considered for the feature but due to the large data volume we deci new strategy. A deleted record is considered fully processed when all its direct children records have been -cleaned up. When this happens, the loose foreign key worker will update the `status` column of +cleaned up. When this happens, the loose foreign key worker updates the `status` column of the deleted record. After this step, the record is no longer needed. The sliding partitioning strategy provides an efficient way of cleaning up old, unused data by @@ -591,7 +591,7 @@ Partitions: gitlab_partitions_dynamic.loose_foreign_keys_deleted_records_84 FOR ``` The `partition` column controls the insert direction, the `partition` value determines which -partition will get the deleted rows inserted via the trigger. Notice that the default value of +partition gets the deleted rows inserted via the trigger. Notice that the default value of the `partition` table matches with the value of the list partition (84). In `INSERT` query within the trigger the value of the `partition` is omitted, the trigger always relies on the default value of the column. @@ -607,20 +607,20 @@ SELECT TG_TABLE_SCHEMA || '.' || TG_TABLE_NAME, old_table.id FROM old_table; The partition "sliding" process is controlled by two, regularly executed callbacks. These callbacks are defined within the `LooseForeignKeys::DeletedRecord` model. -The `next_partition_if` callback controls when to create a new partition. A new partition will -be created when the current partition has at least one record older than 24 hours. A new partition +The `next_partition_if` callback controls when to create a new partition. A new partition is +created when the current partition has at least one record older than 24 hours. A new partition is added by the [`PartitionManager`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/database/partitioning/partition_manager.rb) using the following steps: 1. Create a new partition, where the `VALUE` for the partition is `CURRENT_PARTITION + 1`. 1. Update the default value of the `partition` column to `CURRENT_PARTITION + 1`. -With these steps, new `INSERT`-s via the triggers will end up in the new partition. At this point, +With these steps, all new `INSERT` queries via the triggers end up in the new partition. At this point, the database table has two partitions. The `detach_partition_if` callback determines if the old partitions can be detached from the table. A partition is detachable if there are no pending (unprocessed) records in the partition -(`status = 1`). The detached partitions will be available for some time, you can see the list +(`status = 1`). The detached partitions are available for some time, you can see the list detached partitions in the `detached_partitions` table: ```sql @@ -663,7 +663,7 @@ WHERE ("merge_requests"."id") IN These queries are batched, which means that in many cases, several invocations are needed to clean up all associated child records. -The batching is implemented with loops, the processing will stop when all associated child records +The batching is implemented with loops, the processing stops when all associated child records are cleaned up or the limit is reached. ```ruby @@ -682,14 +682,14 @@ end The loop-based batch processing is preferred over `EachBatch` for the following reasons: -- The records in the batch are modified, so the next batch will contain different records. +- The records in the batch are modified, so the next batch contains different records. - There is always an index on the foreign key column however, the column is usually not unique. `EachBatch` requires a unique column for the iteration. - The record order doesn't matter for the cleanup. -Notice that we have two loops. The initial loop will process records with the `SKIP LOCKED` clause. -The query will skip rows that are locked by other application processes. This will ensure that the -cleanup worker will less likely to become blocked. The second loop will execute the database +Notice that we have two loops. The initial loop processes records with the `SKIP LOCKED` clause. +The query skips rows that are locked by other application processes. This ensures that the +cleanup worker is less likely to become blocked. The second loop executes the database queries without `SKIP LOCKED` to ensure that all records have been processed. #### Processing limits @@ -709,19 +709,19 @@ To mitigate these issues, several limits are applied when the worker runs. The limit rules are implemented in the `LooseForeignKeys::ModificationTracker` class. When one of the limits (record modification count, time limit) is reached the processing is stopped -immediately. After some time, the next scheduled worker will continue the cleanup process. +immediately. After some time, the next scheduled worker continues the cleanup process. #### Performance characteristics -The database trigger on the parent tables will **decrease** the record deletion speed. Each -statement that removes rows from the parent table will invoke the trigger to insert records +The database trigger on the parent tables **decreases** the record deletion speed. Each +statement that removes rows from the parent table invokes the trigger to insert records into the `loose_foreign_keys_deleted_records` table. The queries within the cleanup worker are fairly efficient index scans, with limits in place they're unlikely to affect other parts of the application. The database queries are not running in transaction, when an error happens for example a statement -timeout or a worker crash, the next job will continue the processing. +timeout or a worker crash, the next job continues the processing. ## Troubleshooting @@ -730,13 +730,13 @@ timeout or a worker crash, the next job will continue the processing. There can be cases where the workers need to process an unusually large amount of data. This can happen under normal usage, for example when a large project or group is deleted. In this scenario, there can be several million rows to be deleted or nullified. Due to the limits enforced by the -worker, processing this data will take some time. +worker, processing this data takes some time. When cleaning up "heavy-hitters", the feature ensures fair processing by rescheduling larger batches for later. This gives time for other deleted records to be processed. For example, a project with millions of `ci_builds` records is deleted. The `ci_builds` records -will be deleted by the loose foreign keys feature. +is deleted by the loose foreign keys feature. 1. The cleanup worker is scheduled and picks up a batch of deleted `projects` records. The large project is part of the batch. @@ -746,7 +746,7 @@ project is part of the batch. 1. Go to step 1. The next cleanup worker continues the cleanup. 1. When the `cleanup_attempts` reaches 3, the batch is re-scheduled 10 minutes later by updating the `consume_after` column. -1. The next cleanup worker will process a different batch. +1. The next cleanup worker processes a different batch. We have Prometheus metrics in place to monitor the deleted record cleanup: @@ -812,7 +812,7 @@ runtime. LooseForeignKeys::CleanupWorker.new.perform ``` -When the cleanup is done, the older partitions will be automatically detached by the +When the cleanup is done, the older partitions are automatically detached by the `PartitionManager`. ### PartitionManager bug diff --git a/doc/development/database/maintenance_operations.md b/doc/development/database/maintenance_operations.md index 9e7a35531ca..85df185c024 100644 --- a/doc/development/database/maintenance_operations.md +++ b/doc/development/database/maintenance_operations.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/database/migrations_for_multiple_databases.md b/doc/development/database/migrations_for_multiple_databases.md index ce326a6ce4a..df9607f5672 100644 --- a/doc/development/database/migrations_for_multiple_databases.md +++ b/doc/development/database/migrations_for_multiple_databases.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -13,11 +13,6 @@ for [the decomposed GitLab application using multiple databases](https://gitlab. Learn more about general multiple databases support in a [separate document](multiple_databases.md). -WARNING: -If you experience any issues using `Gitlab::Database::Migration[2.0]`, -you can temporarily revert back to the previous behavior by changing the version to `Gitlab::Database::Migration[1.0]`. -Please report any issues with `Gitlab::Database::Migration[2.0]` in [this issue](https://gitlab.com/gitlab-org/gitlab/-/issues/358430). - The design for multiple databases (except for the Geo database) assumes that all decomposed databases have **the same structure** (for example, schema), but **the data is different** in each database. This means that some tables do not contain data on each database. @@ -78,6 +73,30 @@ class AddUserIdAndStateIndexToMergeRequestReviewers < Gitlab::Database::Migratio end ``` +#### Example: Add a new table to store in a single database + +1. Define the [GitLab Schema](multiple_databases.md#gitlab-schema) of the table in [`lib/gitlab/database/gitlab_schemas.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/database/gitlab_schemas.yml): + + ```yaml + ssh_signatures: :gitlab_main + ``` + +1. Create the table in a schema migration: + + ```ruby + class CreateSshSignatures < Gitlab::Database::Migration[2.0] + def change + create_table :ssh_signatures do |t| + t.timestamps_with_timezone null: false + t.bigint :project_id, null: false, index: true + t.bigint :key_id, null: false, index: true + t.integer :verification_status, default: 0, null: false, limit: 2 + t.binary :commit_sha, null: false, index: { unique: true } + end + end + end + ``` + ### Data Manipulation Language (DML) The DML migrations are all migrations that: @@ -241,7 +260,7 @@ the `database_tasks: false` set. `gitlab:db:validate_config` always runs before ## Validation -Validation in a nutshell uses [pg_query](https://github.com/pganalyze/pg_query) to analyze +Validation in a nutshell uses [`pg_query`](https://github.com/pganalyze/pg_query) to analyze each query and classify tables with information from [`gitlab_schema.yml`](multiple_databases.md#gitlab-schema). The migration is skipped if the specified `gitlab_schema` is outside of a list of schemas managed by a given database connection (`Gitlab::Database::gitlab_schemas_for_connection`). @@ -408,7 +427,7 @@ updating all `ci_pipelines`, you would set As with all DML migrations, you cannot query another database outside of `restrict_gitlab_migration` or `gitlab_shared`. If you need to query another database, -you'll likely need to separate these into two migrations somehow. +separate the migrations. Because the actual migration logic (not the queueing step) for background migrations runs in a Sidekiq worker, the logic can perform DML queries on diff --git a/doc/development/database/multiple_databases.md b/doc/development/database/multiple_databases.md index c622d4f50ff..7badd7f76fa 100644 --- a/doc/development/database/multiple_databases.md +++ b/doc/development/database/multiple_databases.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Sharding info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -23,7 +23,8 @@ Each table of GitLab needs to have a `gitlab_schema` assigned: - `gitlab_main`: describes all tables that are being stored in the `main:` database (for example, like `projects`, `users`). - `gitlab_ci`: describes all CI tables that are being stored in the `ci:` database (for example, `ci_pipelines`, `ci_builds`). -- `gitlab_shared`: describe all application tables that contain data across all decomposed databases (for example, `loose_foreign_keys_deleted_records`). +- `gitlab_shared`: describe all application tables that contain data across all decomposed databases (for example, `loose_foreign_keys_deleted_records`) for models that inherit from `Gitlab::Database::SharedModel`. +- `gitlab_internal`: describe all internal tables of Rails and PostgreSQL (for example, `ar_internal_metadata`, `schema_migrations`, `pg_*`). - `...`: more schemas to be introduced with additional decomposed databases The usage of schema enforces the base class to be used: @@ -44,10 +45,8 @@ This is used as a primary source of classification for: ### The special purpose of `gitlab_shared` -`gitlab_shared` is a special case describing tables or views that by design contain data across -all decomposed databases. This does describe application-defined tables (like `loose_foreign_keys_deleted_records`), -Rails-defined tables (like `schema_migrations` or `ar_internal_metadata` as well as internal PostgreSQL tables -(for example, `pg_attribute`). +`gitlab_shared` is a special case that describes tables or views that, by design, contain data across +all decomposed databases. This classification describes application-defined tables (like `loose_foreign_keys_deleted_records`). **Be careful** to use `gitlab_shared` as it requires special handling while accessing data. Since `gitlab_shared` shares not only structure but also data, the application needs to be written in a way @@ -62,6 +61,11 @@ end As such, migrations modifying data of `gitlab_shared` tables are expected to run across all decomposed databases. +### The special purpose of `gitlab_internal` + +`gitlab_internal` describes Rails-defined tables (like `schema_migrations` or `ar_internal_metadata`), as well as internal PostgreSQL tables (for example, `pg_attribute`). Its primary purpose is to [support other databases](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/85842#note_943453682), like Geo, that +might be missing some of those application-defined `gitlab_shared` tables (like `loose_foreign_keys_deleted_records`), but are valid Rails databases. + ## Migrations Read [Migrations for Multiple Databases](migrations_for_multiple_databases.md). @@ -597,3 +601,21 @@ way to replace cascading deletes so we don't end up with orphaned data or records that point to nowhere, which might lead to bugs. As such we created ["loose foreign keys"](loose_foreign_keys.md) which is an asynchronous process of cleaning up orphaned records. + +## Locking writes on the tables that don't belong to the database schemas + +When the CI database is promoted and the two databases are fully split, +as an extra safeguard against creating a split brain situation, +run the Rake task `gitlab:db:lock_writes`. This command locks writes on: + +- The `gitlab_main` tables on the CI Database. +- The `gitlab_ci` tables on the Main Database. + +This Rake task adds triggers to all the tables, to prevent any +`INSERT`, `UPDATE`, `DELETE`, or `TRUNCATE` statements from running +against the tables that need to be locked. + +If this task was run against a GitLab setup that uses only a single database +for both `gitlab_main` and `gitlab_ci` tables, then no tables will be locked. + +To undo the operation, run the opposite Rake task: `gitlab:db:unlock_writes`. diff --git a/doc/development/database/not_null_constraints.md b/doc/development/database/not_null_constraints.md index af7d569e282..3962307f80d 100644 --- a/doc/development/database/not_null_constraints.md +++ b/doc/development/database/not_null_constraints.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -99,8 +99,8 @@ such records, so we would follow the same process either way. We first add the `NOT NULL` constraint with a `NOT VALID` parameter, which enforces consistency when new records are inserted or current records are updated. -In the example above, the existing epics with a `NULL` description will not be affected and you'll -still be able to update records in the `epics` table. However, when you try to update or insert +In the example above, the existing epics with a `NULL` description are not affected and you are +still able to update records in the `epics` table. However, when you try to update or insert an epic without providing a description, the constraint causes a database error. Adding or removing a `NOT NULL` clause requires that any application changes are deployed _first_. @@ -129,7 +129,7 @@ end #### Data migration to fix existing records (current release) The approach here depends on the data volume and the cleanup strategy. The number of records that -must be fixed on GitLab.com is a nice indicator that will help us decide whether to use a +must be fixed on GitLab.com is a nice indicator that helps us decide whether to use a post-deployment migration or a background data migration: - If the data volume is less than `1000` records, then the data migration can be executed within the post-migration. @@ -138,7 +138,7 @@ post-deployment migration or a background data migration: When unsure about which option to use, please contact the Database team for advice. Back to our example, the epics table is not considerably large nor frequently accessed, -so we are going to add a post-deployment migration for the 13.0 milestone (current), +so we add a post-deployment migration for the 13.0 milestone (current), `db/post_migrate/20200501000002_cleanup_epics_with_null_description.rb`: ```ruby @@ -173,7 +173,7 @@ end #### Validate the `NOT NULL` constraint (next release) -Validating the `NOT NULL` constraint will scan the whole table and make sure that each record is correct. +Validating the `NOT NULL` constraint scans the whole table and make sure that each record is correct. Still in our example, for the 13.1 milestone (next), we run the `validate_not_null_constraint` migration helper in a final post-deployment migration, @@ -196,11 +196,11 @@ end ## `NOT NULL` constraints on large tables If you have to clean up a nullable column for a [high-traffic table](../migration_style_guide.md#high-traffic-tables) -(for example, the `artifacts` in `ci_builds`), your background migration will go on for a while and -it will need an additional [background migration cleaning up](background_migrations.md#cleaning-up) +(for example, the `artifacts` in `ci_builds`), your background migration goes on for a while and +it needs an additional [background migration cleaning up](background_migrations.md#cleaning-up) in the release after adding the data migration. -In that rare case you will need 3 releases end-to-end: +In that rare case you need 3 releases end-to-end: 1. Release `N.M` - Add the `NOT NULL` constraint and the background-migration to fix the existing records. 1. Release `N.M+1` - Cleanup the background migration. diff --git a/doc/development/database/pagination_guidelines.md b/doc/development/database/pagination_guidelines.md index 08840124535..1641708ce01 100644 --- a/doc/development/database/pagination_guidelines.md +++ b/doc/development/database/pagination_guidelines.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -28,9 +28,9 @@ We have two options for rendering the content: Rendering long lists can significantly affect both the frontend and backend performance: -- The database will need to read a lot of data from the disk. -- The result of the query (records) will eventually be transformed to Ruby objects which increases memory allocation. -- Large responses will take more time to send over the wire, to the user's browser. +- The database reads a lot of data from the disk. +- The result of the query (records) is eventually transformed to Ruby objects which increases memory allocation. +- Large responses take more time to send over the wire, to the user's browser. - Rendering long lists might freeze the browser (bad user experience). With pagination, the data is split into equal pieces (pages). On the first visit, the user receives only a limited number of items (page size). The user can see more items by paginating forward which results in a new HTTP request and a new database query. @@ -127,17 +127,17 @@ We can produce the same query in Rails: Issue.where(project_id: 1).page(1).per(20) ``` -The SQL query will return a maximum of 20 rows from the database. However, it doesn't mean that the database will only read 20 rows from the disk to produce the result. +The SQL query returns a maximum of 20 rows from the database. However, it doesn't mean that the database only reads 20 rows from the disk to produce the result. -This is what will happen: +This is what happens: -1. The database will try to plan the execution in the most efficient way possible based on the table statistics and the available indexes. +1. The database tries to plan the execution in the most efficient way possible based on the table statistics and the available indexes. 1. The planner knows that we have an index covering the `project_id` column. -1. The database will read all rows using the index on `project_id`. -1. The rows at this point are not sorted, so the database will need to sort the rows. +1. The database reads all rows using the index on `project_id`. +1. The rows at this point are not sorted, so the database sorts the rows. 1. The database returns the first 20 rows. -In case the project has 10_000 rows, the database will read 10_000 rows and sort them in memory (or on disk). This is not going to scale well in the long term. +In case the project has 10,000 rows, the database reads 10,000 rows and sorts them in memory (or on disk). This does not scale well in the long term. To fix this we need the following index: @@ -145,16 +145,16 @@ To fix this we need the following index: CREATE INDEX index_on_issues_project_id ON issues (project_id, id); ``` -By making the `id` column part of the index, the previous query will read maximum 20 rows. The query will perform well regardless of the number of issues within a project. So with this change, we've also improved the initial page load (when the user loads the issue page). +By making the `id` column part of the index, the previous query reads maximum 20 rows. The query performs well regardless of the number of issues within a project. So with this change, we've also improved the initial page load (when the user loads the issue page). NOTE: -Here we're leveraging the ordered property of the b-tree database index. Values in the index are sorted so reading 20 rows will not require further sorting. +Here we're leveraging the ordered property of the b-tree database index. Values in the index are sorted so reading 20 rows does not require further sorting. #### Limitations ##### `COUNT(*)` on a large dataset -Kaminari by default executes a count query to determine the number of pages for rendering the page links. Count queries can be quite expensive for a large table, in an unfortunate scenario the queries will simply time out. +Kaminari by default executes a count query to determine the number of pages for rendering the page links. Count queries can be quite expensive for a large table. In an unfortunate scenario the queries simply time out. To work around this, we can run Kaminari without invoking the count SQL query. @@ -162,11 +162,11 @@ To work around this, we can run Kaminari without invoking the count SQL query. Issue.where(project_id: 1).page(1).per(20).without_count ``` -In this case, the count query will not be executed and the pagination will no longer render the page numbers. We'll see only the next and previous links. +In this case, the count query is not executed and the pagination no longer renders the page numbers. We see only the next and previous links. ##### `OFFSET` on a large dataset -When we paginate over a large dataset, we might notice that the response time will get slower and slower. This is due to the `OFFSET` clause that seeks through the rows and skips N rows. +When we paginate over a large dataset, we might notice that the response time gets slower and slower. This is due to the `OFFSET` clause that seeks through the rows and skips N rows. From the user point of view, this might not be always noticeable. As the user paginates forward, the previous rows might be still in the buffer cache of the database. If the user shares the link with someone else and it's opened after a few minutes or hours, the response time might be significantly higher or it would even time out. @@ -214,7 +214,7 @@ Limit (cost=137878.89..137881.65 rows=20 width=1309) (actual time=5523.588..552 (8 rows) ``` -We can argue that a normal user will not be going to visit these pages, however, API users could easily navigate to very high page numbers (scraping, collecting data). +We can argue that a normal user does not visit these pages, however, API users could easily navigate to very high page numbers (scraping, collecting data). ### Keyset pagination @@ -279,7 +279,7 @@ eyJpZCI6Ijk0NzMzNTk0IiwidXBkYXRlZF9hdCI6IjIwMjEtMDQtMDkgMDg6NTA6MDUuODA1ODg0MDAw ``` NOTE: -Pagination parameters will be visible to the user, so we need to be careful about which columns we order by. +Pagination parameters are visible to the user, so be careful about which columns we order by. Keyset pagination can only provide the next, previous, first, and last pages. @@ -302,7 +302,7 @@ LIMIT 20 ##### Tooling -A generic keyset pagination library is available within the GitLab project which can most of the cases easily replace the existing, kaminari based pagination with significant performance improvements when dealing with large datasets. +A generic keyset pagination library is available within the GitLab project which can most of the cases easily replace the existing, Kaminari based pagination with significant performance improvements when dealing with large datasets. Example: diff --git a/doc/development/database/pagination_performance_guidelines.md b/doc/development/database/pagination_performance_guidelines.md index 90e4faf2de7..b5040e499e4 100644 --- a/doc/development/database/pagination_performance_guidelines.md +++ b/doc/development/database/pagination_performance_guidelines.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -55,13 +55,13 @@ LIMIT 20 OFFSET 0 ``` -With PostgreSQL version 11, the planner will first look up all issues matching the `project_id` filter and then join all `issue_metrics` rows. The ordering of rows will happen in memory. In case the joined relation is always present (1:1 relationship), the database will read `N * 2` rows where N is the number of rows matching the `project_id` filter. +With PostgreSQL version 11, the planner first looks up all issues matching the `project_id` filter and then join all `issue_metrics` rows. The ordering of rows happens in memory. In case the joined relation is always present (1:1 relationship), the database reads `N * 2` rows where N is the number of rows matching the `project_id` filter. For performance reasons, we should avoid mixing columns from different tables when specifying the `ORDER BY` clause. -In this particular case there is no simple way (like index creation) to improve the query. We might think that changing the `issues.id` column to `issue_metrics.issue_id` will help, however, this will likely make the query perform worse because it might force the database to process all rows in the `issue_metrics` table. +In this particular case there is no simple way (like index creation) to improve the query. We might think that changing the `issues.id` column to `issue_metrics.issue_id` helps, however, this likely makes the query perform worse because it might force the database to process all rows in the `issue_metrics` table. -One idea to address this problem is denormalization. Adding the `project_id` column to the `issue_metrics` table will make the filtering and sorting efficient: +One idea to address this problem is denormalization. Adding the `project_id` column to the `issue_metrics` table makes the filtering and sorting efficient: ```sql SELECT issues.* FROM issues @@ -73,7 +73,7 @@ OFFSET 0 ``` NOTE: -The query will require an index on `issue_metrics` table with the following column configuration: `(project_id, first_mentioned_in_commit_at DESC, issue_id DESC)`. +The query requires an index on `issue_metrics` table with the following column configuration: `(project_id, first_mentioned_in_commit_at DESC, issue_id DESC)`. ## Filtering @@ -81,7 +81,7 @@ The query will require an index on `issue_metrics` table with the following colu Filtering by a project is a very common use case since we have many features on the project level. Examples: merge requests, issues, boards, iterations. -These features will have a filter on `project_id` in their base query. Loading issues for a project: +These features have a filter on `project_id` in their base query. Loading issues for a project: ```ruby project = Project.find(5) @@ -108,9 +108,9 @@ This index fully covers the database query and the pagination. ### By group -Unfortunately, there is no efficient way to sort and paginate on the group level. The database query execution time will increase based on the number of records in the group. +Unfortunately, there is no efficient way to sort and paginate on the group level. The database query execution time increases based on the number of records in the group. -Things get worse when group level actually means group and its subgroups. To load the first page, the database needs to look up the group hierarchy, find all projects and then look up all issues. +Things get worse when group level actually means group and its subgroups. To load the first page, the database looks up the group hierarchy, finds all projects, and then looks up all issues. The main reason behind the inefficient queries on the group level is the way our database schema is designed; our core domain models are associated with a project, and projects are associated with groups. This doesn't mean that the database structure is bad, it's just in a well-normalized form that is not optimized for efficient group level queries. We might need to look into denormalization in the long term. @@ -184,7 +184,7 @@ LIMIT 20 OFFSET 0 ``` -Keep in mind that the index above will not support the following project level query: +The index above does not support the following project level query: ```sql SELECT "issues".* @@ -213,7 +213,7 @@ OFFSET 0 We might be tempted to add an index on `project_id`, `confidential`, and `iid` to improve the database query, however, in this case it's probably unnecessary. Based on the data distribution in the table, confidential issues are rare. Filtering them out does not make the database query significantly slower. The database might read a few extra rows, the performance difference might not even be visible to the end-user. -On the other hand, if we would implement a special filter where we only show confidential issues, we will surely need the index. Finding 20 confidential issues might require the database to scan hundreds of rows or in the worst case, all issues in the project. +On the other hand, if we implemented a special filter where we only show confidential issues, we need the index. Finding 20 confidential issues might require the database to scan hundreds of rows or, in the worst case, all issues in the project. NOTE: Be aware of the data distribution and the table access patterns (how features work) when introducing a new database index. Sampling production data might be necessary to make the right decision. @@ -253,7 +253,7 @@ Example database (oversimplified) execution plan: - `SELECT "issues".* FROM "issues" WHERE "issues"."project_id" = 5` 1. The database estimates the number of rows and the costs to run these queries. 1. The database executes the cheapest query first. -1. Using the query result, load the rows from the other table (from the other query) using the JOIN column and filter the rows further. +1. Using the query result, load the rows from the other table (from the other query) using the `JOIN` column and filter the rows further. In this particular example, the `issue_assignees` query would likely be executed first. @@ -276,17 +276,17 @@ Running the query in production for the GitLab project produces the following ex (13 rows) ``` -The query looks up the `assignees` first, filtered by the `user_id` (`user_id = 4156052`) and it finds 215 rows. Using that 215 rows, the database will look up the 215 associated issue rows by the primary key. Notice that the filter on the `project_id` column is not backed by an index. +The query looks up the `assignees` first, filtered by the `user_id` (`user_id = 4156052`) and it finds 215 rows. Using those 215 rows, the database looks up the 215 associated issue rows by the primary key. Notice that the filter on the `project_id` column is not backed by an index. -In most cases, we are lucky that the joined relation will not be going to return too many rows, therefore, we will end up with a relatively efficient database query that accesses low number of rows. As the database grows, these queries might start to behave differently. Let's say the number `issue_assignees` records for a particular user is very high (millions), then this join query will not perform well, and it will likely time out. +In most cases, we are lucky that the joined relation does not return too many rows, therefore, we end up with a relatively efficient database query that accesses a small number of rows. As the database grows, these queries might start to behave differently. Let's say the number `issue_assignees` records for a particular user is very high, in the millions. This join query does not perform well, and it likely times out. -A similar problem could be a double join, where the filter exists in the 2nd JOIN query. Example: `Issue -> LabelLink -> Label(name=bug)`. +A similar problem could be a double join, where the filter exists in the 2nd `JOIN` query. Example: `Issue -> LabelLink -> Label(name=bug)`. There is no easy way to fix these problems. Denormalization of data could help significantly, however, it has also negative effects (data duplication and keeping the data up to date). Ideas for improving the `issue_assignees` filter: -- Add `project_id` column to the `issue_assignees` table so when JOIN-ing, the extra `project_id` filter will further filter the rows. The sorting will likely happen in memory: +- Add `project_id` column to the `issue_assignees` table so when performing the `JOIN`, the extra `project_id` filter further filters the rows. The sorting likely happens in memory: ```sql SELECT "issues".* diff --git a/doc/development/database/post_deployment_migrations.md b/doc/development/database/post_deployment_migrations.md index 799eefdb875..a49c77ca047 100644 --- a/doc/development/database/post_deployment_migrations.md +++ b/doc/development/database/post_deployment_migrations.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/database/rename_database_tables.md b/doc/development/database/rename_database_tables.md index 7a76c028042..cbcbd507204 100644 --- a/doc/development/database/rename_database_tables.md +++ b/doc/development/database/rename_database_tables.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -135,4 +135,4 @@ database, ActiveRecord fetches the column information again. At this time, our p marked table (`TABLES_TO_BE_RENAMED`) instructs ActiveRecord to use the new database table name when fetching the database table information. -The new version of the application will use the new database table. +The new version of the application uses the new database table. diff --git a/doc/development/database/setting_multiple_values.md b/doc/development/database/setting_multiple_values.md index 0f23aae9f79..cba15a73430 100644 --- a/doc/development/database/setting_multiple_values.md +++ b/doc/development/database/setting_multiple_values.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/database/strings_and_the_text_data_type.md b/doc/development/database/strings_and_the_text_data_type.md index 7aa529e1518..73e023f8d45 100644 --- a/doc/development/database/strings_and_the_text_data_type.md +++ b/doc/development/database/strings_and_the_text_data_type.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -8,7 +8,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w > [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/30453) in GitLab 13.0. -When adding new columns that will be used to store strings or other textual information: +When adding new columns to store strings or other textual information: 1. We always use the `text` data type instead of the `string` data type. 1. `text` columns should always have a limit set, either by using the `create_table` with @@ -142,8 +142,8 @@ instance of GitLab could have such records, so we would follow the same process We first add the limit as a `NOT VALID` check constraint to the table, which enforces consistency when new records are inserted or current records are updated. -In the example above, the existing issues with more than 1024 characters in their title will not be -affected and you'll be still able to update records in the `issues` table. However, when you'd try +In the example above, the existing issues with more than 1024 characters in their title are not +affected, and you are still able to update records in the `issues` table. However, when you'd try to update the `title_html` with a title that has more than 1024 characters, the constraint causes a database error. @@ -182,7 +182,7 @@ end #### Data migration to fix existing records (current release) The approach here depends on the data volume and the cleanup strategy. The number of records that must -be fixed on GitLab.com is a nice indicator that will help us decide whether to use a post-deployment +be fixed on GitLab.com is a nice indicator that helps us decide whether to use a post-deployment migration or a background data migration: - If the data volume is less than `1,000` records, then the data migration can be executed within the post-migration. @@ -233,7 +233,7 @@ You can find more information on the guide about [background migrations](backgro #### Validate the text limit (next release) -Validating the text limit will scan the whole table and make sure that each record is correct. +Validating the text limit scans the whole table, and makes sure that each record is correct. Still in our example, for the 13.1 milestone (next), we run the `validate_text_limit` migration helper in a final post-deployment migration, @@ -276,11 +276,11 @@ end ## Text limit constraints on large tables If you have to clean up a text column for a really [large table](https://gitlab.com/gitlab-org/gitlab/-/blob/master/rubocop/rubocop-migrations.yml#L3) -(for example, the `artifacts` in `ci_builds`), your background migration will go on for a while and -it will need an additional [background migration cleaning up](background_migrations.md#cleaning-up) +(for example, the `artifacts` in `ci_builds`), your background migration goes on for a while and +it needs an additional [background migration cleaning up](background_migrations.md#cleaning-up) in the release after adding the data migration. -In that rare case you will need 3 releases end-to-end: +In that rare case you need 3 releases end-to-end: 1. Release `N.M` - Add the text limit and the background migration to fix the existing records. 1. Release `N.M+1` - Cleanup the background migration. diff --git a/doc/development/database/table_partitioning.md b/doc/development/database/table_partitioning.md index 34cb73978bc..582c988bef9 100644 --- a/doc/development/database/table_partitioning.md +++ b/doc/development/database/table_partitioning.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -36,23 +36,23 @@ before attempting to leverage this feature. While partitioning can be very useful when properly applied, it's imperative to identify if the data and workload of a table naturally fit a -partitioning scheme. There are a few details you'll have to understand -in order to decide if partitioning is a good fit for your particular +partitioning scheme. There are a few details you have to understand +to decide if partitioning is a good fit for your particular problem. First, a table is partitioned on a partition key, which is a column or -set of columns which determine how the data will be split across the +set of columns which determine how the data is split across the partitions. The partition key is used by the database when reading or -writing data, to decide which partitions need to be accessed. The +writing data, to decide which partitions must be accessed. The partition key should be a column that would be included in a `WHERE` clause on almost all queries accessing that table. -Second, it's necessary to understand the strategy the database will -use to split the data across the partitions. The scheme supported by the +Second, it's necessary to understand the strategy the database uses +to split the data across the partitions. The scheme supported by the GitLab migration helpers is date-range partitioning, where each partition in the table contains data for a single month. In this case, the partitioning -key would need to be a timestamp or date column. In order for this type of -partitioning to work well, most queries would need to access data within a +key must be a timestamp or date column. In order for this type of +partitioning to work well, most queries must access data in a certain date range. For a more concrete example, the `audit_events` table can be used, which @@ -73,7 +73,7 @@ CREATE TABLE audit_events ( created_at timestamptz NOT NULL); ``` -Now imagine typical queries in the UI would display the data within a +Now imagine typical queries in the UI would display the data in a certain date range, like a single week: ```sql @@ -117,7 +117,7 @@ partition key falls in the specified range. For example, the partition greater than or equal to `2020-01-01` and less than `2020-02-01`. Now, if we look at the previous example query again, the database can -use the `WHERE` to recognize that all matching rows will be in the +use the `WHERE` to recognize that all matching rows are in the `audit_events_202001` partition. Rather than searching all of the data in all of the partitions, it can search only the single month's worth of data in the appropriate partition. In a large table, this can @@ -136,11 +136,11 @@ LIMIT 100 In this example, the database can't prune any partitions from the search, because matching data could exist in any of them. As a result, it has to query each partition individually, and aggregate the rows into a single result -set. Since `author_id` would be indexed, the performance impact could +set. Because `author_id` would be indexed, the performance impact could likely be acceptable, but on more complex queries the overhead can be substantial. Partitioning should only be leveraged if the access patterns -of the data support the partitioning strategy, otherwise performance will -suffer. +of the data support the partitioning strategy, otherwise performance +suffers. ## Partitioning a table @@ -158,15 +158,15 @@ migration to copy data into the new table. Changes to the original table schema can be made in parallel with the partitioning migration, but they must take care to not break the underlying mechanism that makes the migration work. For example, if a column is added to the table that is being -partitioned, both the partitioned table and the trigger definition need to +partitioned, both the partitioned table and the trigger definition must be updated to match. ### Step 1: Creating the partitioned copy (Release N) The first step is to add a migration to create the partitioned copy of -the original table. This migration will also create the appropriate +the original table. This migration creates the appropriate partitions based on the data in the original table, and install a -trigger that will sync writes from the original table into the +trigger that syncs writes from the original table into the partitioned copy. An example migration of partitioning the `audit_events` table by its @@ -186,15 +186,15 @@ class PartitionAuditEvents < Gitlab::Database::Migration[1.0] end ``` -Once this has executed, any inserts, updates or deletes in the -original table will also be duplicated in the new table. For updates and -deletes, the operation will only have an effect if the corresponding row +After this has executed, any inserts, updates, or deletes in the +original table are also duplicated in the new table. For updates and +deletes, the operation only has an effect if the corresponding row exists in the partitioned table. ### Step 2: Backfill the partitioned copy (Release N) -The second step is to add a post-deployment migration that will schedule -the background jobs that will backfill existing data from the original table +The second step is to add a post-deployment migration that schedules +the background jobs that backfill existing data from the original table into the partitioned copy. Continuing the above example, the migration would look like: @@ -225,7 +225,7 @@ partitioning migration. The third step must occur at least one release after the release that includes the background migration. This gives time for the background migration to execute properly in self-managed installations. In this step, -add another post-deployment migration that will cleanup after the +add another post-deployment migration that cleans up after the background migration. This includes forcing any remaining jobs to execute, and copying data that may have been missed, due to dropped or failed jobs. @@ -248,12 +248,11 @@ end After this migration has completed, the original table and partitioned table should contain identical data. The trigger installed on the -original table guarantees that the data will remain in sync going -forward. +original table guarantees that the data remains in sync going forward. ### Step 4: Swap the partitioned and non-partitioned tables (Release N+1) -The final step of the migration will make the partitioned table ready +The final step of the migration makes the partitioned table ready for use by the application. This section will be updated when the migration helper is ready, for now development can be followed in the [Tracking Issue](https://gitlab.com/gitlab-org/gitlab/-/issues/241267). diff --git a/doc/development/database/transaction_guidelines.md b/doc/development/database/transaction_guidelines.md index 2806bd217db..d96d11f05a5 100644 --- a/doc/development/database/transaction_guidelines.md +++ b/doc/development/database/transaction_guidelines.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -15,8 +15,8 @@ For further reference, check PostgreSQL documentation about [transactions](https The [sharding group](https://about.gitlab.com/handbook/engineering/development/enablement/sharding/) plans to split the main GitLab database and move some of the database tables to other database servers. -We'll start decomposing the `ci_*`-related database tables first. To maintain the current application -development experience, we'll add tooling and static analyzers to the codebase to ensure correct +We start decomposing the `ci_*`-related database tables first. To maintain the current application +development experience, we add tooling and static analyzers to the codebase to ensure correct data access and data modification methods. By using the correct form for defining database transactions, we can save significant refactoring work in the future. @@ -60,7 +60,7 @@ end The database tries to acquire the `FOR UPDATE` lock for the referenced `issue` and `project` records. In our case, we have two competing transactions for these locks, -and only one of them will successfully acquire them. The other transaction will have +and only one of them successfully acquires them. The other transaction has to wait in the lock queue until the first transaction finishes. The execution of the second transaction is blocked at this point. @@ -139,5 +139,5 @@ end ``` The `ActiveRecord::Base` class uses a different database connection than the `Ci::Build` records. -The two statements in the transaction block will not be part of the transaction and will not be +The two statements in the transaction block are not part of the transaction and are rolled back in case something goes wrong. They act as 3rd part calls. diff --git a/doc/development/database_debugging.md b/doc/development/database_debugging.md index 426d355bd82..5d46ade98bb 100644 --- a/doc/development/database_debugging.md +++ b/doc/development/database_debugging.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -9,7 +9,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w This section is to help give some copy-pasta you can use as a reference when you run into some head-banging database problems. -An easy first step is to search for your error in Slack, or search for `GitLab <my error>` with Google. +A first step is to search for your error in Slack, or search for `GitLab <my error>` with Google. Available `RAILS_ENV`: @@ -57,7 +57,7 @@ bundle exec rake db:reset RAILS_ENV=test Access the database via one of these commands (they all get you to the same place) -```ruby +```shell gdk psql -d gitlabhq_development bundle exec rails dbconsole -e development bundle exec rails db -e development @@ -72,7 +72,7 @@ bundle exec rails db -e development ## Access the database with a GUI -Most GUIs (DataGrid, RubyMine, DBeaver) require a TCP connection to the database, but by default +Most GUIs (DataGrip, RubyMine, DBeaver) require a TCP connection to the database, but by default the database runs on a UNIX socket. To be able to access the database from these tools, some steps are needed: @@ -106,8 +106,8 @@ Use these instructions for exploring the GitLab database while developing with t 1. Install or open [Visual Studio Code](https://code.visualstudio.com/download). 1. Install the [PostgreSQL VSCode Extension](https://marketplace.visualstudio.com/items?itemName=ckolkman.vscode-postgres). -1. In Visual Studio Code click on the PostgreSQL Explorer button in the left toolbar. -1. In the top bar of the new window, click on the `+` to **Add Database Connection**, and follow the prompts to fill in the details: +1. In Visual Studio Code select **PostgreSQL Explorer** in the left toolbar. +1. In the top bar of the new window, select `+` to **Add Database Connection**, and follow the prompts to fill in the details: 1. **Hostname**: the path to the PostgreSQL folder in your GDK directory (for example `/dev/gitlab-development-kit/postgresql`). 1. **PostgreSQL user to authenticate as**: usually your local username, unless otherwise specified during PostgreSQL installation. 1. **Password of the PostgreSQL user**: the password you set when installing PostgreSQL. @@ -169,7 +169,7 @@ possible to migrate GitLab from every previous version. In some cases you may want to bypass this check. For example, if you were on a version of GitLab schema later than the `MIN_SCHEMA_VERSION`, and then rolled back the -to an older migration, from before. In this case, in order to migrate forward again, +to an older migration, from before. In this case, to migrate forward again, you should set the `SKIP_SCHEMA_VERSION_CHECK` environment variable. ```shell diff --git a/doc/development/database_query_comments.md b/doc/development/database_query_comments.md index e4133633a77..2798071bc06 100644 --- a/doc/development/database_query_comments.md +++ b/doc/development/database_query_comments.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -12,7 +12,7 @@ queries generated by ActiveRecord. It is very useful for tracing problematic queries back to the application source. -An engineer during an on-call incident will have the full context of a query +An engineer during an on-call incident has the full context of a query and its application source from the comments. ## Metadata information in comments @@ -24,7 +24,7 @@ Queries generated from **Rails** include the following metadata in comments: - `endpoint_id` - `line` -Queries generated from **Sidekiq** workers will include the following metadata +Queries generated from **Sidekiq** workers include the following metadata in comments: - `application` diff --git a/doc/development/database_review.md b/doc/development/database_review.md index fd0e2e17623..2b215190e6d 100644 --- a/doc/development/database_review.md +++ b/doc/development/database_review.md @@ -1,12 +1,12 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- # Database Review Guidelines -This page is specific to database reviews. Please refer to our +This page is specific to database reviews. Refer to our [code review guide](code_review.md) for broader advice and best practices for code review in general. @@ -25,7 +25,7 @@ A database review is required for: generally up to the author of a merge request to decide whether or not complex queries are being introduced and if they require a database review. -- Changes in Service Data metrics that use `count`, `distinct_count` and `estimate_batch_distinct_count`. +- Changes in Service Data metrics that use `count`, `distinct_count`, `estimate_batch_distinct_count` and `sum`. These metrics could have complex queries over large tables. See the [Product Intelligence Guide](https://about.gitlab.com/handbook/product/product-intelligence-guide/) for implementation details. @@ -39,7 +39,7 @@ migration only. ### Required You must provide the following artifacts when you request a ~database review. -If your merge request description does not include these items, the review will be reassigned back to the author. +If your merge request description does not include these items, the review is reassigned back to the author. #### Migrations @@ -47,7 +47,7 @@ If new migrations are introduced, in the MR **you are required to provide**: - The output of both migrating (`db:migrate`) and rolling back (`db:rollback`) for all migrations. -Note that we have automated tooling for +We have automated tooling for [GitLab](https://gitlab.com/gitlab-org/gitlab) (provided by the [`db:check-migrations`](database/dbcheck-migrations-job.md) pipeline job) that provides this output for migrations on ~database merge requests. You do not need to provide this information manually @@ -88,7 +88,7 @@ A database **maintainer**'s role is to: database reviewer and the MR author. - Finally approve the MR and relabel the MR with ~"database::approved" - Merge the MR if no other approvals are pending or pass it on to - other maintainers as required (frontend, backend, docs). + other maintainers as required (frontend, backend, documentation). - If not merging, remove yourself as a reviewer. ### Distributing review workload @@ -96,7 +96,7 @@ A database **maintainer**'s role is to: Review workload is distributed using [reviewer roulette](code_review.md#reviewer-roulette) ([example](https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/25181#note_147551725)). The MR author should request a review from the suggested database -**reviewer**. When they give their sign-off, they will hand over to +**reviewer**. When they sign off, they hand over to the suggested database **maintainer**. If reviewer roulette didn't suggest a database reviewer & maintainer, @@ -106,7 +106,7 @@ make sure you have applied the ~database label and rerun the ### How to prepare the merge request for a database review -In order to make reviewing easier and therefore faster, please take +To make reviewing easier and therefore faster, take the following preparations into account. #### Preparation when adding migrations @@ -124,10 +124,10 @@ the following preparations into account. - When adding an index to a [large table](https://gitlab.com/gitlab-org/gitlab/-/blob/master/rubocop/rubocop-migrations.yml#L3), test its execution using `CREATE INDEX CONCURRENTLY` in the `#database-lab` Slack channel and add the execution time to the MR description: - Execution time largely varies between `#database-lab` and GitLab.com, but an elevated execution time from `#database-lab` - can give a hint that the execution on GitLab.com will also be considerably high. + can give a hint that the execution on GitLab.com is also considerably high. - If the execution from `#database-lab` is longer than `1h`, the index should be moved to a [post-migration](database/post_deployment_migrations.md). Keep in mind that in this case you may need to split the migration and the application changes in separate releases to ensure the index - will be in place when the code that needs it will be deployed. + is in place when the code that needs it is deployed. - Manually trigger the [database testing](database/database_migration_pipeline.md) job (`db:gitlabcom-database-testing`) in the `test` stage. - This job runs migrations in a production-like environment (similar to `#database_lab`) and posts to the MR its findings (queries, runtime, size change). - Review migration runtimes and any warnings. @@ -139,10 +139,10 @@ of error that would result in corruption or loss of production data. Include in the MR description: -- If the migration itself is not reversible, details of how data changes could be reverted in the event of an incident. For example, in the case of a migration that deletes records (an operation that most of the times is not automatically revertable), how _could_ the deleted records be recovered. +- If the migration itself is not reversible, details of how data changes could be reverted in the event of an incident. For example, in the case of a migration that deletes records (an operation that most of the times is not automatically reversible), how _could_ the deleted records be recovered. - If the migration deletes data, apply the label `~data-deletion`. - Concise descriptions of possible user experience impact of an error; for example, "Issues would unexpectedly go missing from Epics". -- Relevant data from the [query plans](#query-plans) that indicate the query works as expected; such as the approximate number of records that will be modified/deleted. +- Relevant data from the [query plans](#query-plans) that indicate the query works as expected; such as the approximate number of records that are modified or deleted. #### Preparation when adding or modifying queries @@ -151,13 +151,13 @@ Include in the MR description: - Write the raw SQL in the MR description. Preferably formatted nicely with [pgFormatter](https://sqlformat.darold.net) or [paste.depesz.com](https://paste.depesz.com) and using regular quotes - <!-- vale off --> +<!-- vale gitlab.NonStandardQuotes = NO --> (for example, `"projects"."id"`) and avoiding smart quotes (for example, `“projects”.“id”`). - <!-- vale on --> +<!-- vale gitlab.NonStandardQuotes = YES --> - In case of queries generated dynamically by using parameters, there should be one raw SQL query for each variation. For example, a finder for issues that may take as a parameter an optional filter on projects, - should include both the version of the simple query over issues and the one that joins issues + should include both the version of the query over issues and the one that joins issues and projects and applies the filter. There are finders or other methods that can generate a very large amount of permutations. @@ -167,7 +167,7 @@ Include in the MR description: For example, if joins or a group by clause are optional, the versions without the group by clause and with less joins should be also included, while keeping the appropriate filters for the remaining tables. -- If a query is going to be always used with a limit and an offset, those should always be +- If a query is always used with a limit and an offset, those should always be included with the maximum allowed limit used and a non 0 offset. ##### Query Plans @@ -175,7 +175,7 @@ Include in the MR description: - The query plan for each raw SQL query included in the merge request along with the link to the query plan following each raw SQL snippet. - Provide a public link to the plan from either: - [postgres.ai](https://postgres.ai/): Follow the link in `#database-lab` and generate a shareable, public link - by clicking the **Share** button in the upper right corner. + by clicking **Share** in the upper right corner. - [explain.depesz.com](https://explain.depesz.com) or [explain.dalibo.com](https://explain.dalibo.com): Paste both the plan and the query used in the form. - When providing query plans, make sure it hits enough data: - You can use a GitLab production replica to test your queries on a large scale, @@ -204,11 +204,11 @@ Include in the MR description: - Add foreign keys to any columns pointing to data in other tables, including [an index](migration_style_guide.md#adding-foreign-key-constraints). - Add indexes for fields that are used in statements such as `WHERE`, `ORDER BY`, `GROUP BY`, and `JOIN`s. - New tables and columns are not necessarily risky, but over time some access patterns are inherently - difficult to scale. To identify these risky patterns in advance, we need to document expectations for + difficult to scale. To identify these risky patterns in advance, we must document expectations for access and size. Include in the MR description answers to these questions: - What is the anticipated growth for the new table over the next 3 months, 6 months, 1 year? What assumptions are these based on? - How many reads and writes per hour would you expect this table to have in 3 months, 6 months, 1 year? Under what circumstances are rows updated? What assumptions are these based on? - - Based on the anticipated data volume and access patterns, does the new table pose an availability risk to GitLab.com or self-managed instances? Will the proposed design scale to support the needs of GitLab.com and self-managed customers? + - Based on the anticipated data volume and access patterns, does the new table pose an availability risk to GitLab.com or self-managed instances? Does the proposed design scale to support the needs of GitLab.com and self-managed customers? #### Preparation when removing columns, tables, indexes, or other structures @@ -235,7 +235,7 @@ Include in the MR description: - Check consistency with `db/structure.sql` and that migrations are [reversible](migration_style_guide.md#reversibility) - Check that the relevant version files under `db/schema_migrations` were added or removed. - Check queries timing (If any): In a single transaction, cumulative query time executed in a migration - needs to fit comfortably within `15s` - preferably much less than that - on GitLab.com. + needs to fit comfortably in `15s` - preferably much less than that - on GitLab.com. - For column removals, make sure the column has been [ignored in a previous release](database/avoiding_downtime_in_migrations.md#dropping-columns) - Check [background migrations](database/background_migrations.md): - Establish a time estimate for execution on GitLab.com. For historical purposes, @@ -266,7 +266,7 @@ Include in the MR description: - Query performance - Check for any overly complex queries and queries the author specifically points out for review (if any) - - If not present yet, ask the author to provide SQL queries and query plans + - If not present, ask the author to provide SQL queries and query plans (for example, by using [ChatOps](understanding_explain_plans.md#chatops) or direct database access) - For given queries, review parameters regarding data distribution diff --git a/doc/development/db_dump.md b/doc/development/db_dump.md index 0c63bf06e07..f2076cbc410 100644 --- a/doc/development/db_dump.md +++ b/doc/development/db_dump.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/deprecation_guidelines/img/deprecation_removal_process.png b/doc/development/deprecation_guidelines/img/deprecation_removal_process.png Binary files differnew file mode 100644 index 00000000000..99642ebbae0 --- /dev/null +++ b/doc/development/deprecation_guidelines/img/deprecation_removal_process.png diff --git a/doc/development/deprecation_guidelines/index.md b/doc/development/deprecation_guidelines/index.md index cafc40ccc68..7fbe2261f4d 100644 --- a/doc/development/deprecation_guidelines/index.md +++ b/doc/development/deprecation_guidelines/index.md @@ -11,13 +11,31 @@ changes](../contributing/index.md#breaking-changes) to GitLab features. ## Terminology -It's important to understand the difference between **deprecation** and -**removal**: +**Deprecation**: -**Deprecation** is the process of flagging/marking/announcing that a feature is no longer fully supported and may be removed in a future version of GitLab. +- Feature not recommended for use. +- Development restricted to Priority 1 / Severity 1 bug fixes. +- Will be removed in a future major release. +- Begins after a deprecation announcement outlining an end-of-support date. +- Ends after the end-of-support date or removal date has passed. -**Removal** is the process of actually removing a feature that was previously -deprecated. +**End of Support**: + +- Feature usage strongly discouraged. +- No support or fixes provided. +- No longer tested internally. +- Will be removed in a future major release. +- Begins after an end-of-support date has passed. +- Ends after all relevant code has been removed. + +**Removal**: + +- Feature usage impossible. +- Happens in a major release in line with our + [semantic versioning policy](../../policy/maintenance.md). +- Begins after removal date has passed. + +![Deprecation, End of Support, Removal process](img/deprecation_removal_process.png) ## When can a feature be deprecated? diff --git a/doc/development/developing_with_solargraph.md b/doc/development/developing_with_solargraph.md index 877fbad8ab2..d7e41187ace 100644 --- a/doc/development/developing_with_solargraph.md +++ b/doc/development/developing_with_solargraph.md @@ -8,7 +8,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w Gemfile packages [Solargraph](https://github.com/castwide/solargraph) language server for additional IntelliSense and code formatting capabilities with editors that support it. -Example configuration for Solargraph can be found in [.solargraph.yml.example](https://gitlab.com/gitlab-org/gitlab/-/blob/master/.solargraph.yml.example) file. Copy the contents of this file to `.solargraph.yml` file for language server to pick this configuration up. Since `.solargraph.yml` configuration file is ignored by Git, it's possible to adjust configuration according to your needs. +Example configuration for Solargraph can be found in [`.solargraph.yml.example`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/.solargraph.yml.example) file. Copy the contents of this file to `.solargraph.yml` file for language server to pick this configuration up. Since `.solargraph.yml` configuration file is ignored by Git, it's possible to adjust configuration according to your needs. Refer to particular IDE plugin documentation on how to integrate it with Solargraph language server: diff --git a/doc/development/distributed_tracing.md b/doc/development/distributed_tracing.md index b4f347449cc..116071cdfd9 100644 --- a/doc/development/distributed_tracing.md +++ b/doc/development/distributed_tracing.md @@ -71,7 +71,7 @@ GITLAB_TRACING=opentracing://<driver>?<param_name>=<param_value>&<param_name_2>= In this example, we have the following hypothetical values: -- `driver`: the driver such a jaegar. +- `driver`: the driver such a Jaeger. - `param_name`, `param_value`: these are driver specific configuration values. Configuration parameters for Jaeger are documented [further on in this document](#2-configure-the-gitlab_tracing-environment-variable) they should be URL encoded. @@ -87,7 +87,7 @@ The easiest way to access tracing from a GDK environment is through the [performance-bar](../administration/monitoring/performance/performance_bar.md). This can be shown by typing `p` `b` in the browser window. -Once the performance bar is enabled, click on the **Trace** link in the performance bar to go to +Once the performance bar is enabled, select **Trace** in the performance bar to go to the Jaeger UI. The Jaeger search UI returns a query for the `Correlation-ID` of the current request. Normally, diff --git a/doc/development/documentation/feature_flags.md b/doc/development/documentation/feature_flags.md index c5ea1985fc7..89e54183e50 100644 --- a/doc/development/documentation/feature_flags.md +++ b/doc/development/documentation/feature_flags.md @@ -29,11 +29,11 @@ When the state of a flag changes (for example, disabled by default to enabled by Possible version history entries are: ```markdown -> - [Introduced](issue-link) in GitLab X.X [with a flag](../../administration/feature_flags.md) named <flag name>. Disabled by default. +> - [Introduced](issue-link) in GitLab X.X [with a flag](../../administration/feature_flags.md) named `flag_name`. Disabled by default. > - [Enabled on GitLab.com](issue-link) in GitLab X.X. > - [Enabled on GitLab.com](issue-link) in GitLab X.X. Available to GitLab.com administrators only. > - [Enabled on self-managed](issue-link) in GitLab X.X. -> - [Generally available](issue-link) in GitLab X.Y. [Feature flag <flag name>](issue-link) removed. +> - [Generally available](issue-link) in GitLab X.Y. [Feature flag `flag_name`](issue-link) removed. ``` You can combine entries if they happened in the same release: @@ -60,15 +60,15 @@ FLAG: | If the feature is... | Use this text | |--------------------------|---------------| -| Available | `On self-managed GitLab, by default this feature is available. To hide the feature, ask an administrator to [disable the feature flag](<path to>/administration/feature_flags.md) named <flag name>.` | -| Unavailable | `On self-managed GitLab, by default this feature is not available. To make it available, ask an administrator to [enable the feature flag](<path to>/administration/feature_flags.md) named <flag name>.` | -| Available to some users | `On self-managed GitLab, by default this feature is available to a subset of users. To show or hide the feature for all, ask an administrator to [change the status of the feature flag](<path to>/administration/feature_flags.md) named <flag name>.` | -| Available, per-group | `On self-managed GitLab, by default this feature is available. To hide the feature per group, ask an administrator to [disable the feature flag](<path to>/administration/feature_flags.md) named <flag name>.` | -| Unavailable, per-group | `On self-managed GitLab, by default this feature is not available. To make it available per group, ask an administrator to [enable the feature flag](<path to>/administration/feature_flags.md) named <flag name>.` | -| Available, per-project | `On self-managed GitLab, by default this feature is available. To hide the feature per project or for your entire instance, ask an administrator to [disable the feature flag](<path to>/administration/feature_flags.md) named <flag name>.` | -| Unavailable, per-project | `On self-managed GitLab, by default this feature is not available. To make it available per project or for your entire instance, ask an administrator to [enable the feature flag](<path to>/administration/feature_flags.md) named <flag name>.` | -| Available, per-user | `On self-managed GitLab, by default this feature is available. To hide the feature per user, ask an administrator to [disable the feature flag](<path to>/administration/feature_flags.md) named <flag name>.` | -| Unavailable, per-user | `On self-managed GitLab, by default this feature is not available. To make it available per user, ask an administrator to [enable the feature flag](<path to>/administration/feature_flags.md) named <flag name>.` | +| Available | ``On self-managed GitLab, by default this feature is available. To hide the feature, ask an administrator to [disable the feature flag](<path to>/administration/feature_flags.md) named `flag_name`.`` | +| Unavailable | ``On self-managed GitLab, by default this feature is not available. To make it available, ask an administrator to [enable the feature flag](<path to>/administration/feature_flags.md) named `flag_name`.`` | +| Available to some users | ``On self-managed GitLab, by default this feature is available to a subset of users. To show or hide the feature for all, ask an administrator to [change the status of the feature flag](<path to>/administration/feature_flags.md) named `flag_name`.`` | +| Available, per-group | ``On self-managed GitLab, by default this feature is available. To hide the feature per group, ask an administrator to [disable the feature flag](<path to>/administration/feature_flags.md) named `flag_name`.`` | +| Unavailable, per-group | ``On self-managed GitLab, by default this feature is not available. To make it available per group, ask an administrator to [enable the feature flag](<path to>/administration/feature_flags.md) named `flag_name`.`` | +| Available, per-project | ``On self-managed GitLab, by default this feature is available. To hide the feature per project or for your entire instance, ask an administrator to [disable the feature flag](<path to>/administration/feature_flags.md) named `flag_name`.`` | +| Unavailable, per-project | ``On self-managed GitLab, by default this feature is not available. To make it available per project or for your entire instance, ask an administrator to [enable the feature flag](<path to>/administration/feature_flags.md) named `flag_name`.`` | +| Available, per-user | ``On self-managed GitLab, by default this feature is available. To hide the feature per user, ask an administrator to [disable the feature flag](<path to>/administration/feature_flags.md) named `flag_name`.`` | +| Unavailable, per-user | ``On self-managed GitLab, by default this feature is not available. To make it available per user, ask an administrator to [enable the feature flag](<path to>/administration/feature_flags.md) named `flag_name`.`` | ### GitLab.com availability information @@ -114,5 +114,5 @@ And, when the feature is done and fully available to all users: > - Introduced in GitLab 13.7 [with a flag](../../administration/feature_flags.md) named `forti_token_cloud`. Disabled by default. > - [Enabled on self-managed](https://gitlab.com/issue/etc) in GitLab 13.8. > - [Enabled on GitLab.com](https://gitlab.com/issue/etc) in GitLab 13.9. -> - [Generally available](issue-link) in GitLab 14.0. [Feature flag <flag name>](issue-link) removed. +> - [Generally available](issue-link) in GitLab 14.0. [Feature flag `forti_token_cloud`](issue-link) removed. ``` diff --git a/doc/development/documentation/index.md b/doc/development/documentation/index.md index c6afcdbddd0..ee439e93011 100644 --- a/doc/development/documentation/index.md +++ b/doc/development/documentation/index.md @@ -143,6 +143,71 @@ Nanoc layout), which is displayed at the top of the page if defined. The `type` metadata parameter is deprecated but still exists in documentation pages. You can safely remove the `type` metadata parameter and its values. +### Batch updates for TW metadata + +NOTE: +This task is an MVC, and requires significant manual preparation of the output. +While the task can be time consuming, it is still faster than doing the work +entirely manually. + +It's important to keep the [`CODEOWNERS`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/.gitlab/CODEOWNERS) +file in the `gitlab` project up to date with the current Technical Writing team assignments. +This information is used in merge requests that contain documentation: + +- To populate the eligible approvers section. +- By GitLab Bot to ping reviewers for community contributions. + +GitLab cannot automatically associate the stage and group metadata in our documentation +pages with the technical writer assigned to that group, so we use a Rake task to +generate entries for the `CODEOWNERS` file. Declaring code owners for pages reduces +the number of times GitLab Bot pings the entire Technical Writing team. + +The `tw:codeowners` Rake task, located in [`lib/tasks/gitlab/tw/codeowners.rake`](https://gitlab.com/gitlab-org/gitlab/blob/master/lib/tasks/gitlab/tw/codeowners.rake), +contains an array of groups and their assigned technical writer. This task: + +- Outputs a line for each doc with metadata that matches a group in `lib/tasks/gitlab/tw/codeowners.rake`. + Files not matching a group are skipped. +- Adds the full path to the page, and the assigned technical writer. + +To prepare an update to the `CODEOWNERS` file: + +1. Update `lib/tasks/gitlab/tw/codeowners.rake` with the latest [TW team assignments](https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments). + Make this update in a standalone merge request, as it runs a long pipeline and + requires backend maintainer review. Make sure this is merged before you update + `CODEOWNERS` in another merge request. +1. Run the task from the root directory of the `gitlab` repository, and save the output in a file: + + ```ruby + bundle exec rake tw:codeowners > ~/Desktop/updates.md + ``` + +1. Open the file you just created (`~/Desktop/updates.md` in this example), and prepare the output: + - Find and replace `./` with `/`. + - Sort the lines in alphabetical (ascending) order. If you use VS Code, you can + select everything, press <kbd>F1</kbd>, type `sort`, and select **Sort lines (ascending, case insensitive**. +1. Create a new branch for your `CODEOWNERS` updates. +1. Replace the documentation-related lines in the `^[Documentation Pages]` section + with the output you prepared. + + WARNING: + The documentation section is not the last section of the `CODEOWNERS` file. Don't + delete data that isn't ours! + +1. Create a commit with the raw changes. +1. From the command line, run `git diff master`. +1. In the diff, look for directory-level assignments to manually restore to the + `CODEOWNERS` file. If all files in a single directory are assigned to the same + technical writer, we simplify these entries. Remove all the lines for the individual + files, and leave a single entry for the directory, for example: `/doc/directory/ @tech.writer`. +1. In the diff, look for changes that don't match your expectations: + - New pages, or newly moved pages, show up as added lines. + - Deleted pages, and pages that are now redirects, show up as deleted lines. + - If you see an unusual number of changes to pages that all seem related, + check the metadata for the pages. A group might have been renamed and the Rake task + must be updated to match. +1. Create another commit with your manual changes, and create a second merge request + with your changes to the `CODEOWNERS` file. Assign it to a technical writing manager for review. + ## Move, rename, or delete a page See [redirects](redirects.md). diff --git a/doc/development/documentation/restful_api_styleguide.md b/doc/development/documentation/restful_api_styleguide.md index 0a24f9b67be..1f270a2b5ee 100644 --- a/doc/development/documentation/restful_api_styleguide.md +++ b/doc/development/documentation/restful_api_styleguide.md @@ -41,12 +41,18 @@ Use the following template to help you get started. Be sure to list any required attributes first in the table. ````markdown -## Descriptive title +## API name > Version history note. One or two sentence description of what endpoint does. +### Method title + +> Version history note. + +Description of the method. + ```plaintext METHOD /endpoint ``` @@ -83,8 +89,40 @@ Example response: ``` ```` -Adjust the [version history note accordingly](versions.md#add-a-version-history-item) -to describe the GitLab release that introduced the API call. +## Version history + +Add [version history](versions.md#documenting-version-specific-features) +to describe new or updated API calls. + +To add version history for an individual attribute, include it in the version history +for the section. For example: + +```markdown +### Edit a widget + +> `widget_message` [introduced](<link-to-issue>) in GitLab 14.3. +``` + +## Attribute deprecation + +To deprecate an attribute: + +1. Add a version history note. + + ```markdown + > - `widget_name` [deprecated](<link-to-issue>) in GitLab 14.7. + ``` + +1. Add inline deprecation text to the description. + + ```markdown + | Attribute | Type | Required | Description | + |:--------------|:-------|:-----------------------|:----------------------------------------------------------------------------------------------------------------------------------------------------| + | `widget_name` | string | **{dotted-circle}** No | [Deprecated](<link-to-issue>) in GitLab 14.7 and is planned for removal in 15.4. Use `widget_id` instead. The name of the widget. | + ``` + +1. Optional. To widely announce the change, or if it's a breaking change, + [update the deprecations and removals documentation](../deprecation_guidelines/#update-the-deprecations-and-removals-documentation). ## Method description diff --git a/doc/development/documentation/site_architecture/global_nav.md b/doc/development/documentation/site_architecture/global_nav.md index 6d2b93b9462..e1e0da03abc 100644 --- a/doc/development/documentation/site_architecture/global_nav.md +++ b/doc/development/documentation/site_architecture/global_nav.md @@ -103,7 +103,7 @@ The global nav has five levels: - Doc - Doc -You can view this structure in [the navigation.yml file](https://gitlab.com/gitlab-org/gitlab-docs/-/blob/main/content/_data/navigation.yaml). +You can view this structure in [the `navigation.yml` file](https://gitlab.com/gitlab-org/gitlab-docs/-/blob/main/content/_data/navigation.yaml). **Do not** [add items](#add-a-navigation-entry) to the global nav without the consent of one of the technical writers. diff --git a/doc/development/documentation/site_architecture/index.md b/doc/development/documentation/site_architecture/index.md index 3566ab82379..05015fe7c5f 100644 --- a/doc/development/documentation/site_architecture/index.md +++ b/doc/development/documentation/site_architecture/index.md @@ -114,12 +114,12 @@ pipeline in the main `gitlab` repository as well as in `gitlab-docs`. Create an a different name first and test it to ensure you do not break the pipelines. 1. In [`gitlab-docs`](https://gitlab.com/gitlab-org/gitlab-docs), go to **{rocket}** **CI/CD > Pipelines**. -1. Click the **Run pipeline** button. +1. Select **Run pipeline**. 1. See that a new pipeline is running. The jobs that build the images are in the first - stage, `build-images`. You can click the pipeline number to see the larger pipeline - graph, or click the first (`build-images`) stage in the mini pipeline graph to + stage, `build-images`. You can select the pipeline number to see the larger pipeline + graph, or select the first (`build-images`) stage in the mini pipeline graph to expose the jobs that build the images. -1. Click the **play** (**{play}**) button next to the images you want to rebuild. +1. Select the **play** (**{play}**) button next to the images you want to rebuild. - Normally, you do not need to rebuild the `image:gitlab-docs-base` image, as it rarely changes. If it does need to be rebuilt, be sure to only run `image:docs-lint` after it is finished rebuilding. @@ -133,7 +133,7 @@ and deploys it to <https://docs.gitlab.com>. To build and deploy the site immediately (must have the Maintainer role): 1. In [`gitlab-docs`](https://gitlab.com/gitlab-org/gitlab-docs), go to **{rocket}** **CI/CD > Schedules**. -1. For the `Build docs.gitlab.com every 4 hours` scheduled pipeline, click the **play** (**{play}**) button. +1. For the `Build docs.gitlab.com every 4 hours` scheduled pipeline, select the **play** (**{play}**) button. Read more about [documentation deployments](deployment_process.md). diff --git a/doc/development/documentation/structure.md b/doc/development/documentation/structure.md index 329fd279b99..a02046d4466 100644 --- a/doc/development/documentation/structure.md +++ b/doc/development/documentation/structure.md @@ -37,9 +37,6 @@ Don't tell them **how** to do this thing. Tell them **what it is**. If you start describing another concept, start a new concept and link to it. -Also, do not use **Overview** or **Introduction** for the title. Instead, -use a noun or phrase that someone would search for. - Concepts should be in this format: ```markdown @@ -53,6 +50,19 @@ Remember, if you start to describe about another concept, stop yourself. Each concept should be about one concept only. ``` +### Concept headings + +For the heading text, use a noun. For example, `Widgets` or `GDK dependency management`. + +If a noun is ambiguous, you can add a gerund. For example, `Documenting versions` instead of `Versions`. + +Avoid these heading titles: + +- `Overview` or `Introduction`. Instead, use a more specific + noun or phrase that someone would search for. +- `Use cases`. Instead, incorporate the information as part of the concept. +- `How it works`. Instead, use a noun followed by `workflow`. For example, `Merge request workflow`. + ## Task A task gives instructions for how to complete a procedure. @@ -101,8 +111,13 @@ To create an issue: The issue is created. You can view it by going to **Issues > List**. ``` +### Task headings + +For the heading text, use the structure `active verb` + `noun`. +For example, `Create an issue`. + If you have several tasks on a page that share prerequisites, you can use the title -**Prerequisites**, and link to it. +`Prerequisites` and link to it. ## Reference @@ -119,8 +134,17 @@ Introductory sentence. | **Name** | Descriptive sentence about the setting. | ``` -If a feature or concept has its own prerequisites, you can use reference -content to create a **Prerequisites** header for the information. +### Reference headings + +Reference headings are usually nouns. + +Avoid these heading titles: + +- `Important notes`. Instead, incorporate this information + closer to where it belongs. For example, this information might be a prerequisite + for a task, or information about a concept. +- `Limitations`. Instead, move the content near other similar information. + If you must, you can use the title `Known issues`. ## Troubleshooting @@ -142,6 +166,10 @@ This issue occurs when... The workaround is... ``` +If multiple causes or workarounds exist, consider putting them into a table format. + +### Troubleshooting headings + For the heading: - Consider including at least a partial error message. @@ -149,7 +177,17 @@ For the heading: If you do not put the full error in the title, include it in the body text. -If multiple causes or workarounds exist, consider putting them into a table format. +## General heading text guidelines + +In general, for heading text: + +- Be clear and direct. Make every word count. +- Use articles and prepositions. +- Follow [capitalization](styleguide/index.md#capitalization) guidelines. +- Do not repeat text from earlier headings. For example, if the page is about merge requests, + instead of `Troubleshooting merge requests`, use only `Troubleshooting`. + +See also [guidelines for headings in Markdown](styleguide/index.md#headings-in-markdown). ## Other types of content diff --git a/doc/development/documentation/styleguide/index.md b/doc/development/documentation/styleguide/index.md index c11d1422167..700d64c30d1 100644 --- a/doc/development/documentation/styleguide/index.md +++ b/doc/development/documentation/styleguide/index.md @@ -86,7 +86,7 @@ move in this direction, so we can address these issues: information into a format that is geared toward helping others, rather than documenting how a feature was implemented. -GitLab uses these [topic type templates](../structure.md). +GitLab uses these [topic types](../structure.md). ### Link instead of repeating text @@ -143,6 +143,25 @@ Hard-coded HTML is valid, although it's discouraged from being used. HTML is per - Special styling is required. - Reviewed and approved by a technical writer. +### Headings in Markdown + +Each documentation page begins with a level 1 heading (`#`). This becomes the `h1` element when +the page is rendered to HTML. There can be only **one** level 1 heading per page. + +- For each subsection, increment the heading level. In other words, increment the number of `#` characters + in front of the heading. +- Do not skip a level. For example: `##` > `####`. +- Leave one blank line before and after the heading. + +When you change heading text, the anchor link changes. To avoid broken links: + +- Do not use step numbers in headings. +- When possible, do not use words that might change in the future. + +Also, do not use links as part of heading text. + +See also [heading guidelines for specific topic types](../structure.md). + ### Markdown Rules GitLab ensures that the Markdown used across all documentation is consistent, as @@ -191,6 +210,8 @@ GitLab documentation should be clear and easy to understand. ### Capitalization +As a company, we tend toward lowercase. + #### Headings Use sentence case. For example: @@ -220,7 +241,7 @@ create an issue or an MR to propose a change to the user interface text. If the term is not in the word list, ask a GitLab Technical Writer for advice. Do not match the capitalization of terms or phrases on the [Features page](https://about.gitlab.com/features/) -or [features.yml](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/features.yml) +or [`features.yml`](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/features.yml) by default. #### Other terms @@ -589,6 +610,10 @@ Consider installing a plugin or extension in your editor for formatting tables: - [Markdown Table Formatter](https://packagecontrol.io/packages/Markdown%20Table%20Formatter) for Sublime Text - [Markdown Table Formatter](https://atom.io/packages/markdown-table-formatter) for Atom +### Table headings + +Use sentence case for table headings. For example, `Keyword value` or `Project name`. + ### Feature tables When creating tables of lists of features (such the features @@ -642,45 +667,6 @@ For other punctuation rules, refer to the [Pajamas Design System Punctuation section](https://design.gitlab.com/content/punctuation/). This is overridden by the [documentation-specific punctuation rules](#punctuation). -## Headings - -In the Markdown document: - -- Add one H1 (`#`) at the start of the page. The `h1` becomes the document `<title>`. -- After the H1, follow the order `h2` > `h3` > `h4` > `h5` > `h6`. -- Do not skip a level. For example: `h2` > `h4`. -- Leave one blank line before and after the heading. - -For the heading text, **do**: - -- Be clear and direct. Make every word count. -- Use active, imperative verbs for [tasks](../structure.md#task). For example, `Create an issue`. -- Use `ing` (gerund) verbs only when you need a topic that introduces tasks. For example, `Configuring GDK`. -- Use nouns for [concepts](../structure.md#concept). For example, `GDK dependency management`. If a noun is - ambiguous, you can add a gerund. For example, `Documenting versions` instead of `Versions`. -- Talk about what the product does, realistically but from a positive perspective. Instead of - `Limitations`, move the content near other similar information. If you must, you can - use the title `Known issues`. -- Use articles and prepositions. -- Add the [product badge](#product-tier-badges) that corresponds to the license tier. -- Follow [capitalization](#capitalization) guidelines. - -For the heading text, **do not**: - -- Use generic words like `Overview` or `Use cases`. Instead, incorporate - the information under a concept heading. -- Use `How it works`. Incorporate this information under a concept, or use a - noun followed by `workflow`. For example, `Merge request workflow`. -- Use `Important Notes`. Incorporate this information closer to where it belongs. -- Use numbers to indicate steps. If the numbers change, the anchor links changes, - which eventually leads to dead links. If you think you must add numbers in headings, - at least discuss it with a writer in the merge request. -- Use words that might change in the future. Changing - a heading changes its anchor URL, which affects other linked pages. -- Repeat text from earlier headings. For example, instead of `Troubleshooting merge requests`, - use `Troubleshooting`. -- Use links. - ### Anchor links Headings generate anchor links when rendered. `## This is an example` generates @@ -1193,7 +1179,7 @@ This is how it renders on the GitLab documentation site: > Notes: > -> - The `figure` tag is required for semantic SEO and the `video_container` +> - The `figure` tag is required for semantic SEO and the `video-container` class is necessary to make sure the video is responsive and displays on different mobile devices. > - The `<div class="video-fallback">` is a fallback necessary for diff --git a/doc/development/documentation/styleguide/word_list.md b/doc/development/documentation/styleguide/word_list.md index e7d927de2cf..c753c39b727 100644 --- a/doc/development/documentation/styleguide/word_list.md +++ b/doc/development/documentation/styleguide/word_list.md @@ -139,6 +139,16 @@ Do not use **and so on**. Instead, be more specific. For details, see Use [**section**](#section) instead of **area**. The only exception is [the Admin Area](#admin-area). +## associate + +Do not use **associate** when describing adding issues to epics, or users to issues, merge requests, +or epics. + +Instead, use **assign**. For example: + +- Assign the issue to an epic. +- Assign a user to the issue. + ## below Try to avoid **below** when referring to an example or table in a documentation page. If required, use **following** instead. For example: @@ -347,6 +357,8 @@ See also [**type**](#type). Use lowercase for **epic**. +See also [associate](#associate). + ## epic board Use lowercase for **epic board**. @@ -395,6 +407,13 @@ of the fields at once. For example: Learn more about [documenting multiple fields at once](index.md#documenting-multiple-fields-at-once). +## filter + +When you are viewing a list of items, like issues or merge requests, you filter the list by +the available attributes. For example, you might filter by assignee or reviewer. + +Filtering is different from [searching](#search). + ## foo Do not use **foo** in product documentation. You can use it in our API and contributor documentation, but try to use a clearer and more meaningful example instead. @@ -851,6 +870,13 @@ Do not use **scalability** when talking about increasing GitLab performance for are sometimes acceptable, but references to increasing GitLab performance for additional users should direct readers to the GitLab [reference architectures](../../../administration/reference_architectures/index.md) page. +## search + +When you search, you type a string in the search box on the top bar. +The search results are displayed on a search page. + +Searching is different from [filtering](#filter). + ## section Use **section** to describe an area on a page. For example, if a page has lines that separate the UI diff --git a/doc/development/documentation/testing.md b/doc/development/documentation/testing.md index 81e1eca8724..feb10845aea 100644 --- a/doc/development/documentation/testing.md +++ b/doc/development/documentation/testing.md @@ -276,6 +276,22 @@ guidelines: | UI text from GitLab | Verify it correctly matches the UI, then: If it does not match the UI, update it. If it matches the UI, but the UI seems incorrect, create an issue to see if the UI needs to be fixed. If it matches the UI and seems correct, add it to the [vale spelling exceptions list](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/.vale/gitlab/spelling-exceptions.txt). | | UI text from a third-party product | Rewrite the sentence to avoid it, or [add the vale exception code in-line](#disable-vale-tests). | +#### Vale uppercase (acronym) test + +The [`Uppercase.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/.vale/gitlab/Uppercase.yml) +test checks for incorrect usage of words in all capitals. For example, avoid usage +like `This is NOT important`. + +If the word must be in all capitals, follow these guidelines: + +| Flagged word | Guideline | +|----------------------------------------------------------------|-----------| +| Acronym (likely known by the average visitor to that page) | Add the acronym to the list of words and acronyms in `Uppercase.yml`. | +| Acronym (likely not known by the average visitor to that page) | The first time the acronym is used, write it out fully followed by the acronym in parentheses. In later uses, use just the acronym by itself. For example: `This feature uses the File Transfer Protocol (FTP). FTP is...`. | +| Correctly capitalized name of a product or service | Add the name to the list of words and acronyms in `Uppercase.yml`. | +| Command, variable, code, or similar | Put it in backticks or a code block. For example: ``Use `FALSE` as the variable value.`` | +| UI text from a third-party product | Rewrite the sentence to avoid it, or [add the vale exception code in-line](#disable-vale-tests). | + #### Vale readability score In [`ReadingLevel.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/.vale/gitlab/ReadingLevel.yml), @@ -321,12 +337,12 @@ To match the versions of `markdownlint-cli` and `vale` used in the GitLab projec [versions used (see `variables:` section)](https://gitlab.com/gitlab-org/gitlab-docs/-/blob/main/.gitlab-ci.yml) when building the `image:docs-lint-markdown` Docker image containing these tools for CI/CD. -| Tool | Version | Command | Additional information | -|--------------------|-----------|-------------------------------------------|------------------------| -| `markdownlint-cli` | Latest | `yarn global add markdownlint-cli` | n/a | -| `markdownlint-cli` | Specific | `yarn global add markdownlint-cli@0.23.2` | The `@` indicates a specific version, and this example updates the tool to version `0.23.2`. | -| Vale | Latest | `brew update && brew upgrade vale` | This command is for macOS only. | -| Vale | Specific | n/a | Not possible using `brew`, but can be [directly downloaded](https://github.com/errata-ai/vale/releases). | +| Tool | Version | Command | Additional information | +|--------------------|-----------|--------------------------------------------------------|------------------------| +| `markdownlint-cli` | Latest | `yarn global add markdownlint-cli` | None. | +| `markdownlint-cli` | Specific | `yarn global add markdownlint-cli@0.23.2` | The `@` indicates a specific version, and this example updates the tool to version `0.23.2`. | +| Vale | Latest | `brew update && brew upgrade vale` | This command is for macOS only. | +| Vale | Specific | Not applicable. | Binaries can be [directly downloaded](https://github.com/errata-ai/vale/releases). | ### Configure editors diff --git a/doc/development/documentation/versions.md b/doc/development/documentation/versions.md index 0f2bdca4c73..067c37d30aa 100644 --- a/doc/development/documentation/versions.md +++ b/doc/development/documentation/versions.md @@ -25,7 +25,7 @@ To view versions that are not available on `docs.gitlab.com`: ## Documenting version-specific features When a feature is added or updated, you can include its version information -either as a **Version history** bullet or as an inline text reference. +either as a **Version history** list item or as an inline text reference. You do not need to add version information on the pages in the `/development` directory. @@ -132,7 +132,7 @@ To remove a page: ```markdown --- - stage: Enablement + stage: Data Stores group: Global Search info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments remove_date: '2022-08-02' @@ -141,8 +141,8 @@ To remove a page: # Title (removed) **(ULTIMATE SELF)** - This feature was [deprecated](https://gitlab.com/gitlab-org/gitlab/-/issues/351963) in GitLab 14.8 - and [removed](https://gitlab.com/gitlab-org/gitlab/-/issues/351963) in 15.0. + This feature was [deprecated](<link-to-issue>) in GitLab X.Y + and [removed](<link-to-issue>) in X.Y. Use [feature X](<link-to-issue>) instead. ``` @@ -162,12 +162,12 @@ To remove a topic: For the `remove_date`, set a date three months after the release where it was removed. ```markdown - <!--- start_remove The following content will be removed on remove_date: '2023-08-22' --> + <!--- start_remove The following content will be removed on remove_date: 'YYYY-MM-DD' --> ## Title (removed) **(ULTIMATE SELF)** - This feature was [deprecated](https://gitlab.com/gitlab-org/gitlab/-/issues/351963) in GitLab 14.8 - and [removed](https://gitlab.com/gitlab-org/gitlab/-/issues/351963) in 15.0. + This feature was [deprecated](<link-to-issue>) in GitLab X.Y + and [removed](<link-to-issue>) in X.Y. Use [feature X](<link-to-issue>) instead. <!--- end_remove --> @@ -179,8 +179,8 @@ This content is removed from the documentation as part of the Technical Writing ## Which versions are removed GitLab supports the current major version and two previous major versions. -For example, if 14.0 is the current major version, all major and minor releases of -GitLab 14.0, 13.0 and 12.0 are supported. +For example, if 15.0 is the current major version, all major and minor releases of +GitLab 15.0, 14.0, and 13.0 are supported. [View the list of supported versions](https://about.gitlab.com/support/statement-of-support.html#version-support). diff --git a/doc/development/elasticsearch.md b/doc/development/elasticsearch.md index 7c67b3495ba..d32ceb43ce9 100644 --- a/doc/development/elasticsearch.md +++ b/doc/development/elasticsearch.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Global Search info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -9,17 +9,17 @@ info: To determine the technical writer assigned to the Stage/Group associated w This area is to maintain a compendium of useful information when working with Elasticsearch. Information on how to enable Elasticsearch and perform the initial indexing is in -the [Elasticsearch integration documentation](../integration/elasticsearch.md#enable-advanced-search). +the [Elasticsearch integration documentation](../integration/advanced_search/elasticsearch.md#enable-advanced-search). ## Deep Dive -In June 2019, Mario de la Ossa hosted a Deep Dive (GitLab team members only: `https://gitlab.com/gitlab-org/create-stage/issues/1`) on the GitLab [Elasticsearch integration](../integration/elasticsearch.md) to share his domain specific knowledge with anyone who may work in this part of the codebase in the future. You can find the <i class="fa fa-youtube-play youtube" aria-hidden="true"></i> [recording on YouTube](https://www.youtube.com/watch?v=vrvl-tN2EaA), and the slides on [Google Slides](https://docs.google.com/presentation/d/1H-pCzI_LNrgrL5pJAIQgvLX8Ji0-jIKOg1QeJQzChug/edit) and in [PDF](https://gitlab.com/gitlab-org/create-stage/uploads/c5aa32b6b07476fa8b597004899ec538/Elasticsearch_Deep_Dive.pdf). Everything covered in this deep dive was accurate as of GitLab 12.0, and while specific details may have changed since then, it should still serve as a good introduction. +In June 2019, Mario de la Ossa hosted a Deep Dive (GitLab team members only: `https://gitlab.com/gitlab-org/create-stage/issues/1`) on the GitLab [Elasticsearch integration](../integration/advanced_search/elasticsearch.md) to share his domain specific knowledge with anyone who may work in this part of the codebase in the future. You can find the <i class="fa fa-youtube-play youtube" aria-hidden="true"></i> [recording on YouTube](https://www.youtube.com/watch?v=vrvl-tN2EaA), and the slides on [Google Slides](https://docs.google.com/presentation/d/1H-pCzI_LNrgrL5pJAIQgvLX8Ji0-jIKOg1QeJQzChug/edit) and in [PDF](https://gitlab.com/gitlab-org/create-stage/uploads/c5aa32b6b07476fa8b597004899ec538/Elasticsearch_Deep_Dive.pdf). Everything covered in this deep dive was accurate as of GitLab 12.0, and while specific details may have changed since then, it should still serve as a good introduction. In August 2020, a second Deep Dive was hosted, focusing on [GitLab-specific architecture for multi-indices support](#zero-downtime-reindexing-with-multiple-indices). The <i class="fa fa-youtube-play youtube" aria-hidden="true"></i> [recording on YouTube](https://www.youtube.com/watch?v=0WdPR9oB2fg) and the [slides](https://lulalala.gitlab.io/gitlab-elasticsearch-deepdive/) are available. Everything covered in this deep dive was accurate as of GitLab 13.3. ## Supported Versions -See [Version Requirements](../integration/elasticsearch.md#version-requirements). +See [Version Requirements](../integration/advanced_search/elasticsearch.md#version-requirements). Developers making significant changes to Elasticsearch queries should test their features against all our supported versions. @@ -69,7 +69,7 @@ The `whitespace` tokenizer was selected in order to have more control over how t Please see the `code` filter for an explanation on how tokens are split. NOTE: -The [Elasticsearch code_analyzer doesn't account for all code cases](../integration/elasticsearch.md#elasticsearch-code_analyzer-doesnt-account-for-all-code-cases). +The [Elasticsearch code_analyzer doesn't account for all code cases](../integration/advanced_search/elasticsearch_troubleshooting.md#elasticsearch-code_analyzer-doesnt-account-for-all-code-cases). #### `code_search_analyzer` diff --git a/doc/development/event_store.md b/doc/development/event_store.md index afd5640271e..fa7208ead04 100644 --- a/doc/development/event_store.md +++ b/doc/development/event_store.md @@ -316,11 +316,11 @@ RSpec.describe MergeRequests::UpdateHeadPipelineWorker do let(:pipeline_created_event) { Ci::PipelineCreatedEvent.new(data: ({ pipeline_id: pipeline.id })) } # This shared example ensures that an event is published and correctly processed by - # the current subscriber (`described_class`). + # the current subscriber (`described_class`). It also ensures that the worker is idempotent. it_behaves_like 'subscribes to event' do let(:event) { pipeline_created_event } end - + it 'does something' do # This helper directly executes `perform` ensuring that `handle_event` is called correctly. consume_event(subscriber: described_class, event: pipeline_created_event) diff --git a/doc/development/experiment_guide/experiment_code_reviews.md b/doc/development/experiment_guide/experiment_code_reviews.md index fdde89caa34..eda316db9d4 100644 --- a/doc/development/experiment_guide/experiment_code_reviews.md +++ b/doc/development/experiment_guide/experiment_code_reviews.md @@ -22,4 +22,4 @@ but is acceptable for now, mention your concerns with a note that there's no need to change the code. The author can then add a comment to this piece of code and link to the issue that resolves the experiment. The author or reviewer can add a link to this concern in the experiment rollout issue under the `Experiment Successful Cleanup Concerns` section of the description. -If the experiment is successful and becomes part of the product, any items that appear under this section will be addressed. +If the experiment is successful and becomes part of the product, any items that appear under this section are addressed. diff --git a/doc/development/experiment_guide/experiment_rollout.md b/doc/development/experiment_guide/experiment_rollout.md index afa32d75221..ff0844f9d3c 100644 --- a/doc/development/experiment_guide/experiment_rollout.md +++ b/doc/development/experiment_guide/experiment_rollout.md @@ -12,7 +12,7 @@ Each experiment should have an [experiment rollout](https://gitlab.com/groups/gi The rollout issue is similar to a feature flag rollout issue, and is also used to track the status of an experiment. When an experiment is deployed, the due date of the issue should be set (this depends on the experiment but can be up to a few weeks in the future). -After the deadline, the issue needs to be resolved and either: +After the deadline, the issue must be resolved and either: - It was successful and the experiment becomes the new default. - It was not successful and all code related to the experiment is removed. @@ -29,7 +29,7 @@ This can be done via ChatOps: - [disable](../feature_flags/controls.md#disabling-feature-flags): `/chatops run feature set gitlab_experiment false` - [enable](../feature_flags/controls.md#process): `/chatops run feature delete gitlab_experiment` -- This allows the `default_enabled` [value of true in the yml](https://gitlab.com/gitlab-org/gitlab/-/blob/016430f6751b0c34abb24f74608c80a1a8268f20/config/feature_flags/ops/gitlab_experiment.yml#L8) to be honored. +- This allows the `default_enabled` [value of true in the YAML](https://gitlab.com/gitlab-org/gitlab/-/blob/016430f6751b0c34abb24f74608c80a1a8268f20/config/feature_flags/ops/gitlab_experiment.yml#L8) to be honored. ## Notes on feature flags @@ -42,8 +42,8 @@ You may already be familiar with the concept of feature flags in GitLab, but usi feature flags in experiments is a bit different. While in general terms, a feature flag is viewed as being either `on` or `off`, this isn't accurate for experiments. -Generally, `off` means that when we ask if a feature flag is enabled, it will always -return `false`, and `on` means that it will always return `true`. An interim state, +Generally, `off` means that when we ask if a feature flag is enabled, it always +returns `false`, and `on` means that it always returns `true`. An interim state, considered `conditional`, also exists. We take advantage of this trinary state of feature flags. To understand this `conditional` aspect: consider that either of these settings puts a feature flag into this state: @@ -64,7 +64,7 @@ We don't refer to this as being enabled, because that's a confusing and overload term here. In the experiment terms, our experiment is _running_, and the feature flag is `conditional`. -When a feature flag is enabled (meaning the state is `on`), the candidate will always be +When a feature flag is enabled (meaning the state is `on`), the candidate is always assigned. We should try to be consistent with our terms, and so for experiments, we have an diff --git a/doc/development/experiment_guide/implementing_experiments.md b/doc/development/experiment_guide/implementing_experiments.md index 3c33d015108..c9e277873dc 100644 --- a/doc/development/experiment_guide/implementing_experiments.md +++ b/doc/development/experiment_guide/implementing_experiments.md @@ -13,14 +13,14 @@ info: To determine the technical writer assigned to the Stage/Group associated w Start by generating a feature flag using the `bin/feature-flag` command as you normally would for a development feature flag, making sure to use `experiment` for the type. For the sake of documentation let's name our feature flag (and experiment) -"pill_color". +`pill_color`. ```shell bin/feature-flag pill_color -t experiment ``` After you generate the desired feature flag, you can immediately implement an -experiment in code. An experiment implementation can be as simple as: +experiment in code. A basic experiment implementation can be: ```ruby experiment(:pill_color, actor: current_user) do |e| @@ -30,8 +30,8 @@ experiment(:pill_color, actor: current_user) do |e| end ``` -When this code executes, the experiment is run, a variant is assigned, and (if within a -controller or view) a `window.gl.experiments.pill_color` object will be available in the +When this code executes, the experiment is run, a variant is assigned, and (if in a +controller or view) a `window.gl.experiments.pill_color` object is available in the client layer, with details like: - The assigned variant. @@ -102,7 +102,7 @@ contexts to simplify reporting: - `{ actor: current_user }`: Assigns a variant and is "sticky" to each user (or "client" if `current_user` is nil) who enters the experiment. -- `{ project: project }`: Assigns a variant and is "sticky" to the project currently +- `{ project: project }`: Assigns a variant and is "sticky" to the project being viewed. If running your experiment is more useful when viewing a project, rather than when a specific user is viewing any project, consider this approach. - `{ group: group }`: Similar to the project example, but applies to a wider @@ -151,7 +151,7 @@ wouldn't be resolvable. There are two ways to implement an experiment: -1. The simple experiment style described previously. +1. The basic experiment style described previously. 1. A more advanced style where an experiment class is provided. The advanced style is handled by naming convention, and works similar to what you @@ -224,8 +224,8 @@ end When an experiment runs, the segmentation rules are executed in the order they're defined. The first segmentation rule to produce a truthy result assigns the variant. -In our example, any user named `'Richard'`, regardless of account age, will always -be assigned the _red_ variant. If you want the opposite logic, flip the order. +In our example, any user named `'Richard'`, regardless of account age, is always +assigned the _red_ variant. If you want the opposite logic, flip the order. NOTE: Keep in mind when defining segmentation rules: after a truthy result, the remaining @@ -275,7 +275,7 @@ end One of the most important aspects of experiments is gathering data and reporting on it. You can use the `track` method to track events across an experimental implementation. You can track events consistently to an experiment if you provide the same context between -calls to your experiment. If you do not yet understand context, you should read +calls to your experiment. If you do not understand context, you should read about contexts now. We can assume we run the experiment in one or a few places, but @@ -295,7 +295,7 @@ experiment have a special added to the event. This can be used - typically by the data team - to create a connection between the events on a given experiment. -If our current user hasn't encountered the experiment yet (meaning where the experiment +If our user hasn't encountered the experiment (meaning where the experiment is run), and we track an event for them, they are assigned a variant and see that variant if they ever encountered the experiment later, when an `:assignment` event would be tracked at that time for them. @@ -316,9 +316,9 @@ so it can be used when resolving experimentation in the client layer. Given that we've defined a class for our experiment, and have defined the variants for it, we can publish that experiment in a couple ways. -The first way is simply by running the experiment. Assuming the experiment has been run, it will surface in the client layer without having to do anything special. +The first way is by running the experiment. Assuming the experiment has been run, it surfaces in the client layer without having to do anything special. -The second way doesn't run the experiment and is intended to be used if the experiment only needs to surface in the client layer. To accomplish this we can simply `.publish` the experiment. This won't run any logic, but does surface the experiment details in the client layer so they can be utilized there. +The second way doesn't run the experiment and is intended to be used if the experiment must only surface in the client layer. To accomplish this we can `.publish` the experiment. This does not run any logic, but does surface the experiment details in the client layer so they can be utilized there. An example might be to publish an experiment in a `before_action` in a controller. Assuming we've defined the `PillColorExperiment` class, like we have above, we can surface it to the client by publishing it instead of running it: @@ -329,7 +329,7 @@ before_action -> { experiment(:pill_color).publish }, only: [:show] You can then see this surface in the JavaScript console: ```javascript -window.gl.experiments // => { pill_color: { excluded: false, experiment: "pill_color", key: "ca63ac02", variant: "candidate" } } +window.gl.experiments // => { pill_color: { excluded: false, experiment: "pill_color", key: "ca63ac02", variant: "candidate" } } ``` ### Using experiments in Vue @@ -366,4 +366,4 @@ export default { ``` NOTE: -When there is no experiment data in the `window.gl.experiments` object for the given experiment name, the `control` slot will be used, if it exists. +When there is no experiment data in the `window.gl.experiments` object for the given experiment name, the `control` slot is used, if it exists. diff --git a/doc/development/experiment_guide/index.md b/doc/development/experiment_guide/index.md index b140cce34fc..163cd009c51 100644 --- a/doc/development/experiment_guide/index.md +++ b/doc/development/experiment_guide/index.md @@ -6,8 +6,8 @@ info: To determine the technical writer assigned to the Stage/Group associated w # Experiment Guide -Experiments can be conducted by any GitLab team, most often the teams from the -[Growth Sub-department](https://about.gitlab.com/handbook/engineering/development/growth/). +Experiments can be conducted by any GitLab team, most often the teams from the +[Growth Sub-department](https://about.gitlab.com/handbook/engineering/development/growth/). Experiments are not tied to releases because they primarily target GitLab.com. Experiments are run as an A/B/n test, and are behind an [experiment feature flag](../feature_flags/#experiment-type) @@ -27,7 +27,7 @@ sometimes referred to as GLEX, to run our experiments. The gem exists in a separ so it can be shared across any GitLab property that uses Ruby. You should feel comfortable reading the documentation on that project if you want to dig into more advanced topics or open issues. Be aware that the documentation there reflects what's in the main branch and may not be the same as -the version being used within GitLab. +the version being used in GitLab. ## Glossary of terms @@ -43,7 +43,7 @@ when communicating about experiments: ## Implementing an experiment -[`GLEX`](https://gitlab.com/gitlab-org/ruby/gems/gitlab-experiment) - or `Gitlab::Experiment`, the `gitlab-experiment` gem - is the preferred option for implementing an experiment in GitLab. +[GLEX](https://gitlab.com/gitlab-org/ruby/gems/gitlab-experiment) - or `Gitlab::Experiment`, the `gitlab-experiment` gem - is the preferred option for implementing an experiment in GitLab. For more information, see [Implementing an A/B/n experiment using GLEX](implementing_experiments.md). diff --git a/doc/development/experiment_guide/testing_experiments.md b/doc/development/experiment_guide/testing_experiments.md index 08ff91a3deb..a73896c8436 100644 --- a/doc/development/experiment_guide/testing_experiments.md +++ b/doc/development/experiment_guide/testing_experiments.md @@ -8,7 +8,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w ## Testing experiments with RSpec -In the course of working with experiments, you'll probably want to utilize the RSpec +In the course of working with experiments, you might want to use the RSpec tooling that's built in. This happens automatically for files in `spec/experiments`, but for other files and specs you want to include it in, you can specify the `:experiment` type: @@ -48,7 +48,7 @@ segmentations using the matchers. class ExampleExperiment < ApplicationExperiment control { } candidate { '_candidate_' } - + exclude { context.actor.first_name == 'Richard' } segment(variant: :candidate) { context.actor.username == 'jejacks0n' } end @@ -84,7 +84,7 @@ expect(subject).to track(:my_event) subject.track(:my_event) ``` -You can use the `on_next_instance` chain method to specify that it will happen +You can use the `on_next_instance` chain method to specify that it happens on the next instance of the experiment. This helps you if you're calling `experiment(:example).track` downstream: @@ -127,7 +127,7 @@ describe('when my_experiment is enabled', () => { ``` NOTE: -This method of stubbing in Jest specs will not automatically un-stub itself at the end of the test. We merge our stubbed experiment in with all the other global data in `window.gl`. If you need to remove the stubbed experiments after your test or ensure a clean global object before your test, you'll need to manage the global object directly yourself: +This method of stubbing in Jest specs does not automatically un-stub itself at the end of the test. We merge our stubbed experiment in with all the other global data in `window.gl`. If you must remove the stubbed experiments after your test or ensure a clean global object before your test, you must manage the global object directly yourself: ```javascript describe('tests that care about global state', () => { diff --git a/doc/development/export_csv.md b/doc/development/export_csv.md index 998e5b1fb3b..0f50d1438fc 100644 --- a/doc/development/export_csv.md +++ b/doc/development/export_csv.md @@ -15,7 +15,7 @@ This document lists the different implementations of CSV export in GitLab codeba | As email attachment | - Asynchronously process the query with background job.<br>- Email uses the export as an attachment. | - Asynchronous processing. | - Requires users use a different app (email) to download the CSV.<br>- Email providers may limit attachment size. | - [Export issues](../user/project/issues/csv_export.md)<br>- [Export merge requests](../user/project/merge_requests/csv_export.md) | | As downloadable link in email (*) | - Asynchronously process the query with background job.<br>- Email uses an export link. | - Asynchronous processing.<br>- Bypasses email provider attachment size limit. | - Requires users use a different app (email).<br>- Requires additional storage and cleanup. | [Export User Permissions](https://gitlab.com/gitlab-org/gitlab/-/issues/1772) | | Polling (non-persistent state) | - Asynchronously processes the query with the background job.<br>- Frontend(FE) polls every few seconds to check if CSV file is ready. | - Asynchronous processing.<br>- Automatically downloads to local machine on completion.<br>- In-app solution. | - Non-persistable request - request expires when user navigates to a different page.<br>- API is processed for each polling request. | [Export Vulnerabilities](../user/application_security/vulnerability_report/#export-vulnerability-details) | -| Polling (persistent state) (*) | - Asynchronously processes the query with background job.<br>- Backend (BE) maintains the export state<br>- FE polls every few seconds to check status.<br>- FE shows 'Download link' when export is ready.<br>- User can download or regenerate a new report. | - Asynchronous processing.<br>- No database calls made during the polling requests (HTTP 304 status is returned until export status changes).<br>- Does not require user to stay on page until export is complete.<br>- In-app solution.<br>- Can be expanded into a generic CSV feature (such as dashboard / CSV API). | - Requires to maintain export states in DB.<br>- Does not automatically download the CSV export to local machine, requires users to click 'Download' button. | [Export Merge Commits Report](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/43055) | +| Polling (persistent state) (*) | - Asynchronously processes the query with background job.<br>- Backend (BE) maintains the export state<br>- FE polls every few seconds to check status.<br>- FE shows 'Download link' when export is ready.<br>- User can download or regenerate a new report. | - Asynchronous processing.<br>- No database calls made during the polling requests (HTTP 304 status is returned until export status changes).<br>- Does not require user to stay on page until export is complete.<br>- In-app solution.<br>- Can be expanded into a generic CSV feature (such as dashboard / CSV API). | - Requires to maintain export states in DB.<br>- Does not automatically download the CSV export to local machine, requires users to select 'Download'. | [Export Merge Commits Report](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/43055) | NOTE: Export types marked as * are currently work in progress. diff --git a/doc/development/fe_guide/emojis.md b/doc/development/fe_guide/emojis.md index 7ef88c5ca19..3c7fc20440b 100644 --- a/doc/development/fe_guide/emojis.md +++ b/doc/development/fe_guide/emojis.md @@ -24,6 +24,10 @@ when your platform does not support it. 1. Ensure new sprite sheets generated for 1x and 2x - `app/assets/images/emoji.png` - `app/assets/images/emoji@2x.png` + 1. Update `fixtures/emojis/intents.json` with any new emoji that we would like to highlight as having positive or negative intent. + - Positive intent should be set to `0.5`. + - Neutral intent can be set to `1`. This is applied to all emoji automatically so there is no need to set this explicitly. + - Negative intent should be set to `1.5`. 1. Ensure you see new individual images copied into `app/assets/images/emoji/` 1. Ensure you can see the new emojis and their aliases in the GitLab Flavored Markdown (GLFM) Autocomplete 1. Ensure you can see the new emojis and their aliases in the award emoji menu diff --git a/doc/development/fe_guide/frontend_faq.md b/doc/development/fe_guide/frontend_faq.md index 1e8f7f5fb81..39c39894dac 100644 --- a/doc/development/fe_guide/frontend_faq.md +++ b/doc/development/fe_guide/frontend_faq.md @@ -187,12 +187,12 @@ Be sure to add these polyfills to `app/assets/javascripts/commons/polyfills.js`. To see what polyfills are being used: 1. Navigate to your merge request. -1. In the secondary menu below the title of the merge request, click **Pipelines**, then - click the pipeline you want to view, to display the jobs in that pipeline. -1. Click the [`compile-production-assets`](https://gitlab.com/gitlab-org/gitlab/-/jobs/641770154) job. -1. In the right-hand sidebar, scroll to **Job Artifacts**, and click **Browse**. -1. Click the **webpack-report** folder to open it, and click **index.html**. -1. In the upper left corner of the page, click the right arrow **{angle-right}** +1. In the secondary menu below the title of the merge request, select **Pipelines**, then + select the pipeline you want to view, to display the jobs in that pipeline. +1. Select the [`compile-production-assets`](https://gitlab.com/gitlab-org/gitlab/-/jobs/641770154) job. +1. In the right-hand sidebar, scroll to **Job Artifacts**, and select **Browse**. +1. Select the **webpack-report** folder to open it, and select **index.html**. +1. In the upper left corner of the page, select the right arrow **{chevron-lg-right}** to display the explorer. 1. In the **Search modules** field, enter `gitlab/node_modules/core-js` to see which polyfills are being loaded and where: diff --git a/doc/development/fe_guide/graphql.md b/doc/development/fe_guide/graphql.md index 5cfdaff0448..67b53fa0299 100644 --- a/doc/development/fe_guide/graphql.md +++ b/doc/development/fe_guide/graphql.md @@ -416,8 +416,8 @@ query getLocalData { } ``` -Similar to resolvers, your `typePolicies` will execute when the `@client` query is used. However, -using `makeVar` will trigger every relevant active Apollo query to reactively update when the state +Similar to resolvers, your `typePolicies` execute when the `@client` query is used. However, +using `makeVar` triggers every relevant active Apollo query to reactively update when the state mutates. ```javascript @@ -462,7 +462,7 @@ export const createLocalState = () => { }; ``` -Pass the cache config to your Apollo Client: +Pass the cache configuration to your Apollo Client: ```javascript // index.js @@ -490,7 +490,7 @@ return new Vue({ }); ``` -Wherever used, the local query will update as the state updates thanks to the **reactive variable**. +Wherever used, the local query updates as the state updates thanks to the **reactive variable**. ### Using with Vuex @@ -522,7 +522,7 @@ of the backend. #### Implementing frontend queries and mutations ahead of the backend -In such case, the frontend will define GraphQL schemas or fields that do not correspond to any +In such case, the frontend defines GraphQL schemas or fields that do not correspond to any backend resolver yet. This is fine as long as the implementation is properly feature-flagged so it does not translate to public-facing errors in the product. However, we do validate client-side queries/mutations against the backend GraphQL schema with the `graphql-verify` CI job. @@ -535,7 +535,7 @@ The preferred approach is to use the `@client` directive on any new query, mutat isn't yet supported by the backend. Any entity with the directive is skipped by the `graphql-verify` validation job. -Additionally Apollo will attempt to resolve them client-side, which can be used in conjunction with +Additionally Apollo attempts to resolve them client-side, which can be used in conjunction with [Mocking API response with local Apollo cache](#mocking-api-response-with-local-apollo-cache). This provides a convenient way of testing your feature with fake data defined client-side. When opening a merge request for your changes, it can be a good idea to provide local resolvers as a @@ -550,7 +550,7 @@ GraphQL queries/mutations validation can be completely turned off for specific f paths to the [`config/known_invalid_graphql_queries.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/known_invalid_graphql_queries.yml) file, much like you would disable ESLint for some files via an `.eslintignore` file. -Bear in mind that any file listed in here will not be validated at all. So if you're only adding +Bear in mind that any file listed in here is not validated at all. So if you're only adding fields to an existing query, use the `@client` directive approach so that the rest of the query is still validated. diff --git a/doc/development/fe_guide/haml.md b/doc/development/fe_guide/haml.md index 803bb89118c..00096ce7fdc 100644 --- a/doc/development/fe_guide/haml.md +++ b/doc/development/fe_guide/haml.md @@ -39,6 +39,15 @@ For example: %span = s_('GroupSettings|Prevent members from sending invitations to groups outside of %{group} and its subgroups.').html_safe % { group: link_to_group(@group) } %p.help-text= prevent_sharing_groups_outside_hierarchy_help_text(@group) + + .form-group.gl-mb-3 + .gl-form-checkbox.custom-control.custom-checkbox + = f.check_box :lfs_enabled, checked: @group.lfs_enabled?, class: 'custom-control-input' + = f.label :lfs_enabled, class: 'custom-control-label' do + %span + = _('Allow projects within this group to use Git LFS') + = link_to sprite_icon('question-o'), help_page_path('topics/git/lfs/index') + %p.help-text= _('This setting can be overridden in each project.') ``` - After: @@ -50,6 +59,14 @@ For example: s_('GroupSettings|Prevent members from sending invitations to groups outside of %{group} and its subgroups.').html_safe % { group: link_to_group(@group) }, help_text: prevent_sharing_groups_outside_hierarchy_help_text(@group), checkbox_options: { disabled: !can_change_prevent_sharing_groups_outside_hierarchy?(@group) } + + .form-group.gl-mb-3 + = f.gitlab_ui_checkbox_component :lfs_enabled, checkbox_options: { checked: @group.lfs_enabled? } do |c| + = c.label do + = _('Allow projects within this group to use Git LFS') + = link_to sprite_icon('question-o'), help_page_path('topics/git/lfs/index') + = c.help_text do + = _('This setting can be overridden in each project.') ``` ### Available components @@ -67,16 +84,27 @@ Currently only the listed components are available but more components are plann [GitLab UI Docs](https://gitlab-org.gitlab.io/gitlab-ui/?path=/story/base-form-form-checkbox--default) +##### Arguments + | Argument | Description | Type | Required (default value) | |---|---|---|---| | `method` | Attribute on the object passed to `gitlab_ui_form_for`. | `Symbol` | `true` | -| `label` | Checkbox label. | `String` | `true` | -| `help_text` | Help text displayed below the checkbox. | `String` | `false` (`nil`) | +| `label` | Checkbox label. `label` slot can be used instead of this argument if HTML is needed. | `String` | `false` (`nil`) | +| `help_text` | Help text displayed below the checkbox. `help_text` slot can be used instead of this argument if HTML is needed. | `String` | `false` (`nil`) | | `checkbox_options` | Options that are passed to [Rails `check_box` method](https://api.rubyonrails.org/classes/ActionView/Helpers/FormBuilder.html#method-i-check_box). | `Hash` | `false` (`{}`) | | `checked_value` | Value when checkbox is checked. | `String` | `false` (`'1'`) | | `unchecked_value` | Value when checkbox is unchecked. | `String` | `false` (`'0'`) | | `label_options` | Options that are passed to [Rails `label` method](https://api.rubyonrails.org/classes/ActionView/Helpers/FormBuilder.html#method-i-label). | `Hash` | `false` (`{}`) | +##### Slots + +This component supports [ViewComponent slots](https://viewcomponent.org/guide/slots.html). + +| Slot | Description +|---|---| +| `label` | Checkbox label content. This slot can be used instead of the `label` argument. | +| `help_text` | Help text content displayed below the checkbox. This slot can be used instead of the `help_text` argument. | + <!-- vale gitlab.Spelling = NO --> #### gitlab_ui_radio_component @@ -85,11 +113,22 @@ Currently only the listed components are available but more components are plann [GitLab UI Docs](https://gitlab-org.gitlab.io/gitlab-ui/?path=/story/base-form-form-radio--default) +##### Arguments + | Argument | Description | Type | Required (default value) | |---|---|---|---| | `method` | Attribute on the object passed to `gitlab_ui_form_for`. | `Symbol` | `true` | | `value` | The value of the radio tag. | `Symbol` | `true` | -| `label` | Radio label. | `String` | `true` | -| `help_text` | Help text displayed below the radio button. | `String` | `false` (`nil`) | +| `label` | Radio label. `label` slot can be used instead of this argument if HTML content is needed inside the label. | `String` | `false` (`nil`) | +| `help_text` | Help text displayed below the radio button. `help_text` slot can be used instead of this argument if HTML content is needed inside the help text. | `String` | `false` (`nil`) | | `radio_options` | Options that are passed to [Rails `radio_button` method](https://api.rubyonrails.org/classes/ActionView/Helpers/FormBuilder.html#method-i-radio_button). | `Hash` | `false` (`{}`) | | `label_options` | Options that are passed to [Rails `label` method](https://api.rubyonrails.org/classes/ActionView/Helpers/FormBuilder.html#method-i-label). | `Hash` | `false` (`{}`) | + +##### Slots + +This component supports [ViewComponent slots](https://viewcomponent.org/guide/slots.html). + +| Slot | Description +|---|---| +| `label` | Checkbox label content. This slot can be used instead of the `label` argument. | +| `help_text` | Help text content displayed below the radio button. This slot can be used instead of the `help_text` argument. | diff --git a/doc/development/fe_guide/registry_architecture.md b/doc/development/fe_guide/registry_architecture.md index 56d67e094b7..be14d5d920c 100644 --- a/doc/development/fe_guide/registry_architecture.md +++ b/doc/development/fe_guide/registry_architecture.md @@ -64,7 +64,7 @@ main pieces of the desired UI and UX of a registry page. The most important comp - `code-instruction`: represents a copyable box containing code. Supports multiline and single line code boxes. Snowplow tracks the code copy event. -- `details-row`: represents a row of details. Used to add additional info in the details area of +- `details-row`: represents a row of details. Used to add additional information in the details area of the `list-item` component. - `history-item`: represents a history list item used to build a timeline. - `list-item`: represents a list element in the registry. It supports: left action, left primary and diff --git a/doc/development/fe_guide/security.md b/doc/development/fe_guide/security.md index 79452327673..6f500c8f0fa 100644 --- a/doc/development/fe_guide/security.md +++ b/doc/development/fe_guide/security.md @@ -41,7 +41,7 @@ Security Policy headers in the GitLab Rails app. Some resources on implementing Content Security Policy: - [MDN Article on CSP](https://developer.mozilla.org/en-US/docs/Web/Security/CSP) -- [GitHub's CSP Journey on the GitHub Engineering Blog](http://githubengineering.com/githubs-csp-journey/) +- [GitHub's CSP Journey on the GitHub Engineering Blog](https://github.blog/2016-04-12-githubs-csp-journey/) - The Dropbox Engineering Blog's series on CSP: [1](https://blogs.dropbox.com/tech/2015/09/on-csp-reporting-and-filtering/), [2](https://blogs.dropbox.com/tech/2015/09/unsafe-inline-and-nonce-deployment/), [3](https://blogs.dropbox.com/tech/2015/09/csp-the-unexpected-eval/), [4](https://blogs.dropbox.com/tech/2015/09/csp-third-party-integrations-and-privilege-separation/) ### Subresource Integrity (SRI) @@ -59,7 +59,7 @@ All CSS and JavaScript assets should use Subresource Integrity. Some resources on implementing Subresource Integrity: - [MDN Article on SRI](https://developer.mozilla.org/en-us/docs/web/security/subresource_integrity) -- [Subresource Integrity on the GitHub Engineering Blog](http://githubengineering.com/subresource-integrity/) +- [Subresource Integrity on the GitHub Engineering Blog](https://github.blog/2015-09-19-subresource-integrity/) --> diff --git a/doc/development/fe_guide/storybook.md b/doc/development/fe_guide/storybook.md index 9c4bcf02389..4c0e7b2612b 100644 --- a/doc/development/fe_guide/storybook.md +++ b/doc/development/fe_guide/storybook.md @@ -53,6 +53,6 @@ To add a story: ## Mock backend APIs -GitLab’s Storybook uses [MirajeJS](https://miragejs.com/) to mock REST and GraphQL APIs. Storybook shares the MirajeJS server +The GitLab Storybook uses [MirajeJS](https://miragejs.com/) to mock REST and GraphQL APIs. Storybook shares the MirajeJS server with the [frontend integration tests](../testing_guide/testing_levels.md#frontend-integration-tests). You can find the MirajeJS configuration files in `spec/frontend_integration/mock_server`. diff --git a/doc/development/fe_guide/style/javascript.md b/doc/development/fe_guide/style/javascript.md index d04d1879476..d93dc8292d4 100644 --- a/doc/development/fe_guide/style/javascript.md +++ b/doc/development/fe_guide/style/javascript.md @@ -100,27 +100,31 @@ class a { ## Converting Strings to Integers -When converting strings to integers, `parseInt` has a slight performance advantage over `Number`, but `Number` is semantic and can be more readable. Prefer `parseInt`, but do not discourage `Number` if it significantly helps readability. +When converting strings to integers, `Number` is semantic and can be more readable. Both are allowable, but `Number` has a slight maintainability advantage. **WARNING:** `parseInt` **must** include the [radix argument](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/parseInt). ```javascript -// bad +// bad (missing radix argument) parseInt('10'); -// bad -things.map(parseInt) - -// ok -Number("106") +// good +parseInt("106", 10); // good -things.map(Number) +Number("106"); +``` + +```javascript +// bad (missing radix argument) +things.map(parseInt); // good -parseInt("106", 10) +things.map(Number); ``` +**PLEASE NOTE:** If the String could represent a non-integer (i.e., it includes a decimal), **do not** use `parseInt`. Consider `Number` or `parseFloat` instead. + ## CSS Selectors - Use `js-` prefix If a CSS class is only being used in JavaScript as a reference to the element, prefix diff --git a/doc/development/fe_guide/vue.md b/doc/development/fe_guide/vue.md index fecb0af936d..ae13e3fc8c5 100644 --- a/doc/development/fe_guide/vue.md +++ b/doc/development/fe_guide/vue.md @@ -583,7 +583,7 @@ This is the template for the example component which is tested in the data-testid="text-input" > <gl-button - variant="success" + variant="confirm" data-testid="add-button" @click="addTodo" >Add</gl-button> diff --git a/doc/development/feature_flags/controls.md b/doc/development/feature_flags/controls.md index 68c14c1b0c9..07c3c83912a 100644 --- a/doc/development/feature_flags/controls.md +++ b/doc/development/feature_flags/controls.md @@ -95,6 +95,7 @@ Guidelines: - Consider notifying `#support_gitlab-com` beforehand. So in case if the feature has any side effects on user experience, they can mitigate and disable the feature flag to reduce some impact. - If the feature meets the requirements for creating a [Change Management](https://about.gitlab.com/handbook/engineering/infrastructure/change-management/#feature-flags-and-the-change-management-process) issue, create a Change Management issue per [criticality guidelines](https://about.gitlab.com/handbook/engineering/infrastructure/change-management/#change-request-workflows). - For simple, low-risk, easily reverted features, proceed and [enable the feature in `#production`](#process). +- For support requests to toggle feature flags for specific groups or projects, please follow the process outlined in the [support workflows](https://about.gitlab.com/handbook/support/workflows/saas_feature_flags.html). #### Process @@ -198,6 +199,14 @@ For groups the `--group` flag is available: /chatops run feature set --group=gitlab-org some_feature true ``` +Note that `--group` does not work with user namespaces. To enable a feature flag for a +generic namespace (including groups) use `--namespace`: + +```shell +/chatops run feature set --namespace=gitlab-org some_feature true +/chatops run feature set --namespace=myusername some_feature true +``` + Note that actor-based gates are applied before percentages. For example, considering the `group/project` as `gitlab-org/gitlab` and a given example feature as `some_feature`, if you run these 2 commands: @@ -215,6 +224,16 @@ actors. Feature.enabled?(:some_feature, group) ``` +Multiple actors can be passed together in a comma-separated form: + +```shell +/chatops run feature set --project=gitlab-org/gitlab,example-org/example-project some_feature true + +/chatops run feature set --group=gitlab-org,example-org some_feature true + +/chatops run feature set --namespace=gitlab-org,example-org some_feature true +``` + Lastly, to verify that the feature is deemed stable in as many cases as possible, you should fully roll out the feature by enabling the flag **globally** by running: @@ -267,7 +286,7 @@ To disable a feature flag that has been enabled for a specific project you can r You cannot selectively disable feature flags for a specific project/group/user without applying a [specific method of implementing](index.md#selectively-disable-by-actor) the feature flags. -If a feature flag is disabled via ChatOps, that will take precedence over the `default_enabled` value in the YML. In other words, you could have a feature enabled for on-premise installations but not for GitLab.com. +If a feature flag is disabled via ChatOps, that will take precedence over the `default_enabled` value in the YAML. In other words, you could have a feature enabled for on-premise installations but not for GitLab.com. ### Feature flag change logging diff --git a/doc/development/feature_flags/index.md b/doc/development/feature_flags/index.md index 54158de6893..d21a46142a2 100644 --- a/doc/development/feature_flags/index.md +++ b/doc/development/feature_flags/index.md @@ -141,7 +141,7 @@ push_frontend_feature_flag(:my_ops_flag, project, type: :ops) An `experiment` feature flag should conform to the same standards as a `development` feature flag, although the interface has some differences. An experiment feature flag should have a rollout issue, -created using the [Experiment Tracking template](https://gitlab.com/gitlab-org/gitlab/-/blob/master/.gitlab/issue_templates/experiment_tracking_template.md). More information can be found in the [experiment guide](../experiment_guide/index.md). +created using the [Experiment Tracking template](https://gitlab.com/gitlab-org/gitlab/-/blob/master/.gitlab/issue_templates/Experiment%20Rollout.md). More information can be found in the [experiment guide](../experiment_guide/index.md). ## Feature flag definition and validation @@ -226,6 +226,16 @@ Feature flags **must** be used in the MR that introduces them. Not doing so caus [broken master](https://about.gitlab.com/handbook/engineering/workflow/#broken-master) scenario due to the `rspec:feature-flags` job that only runs on the `master` branch. +## List all the feature flags + +To [use ChatOps](../../ci/chatops/index.md) to output all the feature flags in an environment to Slack, you can use the `run feature list` +command. For example: + +```shell +/chatops run feature list --dev +/chatops run feature list --staging +``` + ## Delete a feature flag See [cleaning up feature flags](controls.md#cleaning-up) for more information about diff --git a/doc/development/fips_compliance.md b/doc/development/fips_compliance.md index d4274c6275b..5b6f6ba0d98 100644 --- a/doc/development/fips_compliance.md +++ b/doc/development/fips_compliance.md @@ -1,6 +1,6 @@ --- -stage: none -group: unassigned +stage: Create +group: Source Code info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -100,7 +100,7 @@ fips-mode-setup --disable #### Detect FIPS enablement in code -You can query `GitLab::FIPS` in Ruby code to determine if the instance is FIPS-enabled: +You can query `Gitlab::FIPS` in Ruby code to determine if the instance is FIPS-enabled: ```ruby def default_min_key_size(name) @@ -191,11 +191,11 @@ to ignore AMI changes. #### Ansible: Specify the FIPS Omnibus builds -The standard Omnibus GitLab releases build their own OpenSSL library, -which is not FIPS-validated. However, we have nightly builds that create -Omnibus packages that link against the operating system's OpenSSL library. To -use this package, update the `gitlab_repo_script_url` field in the -Ansible `vars.yml`. For example, you might modify +The standard Omnibus GitLab releases build their own OpenSSL library, which is +not FIPS-validated. However, we have nightly builds that create Omnibus packages +that link against the operating system's OpenSSL library. To use this package, +update the `gitlab_edition` and `gitlab_repo_script_url` fields in the Ansible +`vars.yml`. For example, you might modify `gitlab-environment-toolkit/ansible/environments/gitlab-10k/inventory/vars.yml` in this way: @@ -204,6 +204,7 @@ all: vars: ... gitlab_repo_script_url: "https://packages.gitlab.com/install/repositories/gitlab/nightly-fips-builds/script.deb.sh" + gitlab_edition: "gitlab-fips" ``` ### Cloud Native Hybrid @@ -300,7 +301,7 @@ all: gitlab_charts_custom_config_file: '/path/to/gitlab-environment-toolkit/ansible/environments/gitlab-10k/inventory/charts.yml' ``` -Now create `charts.yml` in the location specified above and specify tags with a `-ubi8` suffix. For example: +Now create `charts.yml` in the location specified above and specify tags with a `-fips` suffix. For example: ```yaml global: @@ -308,35 +309,38 @@ global: pullPolicy: Always certificates: image: - tag: master-ubi8 + tag: master-fips + kubectl: + image: + tag: master-fips gitlab: gitaly: image: - tag: master-ubi8 + tag: master-fips gitlab-exporter: image: - tag: master-ubi8 + tag: master-fips gitlab-shell: image: - tag: main-ubi8 # The default branch is main, not master + tag: main-fips # The default branch is main, not master gitlab-mailroom: image: - tag: master-ubi8 + tag: master-fips migrations: image: - tag: master-ubi8 + tag: master-fips sidekiq: image: - tag: master-ubi8 + tag: master-fips toolbox: image: - tag: master-ubi8 + tag: master-fips webservice: image: - tag: master-ubi8 + tag: master-fips workhorse: - tag: master-ubi8 + tag: master-fips nginx-ingress: controller: @@ -352,41 +356,44 @@ See [this issue](https://gitlab.com/gitlab-org/charts/gitlab/-/issues/3153#note_ how to build NGINX and the Ingress Controller. You can also use release tags, but the versioning is tricky because each -component may use its own versioning scheme. For example, for GitLab v14.10: +component may use its own versioning scheme. For example, for GitLab v15.1: ```yaml global: certificates: image: - tag: 20191127-r2-ubi8 + tag: 20211220-r0-fips + kubectl: + image: + tag: 1.18.20-fips gitlab: gitaly: image: - tag: v14.10.0-ubi8 + tag: v15.1.0-fips gitlab-exporter: image: - tag: 11.14.0-ubi8 + tag: 11.15.2-fips gitlab-shell: image: - tag: v13.25.1-ubi8 + tag: v15.1.0-fips gitlab-mailroom: image: - tag: v14.10.0-ubi8 + tag: v15.1.0-fips migrations: image: - tag: v14.10.0-ubi8 + tag: v15.1.0-fips sidekiq: image: - tag: v14.10.0-ubi8 + tag: v15.1.0-fips toolbox: image: - tag: v14.10.0-ubi8 + tag: v15.1.0-fips webservice: image: - tag: v14.10.0-ubi8 + tag: v15.1.0-fips workhorse: - tag: v14.10.0-ubi8 + tag: v15.1.0-fips ``` ## Verify FIPS @@ -508,3 +515,13 @@ the `webservice` container has the following tags: - `master` - `master-ubi8` - `master-fips` + +### Testing merge requests with a FIPS pipeline + +Merge requests that can trigger Package and QA, can trigger a FIPS package and a +Reference Architecture test pipeline. The base image used for the trigger is +Ubuntu 20.04 FIPS: + +1. Trigger `package-and-qa`, if not already triggered. +1. On the `gitlab-omnibus-mirror` child pipeline, manually trigger `Trigger:package:fips`. +1. When the package job is complete, manually trigger the `RAT:FIPS` job. diff --git a/doc/development/foreign_keys.md b/doc/development/foreign_keys.md index c20c70623ae..77df6fbfb0d 100644 --- a/doc/development/foreign_keys.md +++ b/doc/development/foreign_keys.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -15,7 +15,7 @@ class User < ActiveRecord::Base end ``` -Here you will need to add a foreign key on column `posts.user_id`. This ensures +Add a foreign key here on column `posts.user_id`. This ensures that data consistency is enforced on database level. Foreign keys also mean that the database can very quickly remove associated data (for example, when removing a user), instead of Rails having to do this. @@ -28,7 +28,7 @@ Guide](migration_style_guide.md) for more information. Keep in mind that you can only safely add foreign keys to existing tables after you have removed any orphaned rows. The method `add_concurrent_foreign_key` -does not take care of this so you'll need to do so manually. See +does not take care of this so you need to do so manually. See [adding foreign key constraint to an existing column](database/add_foreign_key_to_existing_column.md). ## Cascading Deletes @@ -39,7 +39,7 @@ this should be set to `CASCADE`. ## Indexes When adding a foreign key in PostgreSQL the column is not indexed automatically, -thus you must also add a concurrent index. Not doing so will result in cascading +thus you must also add a concurrent index. Not doing so results in cascading deletes being very slow. ## Naming foreign keys @@ -48,7 +48,7 @@ By default Ruby on Rails uses the `_id` suffix for foreign keys. So we should only use this suffix for associations between two tables. If you want to reference an ID on a third party platform the `_xid` suffix is recommended. -The spec `spec/db/schema_spec.rb` will test if all columns with the `_id` suffix +The spec `spec/db/schema_spec.rb` tests if all columns with the `_id` suffix have a foreign key constraint. So if that spec fails, don't add the column to `IGNORED_FK_COLUMNS`, but instead add the FK constraint, or consider naming it differently. @@ -56,7 +56,7 @@ differently. ## Dependent Removals Don't define options such as `dependent: :destroy` or `dependent: :delete` when -defining an association. Defining these options means Rails will handle the +defining an association. Defining these options means Rails handles the removal of data, instead of letting the database handle this in the most efficient way possible. @@ -80,13 +80,13 @@ foreign keys to remove the data as this would result in the file system data being left behind. In such a case you should use a service class instead that takes care of removing non database data. -In cases where the relation spans multiple databases you will have even +In cases where the relation spans multiple databases you have even further problems using `dependent: :destroy` or the above hooks. You can read more about alternatives at [Avoid `dependent: :nullify` and `dependent: :destroy` across databases](database/multiple_databases.md#avoid-dependent-nullify-and-dependent-destroy-across-databases). -## Alternative primary keys with has_one associations +## Alternative primary keys with `has_one` associations Sometimes a `has_one` association is used to create a one-to-one relationship: @@ -112,9 +112,9 @@ create_table :user_configs, id: false do |t| end ``` -Setting `default: nil` will ensure a primary key sequence is not created, and since the primary key -will automatically get an index, we set `index: false` to avoid creating a duplicate. -You will also need to add the new primary key to the model: +Setting `default: nil` ensures a primary key sequence is not created, and since the primary key +automatically gets an index, we set `index: false` to avoid creating a duplicate. +You also need to add the new primary key to the model: ```ruby class UserConfig < ActiveRecord::Base @@ -126,4 +126,4 @@ end Using a foreign key as primary key saves space but can make [batch counting](service_ping/implement.md#batch-counters) in [Service Ping](service_ping/index.md) less efficient. -Consider using a regular `id` column if the table will be relevant for Service Ping. +Consider using a regular `id` column if the table is relevant for Service Ping. diff --git a/doc/development/gemfile.md b/doc/development/gemfile.md index e0f5a905831..0fcfb88c9cd 100644 --- a/doc/development/gemfile.md +++ b/doc/development/gemfile.md @@ -89,6 +89,7 @@ When upgrading the Rails gem and its dependencies, you also should update the fo You should also update npm packages that follow the current version of Rails: - `@rails/ujs` + - Run `yarn patch-package @rails/ujs` after updating this to ensure our local patch file version matches. - `@rails/actioncable` ## Upgrading dependencies because of vulnerabilities @@ -138,8 +139,8 @@ To avoid upgrading indirect dependencies, we can use [`bundle update When submitting a merge request including a dependency update, include a link to the Gem diff between the 2 versions in the merge request -description. You can find this link on `rubygems.org` under -**Review Changes**. When you click it, RubyGems generates a comparison +description. You can find this link on `rubygems.org`, select +**Review Changes** to generate a comparison between the versions on `diffend.io`. For example, this is the gem diff for [`thor` 1.0.0 vs 1.0.1](https://my.diffend.io/gems/thor/1.0.0/1.0.1). Use the diff --git a/doc/development/geo.md b/doc/development/geo.md index f62b2de30db..18dffe42177 100644 --- a/doc/development/geo.md +++ b/doc/development/geo.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Systems group: Geo info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -33,15 +33,15 @@ for new events and creates background jobs for each specific event type. For example when a repository is updated, the Geo **primary** site creates a Geo event with an associated repository updated event. The Geo Log Cursor daemon -picks the event up and schedules a `Geo::ProjectSyncWorker` job which will -use the `Geo::RepositorySyncService` and `Geo::WikiSyncService` classes +picks the event up and schedules a `Geo::ProjectSyncWorker` job which +uses the `Geo::RepositorySyncService` and `Geo::WikiSyncService` classes to update the repository and the wiki respectively. The Geo Log Cursor daemon can operate in High Availability mode automatically. -The daemon will try to acquire a lock from time to time and once acquired, it -will behave as the *active* daemon. +The daemon tries to acquire a lock from time to time and once acquired, it +behaves as the *active* daemon. -Any additional running daemons on the same site, will be in standby +Any additional running daemons on the same site, is in standby mode, ready to resume work if the *active* daemon releases its lock. We use the [`ExclusiveLease`](https://www.rubydoc.info/github/gitlabhq/gitlabhq/Gitlab/ExclusiveLease) lock type with a small TTL, that is renewed at every @@ -188,16 +188,20 @@ needs to be applied to the tracking database on each **secondary** site. ### Configuration -The database configuration is set in [`config/database_geo.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/database_geo.yml.postgresql). +The database configuration is set in [`config/database.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/database.yml.postgresql). The directory [`ee/db/geo`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/ee/db/geo) contains the schema and migrations for this database. -To write a migration for the database, use the `GeoMigrationGenerator`: +To write a migration for the database, run: ```shell -rails g geo_migration [args] [options] +rails g migration [args] [options] --database geo ``` +Geo should continue using `Gitlab::Database::Migration[1.0]` until the `gitlab_geo` schema is supported, and is for the time being exempt from being validated by `Gitlab::Database::Migration[2.0]`. This requires a developer to manually amend the migration file to change from `[2.0]` to `[1.0]` due to the migration defaults being 2.0. + +For more information, see the [Enable Geo migrations to use Migration[2.0]](https://gitlab.com/gitlab-org/gitlab/-/issues/363491) issue. + To migrate the tracking database, run: ```shell diff --git a/doc/development/geo/framework.md b/doc/development/geo/framework.md index 055c2cd4ea8..18774b9b3fd 100644 --- a/doc/development/geo/framework.md +++ b/doc/development/geo/framework.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Systems group: Geo info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -59,7 +59,7 @@ naming conventions: consume) events. It takes care of the communication between the primary site (where events are produced) and the secondary site (where events are consumed). The engineer who wants to incorporate - Geo in their feature will use the API of replicators to make this + Geo in their feature uses the API of replicators to make this happen. - **Geo Domain-Specific Language**: @@ -99,7 +99,7 @@ end The class name should be unique. It also is tightly coupled to the table name for the registry, so for this example the registry table -will be `package_file_registry`. +is `package_file_registry`. For the different data types Geo supports there are different strategies to include. Pick one that fits your needs. diff --git a/doc/development/gitlab_flavored_markdown/specification_guide/index.md b/doc/development/gitlab_flavored_markdown/specification_guide/index.md index 397d555c54f..cedf44cf1fc 100644 --- a/doc/development/gitlab_flavored_markdown/specification_guide/index.md +++ b/doc/development/gitlab_flavored_markdown/specification_guide/index.md @@ -52,7 +52,10 @@ this inconsistency. Some places in the code refer to both the GitLab and GitHub specifications simultaneous in the same areas of logic. In these situations, _GitHub_ Flavored Markdown may be referred to with variable or constant names like -`ghfm_` to avoid confusion. +`ghfm_` to avoid confusion. For example, we use the `ghfm` acronym for the +[`ghfm_spec_v_0.29.txt` GitHub Flavored Markdown specification file](#github-flavored-markdown-specification) +which is committed to the `gitlab` repo and used as input to the +[`update_specification.rb` script](#update-specificationrb-script). The original CommonMark specification is referred to as _CommonMark_ (no acronym). @@ -141,6 +144,8 @@ and the existing GLFM parser and render implementations. They may also be manually updated as necessary to test-drive incomplete implementations. Regarding the terminology used here: +<!-- vale gitlab.InclusionCultural = NO --> + 1. The Markdown snapshot tests can be considered a form of the [Golden Master Testing approach](https://www.google.com/search?q=golden+master+testing), which is also referred to as Approval Testing or Characterization Testing. @@ -167,6 +172,11 @@ Regarding the terminology used here: they are colocated under the `spec/fixtures` directory with the rest of the fixture data for the GitLab Rails application. +<!-- vale gitlab.InclusionCultural = YES --> + +See also the section on [normalization](#normalization) below, which is an important concept used +in the Markdown snapshot testing. + ## Parsing and Rendering The Markdown dialect used in the GitLab application has a dual requirement for rendering: @@ -187,7 +197,7 @@ implementations: It leverages the [`commonmarker`](https://github.com/gjtorikian/commonmarker) gem, which is a Ruby wrapper for [`libcmark-gfm`](https://github.com/github/cmark), GitHub's fork of the reference parser for CommonMark. `libcmark-gfm` is an extended - version of the C reference implementation of [CommonMark](http://commonmark.org/) + version of the C reference implementation of [CommonMark](https://commonmark.org/) 1. The frontend parser / renderer supports parsing and _WYSIWYG_ rendering for the Content Editor. It is implemented in JavaScript. Parsing is based on the [Remark](https://github.com/remarkjs/remark) Markdown parser, which produces a @@ -213,32 +223,42 @@ HTML which differs from the canonical HTML examples from the specification. For every Markdown example in the GLFM specification, three versions of HTML can potentially be rendered from the example: -1. **Static HTML**: HTML produced by the backend (Ruby) renderer, which - contains extra styling and behavioral HTML. For example, **Create task** buttons - added for dynamically creating an issue from a task list item. - The GitLab [Markdown API](../../../api/markdown.md) generates HTML - for a given Markdown string using this method. -1. **WYSIWYG HTML**: HTML produced by the frontend (JavaScript) Content Editor, - which includes parsing and rendering logic. Used to present an editable document - in the ProseMirror WYSIWYG editor. -1. **Canonical HTML**: The clean, basic version of HTML rendered from Markdown. - 1. For the examples which come from the CommonMark specification and - GFM extensions specification, - the canonical HTML is the exact identical HTML found in the - GFM - `spec.txt` example blocks. - 1. For GLFM extensions to the <abbr title="GitHub Flavored Markdown">GFM</abbr> / CommonMark - specification, a `glfm_canonical_examples.txt` - [input specification file](#input-specification-files) contains the - Markdown examples and corresponding canonical HTML examples. +- Static HTML. +- WYSIWYG HTML. +- Canonical HTML. + +#### Static HTML + +**Static HTML** is HTML produced by the backend (Ruby) renderer, which +contains extra styling and behavioral HTML. For example, **Create task** buttons +added for dynamically creating an issue from a task list item. +The GitLab [Markdown API](../../../api/markdown.md) generates HTML +for a given Markdown string using this method. + +#### WYSIWYG HTML + +**WYSIWYG HTML** is HTML produced by the frontend (JavaScript) Content Editor, +which includes parsing and rendering logic. It is used to present an editable document +in the ProseMirror WYSIWYG editor. -As the rendered static and WYSIWYG HTML from the backend (Ruby) and frontend (JavaScript) -renderers contain extra HTML, their rendered HTML can be converted to canonical HTML -by a [canonicalization](#canonicalization-of-html) process. +#### Canonical HTML -#### Canonicalization of HTML +**Canonical HTML** is the clean, basic version of HTML rendered from Markdown. -Neither the backend (Ruby) nor the frontend (JavaScript) rendered can directly render canonical HTML. +1. For the examples which come from the CommonMark specification and + GFM extensions specification, the canonical HTML is the exact identical HTML found in the + GFM `spec.txt` example blocks. +1. For GLFM extensions to the <abbr title="GitHub Flavored Markdown">GFM</abbr> / CommonMark + specification, a `glfm_canonical_examples.txt` [input specification file](#input-specification-files) + contains the Markdown examples and corresponding canonical HTML examples. + +### Canonicalization of HTML + +The rendered [static HTML](#static-html) and [WYSIWYG HTML](#wysiwyg-html) +from the backend (Ruby) and frontend (JavaScript) renderers usually contains extra styling +or HTML elements, to support specific appearance and behavioral requirements. + +Neither the backend nor the frontend rendering logic can directly render the clean, basic canonical HTML. Nor should they be able to, because: - It's not a direct requirement to support any GitLab application feature. @@ -258,6 +278,49 @@ HTML. (For example, when they are represented as an image.) In these cases, the conformance test for the example can be skipped by setting `skip_update_example_snapshots: true` for the example in `glfm_specification/input/gitlab_flavored_markdown/glfm_example_status.yml`. +### Normalization + +Versions of the rendered HTML and ProseMirror JSON can vary for a number of reasons. +Differences in styling or HTML structure can occur, but the values of attributes or nodes may +also vary across different test runs or environments. For example: + +1. Database record identifiers +1. Namespace or project identifiers +1. Portions of URIs +1. File paths or names +1. Random values + +For the [Markdown snapshot testing](#markdown-snapshot-testing) to work +properly, you must account for these differences in a way that ensures the tests are reliable, +and always behave the same across different test runs or environments. + +To account for these differences, there is a process called **_normalization_**. Normalization +allows custom regular expressions with +[_capturing groups_](https://ruby-doc.org/core-3.1.2/Regexp.html#class-Regexp-label-Capturing) +to be applied to two different versions of HTML or JSON for a given Markdown example, +and the contents of the captured groups can be replaced with the same fixed values. + +Then, the two normalized versions can be compared to each other to ensure all other non-variable +content is identical. + +NOTE: +We don't care about verifying specific attribute values here, so +it's OK if the normalizations discard and replace these variable values with fixed values. +Different testing levels have different purposes: + +1. [Markdown snapshot testing](#markdown-snapshot-testing) is intended to enforce the structure of + the rendered HTML/JSON, and to ensure that it conforms to the canonical specification. +1. Individual unit tests of the implementation for a specific Markdown example are responsible for + specific and targeted testing of these variable values. + +We also use this same regex capture-and-replace normalization approach for +[canonicalization of HTML](#canonicalization-of-html), because it is essentially the same process. +With canonicalization, instead of just replacing variable values, we are removing non-canonical +portions of the HTML. + +Refer to [`glfm_example_normalizations.yml`](#glfm_example_normalizationsyml) for a detailed explanation +of how the normalizations are specified. + ## Goals Given the constraints above, we have a few goals related to the GLFM @@ -374,7 +437,7 @@ subgraph script: A --> B{Backend Markdown API} end subgraph input:<br/>input specification files - C[gfm_spec_v_0.29.txt] --> A + C[ghfm_spec_v_0.29.txt] --> A D[glfm_intro.txt] --> A E[glfm_canonical_examples.txt] --> A end @@ -512,12 +575,16 @@ updated, as in the case of all GFM files. ##### GitHub Flavored Markdown specification -[`glfm_specification/input/github_flavored_markdown/gfm_spec_v_0.29.txt`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/glfm_specification/input/github_flavored_markdown/gfm_spec_v_0.29.txt) +[`glfm_specification/input/github_flavored_markdown/ghfm_spec_v_0.29.txt`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/glfm_specification/input/github_flavored_markdown/ghfm_spec_v_0.29.txt) is the official latest [GFM `spec.txt`](https://github.com/github/cmark-gfm/blob/master/test/spec.txt). - It is automatically downloaded and updated by `update-specification.rb` script. - When it is downloaded, the version number is added to the filename. +NOTE: +This file uses the `ghfm` acronym instead of `gfm`, as +explained in the [Acronyms section](#acronyms-glfm-ghfm-gfm-commonmark). + ##### `glfm_intro.txt` [`glfm_specification/input/gitlab_flavored_markdown/glfm_intro.txt`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/glfm_specification/input/gitlab_flavored_markdown/glfm_intro.txt) @@ -578,23 +645,114 @@ The actual file should not have these prefixed `|` characters. controls the behavior of the [scripts](#scripts) and [tests](#types-of-markdown-tests-driven-by-the-glfm-specification). - It is manually updated. -- It controls the status of automatic generation of files based on Markdown examples. -- It allows example snapshot generation, Markdown conformance tests, or - Markdown snapshot tests to be skipped for individual examples. For example, if - they are unimplemented, broken, or cannot be tested for some reason. +- The `skip_update_example_snapshot*` fields control the status of automatic generation of + snapshot example entries based on Markdown examples. +- The `skip_running_*` control allow Markdown conformance tests or + Markdown snapshot tests to be skipped for individual examples. +- This allows control over skipping this processing or testing of various examples when they + are unimplemented, partially implemented, broken, cannot be generated, or cannot be tested for some reason. +- All entries default to false. They can be set to true by specifying a Ruby + value which evaluates as truthy. This could be the boolean `true` value, but ideally should + be a string describing why the example's updating or testing is being skipped. +- When a `skip_update_example_snapshot*` entry is true, the existing value is preserved. + However, since the YAML is re-written, the style of the string value and its + [Block Chomping Indicator (`|`)](https://yaml.org/spec/1.2.2/#8112-block-chomping-indicator) + may be modified, because the Ruby `psych` YAML library automatically determines this. + +The following optional entries are supported for each example. They all default to `false`: + +- `skip_update_example_snapshots`: When true, skips any addition or update of any this example's entries + in the [`spec/fixtures/glfm/example_snapshots/html.yml`](#specfixturesglfmexample_snapshotshtmlyml) file + or the [`spec/fixtures/glfm/example_snapshots/prosemirror_json.yml`](#specfixturesglfmexample_snapshotsprosemirror_jsonyml) file. + If this value is truthy, then no other `skip_update_example_snapshot_*` entries can be truthy, + and an error is raised if any of them are. +- `skip_update_example_snapshot_html_static`: When true, skips addition or update of this example's [static HTML](#static-html) + entry in the [`spec/fixtures/glfm/example_snapshots/html.yml`](#specfixturesglfmexample_snapshotshtmlyml) file. +- `skip_update_example_snapshot_html_wysiwyg`: When true, skips addition or update of this example's [WYSIWYG HTML](#wysiwyg-html) + entry in the [`spec/fixtures/glfm/example_snapshots/html.yml`](#specfixturesglfmexample_snapshotshtmlyml) file. +- `skip_update_example_snapshot_prosemirror_json`: When true, skips addition or update of this example's + entry in the [`spec/fixtures/glfm/example_snapshots/prosemirror_json.yml`](#specfixturesglfmexample_snapshotsprosemirror_jsonyml) file. +- `skip_running_conformance_static_tests`: When true, skips running the [Markdown conformance tests](#markdown-conformance-testing) + of the [static HTML](#static-html) for this example. +- `skip_running_conformance_wysiwyg_tests`: When true, skips running the [Markdown conformance tests](#markdown-conformance-testing) + of the [WYSIWYG HTML](#wysiwyg-html) for this example. +- `skip_running_snapshot_static_html_tests`: When true, skips running the [Markdown snapshot tests](#markdown-snapshot-testing) + of the [static HTML](#multiple-versions-of-rendered-html) for this example. +- `skip_running_snapshot_wysiwyg_html_tests`: When true, skips running the [Markdown snapshot tests](#markdown-snapshot-testing) + of the [WYSIWYG HTML](#wysiwyg-html) for this example. +- `skip_running_snapshot_prosemirror_json_tests`: When true, skips running the [Markdown snapshot tests](#markdown-snapshot-testing) + of the [ProseMirror JSON](#specfixturesglfmexample_snapshotsprosemirror_jsonyml) for this example. `glfm_specification/input/gitlab_flavored_markdown/glfm_example_status.yml` sample entry: ```yaml 07_99_an_example_with_incomplete_wysiwyg_implementation_1: - skip_update_example_snapshots: false - skip_update_example_snapshot_html_static: false - skip_update_example_snapshot_html_wysiwyg: false - skip_running_conformance_static_tests: false - skip_running_conformance_wysiwyg_tests: false - skip_running_snapshot_static_html_tests: false - skip_running_snapshot_wysiwyg_html_tests: false - skip_running_snapshot_prosemirror_json_tests: false + skip_update_example_snapshots: 'An explanation of the reason for skipping.' + skip_update_example_snapshot_html_static: 'An explanation of the reason for skipping.' + skip_update_example_snapshot_html_wysiwyg: 'An explanation of the reason for skipping.' + skip_update_example_snapshot_prosemirror_json: 'An explanation of the reason for skipping.' + skip_running_conformance_static_tests: 'An explanation of the reason for skipping.' + skip_running_conformance_wysiwyg_tests: 'An explanation of the reason for skipping.' + skip_running_snapshot_static_html_tests: 'An explanation of the reason for skipping.' + skip_running_snapshot_wysiwyg_html_tests: 'An explanation of the reason for skipping.' + skip_running_snapshot_prosemirror_json_tests: 'An explanation of the reason for skipping.' +``` + +##### `glfm_example_normalizations.yml` + +[`glfm_specification/input/gitlab_flavored_markdown/glfm_example_normalizations.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/glfm_specification/input/gitlab_flavored_markdown/glfm_example_normalizations.yml) +controls the [normalization](#normalization) process. It allows one or more `regex`/`replacement` pairs +to be specified for a Markdown example. + +- It is manually updated. +- It has a nested structure corresponding to the example and type of entry it refers to. +- It extensively uses [YAML anchors and aliases](https://yaml.org/spec/1.2.2/#692-node-anchors) + to avoid duplication of `regex`/`replacement` pairs and allow them to be shared across multiple examples. +- The YAML anchors use a naming convention based on the index number of the example, to + ensure unique anchor names and avoid naming conflicts. + +`glfm_specification/input/gitlab_flavored_markdown/glfm_example_normalizations.yml` sample entries: + +```yaml +# NOTE: All YAML anchors which are shared across one or more examples are defined in the `00_shared` section. +00_shared: + 00_uri: &00_uri + - regex: '(href|data-src)(=")(.*?)(test-file\.(png|zip)")' + replacement: '\1\2URI_PREFIX\4' +01_01__section_one__example_containing_a_uri__001: + html: + static: + canonical: + 01_01_uri: *00_uri + snapshot: + 01_01_uri: *00_uri + wysiwyg: + 01_01_uri: *00_uri + prosemirror_json: + 01_01_uri: *00_uri +07_01__gitlab_specific_markdown__footnotes__001: + # YAML anchors which are only shared within a single example should be defined within the example + shared: + 07_01_href: &07_01_href + - regex: '(href)(=")(.+?)(")' + replacement: '\1\2REF\4' + 07_01_id: &07_01_id + - regex: '(id)(=")(.+?)(")' + replacement: '\1\2ID\4' + html: + static: + canonical: + 07_01_href: *07_01_href + 07_01_id: *07_01_id + snapshot: + 07_01_href: *07_01_href + 07_01_id: *07_01_id + wysiwyg: + 07_01_href: *07_01_href + 07_01_id: *07_01_id + prosemirror_json: + 07_01_href: *07_01_href + 07_01_id: *07_01_id ``` #### Output specification files @@ -610,7 +768,8 @@ are colocated under the same parent folder `glfm_specification` with the other a mix of manually edited and generated files. In GFM, `spec.txt` is [located in the test dir](https://github.com/github/cmark-gfm/blob/master/test/spec.txt), -and in CommonMark it's located [in the project root](https://github.com/github/cmark-gfm/blob/master/test/spec.txt). No precedent exists for a standard location. In the future, we may decide to +and in CommonMark it's located [in the project root](https://github.com/github/cmark-gfm/blob/master/test/spec.txt). +No precedent exists for a standard location. In the future, we may decide to move or copy a hosted version of the rendered HTML `spec.html` version to another location or site. ##### spec.txt @@ -748,12 +907,12 @@ Any exceptions or failures which occur when generating HTML are replaced with an ```yaml 06_04_inlines_emphasis_and_strong_emphasis_1: - canonical: | - <p><em>foo bar</em></p> - static: | - <p data-sourcepos="1:1-1:9" dir="auto"><strong>foo bar</strong></p> - wysiwyg: | - <p><strong>foo bar</strong></p> + canonical: | + <p><em>foo bar</em></p> + static: | + <p data-sourcepos="1:1-1:9" dir="auto"><strong>foo bar</strong></p> + wysiwyg: | + <p><strong>foo bar</strong></p> ``` NOTE: diff --git a/doc/development/go_guide/index.md b/doc/development/go_guide/index.md index a5661a77da3..f5b0da2f162 100644 --- a/doc/development/go_guide/index.md +++ b/doc/development/go_guide/index.md @@ -88,7 +88,7 @@ of the "GitLab" project on the Engineering Projects page in the handbook. To add yourself to this list, add the following to your profile in the -[team.yml](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/team.yml) +[`team.yml`](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/team.yml) file and ask your manager to review and merge. ```yaml @@ -406,9 +406,8 @@ variable). Since daemons are long-running applications, they should have mechanisms to manage cancellations, and avoid unnecessary resources consumption (which could -lead to DDOS vulnerabilities). [Go -Context](https://github.com/golang/go/wiki/CodeReviewComments#contexts) should -be used in functions that can block and passed as the first parameter. +lead to DDoS vulnerabilities). [Go Context](https://github.com/golang/go/wiki/CodeReviewComments#contexts) +should be used in functions that can block and passed as the first parameter. ## Dockerfiles diff --git a/doc/development/graphql_guide/batchloader.md b/doc/development/graphql_guide/batchloader.md index 0e90f89ff7a..492d3bc9007 100644 --- a/doc/development/graphql_guide/batchloader.md +++ b/doc/development/graphql_guide/batchloader.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -12,7 +12,7 @@ It is the properties of the GraphQL query tree that create opportunities for bat ## When should you use it? -We should try to batch DB requests as much as possible during GraphQL **query** execution. There is no need to batch loading during **mutations** because they are executed serially. If you need to make a database query, and it is possible to combine two similar (but not identical) queries, then consider using the batch-loader. +We should try to batch DB requests as much as possible during GraphQL **query** execution. There is no need to batch loading during **mutations** because they are executed serially. If you need to make a database query, and it is possible to combine two similar (but not necessarily identical) queries, then consider using the batch-loader. When implementing a new endpoint we should aim to minimise the number of SQL queries. For stability and scalability we must also ensure that our queries do not suffer from N+1 performance issues. @@ -20,7 +20,7 @@ When implementing a new endpoint we should aim to minimise the number of SQL que Batch loading is useful when a series of queries for inputs `Qα, Qβ, ... Qω` can be combined to a single query for `Q[α, β, ... ω]`. An example of this is lookups by ID, where we can find two users by usernames as cheaply as one, but real-world examples can be more complex. -Batch loading is not suitable when the result sets have different sort-orders, grouping, aggregation or other non-composable features. +Batch loading is not suitable when the result sets have different sort orders, grouping, aggregation, or other non-composable features. There are two ways to use the batch-loader in your code. For simple ID lookups, use `::Gitlab::Graphql::Loaders::BatchModelLoader.new(model, id).find`. For more complex cases, you can use the batch API directly. @@ -47,9 +47,29 @@ end Here an [example MR](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/46549) illustrating how to use our `BatchLoading` mechanism. +## The `BatchModelLoader` + +For ID lookups, the advice is to use the `BatchModelLoader`: + +```ruby +def project + ::Gitlab::Graphql::Loaders::BatchModelLoader.new(::Project, object.project_id).find +end +``` + +To preload associations, you can pass an array of them: + +```ruby +def issue(lookahead:) + preloads = [:author] if lookahead.selects?(:author) + + ::Gitlab::Graphql::Loaders::BatchModelLoader.new(::Issue, object.issue_id, preloads).find +end +``` + ## How does it work exactly? -Each lazy object knows which data it needs to load and how to batch the query. When we need to use the lazy objects (which we announce by calling `#sync`), they will be loaded along with all other similar objects in the current batch. +Each lazy object knows which data it needs to load and how to batch the query. When we need to use the lazy objects (which we announce by calling `#sync`), they are loaded along with all other similar objects in the current batch. Inside the block we execute a batch query for our items (`User`). After that, all we have to do is to call loader by passing an item which was used in `BatchLoader::GraphQL.for` method (`usernames`) and the loaded object itself (`user`): @@ -61,9 +81,28 @@ BatchLoader::GraphQL.for(username).batch do |usernames, loader| end ``` +The batch-loader uses the source code location of the block to determine +which requests belong in the same queue, but only one instance of the block +is evaluated for each batch. You do not control which one. + +For this reason, it is important that: + +- The block must not refer to (close over) any instance state on objects. The best practice + is to pass all data the block needs through to it in the `for(data)` call. +- The block must be specific to a kind of batched data. Implementing generic + loaders (such as the `BatchModelLoader`) is possible, but it requires the use + of an injective `key` argument. +- Batches are not shared unless they refer to the same block - two identical blocks + with the same behavior, parameters, and keys do not get shared. For this reason, + never implement batched ID lookups on your own, instead use the `BatchModelLoader` for + maximum sharing. If you see two fields define the same batch-loading, consider + extracting that out to a new `Loader`, and enabling them to share. + ### What does lazy mean? -It is important to avoid syncing batches too early. In the example below we can see how calling sync too early can eliminate opportunities for batching: +It is important to avoid syncing batches (forcing their evaluation) too early. The following example shows how calling sync too early can eliminate opportunities for batching. + +This example calls sync on `x` too early: ```ruby x = find_lazy(1) @@ -80,6 +119,8 @@ z.sync # => will run 2 queries ``` +However, this example waits until all requests are queued, and eliminates the extra query: + ```ruby x = find_lazy(1) y = find_lazy(2) @@ -92,9 +133,38 @@ z.sync # => will run 1 query ``` +NOTE: +There is no dependency analysis in the use of batch-loading. There is simply +a pending queue of requests, and as soon as any one result is needed, all pending +requests are evaluated. + +You should never call `batch.sync` or use `Lazy.force` in resolver code. +If you depend on a lazy value, use `Lazy.with_value` instead: + +```ruby +def publisher + ::Gitlab::Graphql::Loaders::BatchModelLoader.new(::Publisher, object.publisher_id).find +end + +# Here we need the publisher in order to generate the catalog URL +def catalog_url + ::Gitlab::Graphql::Lazy.with_value(publisher) do |p| + UrlHelpers.book_catalog_url(publisher, object.isbn) + end +end +``` + ## Testing -Any GraphQL field that supports `BatchLoading` should be tested using the `batch_sync` method available in [GraphQLHelpers](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/support/helpers/graphql_helpers.rb). +Ideally, do all your testing using request specs, and using `Schema.execute`. If +you do so, you do not need to manage the lifecycle of lazy values yourself, and +you are assured accurate results. + +GraphQL fields that return lazy values may need these values forced in tests. +Forcing refers to explicit demands for evaluation, where this would normally +be arranged by the framework. + +You can force a lazy value with the `GraphqlHelpers#batch_sync` method available in [GraphQLHelpers](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/support/helpers/graphql_helpers.rb), or by using `Gitlab::Graphql::Lazy.force`. For example: ```ruby it 'returns data as a batch' do @@ -114,8 +184,8 @@ We can also use [QueryRecorder](../query_recorder.md) to make sure we are perfor ```ruby it 'executes only 1 SQL query' do - query_count = ActiveRecord::QueryRecorder.new { subject }.count + query_count = ActiveRecord::QueryRecorder.new { subject } - expect(query_count).to eq(1) + expect(query_count).not_to exceed_query_limit(1) end ``` diff --git a/doc/development/graphql_guide/pagination.md b/doc/development/graphql_guide/pagination.md index 1f40a605cfe..72f321a1fd2 100644 --- a/doc/development/graphql_guide/pagination.md +++ b/doc/development/graphql_guide/pagination.md @@ -23,7 +23,7 @@ and used across much of GitLab. You can recognize it by a list of page numbers near the bottom of a page, which, when clicked, take you to that page of results. -For example, when you click **Page 100**, we send `100` to the +For example, when you select **Page 100**, we send `100` to the backend. For example, if each page has say 20 items, the backend calculates `20 * 100 = 2000`, and it queries the database by offsetting (skipping) the first 2000 diff --git a/doc/development/hash_indexes.md b/doc/development/hash_indexes.md index 881369e429b..731639b6f06 100644 --- a/doc/development/hash_indexes.md +++ b/doc/development/hash_indexes.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/i18n/proofreader.md b/doc/development/i18n/proofreader.md index afc04045763..8231cf4316b 100644 --- a/doc/development/i18n/proofreader.md +++ b/doc/development/i18n/proofreader.md @@ -60,6 +60,7 @@ are very appreciative of the work done by translators and proofreaders! - German - Michael Hahnle - [GitLab](https://gitlab.com/mhah), [Crowdin](https://crowdin.com/profile/mhah) - Katrin Leinweber - [GitLab](https://gitlab.com/katrinleinweber), [Crowdin](https://crowdin.com/profile/katrinleinweber) + - Justman10000 - [GitLab](https://gitlab.com/Justman10000), [Crowdin](https://crowdin.com/profile/Justman10000) - Greek - Proofreaders needed. - Hebrew @@ -101,7 +102,7 @@ are very appreciative of the work done by translators and proofreaders! - Horberlan Brito - [GitLab](https://gitlab.com/horberlan), [Crowdin](https://crowdin.com/profile/horberlan) - Romanian - Mircea Pop - [GitLab](https://gitlab.com/eeex), [Crowdin](https://crowdin.com/profile/eex) - - Rareș Pița - [GitLab](https://gitlab.com/dlphin), [Crowdin](https://crowdin.com/profile/dlphin) + - Rareș Pița - [GitLab](https://gitlab.com/dlphin) - Nicolae Liviu - [GitLab](https://gitlab.com/nicklcanada), [Crowdin](https://crowdin.com/profile/nicklcanada) - Russian - Nikita Grylov - [GitLab](https://gitlab.com/nixel2007), [Crowdin](https://crowdin.com/profile/nixel2007) @@ -147,7 +148,7 @@ translations to the GitLab project. 1. Request proofreader permissions by opening a merge request to add yourself to the list of proofreaders. - Open the [`proofreader.md` source file](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/development/i18n/proofreader.md) and click **Edit**. + Open the [`proofreader.md` source file](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/development/i18n/proofreader.md) and select **Edit**. Add your language in alphabetical order and add yourself to the list, including: diff --git a/doc/development/image_scaling.md b/doc/development/image_scaling.md index e1ffbdb766a..93575429369 100644 --- a/doc/development/image_scaling.md +++ b/doc/development/image_scaling.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Memory info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/insert_into_tables_in_batches.md b/doc/development/insert_into_tables_in_batches.md index cd659a3d19b..ebed3d16319 100644 --- a/doc/development/insert_into_tables_in_batches.md +++ b/doc/development/insert_into_tables_in_batches.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments description: "Sometimes it is necessary to store large amounts of records at once, which can be inefficient @@ -48,7 +48,7 @@ records = [MyModel.new, ...] MyModel.bulk_insert!(records) ``` -Note that calls to `bulk_insert!` will always attempt to insert _new records_. If instead +Calls to `bulk_insert!` always attempt to insert _new records_. If instead you would like to replace existing records with new values, while still inserting those that do not already exist, then you can use `bulk_upsert!`: @@ -59,9 +59,9 @@ MyModel.bulk_upsert!(records, unique_by: [:name]) ``` In this example, `unique_by` specifies the columns by which records are considered to be -unique and as such will be updated if they existed prior to insertion. For example, if +unique and as such are updated if they existed prior to insertion. For example, if `existing_model` has a `name` attribute, and if a record with the same `name` value already -exists, its fields will be updated with those of `existing_model`. +exists, its fields are updated with those of `existing_model`. The `unique_by` parameter can also be passed as a `Symbol`, in which case it specifies a database index by which a column is considered unique: @@ -72,8 +72,8 @@ MyModel.bulk_insert!(records, unique_by: :index_on_name) ### Record validation -The `bulk_insert!` method guarantees that `records` will be inserted transactionally, and -will run validations on each record prior to insertion. If any record fails to validate, +The `bulk_insert!` method guarantees that `records` are inserted transactionally, and +runs validations on each record prior to insertion. If any record fails to validate, an error is raised and the transaction is rolled back. You can turn off validations via the `:validate` option: @@ -83,7 +83,7 @@ MyModel.bulk_insert!(records, validate: false) ### Batch size configuration -In those cases where the number of `records` is above a given threshold, insertions will +In those cases where the number of `records` is above a given threshold, insertions occur in multiple batches. The default batch size is defined in [`BulkInsertSafe::DEFAULT_BATCH_SIZE`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/concerns/bulk_insert_safe.rb). Assuming a default threshold of 500, inserting 950 records @@ -95,7 +95,7 @@ MyModel.bulk_insert!(records, batch_size: 100) ``` Assuming the same number of 950 records, this would result in 10 batches being written instead. -Since this will also affect the number of `INSERT`s that occur, make sure you measure the +Since this also affects the number of `INSERT` statements that occur, make sure you measure the performance impact this might have on your code. There is a trade-off between the number of `INSERT` statements the database has to process and the size and cost of each `INSERT`. @@ -127,7 +127,7 @@ records are inserted in bulk, we currently prevent their use. The specifics around which callbacks are explicitly allowed are defined in [`BulkInsertSafe`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/concerns/bulk_insert_safe.rb). Consult the module source code for details. If your class uses callbacks that are not explicitly designated -safe and you `include BulkInsertSafe` the application will fail with an error. +safe and you `include BulkInsertSafe` the application fails with an error. ### `BulkInsertSafe` versus `InsertAll` @@ -155,7 +155,7 @@ owner = OwnerModel.new(owned_relations: array_of_owned_relations) owner.save! ``` -This will issue a single `INSERT`, and transaction, for every record in `owned_relations`, which is inefficient if +This issues a single `INSERT`, and transaction, for every record in `owned_relations`, which is inefficient if `array_of_owned_relations` is large. To remedy this, the `BulkInsertableAssociations` concern can be used to declare that the owner defines associations that are safe for bulk insertion: @@ -180,8 +180,8 @@ BulkInsertableAssociations.with_bulk_insert do end ``` -Note that you can still save relations that are not `BulkInsertSafe` in this block; they will -simply be treated as if you had invoked `save` from outside the block. +You can still save relations that are not `BulkInsertSafe` in this block; they +simply are treated as if you had invoked `save` from outside the block. ## Known limitations @@ -192,5 +192,5 @@ There are a few restrictions to how these APIs can be used: - It does not yet support `has_many through: ...` relations. Moreover, input data should either be limited to around 1000 records at most, -or already batched prior to calling bulk insert. The `INSERT` statement will run in a single +or already batched prior to calling bulk insert. The `INSERT` statement runs in a single transaction, so for large amounts of records it may negatively affect database stability. diff --git a/doc/development/integrations/index.md b/doc/development/integrations/index.md index e595fea6d96..604e481a809 100644 --- a/doc/development/integrations/index.md +++ b/doc/development/integrations/index.md @@ -43,7 +43,7 @@ if you need clarification or spot any outdated information. ### Define properties Integrations can define arbitrary properties to store their configuration with the class method `Integration.prop_accessor`. -The values are stored as a serialized JSON hash in the `integrations.properties` column. +The values are stored as an encrypted JSON hash in the `integrations.encrypted_properties` column. For example: diff --git a/doc/development/integrations/jenkins.md b/doc/development/integrations/jenkins.md index 8a3f64f0a0d..f430fc380b1 100644 --- a/doc/development/integrations/jenkins.md +++ b/doc/development/integrations/jenkins.md @@ -36,8 +36,8 @@ GitLab does not allow requests to localhost or the local network by default. Whe Jenkins uses the GitLab API and needs an access token. 1. Sign in to your GitLab instance. -1. Click on your profile picture, then click **Settings**. -1. Click **Access Tokens**. +1. Select your profile picture, then select **Settings**. +1. Select **Access Tokens**. 1. Create a new Access Token with the **API** scope enabled. Note the value of the token. ## Configure Jenkins diff --git a/doc/development/integrations/jira_connect.md b/doc/development/integrations/jira_connect.md index 26ef67c937c..ade81e29ffb 100644 --- a/doc/development/integrations/jira_connect.md +++ b/doc/development/integrations/jira_connect.md @@ -37,13 +37,13 @@ To install the app in Jira: Marketplace: 1. In Jira, navigate to **Jira settings > Apps > Manage apps**. - 1. Scroll to the bottom of the **Manage apps** page and click **Settings**. - 1. Select **Enable development mode** and click **Apply**. + 1. Scroll to the bottom of the **Manage apps** page and select **Settings**. + 1. Select **Enable development mode** and select **Apply**. 1. Install the app: 1. In Jira, navigate to **Jira settings > Apps > Manage apps**. - 1. Click **Upload app**. + 1. Select **Upload app**. 1. In the **From this URL** field, provide a link to the app descriptor. The host and port must point to your GitLab instance. For example: @@ -52,10 +52,10 @@ To install the app in Jira: https://xxxx.gitpod.io/-/jira_connect/app_descriptor.json ``` - 1. Click **Upload**. + 1. Select **Upload**. If the install was successful, you should see the **GitLab.com for Jira Cloud** app under **Manage apps**. - You can also click **Getting Started** to open the configuration page rendered from your GitLab instance. + You can also select **Getting Started** to open the configuration page rendered from your GitLab instance. _Note that any changes to the app descriptor requires you to uninstall then reinstall the app._ @@ -106,11 +106,7 @@ The following steps describe setting up an environment to test the GitLab OAuth - Trusted: **No** - Confidential: **No** 1. Copy the Application ID. +1. Go to **Admin > Settings > General**. +1. Scroll down and expand the GitLab for Jira App section. 1. Go to [gitpod.io/variables](https://gitpod.io/variables). -1. Create a new variable named `JIRA_CONNECT_OAUTH_CLIENT_ID`, with a scope of `*/*`, and paste the Application ID as the value. - -If you already have an active Gitpod instance, use the following command in the Gitpod terminal to set the environment variable: - -```shell -eval $(gp env -e JIRA_CONNECT_OAUTH_CLIENT_ID=$YOUR_APPLICATION_ID) -``` +1. Paste the Application ID into the **Jira Connect Application ID** field and click **Save changes** diff --git a/doc/development/integrations/secure.md b/doc/development/integrations/secure.md index 0f4fa1a97a8..1a51ee88c58 100644 --- a/doc/development/integrations/secure.md +++ b/doc/development/integrations/secure.md @@ -312,8 +312,7 @@ The format is extensively described in the documentation of [SAST](../../user/application_security/sast/index.md#reports-json-format), [DAST](../../user/application_security/dast/#reports), [Dependency Scanning](../../user/application_security/dependency_scanning/index.md#reports-json-format), -[Container Scanning](../../user/application_security/container_scanning/index.md#reports-json-format), -and [Cluster Image Scanning](../../user/application_security/cluster_image_scanning/index.md#reports-json-format). +and [Container Scanning](../../user/application_security/container_scanning/index.md#reports-json-format) You can find the schemas for these scanners here: @@ -333,33 +332,16 @@ GitLab has the following retention policies for vulnerabilities on non-default b To view vulnerabilities, either: -- Re-run the pipeline. +- Run a new pipeline. - Download the related CI job artifacts if they are available. NOTE: This does not apply for the vulnerabilities existing on the default branch. -### Enable report validation - -> [Deprecated](https://gitlab.com/gitlab-org/gitlab/-/issues/354928) in GitLab 14.9, and planned for removal in GitLab 15.0. -DISCLAIMER: -This page contains information related to upcoming products, features, and functionality. -It is important to note that the information presented is for informational purposes only. -Please do not rely on this information for purchasing or planning purposes. -As with all projects, the items mentioned on this page are subject to change or delay. -The development, release, and timing of any products, features, or functionality remain at the -sole discretion of GitLab Inc. -In GitLab 15.0 and later, report validation is enabled and enforced. Reports that fail validation -are not ingested, and an error message displays on the corresponding pipeline. - -In GitLab 14.10 and later, report validation against the schemas is enabled but not enforced. -Reports that fail validation are ingested but display a warning in the pipeline security tab. - -To enforce report validation for GitLab version 14.10 and earlier, set -[`VALIDATE_SCHEMA`](../../user/application_security/#enable-security-report-validation) to `"true"`. - ### Report validation +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/351000) in GitLab 15.0. + You must ensure that reports generated by the scanner pass validation against the schema version declared in your reports. Reports that don't pass validation are not ingested by GitLab, and an error message displays on the corresponding pipeline. diff --git a/doc/development/integrations/secure_partner_integration.md b/doc/development/integrations/secure_partner_integration.md index 34e0aaedfaf..63f86a3f95d 100644 --- a/doc/development/integrations/secure_partner_integration.md +++ b/doc/development/integrations/secure_partner_integration.md @@ -43,7 +43,7 @@ best place to integrate your own product and its results into GitLab. implications for app security, corporate policy, or compliance. When complete, the job reports back on its status and creates a [job artifact](../../ci/pipelines/job_artifacts.md) as a result. -- The [Merge Request Security Widget](../../user/project/merge_requests/testing_and_reports_in_merge_requests.md#security-reports) +- The [Merge Request Security Widget](../../ci/testing/index.md#security-reports) displays the results of the pipeline's security checks and the developer can review them. The developer can review both a summary and a detailed version of the results. @@ -90,12 +90,11 @@ and complete an integration with the Secure stage. - Documentation for [SAST reports](../../user/application_security/sast/index.md#reports-json-format). - Documentation for [Dependency Scanning reports](../../user/application_security/dependency_scanning/index.md#reports-json-format). - Documentation for [Container Scanning reports](../../user/application_security/container_scanning/index.md#reports-json-format). - - Documentation for [`cluster_image_scanning` reports](../../user/application_security/cluster_image_scanning/index.md#reports-json-format). - See this [example secure job definition that also defines the artifact created](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/ci/templates/Security/Container-Scanning.gitlab-ci.yml). - If you need a new kind of scan or report, [create an issue](https://gitlab.com/gitlab-org/gitlab/-/issues/new#) and add the label `devops::secure`. - Once the job is completed, the data can be seen: - - In the [Merge Request Security Report](../../user/project/merge_requests/testing_and_reports_in_merge_requests.md#security-reports) ([MR Security Report data flow](https://gitlab.com/snippets/1910005#merge-request-view)). + - In the [Merge Request Security Report](../../ci/testing/index.md#security-reports) ([MR Security Report data flow](https://gitlab.com/snippets/1910005#merge-request-view)). - While [browsing a Job Artifact](../../ci/pipelines/job_artifacts.md). - In the [Security Dashboard](../../user/application_security/security_dashboard/index.md) ([Dashboard data flow](https://gitlab.com/snippets/1910005#project-and-group-dashboards)). 1. Optional: Provide a way to interact with results as Vulnerabilities: diff --git a/doc/development/internal_api/index.md b/doc/development/internal_api/index.md index dca71413564..288c0056821 100644 --- a/doc/development/internal_api/index.md +++ b/doc/development/internal_api/index.md @@ -254,7 +254,7 @@ recovery codes based on their SSH key. | Attribute | Type | Required | Description | |:----------|:-------|:---------|:------------| | `key_id` | integer | no | The ID of the SSH key used as found in the authorized-keys file or through the `/authorized_keys` check | -| `user_id` | integer | no | **Deprecated** User_id for which to generate new recovery codes | +| `user_id` | integer | no | **Deprecated** User ID for which to generate new recovery codes | ```plaintext GET /internal/two_factor_recovery_codes @@ -331,6 +331,37 @@ Example response: - GitLab Shell +## Authenticate Error Tracking requests + +This endpoint is called by the error tracking Go REST API application to authenticate a project. + +| Attribute | Type | Required | Description | +|:-------------|:--------|:---------|:-------------------------------------------------------------------| +| `project_id` | integer | yes | The ID of the project which has the associated key. | +| `public_key` | string | yes | The public key generated by the integrated error tracking feature. | + +```plaintext +POST /internal/error_tracking_allowed +``` + +Example request: + +```shell +curl --request POST --header "Gitlab-Shared-Secret: <Base64 encoded secret>" \ + --data "project_id=111&public_key=generated-error-tracking-key" \ + "http://localhost:3001/api/v4/internal/error_tracking_allowed" +``` + +Example response: + +```json +{ "enabled": true } +``` + +### Known consumers + +- OpsTrace + ## Incrementing counter on pre-receive This is called from the Gitaly hooks increasing the reference counter @@ -559,6 +590,39 @@ curl --request POST --header "Gitlab-Kas-Api-Request: <JWT token>" \ --data '{ "uuids": ["102e8a0a-fe29-59bd-b46c-57c3e9bc6411", "5eb12985-0ed5-51f4-b545-fd8871dc2870"] }' ``` +### Scan Execution Policies + +Called from GitLab agent server (`kas`) to retrieve `scan_execution_policies` +configured for the project belonging to the agent token. GitLab `kas` uses +this to configure the agent to scan images in the Kubernetes cluster based on the policy. + +```plaintext +GET /internal/kubernetes/modules/starboard_vulnerability/scan_execution_policies +``` + +Example Request: + +```shell +curl --request GET --header "Gitlab-Kas-Api-Request: <JWT token>" \ + --header "Authorization: Bearer <agent token>" "http://localhost:3000/api/v4/internal/kubernetes/modules/starboard_vulnerability/scan_execution_policies" +``` + +Example response: + +```json +{ + "policies": [ + { + "name": "Policy", + "description": "Policy description", + "enabled": true, + "yaml": "---\nname: Policy\ndescription: 'Policy description'\nenabled: true\nactions:\n- scan: container_scanning\nrules:\n- type: pipeline\n branches:\n - main\n", + "updated_at": "2022-06-02T05:36:26+00:00" + } + ] +} +``` + ## Subscriptions The subscriptions endpoint is used by [CustomersDot](https://gitlab.com/gitlab-org/customers-gitlab-com) (`customers.gitlab.com`) @@ -763,7 +827,7 @@ Example response: ### Moving additional packs -Use a PATCH to move additional packs from one namespace to another. +Use a `PATCH` to move additional packs from one namespace to another. ```plaintext PATCH /namespaces/:id/minutes/move/:target_id @@ -816,7 +880,7 @@ Each array element contains: | Attribute | Type | Required | Description | |:-------------------|:-----------|:---------|:------------| | `namespace_id` | integer | yes | ID of the namespace to be reconciled | -| `next_reconciliation_date` | date | yes | Date when next reconciliation will happen | +| `next_reconciliation_date` | date | yes | Date of the next reconciliation | | `display_alert_from` | date | yes | Start date to display alert of upcoming reconciliation | Example request: diff --git a/doc/development/iterating_tables_in_batches.md b/doc/development/iterating_tables_in_batches.md index 8813fe560db..b4459b53efa 100644 --- a/doc/development/iterating_tables_in_batches.md +++ b/doc/development/iterating_tables_in_batches.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -31,7 +31,7 @@ User.each_batch(of: 10) do |relation| end ``` -This will end up producing queries such as: +This produces queries such as: ```plaintext User Load (0.7ms) SELECT "users"."id" FROM "users" WHERE ("users"."id" >= 41654) ORDER BY "users"."id" ASC LIMIT 1 OFFSET 1000 @@ -46,7 +46,7 @@ all of the arguments that `in_batches` supports. You should always use One should proceed with extra caution, and possibly avoid iterating over a column that can contain duplicate values. When you iterate over an attribute that is not unique, even with the applied max -batch size, there is no guarantee that the resulting batches will not surpass it. The following +batch size, there is no guarantee that the resulting batches do not surpass it. The following snippet demonstrates this situation when one attempt to select `Ci::Build` entries for users with `id` between `1` and `10,000`, the database returns `1 215 178` matching rows. @@ -67,7 +67,7 @@ SELECT "ci_builds".* FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND even though the range size is limited to a certain threshold (`10,000` in the previous example) this threshold does not translate to the size of the returned dataset. That happens because when taking `n` possible values of attributes, one can't tell for sure that the number of records that contains -them will be less than `n`. +them is less than `n`. ## Column definition @@ -99,8 +99,8 @@ determines the data ranges (slices) and schedules the background jobs uses `each ## Efficient usage of `each_batch` -`EachBatch` helps to iterate over large tables. It's important to highlight that `EachBatch` is -not going to magically solve all iteration related performance problems and it might not help at +`EachBatch` helps to iterate over large tables. It's important to highlight that `EachBatch` +does not magically solve all iteration-related performance problems, and it might not help at all in some scenarios. From the database point of view, correctly configured database indexes are also necessary to make `EachBatch` perform well. @@ -108,7 +108,7 @@ also necessary to make `EachBatch` perform well. Let's consider that we want to iterate over the `users` table and print the `User` records to the standard output. The `users` table contains millions of records, thus running one query to fetch -the users will likely time out. +the users likely times out. ![Users table overview](img/each_batch_users_table_v13_7.png) @@ -171,7 +171,7 @@ SELECT "users".* FROM "users" WHERE "users"."id" >= 1 AND "users"."id" < 302 ![Reading the rows from the `users` table](img/each_batch_users_table_iteration_3_v13_7.png) Notice the `<` sign. Previously six items were read from the index and in this query, the last -value is "excluded". The query will look at the index to get the location of the five `user` +value is "excluded". The query looks at the index to get the location of the five `user` rows on the disk and read the rows from the table. The returned array is processed in Ruby. The first iteration is done. For the next iteration, the last `id` value is reused from the @@ -204,13 +204,13 @@ users.each_batch(of: 5) do |relation| end ``` -`each_batch` will produce the following SQL query for the start `id` value: +`each_batch` produces the following SQL query for the start `id` value: ```sql SELECT "users"."id" FROM "users" WHERE "users"."sign_in_count" = 0 ORDER BY "users"."id" ASC LIMIT 1 ``` -Selecting only the `id` column and ordering by `id` is going to "force" the database to use the +Selecting only the `id` column and ordering by `id` forces the database to use the index on the `id` (primary key index) column however, we also have an extra condition on the `sign_in_count` column. The column is not part of the index, so the database needs to look into the actual table to find the first matching row. @@ -225,7 +225,7 @@ The number of scanned rows depends on the data distribution in the table. In this particular example, the database had to read 10 rows (regardless of our batch size setting) to determine the first `id` value. In a "real-world" application it's hard to predict whether the -filtering is going to cause problems or not. In the case of GitLab, verifying the data on a +filtering causes problems or not. In the case of GitLab, verifying the data on a production replica is a good start, but keep in mind that data distribution on GitLab.com can be different from self-managed instances. @@ -289,7 +289,7 @@ CREATE INDEX index_on_users_never_logged_in ON users (sign_in_count, id) ![Reading a good index](img/each_batch_users_table_good_index_v13_7.png) -The following index definition is not going to work well with `each_batch` (avoid). +The following index definition does not work well with `each_batch` (avoid). ```sql CREATE INDEX index_on_users_never_logged_in ON users (sign_in_count) diff --git a/doc/development/licensed_feature_availability.md b/doc/development/licensed_feature_availability.md index 6df5c2164e8..09c32fc4244 100644 --- a/doc/development/licensed_feature_availability.md +++ b/doc/development/licensed_feature_availability.md @@ -22,9 +22,17 @@ it should be restricted on namespace scope. 1. Check using: ```ruby -project.feature_available?(:feature_symbol) +project.licensed_feature_available?(:feature_symbol) ``` +or + +```ruby +group.licensed_feature_available?(:feature_symbol) +``` + +For projects, `licensed_feature_available` delegates to its associated `namespace`. + ## Restricting global features (instance) However, for features such as [Geo](../administration/geo/index.md) and diff --git a/doc/development/logging.md b/doc/development/logging.md index 6a0b50d6970..749f85c9e2d 100644 --- a/doc/development/logging.md +++ b/doc/development/logging.md @@ -344,7 +344,7 @@ provides helper methods to track exceptions: 1. `Gitlab::ErrorTracking.track_exception`: this method only logs and sends exception to Sentry (if configured), 1. `Gitlab::ErrorTracking.log_exception`: this method only logs the exception, - and DOES NOT send the exception to Sentry, + and does not send the exception to Sentry, 1. `Gitlab::ErrorTracking.track_and_raise_for_dev_exception`: this method logs, sends exception to Sentry (if configured) and re-raises the exception for development and test environments. diff --git a/doc/development/maintenance_mode.md b/doc/development/maintenance_mode.md index a118d9cf0ad..f70cca1040e 100644 --- a/doc/development/maintenance_mode.md +++ b/doc/development/maintenance_mode.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Systems group: Geo info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/merge_request_concepts/index.md b/doc/development/merge_request_concepts/index.md index 90e8ff41368..8df0da5123e 100644 --- a/doc/development/merge_request_concepts/index.md +++ b/doc/development/merge_request_concepts/index.md @@ -1,7 +1,7 @@ --- type: reference, dev -stage: create -group: code_review +stage: Create +group: Code Review info: "See the Technical Writers assigned to Development Guidelines: https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments-to-development-guidelines" --- diff --git a/doc/development/migration_style_guide.md b/doc/development/migration_style_guide.md index aebecd90574..c9b59ba66b5 100644 --- a/doc/development/migration_style_guide.md +++ b/doc/development/migration_style_guide.md @@ -52,9 +52,9 @@ work it needs to perform and how long it takes to complete: - Clean-ups, like removing unused columns. - Adding non-critical indices on high-traffic tables. - Adding non-critical indices that take a long time to create. -1. [**Background migrations.**](database/background_migrations.md) These aren't regular Rails migrations, but application code that is +1. [**Batched background migrations.**](database/batched_background_migrations.md) These aren't regular Rails migrations, but application code that is executed via Sidekiq jobs, although a post-deployment migration is used to schedule them. Use them only for data migrations that - exceed the timing guidelines for post-deploy migrations. Background migrations should _not_ change the schema. + exceed the timing guidelines for post-deploy migrations. Batched background migrations should _not_ change the schema. Use the following diagram to guide your decision, but keep in mind that it is just a tool, and the final outcome will always be dependent on the specific changes being made: @@ -495,7 +495,7 @@ def up end ``` -The RuboCop rule generally allows standard Rails migration methods, listed below. This example causes a Rubocop offense: +The RuboCop rule generally allows standard Rails migration methods, listed below. This example causes a RuboCop offense: ```ruby disable_ddl_transaction! @@ -898,6 +898,44 @@ def down end ``` +## Dropping a sequence + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/88387) in GitLab 15.1. + +Dropping a sequence is uncommon, but you can use the `drop_sequence` method provided by the database team. + +Under the hood, it works like this: + +Remove a sequence: + +- Remove the default value if the sequence is actually used. +- Execute `DROP SEQUENCE`. + +Re-add a sequence: + +- Create the sequence, with the possibility of specifying the current value. +- Change the default value of the column. + +A Rails migration example: + +```ruby +class DropSequenceTest < Gitlab::Database::Migration[2.0] + def up + drop_sequence(:ci_pipelines_config, :pipeline_id, :ci_pipelines_config_pipeline_id_seq) + end + + def down + default_value = Ci::Pipeline.maximum(:id) + 10_000 + + add_sequence(:ci_pipelines_config, :pipeline_id, :ci_pipelines_config_pipeline_id_seq, default_value) + end +end +``` + +NOTE: +`add_sequence` should be avoided for columns with foreign keys. +Adding sequence to these columns is **only allowed** in the down method (restore previous schema state). + ## Integer column type By default, an integer column can hold up to a 4-byte (32-bit) number. That is diff --git a/doc/development/new_fe_guide/modules/widget_extensions.md b/doc/development/new_fe_guide/modules/widget_extensions.md index d3be8981abb..4bae0ac70c4 100644 --- a/doc/development/new_fe_guide/modules/widget_extensions.md +++ b/doc/development/new_fe_guide/modules/widget_extensions.md @@ -46,6 +46,7 @@ export default { methods: { fetchCollapsedData(props) {}, // Required: Fetches data required for collapsed state fetchFullData(props) {}, // Required: Fetches data for the full expanded content + fetchMultiData() {}, // Optional: Works in conjunction with `enablePolling` and allows polling multiple endpoints }, }; ``` @@ -232,6 +233,30 @@ export default { }; ``` +If the extension needs to poll multiple endpoints at the same time, then `fetchMultiData` +can be used to return an array of functions. A new `poll` object is created for each +endpoint and they are polled separately. After all endpoints are resolved, polling is +stopped and `setCollapsedData` is called with an array of `response.data`. + +```javascript +export default { + //... + enablePolling: true + methods: { + fetchMultiData() { + return [ + () => axios.get(this.reportPath1), + () => axios.get(this.reportPath2), + () => axios.get(this.reportPath3) + }, + }, +}; +``` + +**Important** The function needs to return a `Promise` that resolves the `response` object. +The implementation relies on the `POLL-INTERVAL` header to keep polling, therefore it is +important not to alter the status code and headers. + ### Errors If `fetchCollapsedData()` or `fetchFullData()` methods throw an error: diff --git a/doc/development/omnibus.md b/doc/development/omnibus.md index dc83b0ea257..b62574e34e5 100644 --- a/doc/development/omnibus.md +++ b/doc/development/omnibus.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Systems group: Distribution info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/ordering_table_columns.md b/doc/development/ordering_table_columns.md index 00ce15fcc10..7cd3d4fb208 100644 --- a/doc/development/ordering_table_columns.md +++ b/doc/development/ordering_table_columns.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -24,15 +24,15 @@ The first column is a 4-byte integer. The next is text of variable length. The bytes. To meet the alignment requirements, four zeros are to be added right after the first column, so `id` occupies 4 bytes, then 4 bytes of alignment padding, and only next `name` is being stored. Therefore, in this case, 8 bytes -will be spent for storing a 4-byte integer. +are spent for storing a 4-byte integer. The space between rows is also subject to alignment padding. The `user_id` -column takes only 4 bytes, and on 64-bit platform, 4 zeroes will be added for +column takes only 4 bytes, and on 64-bit platform, 4 zeroes are added for alignment padding, to allow storing the next row beginning with the "clear" word. As a result, the actual size of each column would be (omitting variable length data and 24-byte tuple header): 8 bytes, variable, 8 bytes. This means that -each row will require at least 16 bytes for the two 4-byte integers. If a table +each row requires at least 16 bytes for the two 4-byte integers. If a table has a few rows this is not an issue. However, once you start storing millions of rows you can save space by using a different order. For the above example, the ideal column order would be the following: @@ -49,7 +49,7 @@ or In these examples, the `id` and `user_id` columns are packed together, which means we only need 8 bytes to store _both_ of them. This in turn means each row -will require 8 bytes less space. +requires 8 bytes less space. Since Ruby on Rails 5.1, the default data type for IDs is `bigint`, which uses 8 bytes. We are using `integer` in the examples to showcase a more realistic reordering scenario. @@ -57,7 +57,7 @@ We are using `integer` in the examples to showcase a more realistic reordering s ## Type Sizes While the [PostgreSQL documentation](https://www.postgresql.org/docs/current/datatype.html) contains plenty -of information we will list the sizes of common types here so it's easier to +of information we list the sizes of common types here so it's easier to look them up. Here "word" refers to the word size, which is 4 bytes for a 32 bits platform and 8 bytes for a 64 bits platform. @@ -69,7 +69,7 @@ bits platform and 8 bytes for a 64 bits platform. | `real` | 4 bytes | 1 word | | `double precision` | 8 bytes | 8 bytes | | `boolean` | 1 byte | not needed | -| `text` / `string` | variable, 1 byte plus the data | 1 word | +| `text` / `string` | variable, 1 byte plus the data | 1 word | | `bytea` | variable, 1 or 4 bytes plus the data | 1 word | | `timestamp` | 8 bytes | 8 bytes | | `timestamptz` | 8 bytes | 8 bytes | @@ -77,7 +77,7 @@ bits platform and 8 bytes for a 64 bits platform. A "variable" size means the actual size depends on the value being stored. If PostgreSQL determines this can be embedded directly into a row it may do so, but -for very large values it will store the data externally and store a pointer (of +for very large values it stores the data externally and store a pointer (of 1 word in size) in the column. Because of this variable sized columns should always be at the end of a table. diff --git a/doc/development/packages.md b/doc/development/packages.md index 6526bdd45a1..a79f5f09677 100644 --- a/doc/development/packages.md +++ b/doc/development/packages.md @@ -1,302 +1,11 @@ --- -stage: Package -group: Package -info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +redirect_to: 'packages/index.md' +remove_date: '2022-08-19' --- -# Packages +This document was moved to [another location](packages/index.md). -This document guides you through adding support to GitLab for a new a [package management system](../administration/packages/index.md). - -See the already supported formats in the [Packages & Registries documentation](../user/packages/index.md) - -It is possible to add a new format with only backend changes. -This guide is superficial and does not cover the way the code should be written. -However, you can find a good example by looking at the following merge requests: - -- [npm registry support](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/8673) -- [Maven repository](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/6607) -- [Instance-level API for Maven repository](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/8757) -- [NuGet group-level API](https://gitlab.com/gitlab-org/gitlab/-/issues/36423) - -## General information - -The existing database model requires the following: - -- Every package belongs to a project. -- Every package file belongs to a package. -- A package can have one or more package files. -- The package model is based on storing information about the package and its version. - -### API endpoints - -Package systems work with GitLab via API. For example `lib/api/npm_project_packages.rb` -implements API endpoints to work with npm clients. So, the first thing to do is to -add a new `lib/api/your_name_project_packages.rb` file with API endpoints that are -necessary to make the package system client to work. Usually that means having -endpoints like: - -- GET package information. -- GET package file content. -- PUT upload package. - -Since the packages belong to a project, it's expected to have project-level endpoint (remote) -for uploading and downloading them. For example: - -```plaintext -GET https://gitlab.com/api/v4/projects/<your_project_id>/packages/npm/ -PUT https://gitlab.com/api/v4/projects/<your_project_id>/packages/npm/ -``` - -Group-level and instance-level endpoints should only be considered after the project-level endpoint is available in production. - -#### Remote hierarchy - -Packages are scoped within various levels of access, which is generally configured by setting your remote. A -remote endpoint may be set at the project level, meaning when installing packages, only packages belonging to that -project are visible. Alternatively, a group-level endpoint may be used to allow visibility to all packages -within a given group. Lastly, an instance-level endpoint can be used to allow visibility to all packages within an -entire GitLab instance. - -As an MVC, we recommend beginning with a project-level endpoint. A typical iteration plan for remote hierarchies is to go from: - -- Publish and install in a project -- Install from a group -- Publish and install in an Instance (this is for Self-Managed customers) - -Using instance-level endpoints requires [stricter naming conventions](#naming-conventions). - -NOTE: -Composer package naming scope is Instance Level. - -### Naming conventions - -To avoid name conflict for instance-level endpoints you must define a package naming convention -that gives a way to identify the project that the package belongs to. This generally involves using the project -ID or full project path in the package name. See -[Conan's naming convention](../user/packages/conan_repository/index.md#package-recipe-naming-convention-for-instance-remotes) as an example. - -For group and project-level endpoints, naming can be less constrained and it is up to the group and project -members to be certain that there is no conflict between two package names. However, the system should prevent -a user from reusing an existing name within a given scope. - -Otherwise, naming should follow the package manager's naming conventions and include a validation in the `package.md` -model for that package type. - -### Services and finders - -Logic for performing tasks such as creating package or package file records or finding packages should not live -within the API file, but should live in services and finders. Existing services and finders should be used or -extended when possible to keep the common package logic grouped as much as possible. - -### Configuration - -GitLab has a `packages` section in its configuration file (`gitlab.rb`). -It applies to all package systems supported by GitLab. Usually you don't need -to add anything there. - -Packages can be configured to use object storage, therefore your code must support it. - -## MVC Approach - -The way new package systems are integrated in GitLab is using an [MVC](https://about.gitlab.com/handbook/values/#minimum-viable-change-mvc). Therefore, the first iteration should support the bare minimum user actions: - -- Authentication with a GitLab job, personal access, project access, or deploy token -- Uploading a package and displaying basic metadata in the user interface -- Pulling a package -- Required actions - -Required actions are all the additional requests that GitLab needs to handle so the corresponding package manager CLI can work properly. It could be a search feature or an endpoint providing meta information about a package. For example: - -- For NuGet, the search request was implemented during the first MVC iteration, to support Visual Studio. -- For npm, there is a metadata endpoint used by `npm` to get the tarball URL. - -For the first MVC iteration, it's recommended to stay at the project level of the [remote hierarchy](#remote-hierarchy). Other levels can be tackled with [future Merge Requests](#future-work). - -There are usually 2 phases for the MVC: - -- [Analysis](#analysis) -- [Implementation](#implementation) - -### Keep iterations small - -When implementing a new package manager, it is tempting to create one large merge request containing all of the -necessary endpoints and services necessary to support basic usage. Instead, put the -API endpoints behind a [feature flag](feature_flags/index.md) and -submit each endpoint or behavior (download, upload, etc) in a different merge request to shorten the review -process. - -### Analysis - -During this phase, the idea is to collect as much information as possible about the API used by the package system. Here some aspects that can be useful to include: - -- **Authentication**: What authentication mechanisms are available (OAuth, Basic - Authorization, other). Keep in mind that GitLab users often want to use their - [Personal Access Tokens](../user/profile/personal_access_tokens.md). - Although not needed for the MVC first iteration, the [CI/CD job tokens](../ci/jobs/ci_job_token.md) - have to be supported at some point in the future. -- **Requests**: Which requests are needed to have a working MVC. Ideally, produce - a list of all the requests needed for the MVC (including required actions). Further - investigation could provide an example for each request with the request and the response bodies. -- **Upload**: Carefully analyze how the upload process works. This is likely the most - complex request to implement. A detailed analysis is desired here as uploads can be - encoded in different ways (body or multipart) and can even be in a totally different - format (for example, a JSON structure where the package file is a Base64 value of - a particular field). These different encodings lead to slightly different implementations - on GitLab and GitLab Workhorse. For more detailed information, review [file uploads](#file-uploads). -- **Endpoints**: Suggest a list of endpoint URLs to implement in GitLab. -- **Split work**: Suggest a list of changes to do to incrementally build the MVC. - This gives a good idea of how much work there is to be done. Here is an example - list that would need to be adapted on a case by case basis: - 1. Empty file structure (API file, base service for this package) - 1. Authentication system for "logging in" to the package manager - 1. Identify metadata and create applicable tables - 1. Workhorse route for [object storage direct upload](uploads/index.md#direct-upload) - 1. Endpoints required for upload/publish - 1. Endpoints required for install/download - 1. Endpoints required for required actions - -The analysis usually takes a full milestone to complete, though it's not impossible to start the implementation in the same milestone. - -In particular, the upload request can have some [requirements in the GitLab Workhorse project](#file-uploads). This project has a different release cycle than the rails backend. It's **strongly** recommended that you open an issue there as soon as the upload request analysis is done. This way GitLab Workhorse is already ready when the upload request is implemented on the rails backend. - -### Implementation - -The implementation of the different Merge Requests varies between different package system integrations. Contributors should take into account some important aspects of the implementation phase. - -#### Authentication - -The MVC must support [Personal Access Tokens](../user/profile/personal_access_tokens.md) right from the start. We currently support two options for these tokens: OAuth and Basic Access. - -OAuth authentication is already supported. You can see an example in the [npm API](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/api/npm_project_packages.rb). - -[Basic Access authentication](https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication) -support is done by overriding a specific function in the API helpers, like -[this example in the Conan API](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/api/conan_packages.rb). -For this authentication mechanism, keep in mind that some clients can send an unauthenticated -request first, wait for the 401 Unauthorized response with the [`WWW-Authenticate`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/WWW-Authenticate) -field, then send an updated (authenticated) request. This case is more involved as -GitLab needs to handle the 401 Unauthorized response. The [NuGet API](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/api/nuget_packages.rb) -supports this case. - -#### Authorization - -There are project and group level permissions for `read_package`, `create_package`, and `destroy_package`. Each -endpoint should -[authorize the requesting user](https://gitlab.com/gitlab-org/gitlab/-/blob/398fef1ca26ae2b2c3dc89750f6b20455a1e5507/ee/lib/api/conan_packages.rb) -against the project or group before continuing. - -#### Database and handling metadata - -The current database model allows you to store a name and a version for each package. -Every time you upload a new package, you can either create a new record of `Package` -or add files to existing record. `PackageFile` should be able to store all file-related -information like the file `name`, `side`, `sha1`, and so on. - -If there is specific data necessary to be stored for only one package system support, -consider creating a separate metadata model. See `packages_maven_metadata` table -and `Packages::Maven::Metadatum` model as an example for package specific data, and `packages_conan_file_metadata` table -and `Packages::Conan::FileMetadatum` model as an example for package file specific data. - -If there is package specific behavior for a given package manager, add those methods to the metadata models and -delegate from the package model. - -Note that the existing package UI only displays information within the `packages_packages` and `packages_package_files` -tables. If the data stored in the metadata tables need to be displayed, a ~frontend change is required. - -#### File uploads - -File uploads should be handled by GitLab Workhorse using object accelerated uploads. What this means is that -the workhorse proxy that checks all incoming requests to GitLab intercept the upload request, -upload the file, and forward a request to the main GitLab codebase only containing the metadata -and file location rather than the file itself. An overview of this process can be found in the -[development documentation](uploads/index.md#direct-upload). - -In terms of code, this means a route must be added to the -[GitLab Workhorse project](https://gitlab.com/gitlab-org/gitlab-workhorse) for each upload endpoint being added -(instance, group, project). [This merge request](https://gitlab.com/gitlab-org/gitlab-workhorse/-/merge_requests/412/diffs) -demonstrates adding an instance-level endpoint for Conan to workhorse. You can also see the Maven project level endpoint -implemented in the same file. - -Once the route has been added, you must add an additional `/authorize` version of the upload endpoint to your API file. -[This example](https://gitlab.com/gitlab-org/gitlab/-/blob/398fef1ca26ae2b2c3dc89750f6b20455a1e5507/ee/lib/api/maven_packages.rb#L164) -shows the additional endpoint added for Maven. The `/authorize` endpoint verifies and authorizes the request from workhorse, -then the normal upload endpoint is implemented below, consuming the metadata that workhorse provides in order to -create the package record. Workhorse provides a variety of file metadata such as type, size, and different checksum formats. - -For testing purposes, you may want to [enable object storage](https://gitlab.com/gitlab-org/gitlab-development-kit/blob/main/doc/howto/object_storage.md) -in your local development environment. - -#### File size limits - -Files uploaded to the GitLab Package Registry are [limited by format](../administration/instance_limits.md#package-registry-limits). -On GitLab.com, these are typically set to 5GB to help prevent timeout issues and abuse. - -When a new package type is added to the `Packages::Package` model, a size limit must be added -similar to [this example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/52639/diffs#382f879fb09b0212e3cedd99e6c46e2083867216), -or the [related test](https://gitlab.com/gitlab-org/gitlab/-/blob/fe4ba43766781371cebfacd78364a1de762917cd/spec/models/packages/package_spec.rb#L761) -must be updated if file size limits do not apply. The only reason a size limit does not apply is if -the package format does not upload and store package files. - -#### Rate Limits on GitLab.com - -Package manager clients can make rapid requests that exceed the -[GitLab.com standard API rate limits](../user/gitlab_com/index.md#gitlabcom-specific-rate-limits). -This results in a `429 Too Many Requests` error. - -We have opened a set of paths to allow higher rate limits. Unless it is not possible, -new package managers should follow these conventions so they can take advantage of the -expanded package rate limit. - -These route prefixes guarantee a higher rate limit: - -```plaintext -/api/v4/packages/ -/api/v4/projects/:project_id/packages/ -/api/v4/groups/:group_id/-/packages/ -``` - -### MVC Checklist - -When adding support to GitLab for a new package manager, the first iteration must contain the -following features. You can add the features through many merge requests as needed, but all the -features must be implemented when the feature flag is removed. - -- Project-level API -- Push event tracking -- Pull event tracking -- Authentication with Personal Access Tokens -- Authentication with Job Tokens -- Authentication with Deploy Tokens (group and project) -- File size [limit](#file-size-limits) -- File format guards (only accept valid file formats for the package type) -- Name regex with validation -- Version regex with validation -- Workhorse route for [accelerated](uploads/working_with_uploads.md) uploads -- Background workers for extracting package metadata (if applicable) -- Documentation (how to use the feature) -- API Documentation (individual endpoints with curl examples) -- Seeding in [`db/fixtures/development/26_packages.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/db/fixtures/development/26_packages.rb) -- Update the [runbook](https://gitlab.com/gitlab-com/runbooks/-/blob/31fb4959e89db25fddf865bc81734c222daf32dd/dashboards/stage-groups/package.dashboard.jsonnet#L74) for the Grafana charts -- End-to-end feature tests for (at the minimum) publishing and installing a package - -### Future Work - -While working on the MVC, contributors might find features that are not mandatory for the MVC but can provide a better user experience. It's generally a good idea to keep an eye on those and open issues. - -Here are some examples - -1. Endpoints required for search -1. Front end updates to display additional package information and metadata -1. Limits on file sizes -1. Tracking for metrics -1. Read more metadata fields from the package to make it available to the front end. For example, it's usual to be able to tag a package. Those tags can be read and saved by backend and then displayed on the packages UI. -1. Endpoints for the upper levels of the [remote hierarchy](#remote-hierarchy). This step might need to create a [naming convention](#naming-conventions) - -## Exceptions - -This documentation is just guidelines on how to implement a package manager to match the existing structure and logic -already present within GitLab. While the structure is intended to be extendable and flexible enough to allow for -any given package manager, if there is good reason to stray due to the constraints or needs of a given package -manager, then it should be raised and discussed within the implementation issue or merge request to work towards -the most efficient outcome. +<!-- This redirect file can be deleted after <2022-08-19>. --> +<!-- Redirects that point to other docs in the same project expire in three months. --> +<!-- Redirects that point to docs in a different project or site (for example, link is not relative and starts with `https:`) expire in one year. --> +<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/redirects.html --> diff --git a/doc/development/packages/index.md b/doc/development/packages/index.md new file mode 100644 index 00000000000..55deaa229ba --- /dev/null +++ b/doc/development/packages/index.md @@ -0,0 +1,25 @@ +--- +stage: Package +group: Package +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Package Registry Development + +Development and Architectural documentation for the package registry + +- [Developing a new format](new_format_development.md) +- [Settings](settings.md) +- [Structure / Schema](structure.md) +- API documentation + - [Composer](../../api/packages/composer.md) + - [Conan](../../api/packages/conan.md) + - [Debian](../../api/packages/debian.md) + - [Generic](../../user/packages/generic_packages/index.md) + - [Go Proxy](../../api/packages/go_proxy.md) + - [Helm](../../api/packages/helm.md) + - [Maven](../../api/packages/maven.md) + - [npm](../../api/packages/npm.md) + - [NuGet](../../api/packages/nuget.md) + - [PyPI](../../api/packages/pypi.md) + - [Ruby Gems](../../api/packages/rubygems.md) diff --git a/doc/development/packages/new_format_development.md b/doc/development/packages/new_format_development.md new file mode 100644 index 00000000000..f7d02f9160b --- /dev/null +++ b/doc/development/packages/new_format_development.md @@ -0,0 +1,302 @@ +--- +stage: Package +group: Package +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Developing support for a new package format + +This document guides you through adding support to GitLab for a new a [package management system](../../administration/packages/index.md). + +See the already supported formats in the [Packages & Registries documentation](../../user/packages/index.md) + +It is possible to add a new format with only backend changes. +This guide is superficial and does not cover the way the code should be written. +However, you can find a good example by looking at the following merge requests: + +- [npm registry support](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/8673) +- [Maven repository](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/6607) +- [Instance-level API for Maven repository](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/8757) +- [NuGet group-level API](https://gitlab.com/gitlab-org/gitlab/-/issues/36423) + +## General information + +The existing database model requires the following: + +- Every package belongs to a project. +- Every package file belongs to a package. +- A package can have one or more package files. +- The package model is based on storing information about the package and its version. + +### API endpoints + +Package systems work with GitLab via API. For example `lib/api/npm_project_packages.rb` +implements API endpoints to work with npm clients. So, the first thing to do is to +add a new `lib/api/your_name_project_packages.rb` file with API endpoints that are +necessary to make the package system client work. Usually that means having +endpoints like: + +- GET package information. +- GET package file content. +- PUT upload package. + +Because the packages belong to a project, it's expected to have project-level endpoint (remote) +for uploading and downloading them. For example: + +```plaintext +GET https://gitlab.com/api/v4/projects/<your_project_id>/packages/npm/ +PUT https://gitlab.com/api/v4/projects/<your_project_id>/packages/npm/ +``` + +Group-level and instance-level endpoints should only be considered after the project-level endpoint is available in production. + +#### Remote hierarchy + +Packages are scoped within various levels of access, which is generally configured by setting your remote. A +remote endpoint may be set at the project level, meaning when installing packages, only packages belonging to that +project are visible. Alternatively, a group-level endpoint may be used to allow visibility to all packages +in a given group. Lastly, an instance-level endpoint can be used to allow visibility to all packages in an +entire GitLab instance. + +As an MVC, we recommend beginning with a project-level endpoint. A typical iteration plan for remote hierarchies is to go from: + +- Publish and install in a project +- Install from a group +- Publish and install in an Instance (this is for Self-Managed customers) + +Using instance-level endpoints requires [stricter naming conventions](#naming-conventions). + +NOTE: +Composer package naming scope is Instance Level. + +### Naming conventions + +To avoid name conflict for instance-level endpoints you must define a package naming convention +that gives a way to identify the project that the package belongs to. This generally involves using the project +ID or full project path in the package name. See +[Conan's naming convention](../../user/packages/conan_repository/index.md#package-recipe-naming-convention-for-instance-remotes) as an example. + +For group and project-level endpoints, naming can be less constrained and it is up to the group and project +members to be certain that there is no conflict between two package names. However, the system should prevent +a user from reusing an existing name within a given scope. + +Otherwise, naming should follow the package manager's naming conventions and include a validation in the `package.md` +model for that package type. + +### Services and finders + +Logic for performing tasks such as creating package or package file records or finding packages should not live +in the API file, but should live in services and finders. Existing services and finders should be used or +extended when possible to keep the common package logic grouped as much as possible. + +### Configuration + +GitLab has a `packages` section in its configuration file (`gitlab.rb`). +It applies to all package systems supported by GitLab. Usually you don't need +to add anything there. + +Packages can be configured to use object storage, therefore your code must support it. + +## MVC Approach + +The way new package systems are integrated in GitLab is using an [MVC](https://about.gitlab.com/handbook/values/#minimum-viable-change-mvc). Therefore, the first iteration should support the bare minimum user actions: + +- Authentication with a GitLab job, personal access, project access, or deploy token +- Uploading a package and displaying basic metadata in the user interface +- Pulling a package +- Required actions + +Required actions are all the additional requests that GitLab must handle so the corresponding package manager CLI can work properly. It could be a search feature or an endpoint providing meta information about a package. For example: + +- For NuGet, the search request was implemented during the first MVC iteration, to support Visual Studio. +- For npm, there is a metadata endpoint used by `npm` to get the tarball URL. + +For the first MVC iteration, it's recommended to stay at the project level of the [remote hierarchy](#remote-hierarchy). Other levels can be tackled with [future Merge Requests](#future-work). + +The MVC usually has two phases: + +- [Analysis](#analysis) +- [Implementation](#implementation) + +### Keep iterations small + +When implementing a new package manager, it is tempting to create one large merge request containing all of the +necessary endpoints and services necessary to support basic usage. Instead: + +1. Put the API endpoints behind a [feature flag](../feature_flags/index.md). +1. Submit each endpoint or behavior (download, upload, etc) in a different merge request to shorten the review process. + +### Analysis + +During this phase, the idea is to collect as much information as possible about the API used by the package system. Here some aspects that can be useful to include: + +- **Authentication**: What authentication mechanisms are available (OAuth, Basic + Authorization, other). Keep in mind that GitLab users often want to use their + [Personal Access Tokens](../../user/profile/personal_access_tokens.md). + Although not needed for the MVC first iteration, the [CI/CD job tokens](../../ci/jobs/ci_job_token.md) + have to be supported at some point in the future. +- **Requests**: Which requests are needed to have a working MVC. Ideally, produce + a list of all the requests needed for the MVC (including required actions). Further + investigation could provide an example for each request with the request and the response bodies. +- **Upload**: Carefully analyze how the upload process works. This request is likely the most + complex to implement. A detailed analysis is desired here as uploads can be + encoded in different ways (body or multipart) and can even be in a totally different + format (for example, a JSON structure where the package file is a Base64 value of + a particular field). These different encodings lead to slightly different implementations + on GitLab and GitLab Workhorse. For more detailed information, review [file uploads](#file-uploads). +- **Endpoints**: Suggest a list of endpoint URLs to implement in GitLab. +- **Split work**: Suggest a list of changes to do to incrementally build the MVC. + This gives a good idea of how much work there is to be done. Here is an example + list that must be adapted on a case by case basis: + 1. Empty file structure (API file, base service for this package) + 1. Authentication system for "logging in" to the package manager + 1. Identify metadata and create applicable tables + 1. Workhorse route for [object storage direct upload](../uploads/index.md#direct-upload) + 1. Endpoints required for upload/publish + 1. Endpoints required for install/download + 1. Endpoints required for required actions + +The analysis usually takes a full milestone to complete, though it's not impossible to start the implementation in the same milestone. + +In particular, the upload request can have some [requirements in the GitLab Workhorse project](#file-uploads). This project has a different release cycle than the rails backend. It's **strongly** recommended that you open an issue there as soon as the upload request analysis is done. This way GitLab Workhorse is already ready when the upload request is implemented on the rails backend. + +### Implementation + +The implementation of the different Merge Requests varies between different package system integrations. Contributors should take into account some important aspects of the implementation phase. + +#### Authentication + +The MVC must support [Personal Access Tokens](../../user/profile/personal_access_tokens.md) right from the start. We support two options for these tokens: OAuth and Basic Access. + +OAuth authentication is already supported. You can see an example in the [npm API](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/api/npm_project_packages.rb). + +[Basic Access authentication](https://developer.mozilla.org/en-US/docs/Web/HTTP/Authentication) +support is done by overriding a specific function in the API helpers, like +[this example in the Conan API](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/api/conan_packages.rb). +For this authentication mechanism, keep in mind that some clients can send an unauthenticated +request first, wait for the 401 Unauthorized response with the [`WWW-Authenticate`](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/WWW-Authenticate) +field, then send an updated (authenticated) request. This case is more involved as +GitLab must handle the `401 Unauthorized` response. The [NuGet API](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/api/nuget_packages.rb) +supports this case. + +#### Authorization + +Project and group level permissions exist for `read_package`, `create_package`, and `destroy_package`. Each +endpoint should +[authorize the requesting user](https://gitlab.com/gitlab-org/gitlab/-/blob/398fef1ca26ae2b2c3dc89750f6b20455a1e5507/ee/lib/api/conan_packages.rb) +against the project or group before continuing. + +#### Database and handling metadata + +The current database model allows you to store a name and a version for each package. +Every time you upload a new package, you can either create a new record of `Package` +or add files to existing record. `PackageFile` should be able to store all file-related +information like the file `name`, `side`, `sha1`, and so on. + +If there is specific data necessary to be stored for only one package system support, +consider creating a separate metadata model. See `packages_maven_metadata` table +and `Packages::Maven::Metadatum` model as an example for package specific data, and `packages_conan_file_metadata` table +and `Packages::Conan::FileMetadatum` model as an example for package file specific data. + +If there is package specific behavior for a given package manager, add those methods to the metadata models and +delegate from the package model. + +The existing package UI only displays information in the `packages_packages` and `packages_package_files` +tables. If the data stored in the metadata tables must be displayed, a `~frontend` change is required. + +#### File uploads + +File uploads should be handled by GitLab Workhorse using object accelerated uploads. What this means is that +the workhorse proxy that checks all incoming requests to GitLab intercept the upload request, +upload the file, and forward a request to the main GitLab codebase only containing the metadata +and file location rather than the file itself. An overview of this process can be found in the +[development documentation](../uploads/index.md#direct-upload). + +In terms of code, this means a route must be added to the +[GitLab Workhorse project](https://gitlab.com/gitlab-org/gitlab-workhorse) for each upload endpoint being added +(instance, group, project). [This merge request](https://gitlab.com/gitlab-org/gitlab-workhorse/-/merge_requests/412/diffs) +demonstrates adding an instance-level endpoint for Conan to workhorse. You can also see the Maven project level endpoint +implemented in the same file. + +After the route has been added, you must add an additional `/authorize` version of the upload endpoint to your API file. +[This example](https://gitlab.com/gitlab-org/gitlab/-/blob/398fef1ca26ae2b2c3dc89750f6b20455a1e5507/ee/lib/api/maven_packages.rb#L164) +shows the additional endpoint added for Maven. The `/authorize` endpoint verifies and authorizes the request from workhorse, +then the normal upload endpoint is implemented below, consuming the metadata that Workhorse provides to +create the package record. Workhorse provides a variety of file metadata such as type, size, and different checksum formats. + +For testing purposes, you may want to [enable object storage](https://gitlab.com/gitlab-org/gitlab-development-kit/blob/main/doc/howto/object_storage.md) +in your local development environment. + +#### File size limits + +Files uploaded to the GitLab Package Registry are [limited by format](../../administration/instance_limits.md#package-registry-limits). +On GitLab.com, these are typically set to 5GB to help prevent timeout issues and abuse. + +When a new package type is added to the `Packages::Package` model, a size limit must be added +similar to [this example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/52639/diffs#382f879fb09b0212e3cedd99e6c46e2083867216), +or the [related test](https://gitlab.com/gitlab-org/gitlab/-/blob/fe4ba43766781371cebfacd78364a1de762917cd/spec/models/packages/package_spec.rb#L761) +must be updated if file size limits do not apply. The only reason a size limit does not apply is if +the package format does not upload and store package files. + +#### Rate Limits on GitLab.com + +Package manager clients can make rapid requests that exceed the +[GitLab.com standard API rate limits](../../user/gitlab_com/index.md#gitlabcom-specific-rate-limits). +This results in a `429 Too Many Requests` error. + +We have opened a set of paths to allow higher rate limits. Unless it is not possible, +new package managers should follow these conventions so they can take advantage of the +expanded package rate limit. + +These route prefixes guarantee a higher rate limit: + +```plaintext +/api/v4/packages/ +/api/v4/projects/:project_id/packages/ +/api/v4/groups/:group_id/-/packages/ +``` + +### MVC Checklist + +When adding support to GitLab for a new package manager, the first iteration must contain the +following features. You can add the features through many merge requests as needed, but all the +features must be implemented when the feature flag is removed. + +- Project-level API +- Push event tracking +- Pull event tracking +- Authentication with Personal Access Tokens +- Authentication with Job Tokens +- Authentication with Deploy Tokens (group and project) +- File size [limit](#file-size-limits) +- File format guards (only accept valid file formats for the package type) +- Name regex with validation +- Version regex with validation +- Workhorse route for [accelerated](../uploads/working_with_uploads.md) uploads +- Background workers for extracting package metadata (if applicable) +- Documentation (how to use the feature) +- API Documentation (individual endpoints with curl examples) +- Seeding in [`db/fixtures/development/26_packages.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/db/fixtures/development/26_packages.rb) +- Update the [runbook](https://gitlab.com/gitlab-com/runbooks/-/blob/31fb4959e89db25fddf865bc81734c222daf32dd/dashboards/stage-groups/package.dashboard.jsonnet#L74) for the Grafana charts +- End-to-end feature tests for (at the minimum) publishing and installing a package + +### Future Work + +While working on the MVC, contributors might find features that are not mandatory for the MVC but can provide a better user experience. It's generally a good idea to keep an eye on those and open issues. + +Here are some examples + +1. Endpoints required for search +1. Front end updates to display additional package information and metadata +1. Limits on file sizes +1. Tracking for metrics +1. Read more metadata fields from the package to make it available to the front end. For example, it's usual to be able to tag a package. Those tags can be read and saved by backend and then displayed on the packages UI. +1. Endpoints for the upper levels of the [remote hierarchy](#remote-hierarchy). This step might require you to create a [naming convention](#naming-conventions) + +## Exceptions + +This documentation is just guidelines on how to implement a package manager to match the existing structure and logic +already present in GitLab. While the structure is intended to be extendable and flexible enough to allow for +any given package manager, if there is good reason to stray due to the constraints or needs of a given package +manager, then it should be raised and discussed in the implementation issue or merge request to work towards +the most efficient outcome. diff --git a/doc/development/packages/settings.md b/doc/development/packages/settings.md new file mode 100644 index 00000000000..37961c0504c --- /dev/null +++ b/doc/development/packages/settings.md @@ -0,0 +1,82 @@ +--- +stage: Package +group: Package +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Package Settings + +This page includes an exhaustive list of settings related to and maintained by the package stage. + +## Instance Settings + +### Package Registry + +Setting | Table | Description +------- | ----- | ----------- +`npm_package_requests_forwarding` | `application_settings` | Enables or disables npm package forwarding at the instance level. +`pypi_package_requests_forwarding` | `application_settings` | Enables or disables PyPI package forwarding at the instance level. +`packages_cleanup_package_file_worker_capacity` | `application_settings` | Number of concurrent workers allowed for package file cleanup. +`throttle_unauthenticated_packages_api_requests_per_period` | `application_settings` | Request limit for unauthenticated package API requests in the period defined by `throttle_unauthenticated_packages_api_period_in_seconds`. +`throttle_unauthenticated_packages_api_period_in_seconds` | `application_settings` | Period in seconds to measure unauthenticated package API requests. +`throttle_authenticated_packages_api_requests_per_period` | `application_settings` | Request limit for authenticated package API requests in the period defined by `throttle_authenticated_packages_api_period_in_seconds`. +`throttle_authenticated_packages_api_period_in_seconds` | `application_settings` | Period in seconds to measure authenticated package API requests. +`throttle_unauthenticated_packages_api_enabled` | `application_settings` +`throttle_authenticated_packages_api_enabled` | `application_settings` | Enables or disables request limits/throttling for the package API. +`conan_max_file_size` | `plan_limits` | Maximum file size for a Conan package file. +`maven_max_file_size` | `plan_limits` | Maximum file size for a Maven package file. +`npm_max_file_size` | `plan_limits` | Maximum file size for an npm package file. +`nuget_max_file_size` | `plan_limits` | Maximum file size for a NuGet package file. +`pypi_max_file_size` | `plan_limits` | Maximum file size for a PyPI package file. +`generic_packages_max_file_size` | `plan_limits` | Maximum file size for a generic package file. +`golang_max_file_size` | `plan_limits` | Maximum file size for a GoProxy package file. +`debian_max_file_size` | `plan_limits` | Maximum file size for a Debian package file. +`rubygems_max_file_size` | `plan_limits` | Maximum file size for a RubyGems package file. +`terraform_module_max_file_size` | `plan_limits` | Maximum file size for a Terraform package file. +`helm_max_file_size` | `plan_limits` | Maximum file size for a Helm package file. + +### Container Registry + +Setting | Table | Description +------- | ----- | ----------- +`container_registry_token_expire_delay` | `application_settings` | The time in minutes before the container registry auth token (JWT) expires. +`container_expiration_policies_enable_historic_entries` | `application_settings` | Allow or prevent projects older than 12.8 to use container cleanup policies. +`container_registry_vendor` | `application_settings` | The vendor of the container registry. `gitlab` for the GitLab container registry, other values for external registries. +`container_registry_version` | `application_settings` | The current version of the container registry. +`container_registry_features` | `application_settings` | Features supported by the connected container registry. For example, tag deletion. +`container_registry_delete_tags_service_timeout` | `application_settings` | The maximum time (in seconds) that the cleanup process can take to delete a batch of tags. +`container_registry_expiration_policies_worker_capacity` | `application_settings` | Number of concurrent container image cleanup policy workers allowed. +`container_registry_cleanup_tags_service_max_list_size` | `application_settings` | The maximum number of tags that can be deleted in a cleanup policy single execution. Additional tags must be deleted in another execution. +`container_registry_expiration_policies_caching` | `application_settings` | Enable or disable tag creation timestamp caching during execution of cleanup policies. +`container_registry_import_max_tags_count` | `application_settings` | Defines what is a the maximum amount of tags that we accept to migrate. +`container_registry_import_max_retries` | `application_settings` | The maximum amount of retries done on a migration that is aborted. +`container_registry_import_start_max_retries` | `application_settings` | The maximum amount of requests to start an import step that is sent to the Container Registry API. +`container_registry_import_max_step_duration` | `application_settings` | The maximum amount of seconds before an ongoing migration is considered as stale. +`container_registry_import_target_plan` | `application_settings` | The target subscription plan on which we're intend to pick container repositories. +`container_registry_import_created_before` | `application_settings` | Only image repositories created before this timestamp are eligible for the migration. +`container_registry_pre_import_timeout` | `application_settings` | The timeout for long running `pre_imports` before they are canceled by the `GuardWorker`. +`container_registry_import_timeout` | `application_settings` | The timeout for long running imports before they are canceled by the `GuardWorker`. +`dependency_proxy_ttl_group_policy_worker_capacity` | `application_settings` | Number of concurrent dependency proxy cleanup policy workers allowed. + +## Namespace/Group Settings + +Setting | Table | Description +------- | ----- | ----------- +`maven_duplicates_allowed` | `namespace_package_settings` | Allow or prevent duplicate Maven packages. +`maven_duplicate_exception_regex` | `namespace_package_settings` | Regex defining Maven packages that are allowed to be duplicate when duplicates are not allowed. This matches the name and version of the package. +`generic_duplicates_allowed` | `namespace_package_settings` | Allow or prevent duplicate generic packages. +`generic_duplicate_exception_regex` | `namespace_package_settings` | Regex defining generic packages that are allowed to be duplicate when duplicates are not allowed. +Dependency Proxy Cleanup Policies - `ttl` | `dependency_proxy_image_ttl_group_policies` | Number of days to retain an unused Dependency Proxy file before it is removed. +Dependency Proxy - `enabled` | `dependency_proxy_image_ttl_group_policies` | Enable or disable the Dependency Proxy cleanup policy. + +## Project Settings + +Setting | Table | Description +------- | ----- | ----------- +Container Cleanup Policies - `next_run_at` | `container_expiration_policies` | When the project qualifies for the next container cleanup policy cron worker. +Container Cleanup Policies - `name_regex` | `container_expiration_policies` | Regex defining image names to remove with the container cleanup policy. +Container Cleanup Policies - `cadence` | `container_expiration_policies` | How often the container cleanup policy should run. +Container Cleanup Policies - `older_than` | `container_expiration_policies` | Age of images to remove with the container cleanup policy. +Container Cleanup Policies - `keep_n` | `container_expiration_policies` | Number of images to retain in a container cleanup policy. +Container Cleanup Policies - `enabled` | `container_expiration_policies` | Enable or disable a container cleanup policy. +Container Cleanup Policies - `name_regex_keep` | `container_expiration_policies` | Regex defining image names to always keep regardless of other rules with the container cleanup policy. diff --git a/doc/development/packages/structure.md b/doc/development/packages/structure.md new file mode 100644 index 00000000000..a2716232b11 --- /dev/null +++ b/doc/development/packages/structure.md @@ -0,0 +1,79 @@ +--- +stage: Package +group: Package +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Package Structure + +## Package Registry + +```mermaid +erDiagram + projects }|--|| namespaces : "" + packages_package_files }o--|| packages_packages : "" + packages_package_file_build_infos }o--|| packages_package_files : "" + packages_build_infos }o--|| packages_packages : "" + packages_tags }o--|| packages_packages : "" + packages_packages }|--|| projects : "" + packages_maven_metadata |o--|| packages_packages : "" + packages_nuget_metadata |o--|| packages_packages : "" + packages_composer_metadata |o--|| packages_packages : "" + packages_conan_metadata |o--|| packages_packages : "" + packages_pypi_metadata |o--|| packages_packages : "" + packages_npm_metadata |o--|| packages_packages : "" + package_conan_file_metadatum |o--|| packages_package_files : "" + package_helm_file_metadatum |o--|| packages_package_files : "" + packages_nuget_dependency_link_metadata |o--|| packages_dependency_links: "" + packages_dependencies ||--o| packages_dependency_links: "" + packages_packages ||--o{ packages_dependency_links: "" + namespace_package_settings |o--|| namespaces: "" +``` + +### Debian packages + +Debian contains a higher number of dedicated tables, so it is displayed here separately: + +```mermaid +erDiagram + projects }|--|| namespaces : "" + packages_packages }|--|| projects : "" + packages_package_files }o--|| packages_packages : "" + package_debian_file_metadatum |o--|| packages_package_files : "" + packages_debian_group_architectures }|--|| packages_debian_group_distributions : "" + packages_debian_group_component_files }|--|| packages_debian_group_components : "" + packages_debian_group_component_files }|--|| packages_debian_group_architectures : "" + packages_debian_group_components }|--|| packages_debian_group_distributions : "" + packages_debian_group_distribution_keys }|--|| packages_debian_group_distributions : "" + packages_debian_group_distributions }o--|| namespaces : "" + packages_debian_project_architectures }|--|| packages_debian_project_distributions : "" + packages_debian_project_component_files }|--|| packages_debian_project_components : "" + packages_debian_project_component_files }|--|| packages_debian_project_architectures : "" + packages_debian_project_components }|--|| packages_debian_project_distributions : "" + packages_debian_project_distribution_keys }|--|| packages_debian_project_distributions : "" + packages_debian_project_distributions }o--|| projects : "" + packages_debian_publications }|--|| packages_debian_project_distributions : "" + packages_debian_publications |o--|| packages_packages : "" + packages_debian_project_distributions |o--|| packages_packages : "" + packages_debian_group_distributions |o--|| namespaces : "" + packages_debian_file_metadata |o--|| packages_package_files : "" +``` + +## Container Registry + +```mermaid +erDiagram + projects }|--|| namespaces : "" + container_repositories }|--|| projects : "" + container_expiration_policy |o--|| projects : "" +``` + +## Dependency Proxy + +```mermaid +erDiagram + dependency_proxy_blobs }o--|| namespaces : "" + dependency_proxy_manifests }o--|| namespaces : "" + dependency_proxy_image_ttl_group_policies |o--|| namespaces : "" + dependency_proxy_group_settings |o--|| namespaces : "" +``` diff --git a/doc/development/pages/index.md b/doc/development/pages/index.md new file mode 100644 index 00000000000..02019db48ba --- /dev/null +++ b/doc/development/pages/index.md @@ -0,0 +1,238 @@ +--- +type: reference, dev +stage: Create +group: Editor +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +description: "GitLab's development guidelines for GitLab Pages" +--- + +# Getting started with development + +## Configuring GitLab Pages hostname + +GitLab Pages needs a hostname or domain, as each different GitLab Pages site is accessed via a +subdomain. GitLab Pages hostname can be set in different manners: + +- [Without wildcard, editing your hosts file](#without-wildcard-editing-your-hosts-file). +- [With DNS wildcard alternatives](#with-dns-wildcard-alternatives). + +### Without wildcard, editing your hosts file + +As `/etc/hosts` don't support wildcard hostnames, you must configure one entry +for GitLab Pages, and then one entry for each page site: + + ```plaintext + 127.0.0.1 gdk.test # If you're using GDK + 127.0.0.1 pages.gdk.test # Pages host + # Any namespace/group/user needs to be added + # as a subdomain to the pages host. This is because + # /etc/hosts doesn't accept wildcards + 127.0.0.1 root.pages.gdk.test # for the root pages + ``` + +### With DNS wildcard alternatives + +If instead of editing your `/etc/hosts` you'd prefer to use a DNS wildcard, you can use: + +- [`nip.io`](https://nip.io) +- [`dnsmasq`](https://wiki.debian.org/dnsmasq) + +## Configuring GitLab Pages without GDK + +Create a `gitlab-pages.conf` in the root of the GitLab Pages site, like: + +```toml +# Default port is 3010, but you can use any other +listen-http=:3010 + +# Your local GitLab Pages domain +pages-domain=pages.gdk.test + +# Directory where the pages are stored +pages-root=shared/pages + +# Show more information in the logs +log-verbose=true +``` + +To see more options you can check +[`internal/config/flags.go`](https://gitlab.com/gitlab-org/gitlab-pages/blob/master/internal/config/flags.go) +or run `gitlab-pages --help`. + +### Running GitLab Pages manually + +For any changes in the code, you must run `make` to build the app. It's best to just always run +it before you start the app. It's quick to build so don't worry! + +```shell +make && ./gitlab-pages -config=gitlab-pages.conf +``` + +## Configuring GitLab Pages with GDK + +In the following steps, `$GDK_ROOT` is the directory where you cloned GDK. + +1. Set up the [GDK hostname](https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/main/doc/howto/local_network.md). +1. Add a [GitLab Pages hostname](#configuring-gitlab-pages-hostname) to the `gdk.yml`: + + ```yaml + gitlab_pages: + enabled: true # enable GitLab Pages to be managed by gdk + port: 3010 # default port is 3010 + host: pages.gdk.test # the GitLab Pages domain + auto_update: true # if gdk must update GitLab Pages git + verbose: true # show more information in the logs + ``` + +### Running GitLab Pages with GDK + +After these configurations are set, GDK manages a GitLab Pages process, giving you access to +it with commands like: + +- Start: `gdk start gitlab-pages` +- Stop: `gdk stop gitlab-pages` +- Restart: `gdk restart gitlab-pages` +- Tail logs: `gdk tail gitlab-pages` + +### Running GitLab Pages manually + +You can also build and start the app independent of GDK processes management. + +For any changes in the code, you must run `make` to build the app. It's best to just always run +it before you start the app. It's quick to build so don't worry! + +```shell +make && ./gitlab-pages -config=gitlab-pages.conf +``` + +#### Building GitLab Pages in FIPS mode + +```shell +FIPS_MODE=1 make && ./gitlab-pages -config=gitlab-pages.conf +``` + +### Creating GitLab Pages site + +To build a GitLab Pages site locally you must +[configure `gitlab-runner`](https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/main/doc/howto/runner.md) + +Check the [user manual](../../user/project/pages/index.md). + +### Enabling access control + +GitLab Pages support private sites. Private sites can be accessed only by users +who have access to your GitLab project. + +GitLab Pages access control is disabled by default. To enable it: + +1. Enable the GitLab Pages access control in GitLab itself, which can be done by either: + - If you're not using GDK, editing `gitlab.yml`: + + ```yaml + # gitlab/config/gitlab.yml + pages: + access_control: true + ``` + + - Editing `gdk.yml` if you're using GDK: + + ```yaml + # $GDK_ROOT/gdk.yml + gitlab_pages: + enabled: true + access_control: true + ``` + +1. Restart GitLab (if running through the GDK, run `gdk restart`). Running + `gdk reconfigure` overwrites the value of `access_control` in `config/gitlab.yml`. +1. In your local GitLab instance, in the browser go to `http://gdk.test:3000/admin/applications`. +1. Create an [Instance-wide OAuth application](../../integration/oauth_provider.md#instance-wide-applications) + with the `api` scope. +1. Set the value of your `redirect-uri` to the `pages-domain` authorization endpoint + - `http://pages.gdk.test:3010/auth`, for example + - The `redirect-uri` must not contain any GitLab Pages site domain. +1. Add the auth client configuration: + + - With GDK, in `gdk.yml`: + + ```yaml + gitlab_pages: + enabled: true + access_control: true + auth_client_id: $CLIENT_ID # the OAuth application id created in http://gdk.test:3000/admin/applications + auth_client_secret: $CLIENT_SECRET # the OAuth application secret created in http://gdk.test:3000/admin/applications + ``` + + GDK generates random `auth_secret` and builds the `auth_redirect_uri` based on GitLab Pages + host configuration. + + - Without GDK, in `gitlab-pages.conf`: + + ```conf + ## the following are only needed if you want to test auth for private projects + auth-client-id=$CLIENT_ID # the OAuth application id created in http://gdk.test:3000/admin/applications + auth-client-secret=$CLIENT_SECRET # the OAuth application secret created in http://gdk.test:3000/admin/applications + auth-secret=$SOME_RANDOM_STRING # should be at least 32 bytes long + auth-redirect-uri=http://pages.gdk.test:3010/auth # the authentication callback url for GitLab Pages + ``` + +1. If running Pages inside the GDK, you can use GDK's `protected_config_files` section under `gdk` in + your `gdk.yml` to avoid getting `gitlab-pages.conf` configuration rewritten: + + ```yaml + gdk: + protected_config_files: + - 'gitlab-pages/gitlab-pages.conf' + ``` + +### Enabling object storage + +GitLab Pages support using object storage for storing artifacts, but object storage +is disabled by default. You can enable it in the GDK: + +1. Edit `gdk.yml` to enable the object storage in GitLab itself: + + ```yaml + # $GDK_ROOT/gdk.yml + object_store: + enabled: true + ``` + +1. Reconfigure and restart GitLab by running the commands `gdk reconfigure` and `gdk restart`. + +For more information, refer to the [GDK documentation](https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/main/doc/configuration.md#object-storage-configuration). + +## Linting + +```shell +# Run the linter locally +make lint + +# Run linter and fix issues (if supported by the linter) +make format +``` + +## Testing + +To run tests, you can use these commands: + +```shell +# This will run all of the tests in the codebase +make test + +# Run a specfic test file +go test ./internal/serving/disk/ + +# Run a specific test in a file +go test ./internal/serving/disk/ -run TestDisk_ServeFileHTTP + +# Run all unit tests except acceptance_test.go +go test ./... -short + +# Run acceptance_test.go only +make acceptance +# Run specific acceptance tests +# We add `make` here because acceptance tests use the last binary that was compiled, +# so we want to have the latest changes in the build that is tested +make && go test ./ -run TestRedirect +``` diff --git a/doc/development/permissions.md b/doc/development/permissions.md index f3818e92fec..ed95456c4f9 100644 --- a/doc/development/permissions.md +++ b/doc/development/permissions.md @@ -95,11 +95,9 @@ can still view the groups and their entities (like epics). Project membership (where the group membership is already taken into account) is stored in the `project_authorizations` table. -WARNING: -Due to [an issue](https://gitlab.com/gitlab-org/gitlab/-/issues/219299), -projects in personal namespace do not show owner (`50`) permission in -`project_authorizations` table. Note however that [`user.owned_projects`](https://gitlab.com/gitlab-org/gitlab/-/blob/0d63823b122b11abd2492bca47cc26858eee713d/app/models/user.rb#L906-916) -is calculated properly. +NOTE: +In [GitLab 14.9](https://gitlab.com/gitlab-org/gitlab/-/issues/351211) and later, projects in personal namespaces have a maximum role of Owner. +Because of a [known issue](https://gitlab.com/gitlab-org/gitlab/-/issues/219299) in GitLab 14.8 and earlier, projects in personal namespaces have a maximum role of Maintainer. ### Confidential issues diff --git a/doc/development/pipelines.md b/doc/development/pipelines.md index b70f07ea7d9..436977a7f38 100644 --- a/doc/development/pipelines.md +++ b/doc/development/pipelines.md @@ -37,7 +37,7 @@ flowchart LR subgraph backend be["Backend code"]--tested with-->rspec end - + be--generates-->fixtures["frontend fixtures"] fixtures--used in-->jest ``` @@ -67,9 +67,9 @@ In the `detect-tests` job, we use this mapping to identify the minimal tests nee In addition, there are a few circumstances where we would always run the full RSpec tests: - when the `pipeline:run-all-rspec` label is set on the merge request -- when the merge request is created by an automation (e.g. Gitaly update or MR targeting a stable branch) +- when the merge request is created by an automation (for example, Gitaly update or MR targeting a stable branch) - when the merge request is created in a security mirror -- when any CI configuration file is changed (i.e. `.gitlab-ci.yml` or `.gitlab/ci/**/*`) +- when any CI configuration file is changed (for example, `.gitlab-ci.yml` or `.gitlab/ci/**/*`) ### Jest minimal jobs @@ -83,11 +83,11 @@ In this mode, `jest` would resolve all the dependencies of related to the change In addition, there are a few circumstances where we would always run the full Jest tests: - when the `pipeline:run-all-jest` label is set on the merge request -- when the merge request is created by an automation (e.g. Gitaly update or MR targeting a stable branch) +- when the merge request is created by an automation (for example, Gitaly update or MR targeting a stable branch) - when the merge request is created in a security mirror -- when any CI configuration file is changed (i.e. `.gitlab-ci.yml` or `.gitlab/ci/**/*`) -- when any frontend "core" file is changed (i.e. `package.json`, `yarn.lock`, `babel.config.js`, `jest.config.*.js`, `config/helpers/**/*.js`) -- when any vendored JavaScript file is changed (i.e. `vendor/assets/javascripts/**/*`) +- when any CI configuration file is changed (for example, `.gitlab-ci.yml` or `.gitlab/ci/**/*`) +- when any frontend "core" file is changed (for example, `package.json`, `yarn.lock`, `babel.config.js`, `jest.config.*.js`, `config/helpers/**/*.js`) +- when any vendored JavaScript file is changed (for example, `vendor/assets/javascripts/**/*`) - when any backend file is changed ([see the patterns list for details](https://gitlab.com/gitlab-org/gitlab/-/blob/3616946936c1adbd9e754c1bd06f86ba670796d8/.gitlab/ci/rules.gitlab-ci.yml#L205-216)) ### Fork pipelines @@ -97,6 +97,18 @@ label is set on the MR. The goal is to reduce the CI/CD minutes consumed by fork See the [experiment issue](https://gitlab.com/gitlab-org/quality/team-tasks/-/issues/1170). +## Faster feedback when reverting merge requests + +When you need to revert a merge request, to get accelerated feedback, you can add the `~pipeline:revert` label to your merge request. + +When this label is assigned, the following steps of the CI/CD pipeline are skipped: + +- The `package-and-qa` job. +- The `rspec:undercoverage` job. +- The entire [Review Apps process](testing_guide/review_apps.md). + +Apply the label to the merge request, and run a new pipeline for the MR. + ## Fail-fast job in merge request pipelines To provide faster feedback when a merge request breaks existing tests, we are experimenting with a @@ -226,7 +238,7 @@ of `gitlab-org/gitlab-foss`. These jobs are only created in the following cases: - when the `pipeline:run-as-if-foss` label is set on the merge request - when the merge request is created in the `gitlab-org/security/gitlab` project -- when any CI configuration file is changed (i.e. `.gitlab-ci.yml` or `.gitlab/ci/**/*`) +- when any CI configuration file is changed (for example, `.gitlab-ci.yml` or `.gitlab/ci/**/*`) The `* as-if-foss` jobs are run in addition to the regular EE-context jobs. They have the `FOSS_ONLY='1'` variable set and get the `ee/` folder removed before the tests start running. @@ -277,15 +289,23 @@ In the event of an emergency, or false positive from this job, add the `pipeline:skip-undercoverage` label to the merge request to allow this job to fail. -You can disable the `undercover` code coverage check by wrapping the desired block of code in `# :nocov:` lines: +### Troubleshooting `rspec:undercoverage` failures -```ruby -# :nocov: -def some_method - # code coverage for this method will be skipped -end -# :nocov: -``` +The `rspec:undercoverage` job has [known bugs](https://gitlab.com/groups/gitlab-org/-/epics/8254) +that can cause false positive failures. You can locally test coverage locally to determine if it's +safe to apply `~"pipeline:skip-undercoverage"`. For example, using `<spec>` as the name of the +test causing the failure: + +1. Run `SIMPLECOV=1 bundle exec rspec <spec>`. +1. Run `scripts/undercoverage`. + +If these commands return `undercover: ✅ No coverage is missing in latest changes` then you can apply `~"pipeline:skip-undercoverage"` to bypass pipeline failures. + +## Ruby versions testing + +Our test suite runs against Ruby 2 in merge requests and default branch pipelines. + +We do run our test suite against Ruby 3 on 2-hourly scheduled pipelines, as GitLab.com will soon run on Ruby 3. ## PostgreSQL versions testing @@ -339,7 +359,7 @@ In general, pipelines for an MR fall into one of the following types (from short - [Frontend pipeline](#frontend-pipeline): For MRs that touch frontend code. - [End-to-end pipeline](#end-to-end-pipeline): For MRs that touch code in the `qa/` folder. -A "pipeline type" is an abstract term that mostly describes the "critical path" (i.e. the chain of jobs for which the sum +A "pipeline type" is an abstract term that mostly describes the "critical path" (for example, the chain of jobs for which the sum of individual duration equals the pipeline's duration). We use these "pipeline types" in [metrics dashboards](https://app.periscopedata.com/app/gitlab/858266/GitLab-Pipeline-Durations) in order to detect what types and jobs need to be optimized first. @@ -719,11 +739,11 @@ This job tries to download a generic package that contains GitLab Workhorse bina We also changed the `setup-test-env` job to: 1. First download the GitLab Workhorse generic package build and uploaded by `build-components`. -1. If the package is retrieved successfully, its content is placed in the right folder (i.e. `tmp/tests/gitlab-workhorse`), preventing the building of the binaries when `scripts/setup-test-env` is run later on. +1. If the package is retrieved successfully, its content is placed in the right folder (for example, `tmp/tests/gitlab-workhorse`), preventing the building of the binaries when `scripts/setup-test-env` is run later on. 1. If the package URL returns a 404, the behavior doesn't change compared to the current one: the GitLab Workhorse binaries are built as part of `scripts/setup-test-env`. NOTE: -The version of the package is the workhorse tree SHA (i.e. `git rev-parse HEAD:workhorse`). +The version of the package is the workhorse tree SHA (for example, `git rev-parse HEAD:workhorse`). ### Pre-clone step diff --git a/doc/development/product_qualified_lead_guide/index.md b/doc/development/product_qualified_lead_guide/index.md index dcd8b33e5c5..25634876aef 100644 --- a/doc/development/product_qualified_lead_guide/index.md +++ b/doc/development/product_qualified_lead_guide/index.md @@ -21,7 +21,7 @@ A hand-raise PQL is a user who requests to speak to sales from within the produc 1. Enter the credentials on CustomersDot development to Platypus in your `/config/secrets.yml` and restart. Credentials for the Platypus Staging are in the 1Password Growth vault. The URL for staging is `https://staging.ci.nexus.gitlabenvironment.cloud`. ```yaml - platypus_url: "<%= ENV['PLATYPUS_URL'] %>" + platypus_url: "<%= ENV['PLATYPUS_URL'] %>" platypus_client_id: "<%= ENV['PLATYPUS_CLIENT_ID'] %>" platypus_client_secret: "<%= ENV['PLATYPUS_CLIENT_SECRET'] %>" ``` @@ -42,7 +42,7 @@ A hand-raise PQL is a user who requests to speak to sales from within the produc - Check the application and Sidekiq logs on `gitlab.com` and CustomersDot to monitor leads. - Check the `leads` table in CustomersDot. -- Set up staging credentials for Platypus, and track the leads on the [Platypus Dashboard](https://staging.ci.nexus.gitlabenvironment.cloud/admin/queues/queue/new-lead-queue). +- Set up staging credentials for Platypus, and track the leads on the Platypus Dashboard: `https://staging.ci.nexus.gitlabenvironment.cloud/admin/queues/queue/new-lead-queue`. - Ask for access to the Marketo Sandbox and validate the leads there, [to this example request](https://gitlab.com/gitlab-com/team-member-epics/access-requests/-/issues/13162). ## Embed a hand-raise lead form diff --git a/doc/development/project_templates.md b/doc/development/project_templates.md index 74ded9c93fc..f688d54ad4f 100644 --- a/doc/development/project_templates.md +++ b/doc/development/project_templates.md @@ -4,59 +4,56 @@ group: Workspace info: "To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments" --- -# Contribute to GitLab project templates +# Contribute a built-in project template -Thanks for considering a contribution to the GitLab -[built-in project templates](../user/project/working_with_projects.md#create-a-project-from-a-built-in-template). +This page provides instructions about how to contribute a +[built-in project template](../user/project/working_with_projects.md#create-a-project-from-a-built-in-template). + +To contribute a built-in project template, you must complete the following tasks: + +1. [Create a project template for GitLab review](#create-a-project-template-for-review) +1. [Add the template SVG icon to GitLab SVGs](#add-the-template-svg-icon-to-gitlab-svgs) +1. [Create a merge request with vendor details](#create-a-merge-request-with-vendor-details) + +You can contribute the following types of project templates: + +- Enterprise: For users with GitLab Premium and above. +- Non-enterprise: For users with GitLab Free and above. ## Prerequisites -To add a new or update an existing template, you must have the following tools +To add or update an existing template, you must have the following tools installed: - `wget` - `tar` -- `jq` - -## Create a new project -To contribute a new built-in project template to be distributed with GitLab: +## Create a project template for review -1. Create a new public project with the project content you'd like to contribute - in a namespace of your choosing. You can [view a working example](https://gitlab.com/gitlab-org/project-templates/dotnetcore). - Projects should be as simple as possible and free of any unnecessary assets or dependencies. -1. When the project is ready for review, [create a new issue](https://gitlab.com/gitlab-org/gitlab/issues) with a link to your project. - In your issue, `@` mention the relevant Backend Engineering Manager and Product - Manager for the [Templates feature](https://about.gitlab.com/handbook/product/categories/#source-code-group). +1. In your selected namespace, create a public project. +1. Add the project content you want to use in the template. Do not include unnecessary assets or dependencies. For an example, +[see this project](https://gitlab.com/gitlab-org/project-templates/dotnetcore). +1. When the project is ready for review, [create an issue](https://gitlab.com/gitlab-org/gitlab/issues) with a link to your project. + In your issue, mention the relevant [Backend Engineering Manager and Product Manager](https://about.gitlab.com/handbook/product/categories/#source-code-group) + for the Templates feature. -## Add the SVG icon to GitLab SVGs +## Add the template SVG icon to GitLab SVGs -If the template you're adding has an SVG icon, you need to first add it to -<https://gitlab.com/gitlab-org/gitlab-svgs>: +If the project template has an SVG icon, you must add it to the +[GitLab SVGs project](https://gitlab.com/gitlab-org/gitlab-svgs/-/blob/main/README.md#adding-icons-or-illustrations) +before you can create a merge request with vendor details. -1. Follow the steps outlined in the - [GitLab SVGs project](https://gitlab.com/gitlab-org/gitlab-svgs/-/blob/main/README.md#adding-icons-or-illustrations) - and submit a merge request. -1. When the merge request is merged, `gitlab-bot` will pull the new changes in - the `gitlab-org/gitlab` project. -1. You can now continue on the vendoring process. +## Create a merge request with vendor details -## Vendoring process - -To make the project template available when creating a new project, the vendoring -process will have to be completed: +Before GitLab can implement the project template, you must [create a merge request](../user/project/merge_requests/creating_merge_requests.md) in [`gitlab-org/gitlab`](https://gitlab.com/gitlab-org/gitlab) that includes vendor details about the project. 1. [Export the project](../user/project/settings/import_export.md#export-a-project-and-its-data) - you created in the previous step and save the file as `<name>.tar.gz`, where - `<name>` is the short name of the project. -1. Edit the following files to include the project template. Two types of built-in - templates are available within GitLab: - - **Normal templates**: Available in GitLab Free and above (this is the most common type of built-in template). - See MR [!25318](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/25318) for an example. - - To add a normal template: - - 1. Open `lib/gitlab/project_template.rb` and add details of the template + and save the file as `<name>.tar.gz`, where `<name>` is the short name of the project. + Move this file to the root directory of `gitlab-org/gitlab`. +1. In `gitlab-org/gitlab`, create and checkout a new branch. +1. Edit the following files to include the project template: + - For **non-Enterprise** project templates: + - In `lib/gitlab/project_template.rb`, add details about the template in the `localized_templates_table` method. In the following example, the short name of the project is `hugo`: @@ -64,11 +61,11 @@ process will have to be completed: ProjectTemplate.new('hugo', 'Pages/Hugo', _('Everything you need to create a GitLab Pages site using Hugo'), 'https://gitlab.com/pages/hugo', 'illustrations/logos/hugo.svg'), ``` - If the vendored project doesn't have an SVG icon, omit `, 'illustrations/logos/hugo.svg'`. + If the project doesn't have an SVG icon, exclude `, 'illustrations/logos/hugo.svg'`. - 1. Open `spec/lib/gitlab/project_template_spec.rb` and add the short name - of the template in the `.all` test. - 1. Open `app/assets/javascripts/projects/default_project_templates.js` and + - In `spec/support/helpers/project_template_test_helper.rb`, append the short name + of the template in the `all_templates` method. + - In `app/assets/javascripts/projects/default_project_templates.js`, add details of the template. For example: ```javascript @@ -78,25 +75,19 @@ process will have to be completed: }, ``` - If the vendored project doesn't have an SVG icon, use `.icon-gitlab_logo` + If the project doesn't have an SVG icon, use `.icon-gitlab_logo` instead. - - - **Enterprise templates**: Introduced in GitLab 12.10, that are available only in GitLab Premium and above. - See MR [!28187](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/28187) for an example. - - To add an Enterprise template: - - 1. Open `ee/lib/ee/gitlab/project_template.rb` and add details of the template - in the `localized_ee_templates_table` method. For example: + - For **Enterprise** project templates: + - In `ee/lib/ee/gitlab/project_template.rb`, in the `localized_ee_templates_table` method, add details about the template. For example: ```ruby ::Gitlab::ProjectTemplate.new('hipaa_audit_protocol', 'HIPAA Audit Protocol', _('A project containing issues for each audit inquiry in the HIPAA Audit Protocol published by the U.S. Department of Health & Human Services'), 'https://gitlab.com/gitlab-org/project-templates/hipaa-audit-protocol', 'illustrations/logos/asklepian.svg') ``` - 1. Open `ee/spec/lib/gitlab/project_template_spec.rb` and add the short name + - In `ee/spec/lib/gitlab/project_template_spec.rb`, add the short name of the template in the `.all` test. - 1. Open `ee/app/assets/javascripts/projects/default_project_templates.js` and - add details of the template. For example: + - In `ee/app/assets/javascripts/projects/default_project_templates.js`, + add the template details. For example: ```javascript hipaa_audit_protocol: { @@ -105,10 +96,11 @@ process will have to be completed: }, ``` -1. Run the `vendor_template` script. Make sure to pass the correct arguments: +1. Run the following Rake task, where `<path>/<name>` is the + name you gave the template in `lib/gitlab/project_template.rb`: ```shell - scripts/vendor_template <git_repo_url> <name> <comment> + bin/rake gitlab:update_project_templates\[<path>/<name>\] ``` 1. Regenerate `gitlab.pot`: @@ -117,41 +109,24 @@ process will have to be completed: bin/rake gettext:regenerate ``` -1. By now, there should be one new file under `vendor/project_templates/` and - 4 changed files. Commit all of them in a new branch and create a merge - request. +1. After you run the scripts, there is one new file in `vendor/project_templates/` and four changed files. Commit all changes and push your branch to update the merge request. For an example, see this [merge request](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/25318). -## Test with GDK +## Test your built-in project with the GitLab Development Kit -If you are using the GitLab Development Kit (GDK) you must disable `praefect` -and regenerate the Procfile, as the Rake task is not currently compatible with it: +Complete the following steps to test the project template in your own GitLab Development Kit instance: -```yaml -# gitlab-development-kit/gdk.yml -praefect: - enabled: false -``` - -1. Follow the steps described in the [vendoring process](#vendoring-process). -1. Run the following Rake task where `<path>/<name>` is the +1. Run the following Rake task, where `<path>/<name>` is the name you gave the template in `lib/gitlab/project_template.rb`: ```shell - bin/rake gitlab:update_project_templates[<path>/<name>] + bin/rake gitlab:update_project_templates\[<path>/<name>\] ``` -You can now test to create a new project by importing the new template in GDK. - ## Contribute an improvement to an existing template -Existing templates are imported from the following groups: - -- [`project-templates`](https://gitlab.com/gitlab-org/project-templates) -- [`pages`](htps://gitlab.com/pages) - -To contribute a change, open a merge request in the relevant project -and mention `@gitlab-org/manage/import/backend` when you are ready for a review. +To update an existing built-in project template: -Then, if your merge request gets accepted, either [open an issue](https://gitlab.com/gitlab-org/gitlab/-/issues) -to ask for it to get updated, or open a merge request updating -the [vendored template](#vendoring-process). +1. Create a merge request in the relevant project of the `project-templates` and `pages` group and mention `@gitlab-org/manage/import/backend` when you are ready for a review. +1. If your merge request is accepted, either: + - [Create an issue](https://gitlab.com/gitlab-org/gitlab/-/issues) to ask for the template to get updated. + - [Create a merge request with vendor details](#create-a-merge-request-with-vendor-details) to update the template. diff --git a/doc/development/query_count_limits.md b/doc/development/query_count_limits.md index fec6f9022ee..49509727337 100644 --- a/doc/development/query_count_limits.md +++ b/doc/development/query_count_limits.md @@ -58,7 +58,7 @@ By using a `before_action` you don't have to modify the controller method in question, reducing the likelihood of merge conflicts. For Grape API endpoints there unfortunately is not a reliable way of running a -hook before a specific endpoint. This means that you have to add the whitelist +hook before a specific endpoint. This means that you have to add the allowlist call directly into the endpoint like so: ```ruby diff --git a/doc/development/query_performance.md b/doc/development/query_performance.md index bc1f753c012..4fe27d42c38 100644 --- a/doc/development/query_performance.md +++ b/doc/development/query_performance.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -11,7 +11,7 @@ This document describes various guidelines to follow when optimizing SQL queries When you are optimizing your SQL queries, there are two dimensions to pay attention to: 1. The query execution time. This is paramount as it reflects how the user experiences GitLab. -1. The query plan. Optimizing the query plan is important in allowing queries to independently scale over time. Realizing that an index will keep a query performing well as the table grows before the query degrades is an example of why we analyze these plans. +1. The query plan. Optimizing the query plan is important in allowing queries to independently scale over time. Realizing that an index keeps a query performing well as the table grows before the query degrades is an example of why we analyze these plans. ## Timing guidelines for queries @@ -39,9 +39,9 @@ cache, or what PostgreSQL calls shared buffers. This is the "warm cache" query. When analyzing an [`EXPLAIN` plan](understanding_explain_plans.md), you can see the difference not only in the timing, but by looking at the output for `Buffers` by running your explain with `EXPLAIN(analyze, buffers)`. [Database Lab](understanding_explain_plans.md#database-lab-engine) -will automatically include these options. +automatically includes these options. -If you are making a warm cache query, you will only see the `shared hits`. +If you are making a warm cache query, you see only the `shared hits`. For example in #database-lab: @@ -57,7 +57,7 @@ Or in the explain plan from `psql`: Buffers: shared hit=7323 ``` -If the cache is cold, you will also see `reads`. +If the cache is cold, you also see `reads`. In #database-lab: diff --git a/doc/development/query_recorder.md b/doc/development/query_recorder.md index 17f2fecc1bc..371d6e0e49e 100644 --- a/doc/development/query_recorder.md +++ b/doc/development/query_recorder.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -24,10 +24,27 @@ it "avoids N+1 database queries" do end ``` +You can if you wish, have both the expectation and the control as +`QueryRecorder` instances: + +```ruby +it "avoids N+1 database queries" do + control = ActiveRecord::QueryRecorder.new { visit_some_page } + create_list(:issue, 5) + action = ActiveRecord::QueryRecorder.new { visit_some_page } + + expect(action).not_to exceed_query_limit(control) +end +``` + As an example you might create 5 issues in between counts, which would cause the query count to increase by 5 if an N+1 problem exists. In some cases the query count might change slightly between runs for unrelated reasons. In this case you might need to test `exceed_query_limit(control_count + acceptable_change)`, but this should be avoided if possible. +If this test fails, and the control was passed as a `QueryRecorder`, then the +failure message indicates where the extra queries are by matching queries on +the longest common prefix, grouping similar queries together. + ## Cached queries By default, QueryRecorder ignores [cached queries](merge_request_performance_guidelines.md#cached-queries) in the count. However, it may be better to count diff --git a/doc/development/rails_initializers.md b/doc/development/rails_initializers.md index 9bf4109f1cb..68f3c07e45a 100644 --- a/doc/development/rails_initializers.md +++ b/doc/development/rails_initializers.md @@ -24,13 +24,13 @@ Some examples where you would need to do this are: ## Database connections in initializers Ideally, database connections are not opened from Rails initializers. Opening a -database connection (e.g. checking the database exists, or making a database +database connection (for example, checking the database exists, or making a database query) from an initializer means that tasks like `db:drop`, and `db:test:prepare` will fail because an active session prevents the database from being dropped. To help detect when database connections are opened from initializers, we now -warn in stderr. For example: +warn in `STDERR`. For example: ```shell DEPRECATION WARNING: Database connection should not be called during initializers (called from block in <module:HasVariable> at app/models/concerns/ci/has_variable.rb:22) diff --git a/doc/development/rails_update.md b/doc/development/rails_update.md index 8999ac90f4c..36ffae97377 100644 --- a/doc/development/rails_update.md +++ b/doc/development/rails_update.md @@ -24,6 +24,7 @@ We strive to run GitLab using the latest Rails releases to benefit from performa 1. Run `bundle update --conservative activesupport` in the `qa` folder. 1. Resolve any Bundler conflicts. 1. Ensure that `@rails/ujs` and `@rails/actioncable` npm packages match the new rails version in [`package.json`](https://gitlab.com/gitlab-org/gitlab/blob/master/package.json). +1. Run `yarn patch-package @rails/ujs` after updating this to ensure our local patch file version matches. 1. Create an MR with the `pipeline:run-all-rspec` label and see if pipeline breaks. 1. To resolve and debug spec failures use `git bisect` against the rails repository. See the [debugging section](#git-bisect-against-rails) below. 1. Include links to the Gem diffs between the two versions in the merge request description. For example, this is the gem diff for [`activesupport` 6.1.3.2 to diff --git a/doc/development/rake_tasks.md b/doc/development/rake_tasks.md index 0538add59b5..13c4bdaedca 100644 --- a/doc/development/rake_tasks.md +++ b/doc/development/rake_tasks.md @@ -166,7 +166,7 @@ There are a few caveats for this Rake task: - The pipeline must have been completed. - You may need to wait for the test report to be parsed and retry again. -This Rake task depends on the [unit test reports](../ci/unit_test_reports.md) feature, +This Rake task depends on the [unit test reports](../ci/testing/unit_test_reports.md) feature, which only gets parsed when it is requested for the first time. ### Speed up tests, Rake tasks, and migrations diff --git a/doc/development/redis.md b/doc/development/redis.md index d5f526f2d32..e48048be624 100644 --- a/doc/development/redis.md +++ b/doc/development/redis.md @@ -56,8 +56,7 @@ the entry, instead of relying on the key changing. ### Multi-key commands -We don't use [Redis Cluster](https://redis.io/topics/cluster-tutorial) at the -moment, but may wish to in the future: [#118820](https://gitlab.com/gitlab-org/gitlab/-/issues/118820). +We don't use Redis Cluster, but support for it is tracked in [this issue](https://gitlab.com/gitlab-org/gitlab/-/issues/118820). This imposes an additional constraint on naming: where GitLab is performing operations that require several keys to be held on the same Redis server - for @@ -118,12 +117,15 @@ NOTE: There is a [video showing how to see the slow log](https://youtu.be/BBI68QuYRH8) (GitLab internal) on GitLab.com -On GitLab.com, entries from the [Redis -slow log](https://redis.io/commands/slowlog) are available in the +<!-- vale gitlab.Substitutions = NO --> + +On GitLab.com, entries from the [Redis slow log](https://redis.io/commands/slowlog/) are available in the `pubsub-redis-inf-gprd*` index with the [`redis.slowlog` tag](https://log.gprd.gitlab.net/app/kibana#/discover?_g=(filters:!(),refreshInterval:(pause:!t,value:0),time:(from:now-1d,to:now))&_a=(columns:!(json.type,json.command,json.exec_time_s),filters:!(('$state':(store:appState),meta:(alias:!n,disabled:!f,index:AWSQX_Vf93rHTYrsexmk,key:json.tag,negate:!f,params:(query:redis.slowlog),type:phrase),query:(match:(json.tag:(query:redis.slowlog,type:phrase))))),index:AWSQX_Vf93rHTYrsexmk)). This shows commands that have taken a long time and may be a performance concern. +<!-- vale gitlab.Substitutions = YES --> + The [`fluent-plugin-redis-slowlog`](https://gitlab.com/gitlab-org/fluent-plugin-redis-slowlog) project is responsible for taking the `slowlog` entries from Redis and @@ -183,9 +185,9 @@ makes sure that booleans are encoded and decoded consistently. ### `Gitlab::Redis::HLL` -The Redis [`PFCOUNT`](https://redis.io/commands/pfcount), -[`PFADD`](https://redis.io/commands/pfadd), and -[`PFMERGE`](https://redis.io/commands/pfmergge) commands operate on +The Redis [`PFCOUNT`](https://redis.io/commands/pfcount/), +[`PFADD`](https://redis.io/commands/pfadd/), and +[`PFMERGE`](https://redis.io/commands/pfmerge/) commands operate on HyperLogLogs, a data structure that allows estimating the number of unique elements with low memory usage. (In addition to the `PFCOUNT` documentation, Thoughtbot's article on [HyperLogLogs in Redis](https://thoughtbot.com/blog/hyperloglogs-in-redis) @@ -200,7 +202,7 @@ For cases where we need to efficiently check the whether an item is in a group of items, we can use a Redis set. [`Gitlab::SetCache`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/set_cache.rb) provides an `#include?` method that uses the -[`SISMEMBER`](https://redis.io/commands/sismember) command, as well as `#read` +[`SISMEMBER`](https://redis.io/commands/sismember/) command, as well as `#read` to fetch all entries in the set. This is used by the diff --git a/doc/development/redis/new_redis_instance.md b/doc/development/redis/new_redis_instance.md index 389cddbb4e5..4900755b58c 100644 --- a/doc/development/redis/new_redis_instance.md +++ b/doc/development/redis/new_redis_instance.md @@ -179,9 +179,12 @@ bin/feature-flag use_primary_store_as_default_for_foo By enabling `use_primary_and_secondary_stores_for_foo` feature flag, our `Gitlab::Redis::Foo` will use `MultiStore` to write to both new Redis instance and the [old (fallback-instance)](#fallback-instance). If we fail to fetch data from the new instance, we will fallback and read from the old Redis instance. - We can monitor logs for `Gitlab::Redis::MultiStore::ReadFromPrimaryError`, and also the Prometheus counter `gitlab_redis_multi_store_read_fallback_total`. -Once we stop seeing them, this means that we are no longer relying on the data stored on the old Redis store. + +For pipelined commands (`pipelined` and `multi`), we execute the entire operation in both stores and then compare the results. If they differ, we emit a +`Gitlab::Redis::MultiStore:PipelinedDiffError` error, and track it in the `gitlab_redis_multi_store_pipelined_diff_error_total` Prometheus counter. + +Once we stop seeing those errors, this means that we are no longer relying on the data stored on the old Redis store. At this point, we are probably safe to move the traffic to the new Redis store. By enabling `use_primary_store_as_default_for_foo` feature flag, the `MultiStore` will use `primary_store` (new instance) as default Redis store. @@ -213,6 +216,15 @@ MultiStore implements read and write Redis commands separately. - `del` - `pipelined` - `flushdb` +- `rpush` + +##### Pipelined commands + +**NOTE:** The Ruby block passed to these commands will be executed twice, once per each store. +Thus, excluding the Redis operations performed, the block should be idempotent. + +- `pipelined` +- `multi` When a command outside of the supported list is used, `method_missing` will pass it to the old Redis instance and keep track of it. This ensures that anything unexpected behaves like it would before. @@ -223,17 +235,19 @@ a developer will need to add an implementation for missing Redis commands before ##### Errors -| error | message | -|-------------------------------------------------|-----------------------------------------------------------------------| +| error | message | +|---------------------------------------------------|---------------------------------------------------------------------------------------------| | `Gitlab::Redis::MultiStore::ReadFromPrimaryError` | Value not found on the Redis primary store. Read from the Redis secondary store successful. | -| `Gitlab::Redis::MultiStore::MethodMissingError` | Method missing. Falling back to execute method on the Redis secondary store. | +| `Gitlab::Redis::MultiStore::PipelinedDiffError` | Pipelined command executed on both stores successfully but results differ between them. | +| `Gitlab::Redis::MultiStore::MethodMissingError` | Method missing. Falling back to execute method on the Redis secondary store. | ##### Metrics -| metrics name | type | labels | description | -|-------------------------------------------------|--------------------|------------------------|----------------------------------------------------| -| `gitlab_redis_multi_store_read_fallback_total` | Prometheus Counter | command, instance_name | Client side Redis MultiStore reading fallback total| -| `gitlab_redis_multi_store_method_missing_total` | Prometheus Counter | command, instance_name | Client side Redis MultiStore method missing total | +| metrics name | type | labels | description | +|-------------------------------------------------------|--------------------|------------------------|--------------------------------------------------------| +| `gitlab_redis_multi_store_read_fallback_total` | Prometheus Counter | command, instance_name | Client side Redis MultiStore reading fallback total | +| `gitlab_redis_multi_store_pipelined_diff_error_total` | Prometheus Counter | command, instance_name | Redis MultiStore pipelined command diff between stores | +| `gitlab_redis_multi_store_method_missing_total` | Prometheus Counter | command, instance_name | Client side Redis MultiStore method missing total | ## Step 4: clean up after the migration diff --git a/doc/development/refactoring_guide/index.md b/doc/development/refactoring_guide/index.md index a6ed83258f3..9793db3bb85 100644 --- a/doc/development/refactoring_guide/index.md +++ b/doc/development/refactoring_guide/index.md @@ -71,7 +71,7 @@ expect(cleanForSnapshot(wrapper.element)).toMatchSnapshot(); ### Resources -[Unofficial wiki explanation](http://wiki.c2.com/?PinningTests) +[Unofficial wiki explanation](https://wiki.c2.com/?PinningTests) ### Examples diff --git a/doc/development/reference_processing.md b/doc/development/reference_processing.md index ad6552e88fe..1dfe6496e79 100644 --- a/doc/development/reference_processing.md +++ b/doc/development/reference_processing.md @@ -41,7 +41,7 @@ For example, the class is responsible for handling references to issues, such as `gitlab-org/gitlab#123` and `https://gitlab.com/gitlab-org/gitlab/-/issues/200048`. -All reference filters are instances of [`HTML::Pipeline::Filter`](https://www.rubydoc.info/github/jch/html-pipeline/HTML/Pipeline/Filter), +All reference filters are instances of [`HTML::Pipeline::Filter`](https://www.rubydoc.info/gems/html-pipeline), and inherit (often indirectly) from [`Banzai::Filter::ReferenceFilter`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/banzai/filter/reference_filter.rb). `HTML::Pipeline::Filter` has a simple interface consisting of `#call`, a void diff --git a/doc/development/routing.md b/doc/development/routing.md index 41961c2288f..2b3ecd8127b 100644 --- a/doc/development/routing.md +++ b/doc/development/routing.md @@ -104,6 +104,6 @@ To get started, see an [example merge request](https://gitlab.com/gitlab-org/git ## Useful links -- [Routing improvements master plan](https://gitlab.com/gitlab-org/gitlab/-/issues/215362) +- [Routing improvements main plan](https://gitlab.com/gitlab-org/gitlab/-/issues/215362) - [Scoped routing explained](https://gitlab.com/gitlab-org/gitlab/-/issues/214217) - [Removal of deprecated routes](https://gitlab.com/gitlab-org/gitlab/-/issues/28848) diff --git a/doc/development/ruby3_gotchas.md b/doc/development/ruby3_gotchas.md index e4ed5039e3c..dbe6fa13eee 100644 --- a/doc/development/ruby3_gotchas.md +++ b/doc/development/ruby3_gotchas.md @@ -138,3 +138,28 @@ installed Ruby manually or via tools like `asdf`. Users of the `gitlab-developme are also affected by this problem. Build images are not affected because they include the patch set addressing this bug. + +## Deprecations are not caught in DeprecationToolkit if the method is stubbed + +We rely on `deprecation_toolkit` to fail fast when using functionality that is deprecated in Ruby 2 and removed in Ruby 3. +A common issue caught during the transition from Ruby 2 to Ruby 3 relates to +the [separation of positional and keyword arguments in Ruby 3.0](https://www.ruby-lang.org/en/news/2019/12/12/separation-of-positional-and-keyword-arguments-in-ruby-3-0/). + +Unfortunately, if the author has stubbed such methods in tests, deprecations would not be caught. +We run automated detection for this warning in tests via `deprecation_toolkit`, +but it relies on the fact that `Kernel#warn` emits a warning, so stubbing out this call will effectively remove the call to warn, which means `deprecation_toolkit` will never see the deprecation warnings. +Stubbing out the implementation removes that warning, and we never pick it up, so the build is green. + +Please refer to [issue 364099](https://gitlab.com/gitlab-org/gitlab/-/issues/364099) for more context. + +## Testing in `irb` and `rails console` + +Another pitfall is that testing in `irb`/`rails c` silences the deprecation warning, +since `irb` in Ruby 2.7.x has a [bug](https://bugs.ruby-lang.org/issues/17377) that prevents deprecation warnings from showing. + +When writing code and performing code reviews, pay extra attention to method calls of the form `f({k: v})`. +This is valid in Ruby 2 when `f` takes either a `Hash` or keyword arguments, but Ruby 3 only considers this valid if `f` takes a `Hash`. +For Ruby 3 compliance, this should be changed to one of the following invocations if `f` takes keyword arguments: + +- `f(**{k: v})` +- `f(k: v)` diff --git a/doc/development/scalability.md b/doc/development/scalability.md index 4450df0399d..39cd0ecfcdd 100644 --- a/doc/development/scalability.md +++ b/doc/development/scalability.md @@ -222,7 +222,7 @@ only when the primary fails. ### Redis Sentinels -[Redis Sentinel](https://redis.io/topics/sentinel) provides high +[Redis Sentinel](https://redis.io/docs/manual/sentinel/) provides high availability for Redis by watching the primary. If multiple Sentinels detect that the primary has gone away, the Sentinels performs an election to determine a new leader. @@ -232,8 +232,7 @@ election to determine a new leader. No leader: A Redis cluster can get into a mode where there are no primaries. For example, this can happen if Redis nodes are misconfigured to follow the wrong node. Sometimes this requires forcing one node to -become a primary by using the [`REPLICAOF NO ONE` -command](https://redis.io/commands/replicaof). +become a primary by using the [`REPLICAOF NO ONE` command](https://redis.io/commands/replicaof/). ### Sidekiq @@ -275,8 +274,8 @@ in a timely manner: this to `ProcessCommitWorker`. - Redistribute/gerrymander Sidekiq processes by queue types. Long-running jobs (for example, relating to project import) can often - squeeze out jobs that run fast (for example, delivering email). [This technique - was used in to optimize our existing Sidekiq deployment](https://gitlab.com/gitlab-com/gl-infra/infrastructure/-/issues/7219#note_218019483). + squeeze out jobs that run fast (for example, delivering email). + [We used this technique to optimize our existing Sidekiq deployment](https://gitlab.com/gitlab-com/gl-infra/reliability/-/issues/7219#note_218019483). - Optimize jobs. Eliminating unnecessary work, reducing network calls (including SQL and Gitaly), and optimizing processor time can yield significant benefits. diff --git a/doc/development/secure_coding_guidelines.md b/doc/development/secure_coding_guidelines.md index 3e46891d20e..d8e2352bd93 100644 --- a/doc/development/secure_coding_guidelines.md +++ b/doc/development/secure_coding_guidelines.md @@ -59,7 +59,7 @@ Some example of well implemented access controls and tests: 1. [example2](https://dev.gitlab.org/gitlab/gitlabhq/-/merge_requests/2511/diffs#ed3aaab1510f43b032ce345909a887e5b167e196_142_155) 1. [example3](https://dev.gitlab.org/gitlab/gitlabhq/-/merge_requests/3170/diffs?diff_id=17494) -**NB:** any input from development team is welcome, for example, about Rubocop rules. +**NB:** any input from development team is welcome, for example, about RuboCop rules. ## Regular Expressions guidelines @@ -637,14 +637,11 @@ We recommend using the ciphers that Mozilla is providing in their [recommended S - `ECDHE-RSA-AES128-GCM-SHA256` - `ECDHE-ECDSA-AES256-GCM-SHA384` - `ECDHE-RSA-AES256-GCM-SHA384` -- `ECDHE-ECDSA-CHACHA20-POLY1305` -- `ECDHE-RSA-CHACHA20-POLY1305` And the following cipher suites (according to the [RFC 8446](https://datatracker.ietf.org/doc/html/rfc8446#appendix-B.4)) for TLS 1.3: - `TLS_AES_128_GCM_SHA256` - `TLS_AES_256_GCM_SHA384` -- `TLS_CHACHA20_POLY1305_SHA256` *Note*: **Golang** does [not support](https://github.com/golang/go/blob/go1.17/src/crypto/tls/cipher_suites.go#L676) all cipher suites with TLS 1.3. @@ -665,7 +662,7 @@ For **Ruby**, you can use [`HTTParty`](https://github.com/jnunemaker/httparty) a Whenever possible this example should be **avoided** for security purposes: ```ruby -response = HTTParty.get('https://gitlab.com', ssl_version: :TLSv1_3, ciphers: ['TLS_AES_128_GCM_SHA256', 'TLS_AES_256_GCM_SHA384', 'TLS_CHACHA20_POLY1305_SHA256']) +response = HTTParty.get('https://gitlab.com', ssl_version: :TLSv1_3, ciphers: ['TLS_AES_128_GCM_SHA256', 'TLS_AES_256_GCM_SHA384']) ``` When using [`GitLab::HTTP`](#gitlab-http-library), the code looks like: @@ -673,7 +670,7 @@ When using [`GitLab::HTTP`](#gitlab-http-library), the code looks like: This is the **recommended** implementation to avoid security issues such as SSRF: ```ruby -response = GitLab::HTTP.perform_request(Net::HTTP::Get, 'https://gitlab.com', ssl_version: :TLSv1_3, ciphers: ['TLS_AES_128_GCM_SHA256', 'TLS_AES_256_GCM_SHA384', 'TLS_CHACHA20_POLY1305_SHA256']) +response = GitLab::HTTP.perform_request(Net::HTTP::Get, 'https://gitlab.com', ssl_version: :TLSv1_3, ciphers: ['TLS_AES_128_GCM_SHA256', 'TLS_AES_256_GCM_SHA384']) ``` ##### TLS 1.2 @@ -687,8 +684,6 @@ func secureCipherSuites() []uint16 { tls.TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, tls.TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384, tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384, - tls.TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305, - tls.TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305, } ``` @@ -703,12 +698,12 @@ tls.Config{ } ``` -This example was taken [here](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/871b52dc700f1a66f6644fbb1e78a6d463a6ff83/internal/tool/tlstool/tlstool.go#L72). +This example was taken [from the GitLab Agent](https://gitlab.com/gitlab-org/cluster-integration/gitlab-agent/-/blob/871b52dc700f1a66f6644fbb1e78a6d463a6ff83/internal/tool/tlstool/tlstool.go#L72). For **Ruby**, you can use again [`HTTParty`](https://github.com/jnunemaker/httparty) and specify this time TLS 1.2 version alongside with the recommended ciphers: ```ruby -response = GitLab::HTTP.perform_request(Net::HTTP::Get, 'https://gitlab.com', ssl_version: :TLSv1_2, ciphers: ['ECDHE-ECDSA-AES128-GCM-SHA256', 'ECDHE-RSA-AES128-GCM-SHA256', 'ECDHE-ECDSA-AES256-GCM-SHA384', 'ECDHE-RSA-AES256-GCM-SHA384', 'ECDHE-ECDSA-CHACHA20-POLY1305', 'ECDHE-RSA-CHACHA20-POLY1305']) +response = GitLab::HTTP.perform_request(Net::HTTP::Get, 'https://gitlab.com', ssl_version: :TLSv1_2, ciphers: ['ECDHE-ECDSA-AES128-GCM-SHA256', 'ECDHE-RSA-AES128-GCM-SHA256', 'ECDHE-ECDSA-AES256-GCM-SHA384', 'ECDHE-RSA-AES256-GCM-SHA384']) ``` ## GitLab Internal Authorization diff --git a/doc/development/serializing_data.md b/doc/development/serializing_data.md index 48e756d015b..97e6f665484 100644 --- a/doc/development/serializing_data.md +++ b/doc/development/serializing_data.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -20,7 +20,7 @@ end ``` While it may be tempting to store serialized data in the database there are many -problems with this. This document will outline these problems and provide an +problems with this. This document outlines these problems and provide an alternative. ## Serialized Data Is Less Powerful @@ -34,13 +34,13 @@ turn there's no way to query the data at all. ## Waste Of Space -Storing serialized data such as JSON or YAML will end up wasting a lot of space. +Storing serialized data such as JSON or YAML ends up wasting a lot of space. This is because these formats often include additional characters (for example, double quotes or newlines) besides the data that you are storing. ## Difficult To Manage -There comes a time where you will need to add a new field to the serialized +There comes a time where you must add a new field to the serialized data, or change an existing one. Using serialized data this becomes difficult and very time consuming as the only way of doing so is to re-write all the stored values. To do so you would have to: @@ -51,8 +51,7 @@ stored values. To do so you would have to: 1. Serialize it back to a String 1. Store it in the database -On the other hand, if one were to use regular columns adding a column would be -as easy as this: +On the other hand, if one were to use regular columns adding a column would be: ```sql ALTER TABLE table_name ADD COLUMN column_name type; @@ -62,9 +61,9 @@ Such a query would take very little to no time and would immediately apply to all rows, without having to re-write large JSON or YAML structures. Finally, there comes a time when the JSON or YAML structure is no longer -sufficient and you need to migrate away from it. When storing only a few rows +sufficient and you must migrate away from it. When storing only a few rows this may not be a problem, but when storing millions of rows such a migration -can easily take hours or even days to complete. +can take hours or even days to complete. ## Relational Databases Are Not Document Stores @@ -85,7 +84,7 @@ you don't need. ## The Solution -The solution is very simple: just use separate columns and/or separate tables. -This will allow you to use all the features provided by your database, it will -make it easier to manage and migrate the data, you'll conserve space, you can +The solution is to use separate columns and/or separate tables. +This allows you to use all the features provided by your database, it +makes it easier to manage and migrate the data, you conserve space, you can index the data efficiently and so forth. diff --git a/doc/development/service_ping/implement.md b/doc/development/service_ping/implement.md index 27bc4d2e8ca..6948eb20e00 100644 --- a/doc/development/service_ping/implement.md +++ b/doc/development/service_ping/implement.md @@ -46,7 +46,7 @@ boards: add_metric('CountBoardsMetric', time_frame: 'all'), There are several types of counters for metrics: -- **[Batch counters](#batch-counters)**: Used for counts and sums. +- **[Batch counters](#batch-counters)**: Used for counts, sums, and averages. - **[Redis counters](#redis-counters):** Used for in-memory counts. - **[Alternative counters](#alternative-counters):** Used for settings and configurations. @@ -102,34 +102,32 @@ Examples using `usage_data.rb` have been [deprecated](usage_data.md). We recomme #### Sum batch operation -There is no support for `sum` for database metrics. - Sum the values of a given ActiveRecord_Relation on given column and handles errors. Handles the `ActiveRecord::StatementInvalid` error Method: ```ruby -sum(relation, column, batch_size: nil, start: nil, finish: nil) +add_metric('JiraImportsTotalImportedIssuesCountMetric') ``` -Arguments: +#### Average batch operation -- `relation`: the ActiveRecord_Relation to perform the operation -- `column`: the column to sum on -- `batch_size`: if none set it uses default value 1000 from `Gitlab::Database::BatchCounter` -- `start`: custom start of the batch counting to avoid complex min calculations -- `end`: custom end of the batch counting to avoid complex min calculations +Average the values of a given `ActiveRecord_Relation` on given column and handles errors. -Examples: +Method: ```ruby -sum(JiraImportState.finished, :imported_issues_count) +add_metric('CountIssuesWeightAverageMetric') ``` +Examples: + +Examples using `usage_data.rb` have been [deprecated](usage_data.md). We recommend to use [instrumentation classes](metrics_instrumentation.md). + #### Grouping and batch operations -The `count`, `distinct_count`, and `sum` batch counters can accept an `ActiveRecord::Relation` +The `count`, `distinct_count`, `sum`, and `average` batch counters can accept an `ActiveRecord::Relation` object, which groups by a specified column. With a grouped relation, the methods do batch counting, handle errors, and returns a hash table of key-value pairs. @@ -144,6 +142,9 @@ distinct_count(Project.group(:visibility_level), :creator_id) sum(Issue.group(:state_id), :weight)) # returns => {1=>3542, 2=>6820} + +average(Issue.group(:state_id), :weight)) +# returns => {1=>3.5, 2=>2.5} ``` #### Add operation @@ -286,7 +287,7 @@ Enabled by default in GitLab 13.7 and later. Increment event count using an ordinary Redis counter, for a given event name. API requests are protected by checking for a valid CSRF token. - + ```plaintext POST /usage_data/increment_counter ``` @@ -652,9 +653,10 @@ We return fallback values in these cases: | Case | Value | |-----------------------------|-------| -| Deprecated Metric | -1000 | +| Deprecated Metric ([Removed with version 14.3](https://gitlab.com/gitlab-org/gitlab/-/issues/335894)) | -1000 | | Timeouts, general failures | -1 | | Standard errors in counters | -2 | +| Histogram metrics failure | { '-1' => -1 } | ## Test counters manually using your Rails console diff --git a/doc/development/service_ping/index.md b/doc/development/service_ping/index.md index 1e09dada36e..e776b78b710 100644 --- a/doc/development/service_ping/index.md +++ b/doc/development/service_ping/index.md @@ -464,7 +464,7 @@ To generate Service Ping, use [Teleport](https://goteleport.com/docs/) or a deta 1. Get the metrics duration from logs: -Search in Google Console logs for `time_elapsed`. Query example [here](https://cloudlogging.app.goo.gl/nWheZvD8D3nWazNe6). +Search in Google Console logs for `time_elapsed`. [Query example](https://cloudlogging.app.goo.gl/nWheZvD8D3nWazNe6). ### Verification (After approx 30 hours) diff --git a/doc/development/service_ping/metrics_dictionary.md b/doc/development/service_ping/metrics_dictionary.md index ead11a412fa..fee3bb571c2 100644 --- a/doc/development/service_ping/metrics_dictionary.md +++ b/doc/development/service_ping/metrics_dictionary.md @@ -35,7 +35,7 @@ Each metric is defined in a separate YAML file consisting of a number of fields: | `name` | no | Metric name suggestion. Can replace the last part of `key_path`. | | `description` | yes | | | `product_section` | yes | The [section](https://gitlab.com/gitlab-com/www-gitlab-com/-/blob/master/data/sections.yml). | -| `product_stage` | no | The [stage](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/stages.yml) for the metric. | +| `product_stage` | yes | The [stage](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/stages.yml) for the metric. | | `product_group` | yes | The [group](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/stages.yml) that owns the metric. | | `product_category` | no | The [product category](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/categories.yml) for the metric. | | `value_type` | yes | `string`; one of [`string`, `number`, `boolean`, `object`](https://json-schema.org/understanding-json-schema/reference/type.html). | @@ -43,11 +43,11 @@ Each metric is defined in a separate YAML file consisting of a number of fields: | `time_frame` | yes | `string`; may be set to a value like `7d`, `28d`, `all`, `none`. | | `data_source` | yes | `string`; may be set to a value like `database`, `redis`, `redis_hll`, `prometheus`, `system`. | | `data_category` | yes | `string`; [categories](#data-category) of the metric, may be set to `operational`, `optional`, `subscription`, `standard`. The default value is `optional`.| -| `instrumentation_class` | no | `string`; [the class that implements the metric](metrics_instrumentation.md). | +| `instrumentation_class` | yes | `string`; [the class that implements the metric](metrics_instrumentation.md). | | `distribution` | yes | `array`; may be set to one of `ce, ee` or `ee`. The [distribution](https://about.gitlab.com/handbook/marketing/strategic-marketing/tiers/#definitions) where the tracked feature is available. | | `performance_indicator_type` | no | `array`; may be set to one of [`gmau`, `smau`, `paid_gmau`, or `umau`](https://about.gitlab.com/handbook/business-technology/data-team/data-catalog/xmau-analysis/). | | `tier` | yes | `array`; may contain one or a combination of `free`, `premium` or `ultimate`. The [tier]( https://about.gitlab.com/handbook/marketing/strategic-marketing/tiers/) where the tracked feature is available. This should be verbose and contain all tiers where a metric is available. | -| `milestone` | no | The milestone when the metric is introduced and when it's available to self-managed instances with the official GitLab release. | +| `milestone` | yes | The milestone when the metric is introduced and when it's available to self-managed instances with the official GitLab release. | | `milestone_removed` | no | The milestone when the metric is removed. | | `introduced_by_url` | no | The URL to the merge request that introduced the metric to be available for self-managed instances. | | `removed_by_url` | no | The URL to the merge request that removed the metric. | diff --git a/doc/development/service_ping/metrics_instrumentation.md b/doc/development/service_ping/metrics_instrumentation.md index e718d972fba..4fd03eea84f 100644 --- a/doc/development/service_ping/metrics_instrumentation.md +++ b/doc/development/service_ping/metrics_instrumentation.md @@ -8,10 +8,13 @@ info: To determine the technical writer assigned to the Stage/Group associated w This guide describes how to develop Service Ping metrics using metrics instrumentation. +<i class="fa fa-youtube-play youtube" aria-hidden="true"></i> +For a video tutorial, see the [Adding Service Ping metric via instrumentation class](https://youtu.be/p2ivXhNxUoY). + ## Nomenclature - **Instrumentation class**: - - Inherits one of the metric classes: `DatabaseMetric`, `RedisMetric`, `RedisHLLMetric` or `GenericMetric`. + - Inherits one of the metric classes: `DatabaseMetric`, `RedisMetric`, `RedisHLLMetric`, `NumbersMetric` or `GenericMetric`. - Implements the logic that calculates the value for a Service Ping metric. - **Metric definition** @@ -24,7 +27,7 @@ This guide describes how to develop Service Ping metrics using metrics instrumen A metric definition has the [`instrumentation_class`](metrics_dictionary.md) field, which can be set to a class. -The defined instrumentation class should inherit one of the existing metric classes: `DatabaseMetric`, `RedisMetric`, `RedisHLLMetric`, or `GenericMetric`. +The defined instrumentation class should inherit one of the existing metric classes: `DatabaseMetric`, `RedisMetric`, `RedisHLLMetric`, `NumbersMetric` or `GenericMetric`. The current convention is that a single instrumentation class corresponds to a single metric. On a rare occasions, there are exceptions to that convention like [Redis metrics](#redis-metrics). To use a single instrumentation class for more than one metric, please reach out to one of the `@gitlab-org/growth/product-intelligence/engineers` members to consult about your case. @@ -35,7 +38,7 @@ We have built a domain-specific language (DSL) to define the metrics instrumenta ## Database metrics -- `operation`: Operations for the given `relation`, one of `count`, `distinct_count`. +- `operation`: Operations for the given `relation`, one of `count`, `distinct_count`, `sum`, and `average`. - `relation`: `ActiveRecord::Relation` for the objects we want to perform the `operation`. - `start`: Specifies the start value of the batch counting, by default is `relation.minimum(:id)`. - `finish`: Specifies the end value of the batch counting, by default is `relation.maximum(:id)`. @@ -104,6 +107,46 @@ module Gitlab end ``` +### Sum Example + +```ruby +# frozen_string_literal: true + +module Gitlab + module Usage + module Metrics + module Instrumentations + class JiraImportsTotalImportedIssuesCountMetric < DatabaseMetric + operation :sum, column: :imported_issues_count + + relation { JiraImportState.finished } + end + end + end + end +end +``` + +### Average Example + +```ruby +# frozen_string_literal: true + +module Gitlab + module Usage + module Metrics + module Instrumentations + class CountIssuesWeightAverageMetric < DatabaseMetric + operation :average, column: :weight + + relation { Issue } + end + end + end + end +end +``` + ## Redis metrics [Example of a merge request that adds a `Redis` metric](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/66582). @@ -201,6 +244,43 @@ options: - i_quickactions_approve ``` +## Numbers metrics + +- `operation`: Operations for the given `data` block. Currently we only support `add` operation. +- `data`: a `block` which contains an array of numbers. +- `available?`: Specifies whether the metric should be reported. The default is `true`. + +```ruby +# frozen_string_literal: true + +module Gitlab + module Usage + module Metrics + module Instrumentations + class IssuesBoardsCountMetric < NumbersMetric + operation :add + + data do |time_frame| + [ + CountIssuesMetric.new(time_frame: time_frame).value, + CountBoardsMetric.new(time_frame: time_frame).value + ] + end + end + end + end + end + end +end +``` + +You must also include the instrumentation class name in the YAML setup. + +```yaml +time_frame: 28d +instrumentation_class: 'IssuesBoardsCountMetric' +``` + ## Generic metrics - `value`: Specifies the value of the metric. @@ -228,14 +308,15 @@ end There is support for: -- `count`, `distinct_count`, `estimate_batch_distinct_count` for [database metrics](#database-metrics). +- `count`, `distinct_count`, `estimate_batch_distinct_count`, `sum`, and `average` for [database metrics](#database-metrics). - [Redis metrics](#redis-metrics). - [Redis HLL metrics](#redis-hyperloglog-metrics). +- `add` for [numbers metrics](#numbers-metrics). - [Generic metrics](#generic-metrics), which are metrics based on settings or configurations. There is no support for: -- `add`, `sum`, `histogram` for database metrics. +- `add`, `histogram` for database metrics. You can [track the progress to support these](https://gitlab.com/groups/gitlab-org/-/epics/6118). @@ -245,8 +326,10 @@ To create a stub instrumentation for a Service Ping metric, you can use a dedica The generator takes the class name as an argument and the following options: -- `--type=TYPE` Required. Indicates the metric type. It must be one of: `database`, `generic`, `redis`. -- `--operation` Required for `database` type. It must be one of: `count`, `distinct_count`, `estimate_batch_distinct_count`. +- `--type=TYPE` Required. Indicates the metric type. It must be one of: `database`, `generic`, `redis`, `numbers`. +- `--operation` Required for `database` & `numebers` type. + - For `database` it must be one of: `count`, `distinct_count`, `estimate_batch_distinct_count`, `sum`, `average`. + - For `numbers` it must be: `add`. - `--ee` Indicates if the metric is for EE. ```shell @@ -264,6 +347,7 @@ This guide describes how to migrate a Service Ping metric from [`lib/gitlab/usag - [Database metric](#database-metrics) - [Redis HyperLogLog metrics](#redis-hyperloglog-metrics) - [Redis metric](#redis-metrics) +- [Numbers metric](#numbers-metrics) - [Generic metric](#generic-metrics) 1. Determine the location of instrumentation class: either under `ee` or outside `ee`. diff --git a/doc/development/shell_commands.md b/doc/development/shell_commands.md index 6f56e60f619..fcb8c20bdd3 100644 --- a/doc/development/shell_commands.md +++ b/doc/development/shell_commands.md @@ -17,7 +17,7 @@ These guidelines are meant to make your code more reliable _and_ secure. ## Use File and FileUtils instead of shell commands -Sometimes we invoke basic Unix commands via the shell when there is also a Ruby API for doing it. Use the Ruby API if it exists. <http://www.ruby-doc.org/stdlib-2.0.0/libdoc/fileutils/rdoc/FileUtils.html#module-FileUtils-label-Module+Functions> +Sometimes we invoke basic Unix commands via the shell when there is also a Ruby API for doing it. Use the Ruby API if it exists. <https://www.ruby-doc.org/stdlib-2.0.0/libdoc/fileutils/rdoc/FileUtils.html#module-FileUtils-label-Module+Functions> ```ruby # Wrong diff --git a/doc/development/sidekiq/logging.md b/doc/development/sidekiq/logging.md index 015376b0fc6..474ea5de951 100644 --- a/doc/development/sidekiq/logging.md +++ b/doc/development/sidekiq/logging.md @@ -25,7 +25,7 @@ need to do anything. There are however some instances when there would be no context present when the job is scheduled, or the context that is present is -likely to be incorrect. For these instances, we've added Rubocop rules +likely to be incorrect. For these instances, we've added RuboCop rules to draw attention and avoid incorrect metadata in our logs. As with most our cops, there are perfectly valid reasons for disabling diff --git a/doc/development/sidekiq/worker_attributes.md b/doc/development/sidekiq/worker_attributes.md index 3bd6d313e2c..6820627f761 100644 --- a/doc/development/sidekiq/worker_attributes.md +++ b/doc/development/sidekiq/worker_attributes.md @@ -6,6 +6,11 @@ info: To determine the technical writer assigned to the Stage/Group associated w # Sidekiq worker attributes +Worker classes can define certain attributes to control their behavior and add metadata. + +Child classes inheriting from other workers also inherit these attributes, so you only +have to redefine them if you want to override their values. + ## Job urgency Jobs can have an `urgency` attribute set, which can be `:high`, diff --git a/doc/development/single_table_inheritance.md b/doc/development/single_table_inheritance.md index 0783721e628..c8d082e8a67 100644 --- a/doc/development/single_table_inheritance.md +++ b/doc/development/single_table_inheritance.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- diff --git a/doc/development/snowplow/implementation.md b/doc/development/snowplow/implementation.md index f4123e3ba86..88fb1d5cfe4 100644 --- a/doc/development/snowplow/implementation.md +++ b/doc/development/snowplow/implementation.md @@ -464,7 +464,10 @@ Page titles are hardcoded as `GitLab` for the same reason. #### Snowplow Inspector Chrome Extension -Snowplow Inspector Chrome Extension is a browser extension for testing frontend events. This works in production, staging, and local development environments. +Snowplow Inspector Chrome Extension is a browser extension for testing frontend events. This works in production, staging, and local development environments. + +<i class="fa fa-youtube-play youtube" aria-hidden="true"></i> +For a video tutorial, see the [Snowplow plugin walk through](https://www.youtube.com/watch?v=g4rqnIZ1Mb4). 1. Install [Snowplow Inspector](https://chrome.google.com/webstore/detail/snowplow-inspector/maplkdomeamdlngconidoefjpogkmljm?hl=en). 1. To open the extension, select the Snowplow Inspector icon beside the address bar. diff --git a/doc/development/snowplow/index.md b/doc/development/snowplow/index.md index 9b684757fe1..d6a7b900629 100644 --- a/doc/development/snowplow/index.md +++ b/doc/development/snowplow/index.md @@ -78,6 +78,8 @@ sequenceDiagram Snowflake DW->>Sisense Dashboards: Data available for querying ``` +For more details about the architecture, see [Snowplow infrastructure](infrastructure.md). + ## Structured event taxonomy Click events must be consistent. If each feature captures events differently, it can be difficult @@ -184,19 +186,6 @@ LIMIT 100 Snowplow JavaScript adds [web-specific parameters](https://docs.snowplowanalytics.com/docs/collecting-data/collecting-from-own-applications/snowplow-tracker-protocol/#Web-specific_parameters) to all web events by default. -## Snowplow monitoring - -For different stages in the processing pipeline, there are several tools that monitor Snowplow events tracking: - -- [Product Intelligence Grafana dashboard](https://dashboards.gitlab.net/d/product-intelligence-main/product-intelligence-product-intelligence?orgId=1) monitors backend events sent from GitLab.com instance to collectors fleet. This dashboard provides information about: - - The number of events that successfully reach Snowplow collectors. - - The number of events that failed to reach Snowplow collectors. - - The number of backend events that were sent. -- [AWS CloudWatch dashboard](https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#dashboards:name=SnowPlow;start=P3D) monitors the state of the events processing pipeline. The pipeline starts from Snowplow collectors, through to enrichers and pseudonymization, and up to persistence on S3 bucket from which events are imported to Snowflake Data Warehouse. To view this dashboard AWS access is required, follow this [instruction](https://gitlab.com/gitlab-org/growth/product-intelligence/snowplow-pseudonymization#monitoring) if you are interested in getting one. -- [SiSense dashboard](https://app.periscopedata.com/app/gitlab/417669/Snowplow-Summary-Dashboard) provides information about the number of good and bad events imported into the Data Warehouse, in addition to the total number of imported Snowplow events. - -For more information, see this [video walk-through](https://www.youtube.com/watch?v=NxPS0aKa_oU). - ## Related topics - [Snowplow data structure](https://docs.snowplowanalytics.com/docs/understanding-your-pipeline/canonical-event/) diff --git a/doc/development/snowplow/infrastructure.md b/doc/development/snowplow/infrastructure.md new file mode 100644 index 00000000000..28541874e98 --- /dev/null +++ b/doc/development/snowplow/infrastructure.md @@ -0,0 +1,101 @@ +--- +stage: Growth +group: Product Intelligence +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Snowplow infrastructure + +Snowplow events on GitLab SaaS fired by a [tracker](implementation.md) go through an AWS pipeline, managed by GitLab. + +## Event flow in the AWS pipeline + +Every event goes through a collector, enricher, and pseudonymization lambda. The event is then dumped to S3 storage where it can be picked up by the Snowflake data warehouse. + +Deploying and managing the infrastructure is automated using Terraform in the current [Terraform repository](https://gitlab.com/gitlab-com/gl-infra/config-mgmt/-/tree/master/environments/aws-snowplow). + +```mermaid +graph LR + GL[GitLab.com]-->COL + + subgraph aws-cloud[AWS] + COL[Collector]-->|snowplow-raw-good|ENR + COL[Collector]-->|snowplow-raw-bad|FRBE + subgraph firehoserbe[Firehose] + FRBE[AWS Lambda] + end + FRBE-->S3RBE + + ENR[Enricher]-->|snowplow-enriched-bad|FEBE + subgraph firehoseebe[Firehose] + FEBE[AWS Lambda] + end + FEBE-->S3EBE + + ENR[Enricher]-->|snowplow-enriched-good|FRGE + subgraph firehosege[Firehose] + FRGE[AWS Lambda] + end + FRGE-->S3GE + end + + subgraph snowflake[Data warehouse] + S3RBE[S3 raw-bad]-->BE[gitlab_bad_events] + S3EBE[S3 enriched-bad]-->BE[gitlab_bad_events] + S3GE[S3 output]-->GE[gitlab_events] + end +``` + +See [Snowplow technology 101](https://github.com/snowplow/snowplow/#snowplow-technology-101) for Snowplow's own documentation and an overview how collectors and enrichers work. + +### Pseudonymization + +In contrast to a typical Snowplow pipeline, after enrichment, GitLab Snowplow events go through a [pseudonymization service](https://gitlab.com/gitlab-org/growth/product-intelligence/snowplow-pseudonymization) in the form of an AWS Lambda service before they are stored in S3 storage. + +#### Why events need to be pseudonymized + +GitLab is bound by its [obligations to community](https://about.gitlab.com/handbook/product/product-intelligence-guide/service-usage-data-commitment/) +and by [legal regulations](https://about.gitlab.com/handbook/legal/privacy/services-usage-data/) to protect the privacy of its users. + +GitLab must provide valuable insights for business decisions, and there is a need +for a better understanding of different users' behavior patterns. The +pseudonymization process helps you find a compromise between these two requirements. + +Pseudonymization processes personally identifiable information inside a Snowplow event in an irreversible fashion +maintaining deterministic output for given input, while masking any relation to that input. + +#### How events are pseudonymized + +Pseudonymization uses an allowlist that provides privacy by default. Therefore, each +attribute received as part of a Snowplow event is pseudonymized unless the attribute +is an allowed exception. + +Pseudonymization is done using the HMAC-SHA256 keyed hash algorithm. +Attributes are combined with a secret salt to replace each identifiable information with a pseudonym. + +### S3 bucket data lake to Snowflake + +See [Data team's Snowplow Overview](https://about.gitlab.com/handbook/business-technology/data-team/platform/snowplow/) for further details how data is ingested into our Snowflake data warehouse. + +## Monitoring + +There are several tools that monitor Snowplow events tracking in different stages of the processing pipeline: + +- [Product Intelligence Grafana dashboard](https://dashboards.gitlab.net/d/product-intelligence-main/product-intelligence-product-intelligence?orgId=1) monitors backend events sent from a GitLab.com instance to a collectors fleet. This dashboard provides information about: + - The number of events that successfully reach Snowplow collectors. + - The number of events that failed to reach Snowplow collectors. + - The number of backend events that were sent. +- [AWS CloudWatch dashboard](https://console.aws.amazon.com/cloudwatch/home?region=us-east-1#dashboards:name=SnowPlow;start=P3D) monitors the state of the events in a processing pipeline. The pipeline starts from Snowplow collectors, goes through to enrichers and pseudonymization, and then up to persistence in an S3 bucket. From S3, the events are imported into the Snowflake Data Warehouse. You must have AWS access rights to view this dashboard. For more information, see [monitoring](https://gitlab.com/gitlab-org/growth/product-intelligence/snowplow-pseudonymization#monitoring) in the Snowplow Events pseudonymization service documentation. +- [Sisense dashboard](https://app.periscopedata.com/app/gitlab/417669/Snowplow-Summary-Dashboard) provides information about the number of good and bad events imported into the Data Warehouse, in addition to the total number of imported Snowplow events. + +For more information, see this [video walk-through](https://www.youtube.com/watch?v=NxPS0aKa_oU). + +## Related topics + +- [Snowplow technology 101](https://github.com/snowplow/snowplow/#snowplow-technology-101) +- [Snowplow pseudonymization AWS Lambda project](https://gitlab.com/gitlab-org/growth/product-intelligence/snowplow-pseudonymization) +- [Product Intelligence Guide](https://about.gitlab.com/handbook/product/product-intelligence-guide/) +- [Data Infrastructure](https://about.gitlab.com/handbook/business-technology/data-team/platform/infrastructure/) +- [Snowplow architecture overview (internal)](https://www.youtube.com/watch?v=eVYJjzspsLU) +- [Snowplow architecture overview slide deck (internal)](https://docs.google.com/presentation/d/16gQEO5CAg8Tx4NBtfnZj-GF4juFI6HfEPWcZgH4Rn14/edit?usp=sharing) +- [AWS Lambda implementation (internal)](https://youtu.be/cQd0mdMhkQA) diff --git a/doc/development/sql.md b/doc/development/sql.md index 4b6153b7205..8553e2a5500 100644 --- a/doc/development/sql.md +++ b/doc/development/sql.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -9,7 +9,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w This document describes various guidelines to follow when writing SQL queries, either using ActiveRecord/Arel or raw SQL queries. -## Using LIKE Statements +## Using `LIKE` Statements The most common way to search for data is using the `LIKE` statement. For example, to get all issues with a title starting with "Draft:" you'd write the @@ -56,10 +56,10 @@ FROM issues WHERE (title ILIKE 'Draft:%' OR foo ILIKE 'Draft:%') ``` -## LIKE & Indexes +## `LIKE` & Indexes -PostgreSQL won't use any indexes when using `LIKE` / `ILIKE` with a wildcard at -the start. For example, this will not use any indexes: +PostgreSQL does not use any indexes when using `LIKE` / `ILIKE` with a wildcard at +the start. For example, this does not use any indexes: ```sql SELECT * @@ -145,7 +145,7 @@ The query: Project.select("path, user_id").joins(:merge_requests) # SELECT path, user_id FROM "projects" ... ``` -Later on, a new feature adds an extra column to the `projects` table: `user_id`. During deployment there might be a short time window where the database migration is already executed, but the new version of the application code is not deployed yet. When the query mentioned above executes during this period, the query will fail with the following error message: `PG::AmbiguousColumn: ERROR: column reference "user_id" is ambiguous` +Later on, a new feature adds an extra column to the `projects` table: `user_id`. During deployment there might be a short time window where the database migration is already executed, but the new version of the application code is not deployed yet. When the query mentioned above executes during this period, the query fails with the following error message: `PG::AmbiguousColumn: ERROR: column reference "user_id" is ambiguous` The problem is caused by the way the attributes are selected from the database. The `user_id` column is present in both the `users` and `merge_requests` tables. The query planner cannot decide which table to use when looking up the `user_id` column. @@ -210,7 +210,7 @@ Project.select(:path, :user_id).joins(:merge_requests) # SELECT "projects"."path", "user_id" FROM "projects" ... ``` -When a column list is given, ActiveRecord tries to match the arguments against the columns defined in the `projects` table and prepend the table name automatically. In this case, the `id` column is not going to be a problem, but the `user_id` column could return unexpected data: +When a column list is given, ActiveRecord tries to match the arguments against the columns defined in the `projects` table and prepend the table name automatically. In this case, the `id` column is not a problem, but the `user_id` column could return unexpected data: ```ruby Project.select(:id, :user_id).joins(:merge_requests) @@ -225,7 +225,7 @@ Project.select(:id, :user_id).joins(:merge_requests) ## Plucking IDs Never use ActiveRecord's `pluck` to pluck a set of values into memory only to -use them as an argument for another query. For example, this will execute an +use them as an argument for another query. For example, this executes an extra unnecessary database query and load a lot of unnecessary data into memory: ```ruby @@ -314,10 +314,10 @@ union = Gitlab::SQL::Union.new([projects, more_projects, ...]) Project.from("(#{union.to_sql}) projects") ``` -### Uneven columns in the UNION sub-queries +### Uneven columns in the `UNION` sub-queries -When the UNION query has uneven columns in the SELECT clauses, the database returns an error. -Consider the following UNION query: +When the `UNION` query has uneven columns in the `SELECT` clauses, the database returns an error. +Consider the following `UNION` query: ```sql SELECT id FROM users WHERE id = 1 @@ -333,7 +333,7 @@ each UNION query must have the same number of columns ``` This problem is apparent and it can be easily fixed during development. One edge-case is when -UNION queries are combined with explicit column listing where the list comes from the +`UNION` queries are combined with explicit column listing where the list comes from the `ActiveRecord` schema cache. Example (bad, avoid it): @@ -387,17 +387,17 @@ User.connection.execute(Gitlab::SQL::Union.new([scope1, scope2]).to_sql) When ordering records based on the time they were created, you can order by the `id` column instead of ordering by `created_at`. Because IDs are always -unique and incremented in the order that rows are created, doing so will produce the +unique and incremented in the order that rows are created, doing so produces the exact same results. This also means there's no need to add an index on `created_at` to ensure consistent performance as `id` is already indexed by default. -## Use WHERE EXISTS instead of WHERE IN +## Use `WHERE EXISTS` instead of `WHERE IN` While `WHERE IN` and `WHERE EXISTS` can be used to produce the same data it is recommended to use `WHERE EXISTS` whenever possible. While in many cases PostgreSQL can optimise `WHERE IN` quite well there are also many cases where -`WHERE EXISTS` will perform (much) better. +`WHERE EXISTS` performs (much) better. In Rails you have to use this by creating SQL fragments: @@ -446,7 +446,7 @@ method. This method differs from our `.safe_find_or_create_by` methods because it performs the `INSERT`, and then performs the `SELECT` commands only if that call fails. -If the `INSERT` fails, it will leave a dead tuple around and +If the `INSERT` fails, it leaves a dead tuple around and increment the primary key sequence (if any), among [other downsides](https://api.rubyonrails.org/classes/ActiveRecord/Relation.html#method-i-create_or_find_by). We prefer `.safe_find_or_create_by` if the common path is that we diff --git a/doc/development/swapping_tables.md b/doc/development/swapping_tables.md index cb038a3b85a..efb481ccf35 100644 --- a/doc/development/swapping_tables.md +++ b/doc/development/swapping_tables.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -10,12 +10,12 @@ Sometimes you need to replace one table with another. For example, when migrating data in a very large table it's often better to create a copy of the table and insert & migrate the data into this new table in the background. -Let's say you want to swap the table "events" with "events_for_migration". In +Let's say you want to swap the table `events` with `events_for_migration`. In this case you need to follow 3 steps: -1. Rename "events" to "events_temporary" -1. Rename "events_for_migration" to "events" -1. Rename "events_temporary" to "events_for_migration" +1. Rename `events` to `events_temporary` +1. Rename `events_for_migration` to `events` +1. Rename `events_temporary` to `events_for_migration` Rails allows you to do this using the `rename_table` method: @@ -27,7 +27,7 @@ rename_table :events_temporary, :events_for_migration This does not require any downtime as long as the 3 `rename_table` calls are executed in the _same_ database transaction. Rails by default uses database -transactions for migrations, but if it doesn't you'll need to start one +transactions for migrations, but if it doesn't you need to start one manually: ```ruby @@ -45,7 +45,7 @@ PostgreSQL you can use the `reset_pk_sequence!` method like so: reset_pk_sequence!('events') ``` -Failure to reset the primary keys will result in newly created rows starting +Failure to reset the primary keys results in newly created rows starting with an ID value of 1. Depending on the existing data this can then lead to duplicate key constraints from popping up, preventing users from creating new data. diff --git a/doc/development/testing_guide/best_practices.md b/doc/development/testing_guide/best_practices.md index 7ae49d33e91..eda1c8c3d10 100644 --- a/doc/development/testing_guide/best_practices.md +++ b/doc/development/testing_guide/best_practices.md @@ -215,6 +215,39 @@ In this case, the `total time` and `top-level time` numbers match more closely: 8 8 0.0477s 0.0477s 0.0477s namespace ``` +#### Stubbing methods within factories + +You should avoid using `allow(object).to receive(:method)` in factories, as this makes the factory unable to be used with `let_it_be`. + +Instead, you can use `stub_method` to stub the method: + +```ruby + before(:create) do |user, evaluator| + # Stub a method. + stub_method(user, :some_method) { 'stubbed!' } + # Or with arguments, including named ones + stub_method(user, :some_method) { |var1| "Returning #{var1}!" } + stub_method(user, :some_method) { |var1: 'default'| "Returning #{var1}!" } + end + + # Un-stub the method. + # This may be useful where the stubbed object is created with `let_it_be` + # and you want to reset the method between tests. + after(:create) do |user, evaluator| + restore_original_method(user, :some_method) + # or + restore_original_methods(user) + end +``` + +NOTE: +`stub_method` does not work when used in conjunction with `let_it_be_with_refind`. This is because `stub_method` will stub a method on an instance and `let_it_be_with_refind` will create a new instance of the object for each run. + +`stub_method` does not support method existence and method arity checks. + +WARNING: +`stub_method` is supposed to be used in factories only. It's strongly discouraged to be used elsewhere. Please consider using [RSpec's mocks](https://relishapp.com/rspec/rspec-mocks/v/3-10/docs/basics) if available. + #### Identify slow tests Running a spec with profiling is a good way to start optimizing a spec. This can @@ -981,7 +1014,7 @@ is used to delete data in all indices in between examples to ensure a clean inde Note that Elasticsearch indexing uses [`Gitlab::Redis::SharedState`](../../../ee/development/redis.md#gitlabrediscachesharedstatequeues). Therefore, the Elasticsearch traits dynamically use the `:clean_gitlab_redis_shared_state` trait. -You do NOT need to add `:clean_gitlab_redis_shared_state` manually. +You do not need to add `:clean_gitlab_redis_shared_state` manually. Specs using Elasticsearch require that you: @@ -1305,7 +1338,7 @@ GitLab uses [factory_bot](https://github.com/thoughtbot/factory_bot) as a test f See [issue #262624](https://gitlab.com/gitlab-org/gitlab/-/issues/262624) for further context. - Factories don't have to be limited to `ActiveRecord` objects. [See example](https://gitlab.com/gitlab-org/gitlab-foss/commit/0b8cefd3b2385a21cfed779bd659978c0402766d). -- Factories and their traits should produce valid objects that are [verified by specs](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/factories_spec.rb). +- Factories and their traits should produce valid objects that are [verified by specs](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/models/factories_spec.rb). - Avoid the use of [`skip_callback`](https://api.rubyonrails.org/classes/ActiveSupport/Callbacks/ClassMethods.html#method-i-skip_callback) in factories. See [issue #247865](https://gitlab.com/gitlab-org/gitlab/-/issues/247865) for details. diff --git a/doc/development/testing_guide/contract/consumer_tests.md b/doc/development/testing_guide/contract/consumer_tests.md new file mode 100644 index 00000000000..b4d6882a655 --- /dev/null +++ b/doc/development/testing_guide/contract/consumer_tests.md @@ -0,0 +1,308 @@ +--- +stage: none +group: Development +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Writing consumer tests + +This tutorial guides you through writing a consumer test from scratch. To start, the consumer tests are written using [`jest-pact`](https://github.com/pact-foundation/jest-pact) that builds on top of [`pact-js`](https://github.com/pact-foundation/pact-js). This tutorial shows you how to write a consumer test for the `/discussions.json` endpoint, which is actually `/:namespace_name/:project_name/-/merge_requests/:id/discussions.json`. + +## Create the skeleton + +Start by creating the skeleton of a consumer test. Create a file under `spec/contracts/consumer/specs` called `discussions.spec.js`. +Then, populate it with the following function and parameters: + +- [`pactWith`](#the-pactwith-function) +- [`PactOptions`](#the-pactoptions-parameter) +- [`PactFn`](#the-pactfn-parameter) + +### The `pactWith` function + +The Pact consumer test is defined through the `pactWith` function that takes `PactOptions` and the `PactFn`. + +```javascript +const { pactWith } = require('jest-pact'); + +pactWith(PactOptions, PactFn); +``` + +### The `PactOptions` parameter + +`PactOptions` with `jest-pact` introduces [additional options](https://github.com/pact-foundation/jest-pact/blob/dce370c1ab4b7cb5dff12c4b62246dc229c53d0e/README.md#defaults) that build on top of the ones [provided in `pact-js`](https://github.com/pact-foundation/pact-js#constructor). In most cases, you define the `consumer`, `provider`, `log`, and `dir` options for these tests. + +```javascript +const { pactWith } = require('jest-pact'); + +pactWith( + { + consumer: 'Merge Request Page', + provider: 'Merge Request Discussions Endpoint', + log: '../logs/consumer.log', + dir: '../contracts', + }, + PactFn +); +``` + +### The `PactFn` parameter + +The `PactFn` is where your tests are defined. This is where you set up the mock provider and where you can use the standard Jest methods like [`Jest.describe`](https://jestjs.io/docs/api#describename-fn), [`Jest.beforeEach`](https://jestjs.io/docs/api#beforeeachfn-timeout), and [`Jest.it`](https://jestjs.io/docs/api#testname-fn-timeout). For more information, see [https://jestjs.io/docs/api](https://jestjs.io/docs/api). + +```javascript +const { pactWith } = require('jest-pact'); + +pactWith( + { + consumer: 'Merge Request Page', + provider: 'Merge Request Discussions Endpoint', + log: '../logs/consumer.log', + dir: '../contracts', + }, + + (provider) => { + describe('Discussions Endpoint', () => { + beforeEach(() => { + + }); + + it('return a successful body', () => { + + }); + }); + }, +); +``` + +## Set up the mock provider + +Before you run your test, set up the mock provider that handles the specified requests and returns a specified response. To do that, define the state and the expected request and response in an [`Interaction`](https://github.com/pact-foundation/pact-js/blob/master/src/dsl/interaction.ts). + +For this tutorial, define four attributes for the `Interaction`: + +1. `state`: A description of what the prerequisite state is before the request is made. +1. `uponReceiving`: A description of what kind of request this `Interaction` is handling. +1. `withRequest`: Where you define the request specifications. It contains the request `method`, `path`, and any `headers`, `body`, or `query`. +1. `willRespondWith`: Where you define the expected response. It contains the response `status`, `headers`, and `body`. + +After you define the `Interaction`, add that interaction to the mock provider by calling `addInteraction`. + +```javascript +const { pactWith } = require('jest-pact'); +const { Matchers } = require('@pact-foundation/pact'); + +pactWith( + { + consumer: 'Merge Request Page', + provider: 'Merge Request Discussions Endpoint', + log: '../logs/consumer.log', + dir: '../contracts', + }, + + (provider) => { + describe('Discussions Endpoint', () => { + beforeEach(() => { + const interaction = { + state: 'a merge request with discussions exists', + uponReceiving: 'a request for discussions', + withRequest: { + method: 'GET', + path: '/gitlab-org/gitlab-qa/-/merge_requests/1/discussions.json', + headers: { + Accept: '*/*', + }, + }, + willRespondWith: { + status: 200, + headers: { + 'Content-Type': 'application/json; charset=utf-8', + }, + body: Matchers.eachLike({ + id: Matchers.string('fd73763cbcbf7b29eb8765d969a38f7d735e222a'), + project_id: Matchers.integer(6954442), + ... + resolved: Matchers.boolean(true) + }), + }, + }; + provider.addInteraction(interaction); + }); + + it('return a successful body', () => { + + }); + }); + }, +); +``` + +### Response body `Matchers` + +Notice how we use `Matchers` in the `body` of the expected response. This allows us to be flexible enough to accept different values but still be strict enough to distinguish between valid and invalid values. We must ensure that we have a tight definition that is neither too strict nor too lax. Read more about the [different types of `Matchers`](https://github.com/pact-foundation/pact-js#using-the-v3-matching-rules). + +## Write the test + +After the mock provider is set up, you can write the test. For this test, you make a request and expect a particular response. + +First, set up the client that makes the API request. To do that, either create or find an existing file under `spec/contracts/consumer/endpoints` and add the following API request. + +```javascript +const axios = require('axios'); + +exports.getDiscussions = (endpoint) => { + const url = endpoint.url; + + return axios + .request({ + method: 'GET', + baseURL: url, + url: '/gitlab-org/gitlab-qa/-/merge_requests/1/discussions.json', + headers: { Accept: '*/*' }, + }) + .then((response) => response.data); +}; +``` + +After that's set up, import it to the test file and call it to make the request. Then, you can make the request and define your expectations. + +```javascript +const { pactWith } = require('jest-pact'); +const { Matchers } = require('@pact-foundation/pact'); + +const { getDiscussions } = require('../endpoints/merge_requests'); + +pactWith( + { + consumer: 'Merge Request Page', + provider: 'Merge Request Discussions Endpoint', + log: '../logs/consumer.log', + dir: '../contracts', + }, + + (provider) => { + describe('Discussions Endpoint', () => { + beforeEach(() => { + const interaction = { + state: 'a merge request with discussions exists', + uponReceiving: 'a request for discussions', + withRequest: { + method: 'GET', + path: '/gitlab-org/gitlab-qa/-/merge_requests/1/discussions.json', + headers: { + Accept: '*/*', + }, + }, + willRespondWith: { + status: 200, + headers: { + 'Content-Type': 'application/json; charset=utf-8', + }, + body: Matchers.eachLike({ + id: Matchers.string('fd73763cbcbf7b29eb8765d969a38f7d735e222a'), + project_id: Matchers.integer(6954442), + ... + resolved: Matchers.boolean(true) + }), + }, + }; + }); + + it('return a successful body', () => { + return getDiscussions({ + url: provider.mockService.baseUrl, + }).then((discussions) => { + expect(discussions).toEqual(Matchers.eachLike({ + id: 'fd73763cbcbf7b29eb8765d969a38f7d735e222a', + project_id: 6954442, + ... + resolved: true + })); + }); + }); + }); + }, +); +``` + +There we have it! The consumer test is now set up. You can now try [running this test](index.md#run-the-consumer-tests). + +## Improve test readability + +As you may have noticed, the request and response definitions can get large. This results in the test being difficult to read, with a lot of scrolling to find what you want. You can make the test easier to read by extracting these out to a `fixture`. + +Create a file under `spec/contracts/consumer/fixtures` called `discussions.fixture.js`. You place the `request` and `response` definitions here. + +```javascript +const { Matchers } = require('@pact-foundation/pact'); + +const body = Matchers.eachLike({ + id: Matchers.string('fd73763cbcbf7b29eb8765d969a38f7d735e222a'), + project_id: Matchers.integer(6954442), + ... + resolved: Matchers.boolean(true) +}); + +const Discussions = { + body: Matchers.extractPayload(body), + + success: { + status: 200, + headers: { + 'Content-Type': 'application/json; charset=utf-8', + }, + body: body, + }, + + request: { + uponReceiving: 'a request for discussions', + withRequest: { + method: 'GET', + path: '/gitlab-org/gitlab-qa/-/merge_requests/1/discussions.json', + headers: { + Accept: '*/*', + }, + }, + }, +}; + +exports.Discussions = Discussions; +``` + +With all of that moved to the `fixture`, you can simplify the test to the following: + +```javascript +const { pactWith } = require('jest-pact'); + +const { Discussions } = require('../fixtures/discussions.fixture'); +const { getDiscussions } = require('../endpoints/merge_requests'); + +pactWith( + { + consumer: 'Merge Request Page', + provider: 'Merge Request Discussions Endpoint', + log: '../logs/consumer.log', + dir: '../contracts', + }, + + (provider) => { + describe('Discussions Endpoint', () => { + beforeEach(() => { + const interaction = { + state: 'a merge request with discussions exists', + ...Discussions.request, + willRespondWith: Discussions.success, + }; + return provider.addInteraction(interaction); + }); + + it('return a successful body', () => { + return getDiscussions({ + url: provider.mockService.baseUrl, + }).then((discussions) => { + expect(discussions).toEqual(Discussions.body); + }); + }); + }); + }, +); +``` diff --git a/doc/development/testing_guide/contract/index.md b/doc/development/testing_guide/contract/index.md new file mode 100644 index 00000000000..6556bd85624 --- /dev/null +++ b/doc/development/testing_guide/contract/index.md @@ -0,0 +1,39 @@ +--- +stage: none +group: Development +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Contract testing + +Contract tests consist of two parts — consumer tests and provider tests. A simple example of a consumer and provider relationship is between the frontend and backend. The frontend would be the consumer and the backend is the provider. The frontend consumes the API that is provided by the backend. The test helps ensure that these two sides follow an agreed upon contract and any divergence from the contract triggers a meaningful conversation to prevent breaking changes from slipping through. + +Consumer tests are similar to unit tests with each spec defining a requests and an expected mock responses and creating a contract based on those definitions. On the other hand, provider tests are similar to integration tests as each spec takes the request defined in the contract and runs that request against the actual service which is then matched against the contract to validate the contract. + +You can check out the existing contract tests at: + +- [`spec/contracts/consumer/specs`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/spec/contracts/consumer/specs) for the consumer tests. +- [`spec/contracts/provider/specs`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/spec/contracts/provider/specs) for the provider tests. + +The contracts themselves are stored in [`/spec/contracts/contracts`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/spec/contracts/contracts) at the moment. The plan is to use [PactBroker](https://docs.pact.io/pact_broker/docker_images) hosted in AWS or another similar service. + +## Write the tests + +- [Writing consumer tests](consumer_tests.md) +- [Writing provider tests](provider_tests.md) + +### Run the consumer tests + +Before running the consumer tests, go to `spec/contracts/consumer` and run `npm install`. To run all the consumer tests, you just need to run `npm test -- /specs`. Otherwise, to run a specific spec file, replace `/specs` with the specific spec filename. + +### Run the provider tests + +Before running the provider tests, make sure your GDK (GitLab Development Kit) is fully set up and running. You can follow the setup instructions detailed in the [GDK repository](https://gitlab.com/gitlab-org/gitlab-development-kit/-/tree/main). To run the provider tests, you use Rake tasks that are defined in [`./lib/tasks/contracts.rake`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/tasks/contracts.rake). To get a list of all the Rake tasks related to the provider tests, run `bundle exec rake -T contracts`. For example: + +```shell +$ bundle exec rake -T contracts +rake contracts:mr:pact:verify:diffs # Verify provider against the consumer pacts for diffs +rake contracts:mr:pact:verify:discussions # Verify provider against the consumer pacts for discussions +rake contracts:mr:pact:verify:metadata # Verify provider against the consumer pacts for metadata +rake contracts:mr:test:merge_request[contract_mr] # Run all merge request contract tests +``` diff --git a/doc/development/testing_guide/contract/provider_tests.md b/doc/development/testing_guide/contract/provider_tests.md new file mode 100644 index 00000000000..0da5bcb4aef --- /dev/null +++ b/doc/development/testing_guide/contract/provider_tests.md @@ -0,0 +1,177 @@ +--- +stage: none +group: Development +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Writing provider tests + +This tutorial guides you through writing a provider test from scratch. It is a continuation of the [consumer test tutorial](consumer_tests.md). To start, the provider tests are written using [`pact-ruby`](https://github.com/pact-foundation/pact-ruby). In this tutorial, you write a provider test that addresses the contract generated by `discussions.spec.js`. + +## Create the skeleton + +Provider tests are quite simple. The goal is to set up the test data and then link that with the corresponding contract. Start by creating a file called `discussions_helper.rb` under `spec/contracts/provider/specs`. Note that the files are called `helpers` to match how they are called by Pact in the Rake tasks, which are set up at the end of this tutorial. + +### The `service_provider` block + +The `service_provider` block is where the provider test is defined. For this block, put in a description of the service provider. Name it exactly as it is called in the contracts that are derived from the consumer tests. + +```ruby +require_relative '../spec_helper' + +module Provider + module DiscussionsHelper + Pact.service_provider 'Merge Request Discussions Endpoint' do + + end + end +end +``` + +### The `honours_pact_with` block + +The `honours_pact_with` block describes which consumer this provider test is addressing. Similar to the `service_provider` block, name this exactly the same as it's called in the contracts that are derived from the consumer tests. + +```ruby +require_relative '../spec_helper' + +module Provider + module DiscussionsHelper + Pact.service_provider 'Merge Request Discussions Endpoint' do + honours_pact_with 'Merge Request Page' do + + end + end + end +end +``` + +## Configure the test app + +For the provider tests to verify the contracts, you must hook it up to a test app that makes the actual request and return a response to verify against the contract. To do this, configure the `app` the test uses as `Environment::Test.app`, which is defined in [`spec/contracts/provider/environments/test.rb`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/spec/contracts/provider/environments/test.rb). + +```ruby +require_relative '../spec_helper' + +module Provider + module DiscussionsHelper + Pact.service_provider 'Merge Request Discussions Endpoint' do + app { Environment::Test.app } + + honours_pact_with 'Merge Request Page' do + + end + end + end +end +``` + +## Define the contract to verify + +Now that the test app is configured, all that is left is to define which contract this provider test is verifying. To do this, set the `pact_uri`. + +```ruby +require_relative '../spec_helper' + +module Provider + module DiscussionsHelper + Pact.service_provider 'Merge Request Discussions Endpoint' do + app { Environment::Test.app } + + honours_pact_with 'Merge Request Page' do + pact_uri '../contracts/merge_request_page-merge_request_discussions_endpoint.json' + end + end + end +end +``` + +## Add / update the Rake tasks + +Now that you have a test created, you must create Rake tasks that run this test. The Rake tasks are defined in [`lib/tasks/contracts.rake`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/lib/tasks/contracts.rake) where we have individual Rake tasks to run individual specs, but also Rake tasks that run a group of tests. + +Under the `contracts:mr` namespace, introduce the Rake task to run this new test specifically. In it, call `pact.uri` to define the location of the contract and the provider test that tests that contract. Notice here that `pact_uri` has a parameter called `pact_helper`. This is why the provider tests are called `_helper.rb`. + +```ruby +Pact::VerificationTask.new(:discussions) do |pact| + pact.uri( + "#{contracts}/contracts/merge_request_page-merge_request_discussions_endpoint.json", + pact_helper: "#{provider}/specs/discussions_helper.rb" + ) +end +``` + +At the same time, add your new `:discussions` Rake task to be included in the `test:merge_request` Rake task. In that Rake task, there is an array defined (`%w[metadata diffs]`). You must add `discussions` in that list. + +## Create test data + +As the last step, create the test data that allows the provider test to return the contract's expected response. You might wonder why you create the test data last. It's really a matter of preference. With the test already configured, you can easily run the test to verify and make sure all the necessary test data are created to produce the expected response. + +You can read more about [provider states](https://docs.pact.io/implementation_guides/ruby/provider_states). We can do global provider states but for this tutorial, the provider state is for one specific `state`. + +To create the test data, create `discussions_state.rb` under `spec/contracts/provider/states`. As a quick aside, make sure to also import this state file in the `discussions_helper.rb` file. + +### Default user in `spec/contracts/provider/spec_helper.rb` + +Before you create the test data, note that a default user is created in the [`spec_helper`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/spec/contracts/provider/spec_helper.rb), which is the user being used for the test runs. This user is configured using `RSpec.configure`, as Pact actually is built on top of RSpec. This step allows us to configure the user before any of the test runs. + +```ruby +RSpec.configure do |config| + config.include Devise::Test::IntegrationHelpers + config.before do + user = FactoryBot.create(:user, name: "Contract Test").tap do |user| + user.current_sign_in_at = Time.current + end + sign_in user + end +end +``` + +Any further modifications to the user that's needed can be done through the individual provider state files. + +### The `provider_states_for` block + +In the state file, you must define which consumer this provider state is for. You can do that with `provider_states_for`. Make sure that the `name` provided matches the name defined for the consumer. + +```ruby +Pact.provider_states_for 'Merge Request Page' do +end +``` + +### The `provider_state` block + +In the `provider_states_for` block, you then define the state the test data is for. These states are also defined in the consumer test. In this case, there is a `'a merge request with discussions exists'` state. + +```ruby +Pact.provider_states_for "Merge Request Page" do + provider_state "a merge request with discussions exists" do + + end +end +``` + +### The `set_up` block + +This is where you define the test data creation steps. Use `FactoryBot` to create the data. As you create the test data, you can keep [running the provider test](index.md#run-the-provider-tests) to check on the status of the test and figure out what else is missing in your data setup. + +```ruby +Pact.provider_states_for "Merge Request Page" do + provider_state "a merge request with discussions exists" do + set_up do + user = User.find_by(name: Provider::UsersHelper::CONTRACT_USER_NAME) + namespace = create(:namespace, name: 'gitlab-org') + project = create(:project, name: 'gitlab-qa', namespace: namespace) + + project.add_maintainer(user) + + merge_request = create(:merge_request_with_diffs, id: 1, source_project: project, author: user) + + create(:discussion_note_on_merge_request, noteable: merge_request, project: project, author: user) + end + end +end +``` + +Note the `Provider::UsersHelper::CONTRACT_USER_NAME` here to fetch a user is a user that is from the [`spec_helper`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/spec/contracts/provider/spec_helper.rb) that sets up a user before any of these tests run. + +And with that, the provider tests for `discussion_helper.rb` should now pass with this. diff --git a/doc/development/testing_guide/end_to_end/beginners_guide.md b/doc/development/testing_guide/end_to_end/beginners_guide.md index 39a3e3445ea..a13011d0101 100644 --- a/doc/development/testing_guide/end_to_end/beginners_guide.md +++ b/doc/development/testing_guide/end_to_end/beginners_guide.md @@ -343,8 +343,8 @@ Before running the spec, make sure that: To run the spec, run the following command: -```ruby -GITLAB_PASSWORD=<GDK root password> bundle exec bin/qa Test::Instance::All http://localhost:3000 -- <test_file> +```shell +GITLAB_PASSWORD=<GDK root password> bundle exec rspec <test_file> ``` Where `<test_file>` is: @@ -352,6 +352,8 @@ Where `<test_file>` is: - `qa/specs/features/browser_ui/1_manage/login/log_in_spec.rb` when running the Login example. - `qa/specs/features/browser_ui/2_plan/issue/create_issue_spec.rb` when running the Issue example. +Additional information on test execution and possible options are described in ["QA framework README"](https://gitlab.com/gitlab-org/gitlab/-/blob/master/qa/README.md#run-the-end-to-end-tests-in-a-local-development-environment) + ## End-to-end test merge request template When submitting a new end-to-end test, use the ["New End to End Test"](https://gitlab.com/gitlab-org/gitlab/-/blob/master/.gitlab/merge_request_templates/New%20End%20To%20End%20Test.md) diff --git a/doc/development/testing_guide/end_to_end/best_practices.md b/doc/development/testing_guide/end_to_end/best_practices.md index bd9896934c7..85f8beeacad 100644 --- a/doc/development/testing_guide/end_to_end/best_practices.md +++ b/doc/development/testing_guide/end_to_end/best_practices.md @@ -189,9 +189,9 @@ Attach the `:aggregate_failures` metadata to the example if multiple expectation it 'searches', :aggregate_failures do Page::Search::Results.perform do |search| expect(search).to have_file_in_project(template[:file_name], project.name) - + search.switch_to_code - + expect(search).to have_file_with_content(template[:file_name], content[0..33]) end end @@ -208,6 +208,54 @@ it 'searches' do end ``` +## Avoid multiple actions in `expect do ... raise_error` blocks + +When you wrap multiple actions in a single `expect do ... end.not_to raise_error` or `expect do ... end.to raise_error` block, +it can be hard to debug the actual cause of the failure, because of how the logs are printed. Important information can be truncated +or missing altogether. + +For example, if you encapsulate some actions and expectations in a private method in the test, like `expect_owner_permissions_allow_delete_issue`: + +```ruby +it "has Owner role with Owner permissions" do + Page::Dashboard::Projects.perform do |projects| + projects.filter_by_name(project.name) + + expect(projects).to have_project_with_access_role(project.name, 'Owner') + end + + expect_owner_permissions_allow_delete_issue +end +``` + +Then, in the method itself: + +```ruby +#=> Good +def expect_owner_permissions_allow_delete_issue + issue.visit! + + Page::Project::Issue::Show.perform(&:delete_issue) + + Page::Project::Issue::Index.perform do |index| + expect(index).not_to have_issue(issue) + end +end + +#=> Bad +def expect_owner_permissions_allow_delete_issue + expect do + issue.visit! + + Page::Project::Issue::Show.perform(&:delete_issue) + + Page::Project::Issue::Index.perform do |index| + expect(index).not_to have_issue(issue) + end + end.not_to raise_error +end +``` + ## Prefer to split tests across multiple files Our framework includes a couple of parallelization mechanisms that work by executing spec files in parallel. diff --git a/doc/development/testing_guide/end_to_end/capybara_to_chemlab_migration_guide.md b/doc/development/testing_guide/end_to_end/capybara_to_chemlab_migration_guide.md index 9c7e0ef73a8..a71e076b57f 100644 --- a/doc/development/testing_guide/end_to_end/capybara_to_chemlab_migration_guide.md +++ b/doc/development/testing_guide/end_to_end/capybara_to_chemlab_migration_guide.md @@ -12,7 +12,7 @@ Given the view: ```html <form id="my-form"> - <label for="first-name">First name</label> + <label for="first-name">First name</label> <input type="text" name="first-name" data-qa-selector="first_name" /> <label for="last-name">Last name</label> @@ -26,7 +26,7 @@ Given the view: <label for="password">Password</label> <input type="password" name="password" data-qa-selector="password" /> - + <input type="submit" value="Continue" data-qa-selector="continue"/> </form> ``` diff --git a/doc/development/testing_guide/end_to_end/feature_flags.md b/doc/development/testing_guide/end_to_end/feature_flags.md index b4ec9e8ccd3..cb4c8e8a6e8 100644 --- a/doc/development/testing_guide/end_to_end/feature_flags.md +++ b/doc/development/testing_guide/end_to_end/feature_flags.md @@ -30,7 +30,7 @@ feature flag is under test. - Format: `feature_flag: { name: 'feature_flag_name', scope: :project }` - When `scope` is set to `:global`, the test will be **skipped on all live .com environments**. This is to avoid issues with feature flag changes affecting other tests or users on that environment. -- When `scope` is set to any other value (such as `:project`, `:group` or `:user`), or if no `scope` is specified, the test will only be **skipped on canary and production**. +- When `scope` is set to any other value (such as `:project`, `:group` or `:user`), or if no `scope` is specified, the test will only be **skipped on canary, production, and preprod**. This is due to the fact that administrator access is not available there. **WARNING:** You are strongly advised to first try and [enable feature flags only for a group, project, user](../../feature_flags/index.md#feature-actors), @@ -192,10 +192,26 @@ End-to-end tests should pass with a feature flag enabled before it is enabled on ### Automatic test execution when a feature flag definition changes -If a merge request adds or edits a [feature flag definition file](../../feature_flags/index.md#feature-flag-definition-and-validation), -two `package-and-qa` jobs will be included automatically in the merge request pipeline. One job will enable the defined -feature flag and the other will disable it. The jobs execute the same suite of tests to confirm that they pass with if -the feature flag is either enabled or disabled. +There are two ways to confirm that end-to-end tests pass: + +- If a merge request adds or edits a [feature flag definition file](../../feature_flags/index.md#feature-flag-definition-and-validation), + two `package-and-qa` jobs (`package-and-qa-ff-enabled` and `package-and-qa-ff-disabled`) are included automatically in the merge request + pipeline. One job enables the defined feature flag and the other job disables it. The jobs execute the same suite of tests to confirm + that they pass with the feature flag either enabled or disabled. +- In some cases, if `package-and-qa` hasn't been triggered automatically, or if it has run the tests with the default feature flag values + (which might not be desired), you can create a Draft MR that enables the feature flag to ensure that all E2E tests pass with the feature + flag enabled. + +### Troubleshooting end-to-end test failures with feature flag enabled + +If enabling the feature flag results in E2E test failures, you can browse the artifacts in the failed pipeline to see screenshots of the failed tests. After which, you can either: + +- Identify tests that need to be updated and contact the relevant [counterpart Software Engineer in Test](https://about.gitlab.com/handbook/engineering/quality/#individual-contributors) responsible for updating the tests or assisting another engineer to do so. However, if a change does not go through [quad-planning](https://about.gitlab.com/handbook/engineering/quality/quality-engineering/quad-planning/) and a required test update is not made, test failures could block deployment. +- Run the failed tests [locally](https://gitlab.com/gitlab-org/gitlab/-/tree/master/qa#run-the-end-to-end-tests-in-a-local-development-environment) + with the [feature flag enabled](https://gitlab.com/gitlab-org/gitlab/-/tree/master/qa#running-tests-with-a-feature-flag-enabled-or-disabled). + This option requires considerable amount of setup, but you'll be able to see what the browser is doing as it's running the failed + tests, which can help debug the problem faster. You can also refer to the [Troubleshooting Guide for E2E tests](troubleshooting.md) for + support for common blockers. ### Test execution during feature development diff --git a/doc/development/testing_guide/end_to_end/page_objects.md b/doc/development/testing_guide/end_to_end/page_objects.md index 85ab4d479f9..c93e8c9d13f 100644 --- a/doc/development/testing_guide/end_to_end/page_objects.md +++ b/doc/development/testing_guide/end_to_end/page_objects.md @@ -150,7 +150,7 @@ In our case, `data-qa-selector="login_field"`, `data-qa-selector="password_field ```haml = f.text_field :login, class: "form-control top", autofocus: "autofocus", autocapitalize: "off", autocorrect: "off", required: true, title: "This field is required.", data: { qa_selector: 'login_field' } = f.password_field :password, class: "form-control bottom", required: true, title: "This field is required.", data: { qa_selector: 'password_field' } -= f.submit "Sign in", class: "btn btn-success", data: { qa_selector: 'sign_in_button' } += f.submit "Sign in", class: "btn btn-confirm", data: { qa_selector: 'sign_in_button' } ``` Things to note: diff --git a/doc/development/testing_guide/end_to_end/resources.md b/doc/development/testing_guide/end_to_end/resources.md index e2b005d8a1b..dacc428aec6 100644 --- a/doc/development/testing_guide/end_to_end/resources.md +++ b/doc/development/testing_guide/end_to_end/resources.md @@ -531,7 +531,7 @@ When you implement a new type of reusable resource there are two `private` metho can be validated. They are: - `reference_resource`: creates a new instance of the resource that can be compared with the one that was used during the tests. -- `unique_identifiers`: returns an array of attributes that allow the resource to be identified (e.g., name) and that are therefore +- `unique_identifiers`: returns an array of attributes that allow the resource to be identified (for example, name) and that are therefore expected to differ when comparing the reference resource with the resource reused in the tests. The following example shows the implementation of those two methods in `QA::Resource::ReusableProject`. diff --git a/doc/development/testing_guide/end_to_end/rspec_metadata_tests.md b/doc/development/testing_guide/end_to_end/rspec_metadata_tests.md index 0163f2e648c..591d03db7b8 100644 --- a/doc/development/testing_guide/end_to_end/rspec_metadata_tests.md +++ b/doc/development/testing_guide/end_to_end/rspec_metadata_tests.md @@ -15,13 +15,14 @@ This is a partial list of the [RSpec metadata](https://relishapp.com/rspec/rspec |-----------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | `:elasticsearch` | The test requires an Elasticsearch service. It is used by the [instance-level scenario](https://gitlab.com/gitlab-org/gitlab-qa#definitions) [`Test::Integration::Elasticsearch`](https://gitlab.com/gitlab-org/gitlab/-/blob/72b62b51bdf513e2936301cb6c7c91ec27c35b4d/qa/qa/ee/scenario/test/integration/elasticsearch.rb) to include only tests that require Elasticsearch. | | `:except` | The test is to be run in their typical execution contexts _except_ as specified. See [test execution context selection](execution_context_selection.md) for more information. | -| `:feature_flag` | The test uses a feature flag and therefore requires an administrator account to run. When `scope` is set to `:global`, the test will be skipped on all live .com environments. Otherwise, it will be skipped only on Canary and Production. See [testing with feature flags](../../../development/testing_guide/end_to_end/feature_flags.md) for more details. | +| `:feature_flag` | The test uses a feature flag and therefore requires an administrator account to run. When `scope` is set to `:global`, the test will be skipped on all live .com environments. Otherwise, it will be skipped only on Canary, Production, and Preprod. See [testing with feature flags](../../../development/testing_guide/end_to_end/feature_flags.md) for more details. | | `:geo` | The test requires two GitLab Geo instances - a primary and a secondary - to be spun up. | | `:gitaly_cluster` | The test runs against a GitLab instance where repositories are stored on redundant Gitaly nodes behind a Praefect node. All nodes are [separate containers](../../../administration/gitaly/praefect.md#requirements). Tests that use this tag have a longer setup time since there are three additional containers that need to be started. | | `:github` | The test requires a GitHub personal access token. | | `:group_saml` | The test requires a GitLab instance that has SAML SSO enabled at the group level. Interacts with an external SAML identity provider. Paired with the `:orchestrated` tag. | | `:instance_saml` | The test requires a GitLab instance that has SAML SSO enabled at the instance level. Interacts with an external SAML identity provider. Paired with the `:orchestrated` tag. | | `:integrations` | This aims to test the available [integrations](../../../user/project/integrations/index.md#available-integrations). The test requires Docker to be installed in the run context. It will provision the containers and can be run against a local instance or using the `gitlab-qa` scenario `Test::Integration::Integrations` | +| `:issue`, `:issue_${num}` | Optional links to issues which might be related to the spec. Helps keep track of related issues and can also be used by tools that create test reports. Currently added automatically to `Allure` test report. Multiple tags can be used by adding an optional numeric suffix like `issue_1`, `issue_2` etc. | | `:service_ping_disabled` | The test interacts with the GitLab configuration service ping at the instance level to turn Admin Area setting service ping checkbox on or off. This tag will have the test run only in the `service_ping_disabled` job and must be paired with the `:orchestrated` and `:requires_admin` tags. | | `:jira` | The test requires a Jira Server. [GitLab-QA](https://gitlab.com/gitlab-org/gitlab-qa) provisions the Jira Server in a Docker container when the `Test::Integration::Jira` test scenario is run. | | `:kubernetes` | The test includes a GitLab instance that is configured to be run behind an SSH tunnel, allowing a TLS-accessible GitLab. This test also includes provisioning of at least one Kubernetes cluster to test against. _This tag is often be paired with `:orchestrated`._ | @@ -42,6 +43,7 @@ This is a partial list of the [RSpec metadata](https://relishapp.com/rspec/rspec | `:requires_git_protocol_v2` | The test requires that Git protocol version 2 is enabled on the server. It's assumed to be enabled by default but if not the test can be skipped by setting `QA_CAN_TEST_GIT_PROTOCOL_V2` to `false`. | | `:requires_praefect` | The test requires that the GitLab instance uses [Gitaly Cluster](../../../administration/gitaly/praefect.md) (a.k.a. Praefect) as the repository storage . It's assumed to be used by default but if not the test can be skipped by setting `QA_CAN_TEST_PRAEFECT` to `false`. | | `:runner` | The test depends on and sets up a GitLab Runner instance, typically to run a pipeline. | +| `:sanity_feature_flags` | The test verifies the functioning of the feature flag handling part of the test framework | | `:skip_live_env` | The test is excluded when run against live deployed environments such as Staging, Canary, and Production. | | `:skip_fips_env` | The test is excluded when run against an environment in FIPS mode. | | `:skip_signup_disabled` | The test uses UI to sign up a new user and is skipped in any environment that does not allow new user registration via the UI. | @@ -49,4 +51,3 @@ This is a partial list of the [RSpec metadata](https://relishapp.com/rspec/rspec | `:smtp` | The test requires a GitLab instance to be configured to use an SMTP server. Tests SMTP notification email delivery from GitLab by using MailHog. | | `:testcase` | The link to the test case issue in the [GitLab Project Test Cases](https://gitlab.com/gitlab-org/gitlab/-/quality/test_cases). | | `:transient` | The test tests transient bugs. It is excluded by default. | -| `:issue`, `:issue_${num}` | Optional links to issues which might be related to the spec. Helps keep track of related issues and can also be used by tools that create test reports. Currently added automatically to `Allure` test report. Multiple tags can be used by adding an optional numeric suffix like `issue_1`, `issue_2` etc. | diff --git a/doc/development/testing_guide/end_to_end/running_tests_that_require_special_setup.md b/doc/development/testing_guide/end_to_end/running_tests_that_require_special_setup.md index 599e1104b72..438294161ac 100644 --- a/doc/development/testing_guide/end_to_end/running_tests_that_require_special_setup.md +++ b/doc/development/testing_guide/end_to_end/running_tests_that_require_special_setup.md @@ -153,7 +153,7 @@ Examples of tests which require a runner: Example: ```shell -docker run \ +docker run \ --detach \ --hostname interface_ip_address \ --publish 80:80 \ @@ -274,7 +274,7 @@ Geo requires an EE license. To visit the Geo sites in your browser, you need a r 1. To run end-to-end tests from your local GDK, run the [`EE::Scenario::Test::Geo` scenario](https://gitlab.com/gitlab-org/gitlab/-/blob/f7272b77e80215c39d1ffeaed27794c220dbe03f/qa/qa/ee/scenario/test/geo.rb) from the [`gitlab/qa/` directory](https://gitlab.com/gitlab-org/gitlab/-/blob/f7272b77e80215c39d1ffeaed27794c220dbe03f/qa). Include `--without-setup` to skip the Geo configuration steps. ```shell - QA_DEBUG=true GITLAB_QA_ACCESS_TOKEN=[add token here] GITLAB_QA_ADMIN_ACCESS_TOKEN=[add token here] bundle exec bin/qa QA::EE::Scenario::Test::Geo \ + QA_LOG_LEVEL=debug GITLAB_QA_ACCESS_TOKEN=[add token here] GITLAB_QA_ADMIN_ACCESS_TOKEN=[add token here] bundle exec bin/qa QA::EE::Scenario::Test::Geo \ --primary-address http://gitlab-primary.geo \ --secondary-address http://gitlab-secondary.geo \ --without-setup @@ -283,7 +283,7 @@ Geo requires an EE license. To visit the Geo sites in your browser, you need a r If the containers need to be configured first (for example, if you used the `--no-tests` option in the previous step), run the `QA::EE::Scenario::Test::Geo scenario` as shown below to first do the Geo configuration steps, and then run Geo end-to-end tests. Make sure that `EE_LICENSE` is (still) defined in your shell session. ```shell - QA_DEBUG=true bundle exec bin/qa QA::EE::Scenario::Test::Geo \ + QA_LOG_LEVEL=debug bundle exec bin/qa QA::EE::Scenario::Test::Geo \ --primary-address http://gitlab-primary.geo \ --primary-name gitlab-primary \ --secondary-address http://gitlab-secondary.geo \ @@ -354,7 +354,7 @@ To run the LDAP tests on your local with TLS enabled, follow these steps: 1. Run an LDAP test from [`gitlab/qa`](https://gitlab.com/gitlab-org/gitlab/-/tree/d5447ebb5f99d4c72780681ddf4dc25b0738acba/qa) directory: ```shell - GITLAB_LDAP_USERNAME="tanuki" GITLAB_LDAP_PASSWORD="password" QA_DEBUG=true WEBDRIVER_HEADLESS=false bin/qa Test::Instance::All https://gitlab.test qa/specs/features/browser_ui/1_manage/login/log_into_gitlab_via_ldap_spec.rb + GITLAB_LDAP_USERNAME="tanuki" GITLAB_LDAP_PASSWORD="password" QA_LOG_LEVEL=debug WEBDRIVER_HEADLESS=false bin/qa Test::Instance::All https://gitlab.test qa/specs/features/browser_ui/1_manage/login/log_into_gitlab_via_ldap_spec.rb ``` ### Running LDAP tests with TLS disabled @@ -382,7 +382,7 @@ To run the LDAP tests on your local with TLS disabled, follow these steps: 1. Run an LDAP test from [`gitlab/qa`](https://gitlab.com/gitlab-org/gitlab/-/tree/d5447ebb5f99d4c72780681ddf4dc25b0738acba/qa) directory: ```shell - GITLAB_LDAP_USERNAME="tanuki" GITLAB_LDAP_PASSWORD="password" QA_DEBUG=true WEBDRIVER_HEADLESS=false bin/qa Test::Instance::All http://localhost qa/specs/features/browser_ui/1_manage/login/log_into_gitlab_via_ldap_spec.rb + GITLAB_LDAP_USERNAME="tanuki" GITLAB_LDAP_PASSWORD="password" QA_LOG_LEVEL=debug WEBDRIVER_HEADLESS=false bin/qa Test::Instance::All http://localhost qa/specs/features/browser_ui/1_manage/login/log_into_gitlab_via_ldap_spec.rb ``` ## Guide to the mobile suite diff --git a/doc/development/testing_guide/end_to_end/troubleshooting.md b/doc/development/testing_guide/end_to_end/troubleshooting.md index 951fb056a4c..76d19dc0159 100644 --- a/doc/development/testing_guide/end_to_end/troubleshooting.md +++ b/doc/development/testing_guide/end_to_end/troubleshooting.md @@ -25,12 +25,12 @@ WEBDRIVER_HEADLESS=false bundle exec bin/qa Test::Instance::All http://localhost Sometimes a test might fail and the failure stack trace doesn't provide enough information to determine what went wrong. You can get more information by enabling -debug logs by setting `QA_DEBUG=true`, to see what the test framework is attempting. +debug logs by setting `QA_LOG_LEVEL=debug`, to see what the test framework is attempting. For example: ```shell cd gitlab/qa -QA_DEBUG=true bundle exec bin/qa Test::Instance::All http://localhost:3000 +QA_LOG_LEVEL=debug bundle exec bin/qa Test::Instance::All http://localhost:3000 ``` The test framework then outputs many logs showing the actions taken during diff --git a/doc/development/testing_guide/index.md b/doc/development/testing_guide/index.md index 2e00a00c454..fa9f1f1ac3e 100644 --- a/doc/development/testing_guide/index.md +++ b/doc/development/testing_guide/index.md @@ -70,4 +70,8 @@ Everything you should know about how to run end-to-end tests using Everything you should know about how to test migrations. +## [Contract tests](contract/index.md) + +Introduction to contract testing, how to run the tests, and how to write them. + [Return to Development documentation](../index.md) diff --git a/doc/development/testing_guide/review_apps.md b/doc/development/testing_guide/review_apps.md index ff4b77dec2c..f1083c23406 100644 --- a/doc/development/testing_guide/review_apps.md +++ b/doc/development/testing_guide/review_apps.md @@ -6,13 +6,13 @@ info: To determine the technical writer assigned to the Stage/Group associated w # Review Apps -Review Apps are deployed using the `start-review-app-pipeline` job. This job triggers a child pipeline containing a series of jobs to perform the various tasks needed to deploy a Review App. +Review Apps are deployed using the `start-review-app-pipeline` job which triggers a child pipeline containing a series of jobs to perform the various tasks needed to deploy a Review App. ![start-review-app-pipeline job](img/review-app-parent-pipeline.png) For any of the following scenarios, the `start-review-app-pipeline` job would be automatically started: -- for merge requests with CI config changes +- for merge requests with CI configuration changes - for merge requests with frontend changes - for merge requests with changes to `{,ee/,jh/}{app/controllers}/**/*` - for merge requests with changes to `{,ee/,jh/}{app/models}/**/*` @@ -27,6 +27,8 @@ On every [pipeline](https://gitlab.com/gitlab-org/gitlab/pipelines/125315730) in `review` stage), the `review-qa-smoke` and `review-qa-reliable` jobs are automatically started. The `review-qa-smoke` runs the QA smoke suite and the `review-qa-reliable` executes E2E tests identified as [reliable](https://about.gitlab.com/handbook/engineering/quality/quality-engineering/reliable-tests). +`review-qa-*` jobs ensure that end-to-end tests for the changes in the merge request pass in a live environment. This shifts the identification of e2e failures from an environment on the path to production to the merge request, to prevent breaking features on GitLab.com or costly GitLab.com deployment blockers. `review-qa-*` failures should be investigated with counterpart SET involvement if needed to help determine the root cause of the error. + You can also manually start the `review-qa-all`: it runs the full QA suite. After the end-to-end test runs have finished, [Allure reports](https://github.com/allure-framework/allure2) are generated and published by @@ -34,6 +36,10 @@ the `allure-report-qa-smoke`, `allure-report-qa-reliable`, and `allure-report-qa Errors can be found in the `gitlab-review-apps` Sentry project and [filterable by Review App URL](https://sentry.gitlab.net/gitlab/gitlab-review-apps/?query=url%3A%22https%3A%2F%2Fgitlab-review-require-ve-u92nn2.gitlab-review.app%2F%22) or [commit SHA](https://sentry.gitlab.net/gitlab/gitlab-review-apps/releases/6095b501da7/all-events/). +### Bypass failed review app deployment to merge a broken `master` fix + +Maintainers can elect to use the [process for merging during broken `master`](https://about.gitlab.com/handbook/engineering/workflow/#instructions-for-the-maintainer) if a customer-critical merge request is blocked by pipelines failing due to review app deployment failures. + ## Performance Metrics On every [pipeline](https://gitlab.com/gitlab-org/gitlab/pipelines/125315730) in the `qa` stage, the @@ -94,8 +100,8 @@ the GitLab handbook information for the [shared 1Password account](https://about 1. Make sure you [have access to the cluster](#get-access-to-the-gcp-review-apps-cluster) and the `container.pods.exec` permission first. 1. [Filter Workloads by your Review App slug](https://console.cloud.google.com/kubernetes/workload?project=gitlab-review-apps). For example, `review-qa-raise-e-12chm0`. 1. Find and open the `toolbox` Deployment. For example, `review-qa-raise-e-12chm0-toolbox`. -1. Click on the Pod in the "Managed pods" section. For example, `review-qa-raise-e-12chm0-toolbox-d5455cc8-2lsvz`. -1. Click on the `KUBECTL` dropdown, then `Exec` -> `toolbox`. +1. Select the Pod in the "Managed pods" section. For example, `review-qa-raise-e-12chm0-toolbox-d5455cc8-2lsvz`. +1. Select the `KUBECTL` dropdown, then `Exec` -> `toolbox`. 1. Replace `-c toolbox -- ls` with `-it -- gitlab-rails console` from the default command or - Run `kubectl exec --namespace review-qa-raise-e-12chm0 review-qa-raise-e-12chm0-toolbox-d5455cc8-2lsvz -it -- gitlab-rails console` and @@ -107,8 +113,8 @@ the GitLab handbook information for the [shared 1Password account](https://about 1. Make sure you [have access to the cluster](#get-access-to-the-gcp-review-apps-cluster) and the `container.pods.getLogs` permission first. 1. [Filter Workloads by your Review App slug](https://console.cloud.google.com/kubernetes/workload?project=gitlab-review-apps). For example, `review-qa-raise-e-12chm0`. 1. Find and open the `migrations` Deployment. For example, `review-qa-raise-e-12chm0-migrations.1`. -1. Click on the Pod in the "Managed pods" section. For example, `review-qa-raise-e-12chm0-migrations.1-nqwtx`. -1. Click on the `Container logs` link. +1. Select the Pod in the "Managed pods" section. For example, `review-qa-raise-e-12chm0-migrations.1-nqwtx`. +1. Select `Container logs`. Alternatively, you could use the [Logs Explorer](https://console.cloud.google.com/logs/query;query=?project=gitlab-review-apps) which provides more utility to search logs. An example query for a pod name is as follows: @@ -199,7 +205,7 @@ subgraph "CNG-mirror pipeline" issue with a link to your merge request. Note that the deployment failure can reveal an actual problem introduced in your merge request (that is, this isn't necessarily a transient failure)! -- If the `review-qa-smoke` or `review-qa-reliable` job keeps failing, +- If the `review-qa-smoke` or `review-qa-reliable` job keeps failing (note that we already retry them once), please check the job's logs: you could discover an actual problem introduced in your merge request. You can also download the artifacts to see screenshots of the page at the time the failures occurred. If you don't find the cause of the @@ -237,7 +243,7 @@ due to this [known issue on the Kubernetes executor for GitLab Runner](https://g ### Helm The Helm version used is defined in the -[`registry.gitlab.com/gitlab-org/gitlab-build-images:gitlab-helm3-kubectl1.14` image](https://gitlab.com/gitlab-org/gitlab-build-images/-/blob/master/Dockerfile.gitlab-helm3-kubectl1.14#L7) +[`registry.gitlab.com/gitlab-org/gitlab-build-images:gitlab-helm3.5-kubectl1.17` image](https://gitlab.com/gitlab-org/gitlab-build-images/-/blob/master/Dockerfile.gitlab-helm3.5-kubectl1.17#L6) used by the `review-deploy` and `review-stop` jobs. ## Diagnosing unhealthy Review App releases diff --git a/doc/development/testing_guide/testing_levels.md b/doc/development/testing_guide/testing_levels.md index 9ca2d0db93c..02f32a031dc 100644 --- a/doc/development/testing_guide/testing_levels.md +++ b/doc/development/testing_guide/testing_levels.md @@ -55,7 +55,6 @@ records should use stubs/doubles as much as possible. | `lib/` | `spec/lib/` | RSpec | | | `lib/tasks/` | `spec/tasks/` | RSpec | | | `rubocop/` | `spec/rubocop/` | RSpec | | -| `spec/factories` | `spec/factories_spec.rb` | RSpec | | ### Frontend unit tests diff --git a/doc/development/understanding_explain_plans.md b/doc/development/understanding_explain_plans.md index e06ece38135..17fcd5b3e88 100644 --- a/doc/development/understanding_explain_plans.md +++ b/doc/development/understanding_explain_plans.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -7,7 +7,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w # Understanding EXPLAIN plans PostgreSQL allows you to obtain query plans using the `EXPLAIN` command. This -command can be invaluable when trying to determine how a query will perform. +command can be invaluable when trying to determine how a query performs. You can use this command directly in your SQL query, as long as the query starts with it: @@ -26,7 +26,7 @@ Aggregate (cost=922411.76..922411.77 rows=1 width=8) Filter: (visibility_level = ANY ('{0,20}'::integer[])) ``` -When using _just_ `EXPLAIN`, PostgreSQL won't actually execute our query, +When using _just_ `EXPLAIN`, PostgreSQL does not actually execute our query, instead it produces an _estimated_ execution plan based on the available statistics. This means the actual plan can differ quite a bit. Fortunately, PostgreSQL provides us with the option to execute the query as well. To do so, @@ -39,7 +39,7 @@ FROM projects WHERE visibility_level IN (0, 20); ``` -This will produce: +This produces: ```sql Aggregate (cost=922420.60..922420.61 rows=1 width=8) (actual time=3428.535..3428.535 rows=1 loops=1) @@ -54,7 +54,7 @@ As we can see this plan is quite different, and includes a lot more data. Let's discuss this step by step. Because `EXPLAIN ANALYZE` executes the query, care should be taken when using a -query that will write data or might time out. If the query modifies data, +query that writes data or might time out. If the query modifies data, consider wrapping it in a transaction that rolls back automatically like so: ```sql @@ -73,7 +73,7 @@ FROM projects WHERE visibility_level IN (0, 20); ``` -This will then produce: +This then produces: ```sql Aggregate (cost=922420.60..922420.61 rows=1 width=8) (actual time=3428.535..3428.535 rows=1 loops=1) @@ -120,10 +120,10 @@ Aggregate (cost=922411.76..922411.77 rows=1 width=8) Here the first node executed is `Seq scan on projects`. The `Filter:` is an additional filter applied to the results of the node. A filter is very similar to Ruby's `Array#select`: it takes the input rows, applies the filter, and -produces a new list of rows. Once the node is done, we perform the `Aggregate` +produces a new list of rows. After the node is done, we perform the `Aggregate` above it. -Nested nodes will look like this: +Nested nodes look like this: ```sql Aggregate (cost=176.97..176.98 rows=1 width=8) (actual time=0.252..0.252 rows=1 loops=1) @@ -152,7 +152,7 @@ number of rows produced, the number of loops performed, and more. For example: Seq Scan on projects (cost=0.00..908044.47 rows=5746914 width=0) ``` -Here we can see that our cost ranges from `0.00..908044.47` (we'll cover this in +Here we can see that our cost ranges from `0.00..908044.47` (we cover this in a moment), and we estimate (since we're using `EXPLAIN` and not `EXPLAIN ANALYZE`) a total of 5,746,914 rows to be produced by this node. The `width` statistics describes the estimated width of each row, in bytes. @@ -171,7 +171,7 @@ The startup cost states how expensive it was to start the node, with the total cost describing how expensive the entire node was. In general: the greater the values, the more expensive the node. -When using `EXPLAIN ANALYZE`, these statistics will also include the actual time +When using `EXPLAIN ANALYZE`, these statistics also include the actual time (in milliseconds) spent, and other runtime statistics (for example, the actual number of produced rows): @@ -183,7 +183,7 @@ Here we can see we estimated 5,746,969 rows to be returned, but in reality we returned 5,746,940 rows. We can also see that _just_ this sequential scan took 2.98 seconds to run. -Using `EXPLAIN (ANALYZE, BUFFERS)` will also give us information about the +Using `EXPLAIN (ANALYZE, BUFFERS)` also gives us information about the number of rows removed by a filter, the number of buffers used, and more. For example: @@ -242,7 +242,7 @@ retrieving lots of rows, so it's best to avoid these for large tables. A scan on an index that did not require fetching anything from the table. In certain cases an index only scan may still fetch data from the table, in this -case the node will include a `Heap Fetches:` statistic. +case the node includes a `Heap Fetches:` statistic. ### Index Scan @@ -273,7 +273,7 @@ Sorts the input rows as specified using an `ORDER BY` statement. ### Nested Loop -A nested loop will execute its child nodes for every row produced by a node that +A nested loop executes its child nodes for every row produced by a node that precedes it. For example: ```sql @@ -316,7 +316,7 @@ FROM users WHERE twitter != ''; ``` -This will produce the following plan: +This produces the following plan: ```sql Aggregate (cost=845110.21..845110.22 rows=1 width=8) (actual time=1271.157..1271.158 rows=1 loops=1) @@ -435,7 +435,7 @@ This index would only index the `email` value of rows that match `WHERE id < CREATE INDEX CONCURRENTLY twitter_test ON users (twitter) WHERE twitter != ''; ``` -Once created, if we run our query again we will be given the following plan: +After being created, if we run our query again we are given the following plan: ```sql Aggregate (cost=1608.26..1608.27 rows=1 width=8) (actual time=19.821..19.821 rows=1 loops=1) @@ -466,7 +466,7 @@ be used for comparison (for example, it depends a lot on the state of cache). When optimizing a query, we usually need to reduce the amount of data we're dealing with. Indexes are the way to work with fewer pages (buffers) to get the result, so, during optimization, look at the number of buffers used (read and hit), -and work on reducing these numbers. Reduced timing will be the consequence of reduced +and work on reducing these numbers. Reduced timing is the consequence of reduced buffer numbers. [Database Lab Engine](#database-lab-engine) guarantees that the plan is structurally identical to production (and overall number of buffers is the same as on production), but difference in cache state and I/O speed may lead to different timings. @@ -508,8 +508,8 @@ index on `projects.visibility_level` to somehow turn this Sequential scan + filter into an index-only scan. Unfortunately, doing so is unlikely to improve anything. Contrary to what some -might believe, an index being present _does not guarantee_ that PostgreSQL will -actually use it. For example, when doing a `SELECT * FROM projects` it is much +might believe, an index being present _does not guarantee_ that PostgreSQL +actually uses it. For example, when doing a `SELECT * FROM projects` it is much cheaper to just scan the entire table, instead of using an index and then fetching data from the table. In such cases PostgreSQL may decide to not use an index. @@ -539,7 +539,7 @@ For GitLab.com this produces: Here the total number of projects is 5,811,804, and 5,746,126 of those are of level 0 or 20. That's 98% of the entire table! -So no matter what we do, this query will retrieve 98% of the entire table. Since +So no matter what we do, this query retrieves 98% of the entire table. Since most time is spent doing exactly that, there isn't really much we can do to improve this query, other than _not_ running it at all. @@ -589,7 +589,7 @@ Foreign-key constraints: "fk_rails_722ceba4f7" FOREIGN KEY (project_id) REFERENCES projects(id) ON DELETE CASCADE ``` -Let's rewrite our query to JOIN this table onto our projects, and get the +Let's rewrite our query to `JOIN` this table onto our projects, and get the projects for a specific user: ```sql @@ -604,7 +604,7 @@ AND user_interacted_projects.user_id = 1; What we do here is the following: 1. Get our projects. -1. INNER JOIN `user_interacted_projects`, meaning we're only left with rows in +1. `INNER JOIN` `user_interacted_projects`, meaning we're only left with rows in `projects` that have a corresponding row in `user_interacted_projects`. 1. Limit this to the projects with `visibility_level` of 0 or 20, and to projects that the user with ID 1 interacted with. @@ -765,7 +765,7 @@ The web interface comes with the following execution plan visualizers included: #### Tips & Tricks -The database connection is now maintained during your whole session, so you can use `exec set ...` for any session variables (such as `enable_seqscan` or `work_mem`). These settings will be applied to all subsequent commands until you reset them. For example you can disable parallel queries with +The database connection is now maintained during your whole session, so you can use `exec set ...` for any session variables (such as `enable_seqscan` or `work_mem`). These settings are applied to all subsequent commands until you reset them. For example you can disable parallel queries with ```sql exec SET max_parallel_workers_per_gather = 0 diff --git a/doc/development/uploads/working_with_uploads.md b/doc/development/uploads/working_with_uploads.md index 4e907530a9f..d44f2f69168 100644 --- a/doc/development/uploads/working_with_uploads.md +++ b/doc/development/uploads/working_with_uploads.md @@ -101,23 +101,24 @@ Therefore, document new uploads here by slotting them into the following tables: | Pipeline artifacts | `carrierwave` | `sidekiq` | `/artifacts/<proj_id_hash>/pipelines/<pipeline_id>/artifacts/<artifact_id>` | | Live job traces | `fog` | `sidekiq` | `/artifacts/tmp/builds/<job_id>/chunks/<chunk_index>.log` | | Job traces archive | `carrierwave` | `sidekiq` | `/artifacts/<proj_id_hash>/<date>/<job_id>/<artifact_id>/job.log` | -| Autoscale runner caching | N/A | `gitlab-runner` | `/gitlab-com-[platform-]runners-cache/???` | -| Backups | N/A | `s3cmd`, `awscli`, or `gcs` | `/gitlab-backups/???` | -| Git LFS | `direct upload` | `workhorse` | `/lsf-objects/<lfs_obj_oid[0:2]>/<lfs_obj_oid[2:2]>` | +| Autoscale runner caching | Not applicable | `gitlab-runner` | `/gitlab-com-[platform-]runners-cache/???` | +| Backups | Not applicable | `s3cmd`, `awscli`, or `gcs` | `/gitlab-backups/???` | +| Git LFS | `direct upload` | `workhorse` | `/lfs-objects/<lfs_obj_oid[0:2]>/<lfs_obj_oid[2:2]>` | | Design management files | `disk buffering` | `rails controller` | `/lsf-objects/<lfs_obj_oid[0:2]>/<lfs_obj_oid[2:2]>` | -| Design management thumbnails | `carrierwave` | `sidekiq` | `/uploads/design_management/action/image_v432x230/<model_id>` | +| Design management thumbnails | `carrierwave` | `sidekiq` | `/uploads/design_management/action/image_v432x230/<model_id>/<original_lfs_obj_oid[2:2]` | | Generic file uploads | `direct upload` | `workhorse` | `/uploads/@hashed/[0:2]/[2:4]/<hash1>/<hash2>/file` | | Generic file uploads - personal snippets | `direct upload` | `workhorse` | `/uploads/personal_snippet/<snippet_id>/<filename>` | | Global appearance settings | `disk buffering` | `rails controller` | `/uploads/appearance/...` | | Topics | `disk buffering` | `rails controller` | `/uploads/projects/topic/...` | | Avatar images | `direct upload` | `workhorse` | `/uploads/[user,group,project]/avatar/<model_id>` | -| Import/export | `direct upload` | `workhorse` | `/uploads/import_export_upload/???` | +| Import | `direct upload` | `workhorse` | `/uploads/import_export_upload/import_file/<model_id>/<file_name>` | +| Export | `carrierwave` | `sidekiq` | `/uploads/import_export_upload/export_file/<model_id>/<timestamp>_<namespace>-<project_name>_export.tag.gz` | | GitLab Migration | `carrierwave` | `sidekiq` | `/uploads/bulk_imports/???` | | MR diffs | `carrierwave` | `sidekiq` | `/external-diffs/merge_request_diffs/mr-<mr_id>/diff-<diff_id>` | -| Package manager archives | `direct upload` | `sidekiq` | `/packages/<proj_id_hash>/packages/<pkg_segment>/files/<pkg_file_id>` | -| Package manager archives | `direct upload` | `sidekiq` | `/packages/<container_id_hash>/debian_*_component_file/<component_file_id>` | -| Package manager archives | `direct upload` | `sidekiq` | `/packages/<container_id_hash>/debian_*_distribution/<distribution_file_id>` | -| Container image cache (?) | `direct upload` | `workhorse` | `/dependency-proxy/<group_id_hash>/dependency_proxy/<group_id>/files/<proxy_id>/<blob_id or manifest_id>` | +| [Package manager assets (except for NPM)](../../user/packages/package_registry/index.md) | `direct upload` | `workhorse` | `/packages/<proj_id_hash>/packages/<package_id>/files/<package_file_id>` | +| [NPM Package manager assets](../../user/packages/npm_registry/index.md) | `carrierwave` | `grape API` | `/packages/<proj_id_hash>/packages/<package_id>/files/<package_file_id>` | +| [Debian Package manager assets](../../user/packages/debian_repository/index.md) | `direct upload` | `workhorse` | `/packages/<group_id or project_id_hash>/debian_*/<group_id or project_id or distribution_file_id>` | +| [Dependency Proxy cache](../../user/packages/dependency_proxy/index.md) | [`send_dependency`](https://gitlab.com/gitlab-org/gitlab/-/blob/6ed73615ff1261e6ed85c8f57181a65f5b4ffada/workhorse/internal/dependencyproxy/dependencyproxy.go) | `workhorse` | `/dependency-proxy/<group_id_hash>/dependency_proxy/<group_id>/files/<blob_id or manifest_id>` | | Terraform state files | `carrierwave` | `rails controller` | `/terraform/<proj_id_hash>/<terraform_state_id>` | | Pages content archives | `carrierwave` | `sidekiq` | `/gitlab-gprd-pages/<proj_id_hash>/pages_deployments/<deployment_id>/` | | Secure Files | `carrierwave` | `sidekiq` | `/ci-secure-files/<proj_id_hash>/secure_files/<secure_file_id>/` | diff --git a/doc/development/verifying_database_capabilities.md b/doc/development/verifying_database_capabilities.md index bda9c68eae5..55347edf4ec 100644 --- a/doc/development/verifying_database_capabilities.md +++ b/doc/development/verifying_database_capabilities.md @@ -1,5 +1,5 @@ --- -stage: Enablement +stage: Data Stores group: Database info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments --- @@ -34,5 +34,5 @@ to be wrapped in a `Gitlab::Database.read_only?` or `Gitlab::Database.read_write guard, to make sure it doesn't for read-only databases. We have a Rails Middleware that filters any potentially writing -operations (the CUD operations of CRUD) and prevent the user from trying +operations (the `CUD` operations of CRUD) and prevent the user from trying to update the database and getting a 500 error (see `Gitlab::Middleware::ReadOnly`). diff --git a/doc/development/windows.md b/doc/development/windows.md index fb095b68939..3eed9c057ab 100644 --- a/doc/development/windows.md +++ b/doc/development/windows.md @@ -65,21 +65,21 @@ Build a Google Cloud image with the above shared runners repository by doing the 1. In a web browser, go to the [Google Cloud Platform console](https://console.cloud.google.com/compute/images). 1. Filter images by the name you used when creating image, `windows` is likely all you need to filter by. -1. Click the image's name. -1. Click the **CREATE INSTANCE** link. +1. Select the image's name. +1. Select **CREATE INSTANCE**. 1. Important: Change name to what you'd like as you can't change it later. 1. Optional: Change Region to be closest to you as well as any other option you'd like. -1. Click **Create** at the bottom of the page. -1. Click the name of your newly created VM Instance (optionally you can filter to find it). -1. Click **Set Windows password**. +1. Select **Create** at the bottom of the page. +1. Select the name of your newly created VM Instance (optionally you can filter to find it). +1. Select **Set Windows password**. 1. Optional: Set a username or use default. -1. Click **Next**. +1. Select **Next**. 1. Copy and save the password as it is not shown again. -1. Click **RDP** down arrow. -1. Click **Download the RDP file**. +1. Select **RDP** down arrow. +1. Select **Download the RDP file**. 1. Open the downloaded RDP file with the Windows remote desktop app (<https://docs.microsoft.com/en-us/windows-server/remote/remote-desktop-services/clients/remote-desktop-clients>). -1. Click **Continue** to accept the certificate. -1. Enter the password and click **Next**. +1. Select **Continue** to accept the certificate. +1. Enter the password and select **Next**. You should now be connected into a Windows machine with a command prompt. diff --git a/doc/development/work_items.md b/doc/development/work_items.md index d4a1073461a..9a17a152525 100644 --- a/doc/development/work_items.md +++ b/doc/development/work_items.md @@ -45,12 +45,24 @@ Here are some problems with current issues usage and why we are looking into wor - Codebase maintainability and feature development becomes a bigger challenges as we grow issues beyond its core role of issue tracking into supporting the different types and subtle differences between them. -## Work item and work item type terms +## Work item terminology -Using the terms "issue" or "issuable" to reference the types of collaboration objects -(for example, issue, bug, feature, or epic) often creates confusion. To avoid confusion, we will use the term -work item type (WIT) when referring to the type of a collaboration object. -An instance of a WIT is a work item (WI). For example, `issue#123`, `bug#456`, `requirement#789`. +To avoid confusion and ensure communication is efficient, we will use the following terms exclusively when discussing work items. + +| Term | Description | Example of misuse | Should be | +| --- | --- | --- | --- | +| work item type | Classes of work item; for example: issue, requirement, test case, incident, or task | _Epics will eventually become issues_ | _Epics will eventually become a **work item type**_ | +| work item | An instance of a work item type | | | +| work item view | The new frontend view that renders work items of any type | | | +| legacy issue view | The existing view used to render issues and incidents | | | +| issue | The existing issue model | | | +| issuable | Any model currently using the issueable module (issues, epics and MRs) | _Incidents are an **issuable**_ | _Incidents are a **work item type**_ | + +Some terms have been used in the past but have since become confusing and are now discouraged. + +| Term | Description | Example of misuse | Should be | +| --- | --- | --- | --- | +| issue type | A former way to refer to classes of work item | _Tasks are an **issue type**_ | _Tasks are a **work item type**_ | ### Migration strategy diff --git a/doc/development/workhorse/configuration.md b/doc/development/workhorse/configuration.md index ce80a155489..b86bb824ea1 100644 --- a/doc/development/workhorse/configuration.md +++ b/doc/development/workhorse/configuration.md @@ -21,47 +21,49 @@ Add any new Workhorse configuration options into the configuration file. Options: -apiCiLongPollingDuration duration - Long polling duration for job requesting for runners (default 50ns) + Long polling duration for job requesting for runners (default 50ns) -apiLimit uint - Number of API requests allowed at single time + Number of API requests allowed at single time -apiQueueDuration duration - Maximum queueing duration of requests (default 30s) + Maximum queueing duration of requests (default 30s) -apiQueueLimit uint - Number of API requests allowed to be queued + Number of API requests allowed to be queued -authBackend string - Authentication/authorization backend (default "http://localhost:8080") + Authentication/authorization backend (default "http://localhost:8080") -authSocket string - Optional: Unix domain socket to dial authBackend at + Optional: Unix domain socket to dial authBackend at -cableBackend string - Optional: ActionCable backend (default authBackend) + ActionCable backend -cableSocket string - Optional: Unix domain socket to dial cableBackend at (default authSocket) + Optional: Unix domain socket to dial cableBackend at -config string - TOML file to load config from + TOML file to load config from -developmentMode - Allow the assets to be served from Rails app + Allow the assets to be served from Rails app -documentRoot string - Path to static files content (default "public") + Path to static files content (default "public") -listenAddr string - Listen address for HTTP server (default "localhost:8181") + Listen address for HTTP server (default "localhost:8181") -listenNetwork string - Listen 'network' (tcp, tcp4, tcp6, unix) (default "tcp") + Listen 'network' (tcp, tcp4, tcp6, unix) (default "tcp") -listenUmask int - Umask for Unix socket + Umask for Unix socket -logFile string - Log file location + Log file location -logFormat string - Log format to use defaults to text (text, json, structured, none) (default "text") + Log format to use defaults to text (text, json, structured, none) (default "text") -pprofListenAddr string - pprof listening address, e.g. 'localhost:6060' + pprof listening address, e.g. 'localhost:6060' -prometheusListenAddr string - Prometheus listening address, e.g. 'localhost:9229' + Prometheus listening address, e.g. 'localhost:9229' + -propagateCorrelationID X-Request-ID + Reuse existing Correlation-ID from the incoming request header X-Request-ID if present -proxyHeadersTimeout duration - How long to wait for response headers when proxying the request (default 5m0s) + How long to wait for response headers when proxying the request (default 5m0s) -secretPath string - File with secret key to authenticate with authBackend (default "./.gitlab_workhorse_secret") + File with secret key to authenticate with authBackend (default "./.gitlab_workhorse_secret") -version - Print version and exit + Print version and exit ``` The 'auth backend' refers to the GitLab Rails application. The name is @@ -70,7 +72,7 @@ HTTP. GitLab Workhorse can listen on either a TCP or a Unix domain socket. It can also open a second listening TCP listening socket with the Go -[`net/http/pprof` profiler server](http://golang.org/pkg/net/http/pprof/). +[`net/http/pprof` profiler server](https://pkg.go.dev/net/http/pprof). GitLab Workhorse can listen on Redis build and runner registration events if you pass a valid TOML configuration file through the `-config` flag. @@ -147,6 +149,19 @@ addr = "localhost:3443" The `certificate` file should contain the concatenation of the server's certificate, any intermediates, and the CA's certificate. +Metrics endpoints can be configured similarly: + +```toml +[metrics_listener] +network = "tcp" +addr = "localhost:9229" +[metrics_listener.tls] + certificate = "/path/to/certificate" + key = "/path/to/private/key" + min_version = "tls1.2" + max_version = "tls1.3" +``` + ## Interaction of authBackend and authSocket The interaction between `authBackend` and `authSocket` can be confusing. @@ -213,6 +228,53 @@ configuration with the `GITLAB_TRACING` environment variable, like this: GITLAB_TRACING=opentracing://jaeger ./gitlab-workhorse ``` +### Propagate correlation IDs + +When a user makes an HTTP request, such as creating a new project, the +initial request is routed through Workhorse to another service, which +may in turn, make other requests. To help trace the request as it flows +across services, Workhorse generates a random value called a +[correlation ID](../../administration/troubleshooting/tracing_correlation_id.md). +Workhorse sends this correlation ID via the `X-Request-Id` HTTP header. + +Some GitLab services, such as GitLab Shell, generate their own +correlation IDs. In addition, other services, such as Gitaly, make +internal API calls that pass along a correlation ID from the original +request. In either case, the correlation ID is also passed via the +`X-Request-Id` HTTP header. + +By default, Workhorse ignores this header and always generates a new +correlation ID. This makes debugging harder and prevents distributed +tracing from working properly, since the new correlation ID is +completely unrelated to the original one. + +Workhorse can be configured to propagate an incoming correlation ID via +the `-propagateCorrelationID` command-line flag. It is highly +recommended that this option be used with an IP allow list to ensure +arbitrary values cannot be generated by untrusted clients. + +An IP allow list is specified via the `trusted_cidrs_for_propagation` +option in the Workhorse configuration file. Specify a list of CIDR blocks +that can be trusted. For example: + +```toml +trusted_cidrs_for_propagation = ["10.0.0.0/8", "127.0.0.1/32"] +``` + +NOTE: +The `-propagateCorrelationID` flag must be used for the `trusted_cidrs_for_propagation` option to work. + +### Trusted proxies + +If Workhorse is behind a reverse proxy such as NGINX, the +`trusted_cidrs_for_x_forwarded_for` option is needed to specify which +CIDR blocks can be used to trust to provide the originating IP address +via the `X-Forwarded-For` HTTP header. For example: + +```toml +trusted_cidrs_for_x_forwarded_for = ["10.0.0.0/8", "127.0.0.1/32"] +``` + ## Continuous profiling Workhorse supports continuous profiling through [LabKit](https://gitlab.com/gitlab-org/labkit/) diff --git a/doc/development/workhorse/gitlab_features.md b/doc/development/workhorse/gitlab_features.md index 2aa8d9d2399..365cc7991d8 100644 --- a/doc/development/workhorse/gitlab_features.md +++ b/doc/development/workhorse/gitlab_features.md @@ -53,14 +53,14 @@ memory than it costs to have Workhorse look after it. for example, JavaScript files and CSS files are served straight from disk. - Workhorse can modify responses sent by Rails: for example if you use - `send_file` in Rails then GitLab Workhorse will open the file on + `send_file` in Rails then GitLab Workhorse opens the file on disk and send its contents as the response body to the client. - Workhorse can take over requests after asking permission from Rails. Example: handling `git clone`. - Workhorse can modify requests before passing them to Rails. Example: when handling a Git LFS upload Workhorse first asks permission from - Rails, then it stores the request body in a tempfile, then it sends - a modified request containing the tempfile path to Rails. + Rails, then it stores the request body in a temporary file, then it sends + a modified request containing the file path to Rails. - Workhorse can manage long-lived WebSocket connections for Rails. Example: handling the terminal websocket for environments. - Workhorse does not connect to PostgreSQL, only to Rails and (optionally) Redis. |