diff options
author | Robert Speicher <rspeicher@gmail.com> | 2021-01-20 13:34:23 -0600 |
---|---|---|
committer | Robert Speicher <rspeicher@gmail.com> | 2021-01-20 13:34:23 -0600 |
commit | 6438df3a1e0fb944485cebf07976160184697d72 (patch) | |
tree | 00b09bfd170e77ae9391b1a2f5a93ef6839f2597 /doc/development | |
parent | 42bcd54d971da7ef2854b896a7b34f4ef8601067 (diff) | |
download | gitlab-ce-13.8.0-rc42.tar.gz |
Add latest changes from gitlab-org/gitlab@13-8-stable-eev13.8.0-rc42
Diffstat (limited to 'doc/development')
78 files changed, 3222 insertions, 2008 deletions
diff --git a/doc/development/README.md b/doc/development/README.md index 2e4674b5288..0d3c1b3cbe9 100644 --- a/doc/development/README.md +++ b/doc/development/README.md @@ -256,11 +256,11 @@ See [database guidelines](database/index.md). - [Externalization](i18n/externalization.md) - [Translation](i18n/translation.md) -## Product Analytics guides +## Product Intelligence guides -- [Product Analytics guide](https://about.gitlab.com/handbook/product/product-analytics-guide/) -- [Usage Ping guide](product_analytics/usage_ping.md) -- [Snowplow guide](product_analytics/snowplow.md) +- [Product Intelligence guide](https://about.gitlab.com/handbook/product/product-intelligence-guide/) +- [Usage Ping guide](usage_ping.md) +- [Snowplow guide](snowplow.md) ## Experiment guide @@ -292,6 +292,7 @@ See [database guidelines](database/index.md). - [Reference processing](reference_processing.md) - [Compatibility with multiple versions of the application running at the same time](multi_version_compatibility.md) - [Features inside `.gitlab/`](features_inside_dot_gitlab.md) +- [Dashboards for stage groups](stage_group_dashboards.md) ## Other GitLab Development Kit (GDK) guides diff --git a/doc/development/adding_database_indexes.md b/doc/development/adding_database_indexes.md index 0991c4740cc..01904d37883 100644 --- a/doc/development/adding_database_indexes.md +++ b/doc/development/adding_database_indexes.md @@ -195,3 +195,34 @@ Without an explicit name argument, Rails can return a false positive for `index_exists?`, causing a required index to not be created properly. By always requiring a name for certain types of indexes, the chance of error is greatly reduced. + +## Temporary indexes + +There may be times when an index is only needed temporarily. + +For example, in a migration, a column of a table might be conditionally +updated. To query which columns need to be updated within the +[query performance guidelines](query_performance.md), an index is needed that would otherwise +not be used. + +In these cases, a temporary index should be considered. To specify a +temporary index: + +1. Prefix the index name with `tmp_` and follow the [naming conventions](database/constraint_naming_convention.md) and [requirements for naming indexes](#requirements-for-naming-indexes) for the rest of the name. +1. Create a follow-up issue to remove the index in the next (or future) milestone. +1. Add a comment in the migration mentioning the removal issue. + +A temporary migration would look like: + +```ruby +INDEX_NAME = 'tmp_index_projects_on_owner_where_emails_disabled' + +def up + # Temporary index to be removed in 13.9 https://gitlab.com/gitlab-org/gitlab/-/issues/1234 + add_concurrent_index :projects, :creator_id, where: 'emails_disabled = false', name: INDEX_NAME +end + +def down + remove_concurrent_index_by_name :projects, INDEX_NAME +end +``` diff --git a/doc/development/agent/identity.md b/doc/development/agent/identity.md index 65de1a6f0c8..884ce015a02 100644 --- a/doc/development/agent/identity.md +++ b/doc/development/agent/identity.md @@ -37,9 +37,9 @@ has a different configuration. Some may enable features A and B, and some may enable features B and C. This flexibility enables different groups of people to use different features of the agent in the same cluster. -For example, [Priyanka (Platform Engineer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#priyanka-platform-engineer) +For example, [Priyanka (Platform Engineer)](https://about.gitlab.com/handbook/marketing/strategic-marketing/roles-personas/#priyanka-platform-engineer) may want to use cluster-wide features of the agent, while -[Sasha (Software Developer)](https://about.gitlab.com/handbook/marketing/product-marketing/roles-personas/#sasha-software-developer) +[Sasha (Software Developer)](https://about.gitlab.com/handbook/marketing/strategic-marketing/roles-personas/#sasha-software-developer) uses the agent that only has access to a particular namespace. Each agent is likely running using a diff --git a/doc/development/agent/local.md b/doc/development/agent/local.md index 75d45366238..47246a6a6d3 100644 --- a/doc/development/agent/local.md +++ b/doc/development/agent/local.md @@ -38,7 +38,7 @@ You can run `kas` and `agentk` locally to test the [Kubernetes Agent](index.md) gdk start # Stop GDK's version of kas gdk stop gitlab-k8s-agent - + # Start kas bazel run //cmd/kas -- --configuration-file="$(pwd)/cfg.yaml" ``` @@ -56,3 +56,45 @@ for more targets. <i class="fa fa-youtube-play youtube" aria-hidden="true"></i> To learn more about how the repository is structured, see [GitLab Kubernetes Agent repository overview](https://www.youtube.com/watch?v=j8CyaCWroUY). + +## Run tests locally + +You can run all tests, or a subset of tests, locally. + +- **To run all tests**: Run the command `make test`. +- **To run all test targets in the directory**: Run the command + `bazel test //internal/module/gitops/server:all`. + + You can use `*` in the command, instead of `all`, but it must be quoted to + avoid shell expansion: `bazel test '//internal/module/gitops/server:*'`. +- **To run all tests in a directory and its subdirectories**: Run the command + `bazel test //internal/module/gitops/server/...`. + +### Run specific test scenarios + +To run only a specific test scenario, you need the directory name and the target +name of the test. For example, to run the tests at +`internal/module/gitops/server/module_test.go`, the `BUILD.bazel` file that +defines the test's target name lives at `internal/module/gitops/server/BUILD.bazel`. +In the latter, the target name is defined like: + +```bazel +go_test( + name = "server_test", + size = "small", + srcs = [ + "module_test.go", +``` + +The target name is `server_test` and the directory is `internal/module/gitops/server/`. +Run the test scenario with this command: + +```shell +bazel test //internal/module/gitops/server:server_test +``` + +### Additional resources + +- Bazel documentation about [specifying targets to build](https://docs.bazel.build/versions/master/guide.html#specifying-targets-to-build). +- [The Bazel query](https://docs.bazel.build/versions/master/query.html) +- [Bazel query how to](https://docs.bazel.build/versions/master/query-how-to.html) diff --git a/doc/development/api_graphql_styleguide.md b/doc/development/api_graphql_styleguide.md index 832a89ecac1..d73c3a8d6f6 100644 --- a/doc/development/api_graphql_styleguide.md +++ b/doc/development/api_graphql_styleguide.md @@ -760,7 +760,7 @@ To limit the amount of queries performed, we can use [BatchLoader](graphql_guide ### Writing resolvers -Our code should aim to be thin declarative wrappers around finders and services. You can +Our code should aim to be thin declarative wrappers around finders and [services](../development/reusing_abstractions.md#service-classes). You can repeat lists of arguments, or extract them to concerns. Composition is preferred over inheritance in most cases. Treat resolvers like controllers: resolvers should be a DSL that compose other application abstractions. @@ -1256,6 +1256,10 @@ single mutation when multiple are performed within a single request. ### The `resolve` method +Similar to [writing resolvers](#writing-resolvers), the `resolve` method of a mutation +should aim to be a thin declarative wrapper around a +[service](../development/reusing_abstractions.md#service-classes). + The `resolve` method receives the mutation's arguments as keyword arguments. From here, we can call the service that modifies the resource. @@ -1352,6 +1356,7 @@ Key points: - Errors may be reported to users either at `$root.errors` (top-level error) or at `$root.data.mutationName.errors` (mutation errors). The location depends on what kind of error this is, and what information it holds. +- Mutation fields [must have `null: true`](https://graphql-ruby.org/mutations/mutation_errors#nullable-mutation-payload-fields) Consider an example mutation `doTheThing` that returns a response with two fields: `errors: [String]`, and `thing: ThingType`. The specific nature of diff --git a/doc/development/changelog.md b/doc/development/changelog.md index 894ae5a1893..8fad32ed163 100644 --- a/doc/development/changelog.md +++ b/doc/development/changelog.md @@ -47,7 +47,7 @@ the `author` field. GitLab team members **should not**. - Any user-facing change **must** have a changelog entry. This includes both visual changes (regardless of how minor), and changes to the rendered DOM which impact how a screen reader may announce the content. - Any client-facing change to our REST and GraphQL APIs **must** have a changelog entry. - Performance improvements **should** have a changelog entry. -- Changes that need to be documented in the Product Analytics [Event Dictionary](https://about.gitlab.com/handbook/product/product-analytics-guide/#event-dictionary) +- Changes that need to be documented in the Product Intelligence [Event Dictionary](https://about.gitlab.com/handbook/product/product-intelligence-guide/#event-dictionary) also require a changelog entry. - _Any_ contribution from a community member, no matter how small, **may** have a changelog entry regardless of these guidelines if the contributor wants one. @@ -55,7 +55,7 @@ the `author` field. GitLab team members **should not**. - Any docs-only changes **should not** have a changelog entry. - Any change behind a disabled feature flag **should not** have a changelog entry. - Any change behind an enabled feature flag **should** have a changelog entry. -- Any change that adds new usage data metrics and changes that needs to be documented in Product Analytics [Event Dictionary](https://about.gitlab.com/handbook/product/product-analytics-guide/#event-dictionary) **should** have a changelog entry. +- Any change that adds new usage data metrics and changes that needs to be documented in Product Intelligence [Event Dictionary](https://about.gitlab.com/handbook/product/product-intelligence-guide/#event-dictionary) **should** have a changelog entry. - A change that adds snowplow events **should** have a changelog entry - - A change that [removes a feature flag](feature_flags/development.md) **should** have a changelog entry - only if the feature flag did not default to true already. diff --git a/doc/development/cicd/templates.md b/doc/development/cicd/templates.md index 1ab569ba0df..94b03634e25 100644 --- a/doc/development/cicd/templates.md +++ b/doc/development/cicd/templates.md @@ -34,7 +34,7 @@ Also, all templates must be named with the `*.gitlab-ci.yml` suffix. ### Backward compatibility A template might be dynamically included with the `include:template:` keyword. If -you make a change to an *existing* template, you **must** make sure that it won't break +you make a change to an *existing* template, you **must** make sure that it doesn't break CI/CD in existing projects. For example, changing a job name in a template could break pipelines in an existing project. @@ -59,12 +59,20 @@ performance: ``` If the job name `performance` in the template is renamed to `browser-performance`, -user's `.gitlab-ci.yml` will immediately cause a lint error because there +the user's `.gitlab-ci.yml` immediately causes a lint error because there are no such jobs named `performance` in the included template anymore. Therefore, users have to fix their `.gitlab-ci.yml` that could annoy their workflow. Please read [versioning](#versioning) section for introducing breaking change safely. +### Best practices + +- Avoid using [global keywords](../../ci/yaml/README.md#global-keywords), + such as `image`, `stages` and `variables` at top-level. + When a root `.gitlab-ci.yml` [includes](../../ci/yaml/README.md#include) + multiple templates, these global keywords could be overridden by the + others and cause an unexpected behavior. + ## Versioning Versioning allows you to introduce a new template without modifying the existing @@ -103,7 +111,7 @@ If the `latest` template does not exist yet, you can copy [the stable template]( Users may want to use an older [stable template](#stable-version) that is not bundled in the current GitLab package. For example, the stable templates in GitLab v13.0 and -GitLab v14.0 could be so different that a user will want to continue using the v13.0 template even +GitLab v14.0 could be so different that a user wants to continue using the v13.0 template even after upgrading to GitLab 14.0. You can add a note in the template or in documentation explaining how to use `include:remote` @@ -152,7 +160,7 @@ When you add a template into one of those directories, make sure that it correct ### Write an RSpec test -You should write an RSpec test to make sure that pipeline jobs will be generated correctly: +You should write an RSpec test to make sure that pipeline jobs are generated correctly: 1. Add a test file at `spec/lib/gitlab/ci/templates/<template-category>/<template-name>_spec.rb` 1. Test that pipeline jobs are properly created via `Ci::CreatePipelineService`. @@ -163,10 +171,10 @@ When you introduce a breaking change to [a `latest` template](#latest-version), you must: 1. Test the upgrade path from [the stable template](#stable-version). -1. Verify what kind of errors users will encounter. +1. Verify what kind of errors users encounter. 1. Document it as a troubleshooting guide. -This information will be important for users when [a stable template](#stable-version) +This information is important for users when [a stable template](#stable-version) is updated in a major version GitLab release. ## Security diff --git a/doc/development/code_review.md b/doc/development/code_review.md index 00f4cf90481..fe395dc2304 100644 --- a/doc/development/code_review.md +++ b/doc/development/code_review.md @@ -24,6 +24,7 @@ uncovered edge cases. The default approach is to choose a reviewer from your group or team for the first review. This is only a recommendation and the reviewer may be from a different team. However, it is recommended to pick someone who is a [domain expert](#domain-experts). +If your merge request touches more than one domain (for example, Dynamic Analysis and GraphQL), ask for reviews from an expert from each domain. You can read more about the importance of involving reviewer(s) in the section on the responsibility of the author below. @@ -69,14 +70,17 @@ It picks reviewers and maintainers from the list at the [engineering projects](https://about.gitlab.com/handbook/engineering/projects/) page, with these behaviors: -1. It doesn't pick people whose [GitLab status](../user/profile/index.md#current-status) - contains the string 'OOO', or the emoji is `:palm_tree:` or `:beach:`. +1. It doesn't pick people whose Slack or [GitLab status](../user/profile/index.md#current-status): + - contains the string 'OOO', 'PTO', 'Parental Leave', or 'Friends and Family' + - emoji is `:palm_tree:`, `:beach:`, `:beach_umbrella:`, `:beach_with_umbrella:`, `:ferris_wheel:`, `:thermometer:`, `:face_with_thermometer:`, `:red_circle:`, `:bulb:`, `:sun_with_face:`. 1. [Trainee maintainers](https://about.gitlab.com/handbook/engineering/workflow/code-review/#trainee-maintainer) are three times as likely to be picked as other reviewers. -1. People whose [GitLab status](../user/profile/index.md#current-status) emoji - is `:large_blue_circle:` are more likely to be picked. This applies to both reviewers and trainee maintainers. +1. Team members whose Slack or [GitLab status](../user/profile/index.md#current-status) emoji + is 🔵 `:large_blue_circle:` are more likely to be picked. This applies to both reviewers and trainee maintainers. - Reviewers with `:large_blue_circle:` are two times as likely to be picked as other reviewers. - Trainee maintainers with `:large_blue_circle:` are four times as likely to be picked as other reviewers. +1. People whose [GitLab status](../user/profile/index.md#current-status) emoji + is 🔶 `:large_orange_diamond:` are half as likely to be picked. This applies to both reviewers and trainee maintainers. 1. It always picks the same reviewers and maintainers for the same branch name (unless their OOO status changes, as in point 1). It removes leading `ce-` and `ee-`, and trailing `-ce` and `-ee`, so @@ -116,7 +120,7 @@ with [domain expertise](#domain-experts). by a [Software Engineer in Test](https://about.gitlab.com/handbook/engineering/quality/#individual-contributors)**. 1. If your merge request only includes end-to-end changes (*3*) **or** if the MR author is a [Software Engineer in Test](https://about.gitlab.com/handbook/engineering/quality/#individual-contributors), it must be **approved by a [Quality maintainer](https://about.gitlab.com/handbook/engineering/projects/#gitlab_maintainers_qa)** 1. If your merge request includes a new or updated [application limit](https://about.gitlab.com/handbook/product/product-processes/#introducing-application-limits), it must be **approved by a [product manager](https://about.gitlab.com/company/team/)**. -1. If your merge request includes Product Analytics (telemetry) changes, it should be reviewed and approved by a [Product analytics engineer](https://gitlab.com/gitlab-org/growth/product-analytics/engineers). +1. If your merge request includes Product Intelligence (telemetry or analytics) changes, it should be reviewed and approved by a [Product Intelligence engineer](https://gitlab.com/gitlab-org/growth/product_intelligence/engineers). - (*1*): Please note that specs other than JavaScript specs are considered backend code. - (*2*): We encourage you to seek guidance from a database maintainer if your merge @@ -336,6 +340,7 @@ experience, refactors the existing code). Then: convey your intent. - For non-mandatory suggestions, decorate with (non-blocking) so the author knows they can optionally resolve within the merge request or follow-up at a later stage. + - There's a [Chrome/Firefox addon](https://gitlab.com/conventionalcomments/conventional-comments-button) which you can use to apply [Conventional Comment](https://conventionalcomments.org/) prefixes. - After a round of line notes, it can be helpful to post a summary note such as "Looks good to me", or "Just a couple things to address." - Assign the merge request to the author if changes are required following your @@ -505,7 +510,7 @@ and get on with their work quickly. If you think you are at capacity and are unable to accept any more reviews until some have been completed, communicate this through your GitLab status by setting -the `:red_circle:` emoji and mentioning that you are at capacity in the status +the 🔴 `:red_circle:` emoji and mentioning that you are at capacity in the status text. This guides contributors to pick a different reviewer, helping us to meet the SLO. diff --git a/doc/development/contributing/style_guides.md b/doc/development/contributing/style_guides.md index bfaee407cb8..c316d50c88c 100644 --- a/doc/development/contributing/style_guides.md +++ b/doc/development/contributing/style_guides.md @@ -112,14 +112,14 @@ the `.rubocop_todo.yml`. This also allows us greater visibility into the excepti which are currently being resolved. One way to generate the initial list is to run the todo auto generation, -with `exclude limit` set to a high number. +with `exclude limit` set to a high number. ```shell bundle exec rubocop --auto-gen-config --auto-gen-only-exclude --exclude-limit=10000 ``` -You can then move the list from the freshly generated `.rubocop_todo.yml` for the Cop being actively -resolved and place it in the `.rubocop_manual_todo.yml`. In this scenario, do not commit auto generated +You can then move the list from the freshly generated `.rubocop_todo.yml` for the Cop being actively +resolved and place it in the `.rubocop_manual_todo.yml`. In this scenario, do not commit auto generated changes to the `.rubocop_todo.yml` as an `exclude limit` that is higher than 15 will make the `.rubocop_todo.yml` hard to parse. diff --git a/doc/development/database/strings_and_the_text_data_type.md b/doc/development/database/strings_and_the_text_data_type.md index 8b839e929c7..33a0fd2ebb7 100644 --- a/doc/development/database/strings_and_the_text_data_type.md +++ b/doc/development/database/strings_and_the_text_data_type.md @@ -11,11 +11,11 @@ info: To determine the technical writer assigned to the Stage/Group associated w When adding new columns that will be used to store strings or other textual information: 1. We always use the `text` data type instead of the `string` data type. -1. `text` columns should always have a limit set by using the `add_text_limit` migration helper. +1. `text` columns should always have a limit set, either by using the `create_table_with_constraints` helper +when creating a table, or by using the `add_text_limit` when altering an existing table. -The `text` data type can not be defined with a limit, so `add_text_limit` is enforcing that by -adding a [check constraint](https://www.postgresql.org/docs/11/ddl-constraints.html) on the -column and then validating it at a followup step. +The `text` data type can not be defined with a limit, so `create_table_with_constraints` and `add_text_limit` enforce +that by adding a [check constraint](https://www.postgresql.org/docs/11/ddl-constraints.html) on the column. ## Background information @@ -48,20 +48,15 @@ class CreateDbGuides < ActiveRecord::Migration[6.0] DOWNTIME = false - disable_ddl_transaction! - def up - unless table_exists?(:db_guides) - create_table :db_guides do |t| - t.bigint :stars, default: 0, null: false - t.text :title - t.text :notes - end - end + create_table_with_constraints :db_guides do |t| + t.bigint :stars, default: 0, null: false + t.text :title + t.text :notes - # The following add the constraints and validate them immediately (no data in the table) - add_text_limit :db_guides, :title, 128 - add_text_limit :db_guides, :notes, 1024 + t.text_limit :title, 128 + t.text_limit :notes, 1024 + end end def down @@ -71,12 +66,8 @@ class CreateDbGuides < ActiveRecord::Migration[6.0] end ``` -Adding a check constraint requires an exclusive lock while the `ALTER TABLE` that adds is running. -As we don't want the exclusive lock to be held for the duration of a transaction, `add_text_limit` -must always run in a migration with `disable_ddl_transaction!`. - -Also, note that we have to add a check that the table exists so that the migration can be repeated -in case of a failure. +Note that the `create_table_with_constraints` helper uses the `with_lock_retries` helper +internally, so we don't need to manually wrap the method call in the migration. ## Add a text column to an existing table diff --git a/doc/development/database_review.md b/doc/development/database_review.md index f0c265df9ab..da2c93cc1fd 100644 --- a/doc/development/database_review.md +++ b/doc/development/database_review.md @@ -25,9 +25,9 @@ A database review is required for: generally up to the author of a merge request to decide whether or not complex queries are being introduced and if they require a database review. -- Changes in usage data metrics that use `count` and `distinct_count`. +- Changes in usage data metrics that use `count`, `distinct_count` and `estimate_batch_distinct_count`. These metrics could have complex queries over large tables. - See the [Product Analytics Guide](https://about.gitlab.com/handbook/product/product-analytics-guide/) + See the [Product Intelligence Guide](https://about.gitlab.com/handbook/product/product-intelligence-guide/) for implementation details. A database reviewer is expected to look out for obviously complex diff --git a/doc/development/documentation/feature_flags.md b/doc/development/documentation/feature_flags.md index 59298c5345f..7547ec59fb2 100644 --- a/doc/development/documentation/feature_flags.md +++ b/doc/development/documentation/feature_flags.md @@ -37,12 +37,10 @@ therefore, it indicates that it cannot be done by regular users of GitLab.com. ### Features disabled by default -For features disabled by default, if they cannot be used yet due to lack of -completeness, or if they're still under internal evaluation (for example, for -performance implications) do **not document them**: add (or merge) the docs -only when the feature is safe and ready to use and test by end-users. +For features disabled by default, add or improve the docs with every change in line with the +[definition of done](../contributing/merge_request_workflow.md#definition-of-done). -For feature flags disabled by default, if they can be used by end users: +Include details of the feature flag in the documentation: - Say that it's disabled by default. - Say whether it's enabled on GitLab.com. diff --git a/doc/development/documentation/index.md b/doc/development/documentation/index.md index 5fb5e9b433a..55f5d43b175 100644 --- a/doc/development/documentation/index.md +++ b/doc/development/documentation/index.md @@ -173,7 +173,7 @@ There are two types of redirects: - Redirect files added into the docs themselves, for users who view the docs in `/help` on self-managed instances. For example, [`/help` on GitLab.com](https://gitlab.com/help). - Redirects in a [`_redirects`](../../user/project/pages/redirects.md) file, for users - who view the docs on <http://docs.gitlab.com>. + who view the docs on <https://docs.gitlab.com>. To add a redirect: @@ -201,6 +201,9 @@ To add a redirect: 1. If the document being moved has any Disqus comments on it, follow the steps described in [Redirections for pages with Disqus comments](#redirections-for-pages-with-disqus-comments). + 1. If a documentation page you're removing includes images that aren't used + with any other documentation pages, be sure to use your MR to delete + those images from the repository. 1. Assign the MR to a technical writer for review and merge. 1. If the redirect is to one of the 4 internal docs projects (not an external URL), create an MR in [`gitlab-docs`](https://gitlab.com/gitlab-org/gitlab-docs): @@ -366,6 +369,19 @@ You can combine one or more of the following: = link_to 'Help page', help_page_path('user/permissions') ``` +#### Linking to `/help` in JavaScript + +To link to the documentation from a JavaScript or a Vue component, use the `helpPagePath` function from [`help_page_helper.js`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/assets/javascripts/helpers/help_page_helper.js): + +```javascript +import { helpPagePath } from '~/helpers/help_page_helper'; + +helpPagePath('user/permissions', { anchor: 'anchor-link' }) +// evaluates to '/help/user/permissions#anchor-link' for GitLab.com +``` + +This is preferred over static paths, as the helper also works on instances installed under a [relative URL](../../install/relative_url.md). + ### GitLab `/help` tests Several [RSpec tests](https://gitlab.com/gitlab-org/gitlab/blob/master/spec/features/help_pages_spec.rb) diff --git a/doc/development/documentation/styleguide/index.md b/doc/development/documentation/styleguide/index.md index 971652f76d3..bba94c7de7e 100644 --- a/doc/development/documentation/styleguide/index.md +++ b/doc/development/documentation/styleguide/index.md @@ -22,7 +22,7 @@ You can also view a list of [recent updates to this guide](https://gitlab.com/da If you can't find what you need: - View the GitLab Handbook for [writing style guidelines](https://about.gitlab.com/handbook/communication/#writing-style-guidelines) that apply to all GitLab content. -- Refer to one of the following: +- Refer to: - [Microsoft Style Guide](https://docs.microsoft.com/en-us/style-guide/welcome/). - [Google Developer Documentation Style Guide](https://developers.google.com/style). @@ -161,7 +161,7 @@ Markdown rendering engine. For a complete Kramdown reference, see the The [`gitlab-kramdown`](https://gitlab.com/gitlab-org/gitlab_kramdown) Ruby gem plans to support all [GitLab Flavored Markdown](../../../user/markdown.md) in the future, which is all Markdown supported for display in the GitLab application itself. For now, use -regular Markdown, following the rules in the linked style guide. +regular Markdown and follow the rules in the linked style guide. Kramdown-specific markup (for example, `{:.class}`) doesn't render properly on GitLab instances under [`/help`](../index.md#gitlab-help). @@ -207,9 +207,9 @@ Some examples fail if incorrect capitalization is used: Additionally, commands, parameters, values, filenames, and so on must be included in backticks. For example: -- "Change the `needs` keyword in your `.gitlab.yml`..." - - `needs` is a parameter, and `.gitlab.yml` is a file, so both need backticks. - Additionally, `.gitlab.yml` without backticks fails markdownlint because it +- "Change the `needs` keyword in your `.gitlab-ci.yml`..." + - `needs` is a parameter, and `.gitlab-ci.yml` is a file, so both need backticks. + Additionally, `.gitlab-ci.yml` without backticks fails markdownlint because it does not have capital G or L. - "Run `git clone` to clone a Git repository..." - `git clone` is a command, so it must be lowercase, while Git is the product, @@ -252,7 +252,7 @@ Put files for a specific product area into the related folder: ### Work with directories and files -Refer to the following items when working with directories and files: +When working with directories and files: 1. When you create a new directory, always start with an `index.md` file. Don't use another filename and _do not_ create `README.md` files. @@ -332,7 +332,7 @@ GitLab documentation should be clear and easy to understand. ### Trademark Only use the GitLab name and trademarks in accordance with -[GitLab Brand Guidelines](https://about.gitlab.com/handbook/marketing/inbound-marketing/digital-experience/brand-guidelines/#trademark). +[GitLab Brand Guidelines](https://about.gitlab.com/handbook/marketing/corporate-marketing/brand-activation/brand-guidelines/#trademark). Don't use the possessive form of the word GitLab (`GitLab's`). @@ -412,7 +412,7 @@ references to user interface elements. For example: ### Inclusive language We strive to create documentation that's inclusive. This section includes -guidance and examples for the following categories: +guidance and examples for these categories: - [Gender-specific wording](#avoid-gender-specific-wording). (Tested in [`InclusionGender.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/.vale/gitlab/InclusionGender.yml).) @@ -481,7 +481,7 @@ more precise and functional, such as `primary` and `secondary`. <!-- vale gitlab.InclusionCultural = YES --> -For more information see the following [Internet Draft specification](https://tools.ietf.org/html/draft-knodel-terminology-02). +For more information see the [Internet Draft specification](https://tools.ietf.org/html/draft-knodel-terminology-02). ### Fake user information @@ -499,7 +499,8 @@ addresses and names, do use: When including sample URLs in the documentation, use: - `example.com` when the domain name is generic. -- `gitlab.example.com` when referring to self-managed instances of GitLab. +- `gitlab.example.com` when referring only to self-managed GitLab instances. + Use `gitlab.com` for GitLab SaaS instances. ### Fake tokens @@ -507,12 +508,11 @@ There may be times where a token is needed to demonstrate an API call using cURL or a variable used in CI. It is strongly advised not to use real tokens in documentation even if the probability of a token being exploited is low. -You can use the following fake tokens as examples: +You can use these fake tokens as examples: | Token type | Token value | |:----------------------|:-------------------------------------------------------------------| -| Private user token | `<your_access_token>` | -| Personal access token | `n671WNGecHugsdEDPsyo` | +| Personal access token | `<your_access_token>` | | Application ID | `2fcb195768c39e9a94cec2c2e32c59c0aad7a3365c10892e8116b5d83d4096b6` | | Application secret | `04f294d1eaca42b8692017b426d53bbc8fe75f827734f0260710b83a556082df` | | CI/CD variable | `Li8j-mLUVA3eZYjPfd_H` | @@ -526,11 +526,14 @@ You can use the following fake tokens as examples: ### Usage list <!-- vale off --> -| Usage | Guidance | -|-----------------------|-----| -| admin, admin area | Use **administration**, **administrator**, **administer**, or **Admin Area** instead. |. +| Usage | Guidance | +|-----------------------|----------| +| above | Try to avoid extra words when referring to an example or table in a documentation page, but if required, use **previously** instead. | +| admin, admin area | Use **administration**, **administrator**, **administer**, or **Admin Area** instead. ([Vale](../testing.md#vale) rule: [`Admin.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/.vale/gitlab/Admin.yml)) | +| allow, enable | Try to avoid, unless you are talking about security-related features. For example, instead of "This feature allows you to create a pipeline," use "Use this feature to create a pipeline." This phrasing is more active and is from the user perspective, rather than the person who implemented the feature. [View details](https://docs.microsoft.com/en-us/style-guide/a-z-word-list-term-collections/a/allow-allows). | | and/or | Use **or** instead, or another sensible construction. | -| currently | Do not use when talking about the product or its features. The documentation describes the product as it is today. | +| below | Try to avoid extra words when referring to an example or table in a documentation page, but if required, use **following** instead. | +| currently | Do not use when talking about the product or its features. The documentation describes the product as it is today. ([Vale](../testing.md#vale) rule: [`CurrentStatus.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/.vale/gitlab/CurrentStatus.yml)) | | easily | Do not use. If the user doesn't find the process to be these things, we lose their trust. | | e.g. | Do not use Latin abbreviations. Use **for example**, **such as**, **for instance**, or **like** instead. ([Vale](../testing.md#vale) rule: [`LatinTerms.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/.vale/gitlab/LatinTerms.yml)) | | future tense | When possible, use present tense instead. For example, use `after you execute this command, GitLab displays the result` instead of `after you execute this command, GitLab will display the result`. ([Vale](../testing.md#vale) rule: [`FutureTense.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/doc/.vale/gitlab/FutureTense.yml)) | @@ -865,7 +868,7 @@ Consider installing a plugin or extension in your editor for formatting tables: When creating tables of lists of features (such the features available to each role on the [Permissions](../../../user/permissions.md#project-members-permissions) -page), use the following phrases: +page), use these phrases: | Option | Markdown | Displayed result | |--------|--------------------------|------------------------| @@ -967,7 +970,7 @@ Links are important in GitLab documentation. They allow you to [link instead of summarizing](#link-instead-of-summarize) to help preserve a [single source of truth](#why-a-single-source-of-truth) in GitLab documentation. -We include guidance for links in the following categories: +We include guidance for links in these categories: - How to set up [anchor links](#anchor-links) for headings. - How to set up [criteria](#basic-link-criteria) for configuring a link. @@ -1137,14 +1140,14 @@ When documenting navigation through the user interface: - Use the exact wording as shown in the UI, including any capital letters as-is. - Use bold text for navigation items and the char "greater than" (`>`) as a - separator. For example: `Navigate to your project's **Settings > CI/CD**`. + separator. For example: `From your project, go to **Settings > CI/CD**`. - If there are any expandable menus, make sure to mention that the user needs to expand the tab to find the settings you're referring to. For example: - `Navigate to your project's **Settings > CI/CD** and expand **General pipelines**`. + `From your group, go to **Settings > CI/CD** and expand **General pipelines**`. ### Navigational elements -Use the following terms when referring to the main GitLab user interface +Use these terms when referring to the main GitLab user interface elements: - **Top menu**: This is the top menu that spans the width of the user interface. @@ -1183,7 +1186,7 @@ When you take screenshots: - Save the image with a lowercase filename that's descriptive of the feature or concept in the image. If the image is of the GitLab interface, append the - GitLab version to the filename, based on the following format: + GitLab version to the filename, based on this format: `image_name_vX_Y.png`. For example, for a screenshot taken from the pipelines page of GitLab 11.1, a valid name is `pipelines_v11_1.png`. If you're adding an illustration that doesn't include parts of the user interface, add the release @@ -1365,7 +1368,7 @@ hidden on the documentation site, but is displayed by `/help`. <!-- vale on --> Syntax highlighting is required for fenced code blocks added to the GitLab -documentation. Refer to the following table for the most common language classes, +documentation. Refer to this table for the most common language classes, or check the [complete list](https://github.com/rouge-ruby/rouge/wiki/List-of-supported-languages-and-lexers) of available language classes: @@ -1433,15 +1436,15 @@ Usage examples: Icons should be used sparingly, and only in ways that aid and do not hinder the readability of the text. -For example, the following adds little to the accompanying text: +For example, this Markdown adds little to the accompanying text: ```markdown -1. Go to **{home}** **Project overview > Details** +1. Go to **{home}** **Project overview > Details**. ``` -1. Go to **{home}** **Project overview > Details** +1. Go to **{home}** **Project overview > Details**. -However, the following might help the reader connect the text to the user +However, these tables might help the reader connect the text to the user interface: ```markdown @@ -1555,14 +1558,12 @@ It renders on the GitLab documentation site as: ## Terms -To maintain consistency through GitLab documentation, the following guides -documentation authors on agreed styles and usage of terms. +To maintain consistency through GitLab documentation, use these styles and terms. ### Merge requests (MRs) Merge requests allow you to exchange changes you made to source code and -collaborate with other people on the same project. This term is used in -the following ways: +collaborate with other people on the same project. - Use lowercase _merge requests_ regardless of whether referring to the feature or individual merge requests. @@ -1580,7 +1581,7 @@ Examples: ### Describe UI elements -The following are styles to follow when describing user interface elements in an +Follow these styles when you're describing user interface elements in an application: - For elements with a visible label, use that label in bold with matching case. @@ -1590,7 +1591,7 @@ application: ### Verbs for UI elements -The following are recommended verbs for specific uses with user interface +Use these verbs for specific uses with user interface elements: | Recommended | Used for | Replaces | @@ -1637,7 +1638,7 @@ displayed for the page or feature. #### Version text in the **Version History** -If all content in a section is related, add version text following the header +If all content in a section is related, add version text after the header for the section. The version information must be surrounded by blank lines, and each entry should be on its own line. @@ -1670,8 +1671,8 @@ the blockquote to use a bulleted list: If a feature is moved to another tier: ```markdown -> - [Moved](<link-to-issue>) from [GitLab Premium](https://about.gitlab.com/pricing/) to [GitLab Starter](https://about.gitlab.com/pricing/) in 11.8. -> - [Moved](<link-to-issue>) from [GitLab Starter](https://about.gitlab.com/pricing/) to GitLab Core in 12.0. +> - [Moved](<link-to-issue>) from GitLab Premium to GitLab Starter in 11.8. +> - [Moved](<link-to-issue>) from GitLab Starter to GitLab Core in 12.0. ``` If a feature is deprecated, include a link to a replacement (when available): @@ -1709,7 +1710,7 @@ voters to agree. #### End-of-life for features or products When a feature or product enters its end-of-life, indicate its status by -creating a [warning alert](#alert-boxes) directly following its relevant header. +creating a [warning alert](#alert-boxes) directly after its relevant header. If possible, link to its deprecation and removal issues. For example: diff --git a/doc/development/documentation/testing.md b/doc/development/documentation/testing.md index d2e3f473532..561727648f0 100644 --- a/doc/development/documentation/testing.md +++ b/doc/development/documentation/testing.md @@ -183,7 +183,8 @@ Vale configuration is found in the following projects: - [`charts`](https://gitlab.com/gitlab-org/charts/gitlab/-/tree/master/doc/.vale/gitlab) - [`gitlab-development-kit`](https://gitlab.com/gitlab-org/gitlab-development-kit/-/tree/master/doc/.vale/gitlab) -This configuration is also used within build pipelines. +This configuration is also used within build pipelines, where +[error-level rules](#vale-result-types) are enforced. You can use Vale: @@ -197,14 +198,17 @@ You can use Vale: Vale returns three types of results: `suggestion`, `warning`, and `error`: - **Suggestion**-level results are writing tips and aren't displayed in CI - job output. Suggestions don't break CI. + job output. Suggestions don't break CI. See a list of + [suggestion-level rules](https://gitlab.com/search?utf8=✓&snippets=false&scope=&repository_ref=master&search=path%3Adoc%2F.vale%2Fgitlab+Suggestion%3A&group_id=9970&project_id=278964). - **Warning**-level results are [Style Guide](styleguide/index.md) violations, aren't displayed in CI job output, and should contain clear explanations of how to resolve the warning. Warnings may be technical debt, or can be future error-level test items - (after the Technical Writing team completes its cleanup). Warnings don't break CI. + (after the Technical Writing team completes its cleanup). Warnings don't break CI. See a list of + [warning-level rules](https://gitlab.com/search?utf8=✓&snippets=false&scope=&repository_ref=master&search=path%3Adoc%2F.vale%2Fgitlab+Warning%3A&group_id=9970&project_id=278964). - **Error**-level results are Style Guide violations, and should contain clear explanations about how to resolve the error. Errors break CI and are displayed in CI job output. - of how to resolve the error. Errors break CI and are displayed in CI job output. + of how to resolve the error. Errors break CI and are displayed in CI job output. See a list of + [error-level rules](https://gitlab.com/search?utf8=✓&snippets=false&scope=&repository_ref=master&search=path%3Adoc%2F.vale%2Fgitlab+Error%3A&group_id=9970&project_id=278964). ### Install linters diff --git a/doc/development/elasticsearch.md b/doc/development/elasticsearch.md index 1c92601dde9..8bf8a5fccb8 100644 --- a/doc/development/elasticsearch.md +++ b/doc/development/elasticsearch.md @@ -216,6 +216,9 @@ cron worker sequentially. Any update to the Elastic index mappings should be replicated in [`Elastic::Latest::Config`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/elastic/latest/config.rb). +Migrations can be built with a retry limit and have the ability to be [failed and marked as halted](https://gitlab.com/gitlab-org/gitlab/-/blob/66e899b6637372a4faf61cfd2f254cbdd2fb9f6d/ee/lib/elastic/migration.rb#L40). +Any data or index cleanup needed to support migration retries should be handled within the migration. + ### Migration options supported by the [`Elastic::MigrationWorker`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/app/workers/elastic/migration_worker.rb) - `batched!` - Allow the migration to run in batches. If set, the [`Elastic::MigrationWorker`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/app/workers/elastic/migration_worker.rb) @@ -337,3 +340,48 @@ cluster.routing.allocation.disk.watermark.high: 10gb Restart Elasticsearch, and the `read_only_allow_delete` will clear on it's own. _from "Disk-based Shard Allocation | Elasticsearch Reference" [5.6](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/disk-allocator.html#disk-allocator) and [6.x](https://www.elastic.co/guide/en/elasticsearch/reference/6.7/disk-allocator.html)_ + +### Disaster recovery/data loss/backups + +The use of Elasticsearch in GitLab is only ever as a secondary data store. +This means that all of the data stored in Elasticsearch can always be derived +again from other data sources, specifically PostgreSQL and Gitaly. Therefore if +the Elasticsearch data store is ever corrupted for whatever reason you can +simply reindex everything from scratch. + +If your Elasticsearch index is incredibly large it may be too time consuming or +cause too much downtime to reindex from scratch. There aren't any built in +mechanisms for automatically finding discrepencies and resyncing an +Elasticsearch index if it gets out of sync but one tool that may be useful is +looking at the logs for all the updates that occurred in a time range you +believe may have been missed. This information is very low level and only +useful for operators that are familiar with the GitLab codebase. It is +documented here in case it is useful for others. The relevant logs that could +theoretically be used to figure out what needs to be replayed are: + +1. All non-repository updates that were synced can be found in + [`elasticsearch.log`](../administration/logs.md#elasticsearchlog) by + searching for + [`track_items`](https://gitlab.com/gitlab-org/gitlab/-/blob/1e60ea99bd8110a97d8fc481e2f41cab14e63d31/ee/app/services/elastic/process_bookkeeping_service.rb#L25) + and these can be replayed by sending these items again through + `::Elastic::ProcessBookkeepingService.track!` +1. All repository updates that occurred can be found in + [`elasticsearch.log`](../administration/logs.md#elasticsearchlog) by + searching for + [`indexing_commit_range`](https://gitlab.com/gitlab-org/gitlab/-/blob/6f9d75dd3898536b9ec2fb206e0bd677ab59bd6d/ee/lib/gitlab/elastic/indexer.rb#L41). + Replaying these requires resetting the + [`IndexStatus#last_commit/last_wiki_commit`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/models/index_status.rb) + to the oldest `from_sha` in the logs and then triggering another index of + the project using + [`ElasticCommitIndexerWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/elastic_commit_indexer_worker.rb) +1. All project deletes that occurred can be found in + [`sidekiq.log`](../administration/logs.md#sidekiqlog) by searching for + [`ElasticDeleteProjectWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/elastic_delete_project_worker.rb). + These updates can be replayed by triggering another + `ElasticDeleteProjectWorker`. + +With the above methods and taking regular [Elasticsearch +snapshots](https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html) +we should be able to recover from different kinds of data loss issues in a +relatively short period of time compared to indexing everything from +scratch. diff --git a/doc/development/event_tracking/backend.md b/doc/development/event_tracking/backend.md index 24e83ffc524..e8b8e0c4885 100644 --- a/doc/development/event_tracking/backend.md +++ b/doc/development/event_tracking/backend.md @@ -1,8 +1,8 @@ --- -redirect_to: '../product_analytics/index.md' +redirect_to: 'https://about.gitlab.com/handbook/product/product-intelligence-guide/' --- -This document was moved to [another location](../product_analytics/index.md). +This document was moved to [another location](https://about.gitlab.com/handbook/product/product-intelligence-guide/). -<!-- This redirect file can be deleted after February 1, 2021. --> +<!-- This redirect file can be deleted after December 1, 2021. --> <!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/development/event_tracking/frontend.md b/doc/development/event_tracking/frontend.md index 24e83ffc524..e8b8e0c4885 100644 --- a/doc/development/event_tracking/frontend.md +++ b/doc/development/event_tracking/frontend.md @@ -1,8 +1,8 @@ --- -redirect_to: '../product_analytics/index.md' +redirect_to: 'https://about.gitlab.com/handbook/product/product-intelligence-guide/' --- -This document was moved to [another location](../product_analytics/index.md). +This document was moved to [another location](https://about.gitlab.com/handbook/product/product-intelligence-guide/). -<!-- This redirect file can be deleted after February 1, 2021. --> +<!-- This redirect file can be deleted after December 1, 2021. --> <!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/development/event_tracking/index.md b/doc/development/event_tracking/index.md index 24e83ffc524..e8b8e0c4885 100644 --- a/doc/development/event_tracking/index.md +++ b/doc/development/event_tracking/index.md @@ -1,8 +1,8 @@ --- -redirect_to: '../product_analytics/index.md' +redirect_to: 'https://about.gitlab.com/handbook/product/product-intelligence-guide/' --- -This document was moved to [another location](../product_analytics/index.md). +This document was moved to [another location](https://about.gitlab.com/handbook/product/product-intelligence-guide/). -<!-- This redirect file can be deleted after February 1, 2021. --> +<!-- This redirect file can be deleted after December 1, 2021. --> <!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/development/experiment_guide/index.md b/doc/development/experiment_guide/index.md index 35cd55b199c..a1899ab5f18 100644 --- a/doc/development/experiment_guide/index.md +++ b/doc/development/experiment_guide/index.md @@ -145,7 +145,7 @@ addressed. To determine whether the experiment is a success or not, we must implement tracking events to acquire data for analyzing. We can send events to Snowplow via either the backend or frontend. -Read the [product analytics guide](https://about.gitlab.com/handbook/product/product-analytics-guide/) for more details. +Read the [product intelligence guide](https://about.gitlab.com/handbook/product/product-intelligence-guide/) for more details. #### Track backend events @@ -281,13 +281,19 @@ Note that this data is completely separate from the [events tracking data](#impl #### Add context -You can add arbitrary context data in a hash which gets stored as part of the experiment user record. +You can add arbitrary context data in a hash which gets stored as part of the experiment user record. New calls to the `record_experiment_user` with newer contexts get merged deeply into the existing context. + This data can then be used by data analytics dashboards. ```ruby before_action do - record_experiment_user(:signup_flow, foo: 42) + record_experiment_user(:signup_flow, foo: 42, bar: { a: 22}) + # context is { "foo" => 42, "bar" => { "a" => 22 }} end + +# Additional contexts for newer record calls are merged deeply +record_experiment_user(:signup_flow, foo: 40, bar: { b: 2 }, thor: 3) +# context becomes { "foo" => 40, "bar" => { "a" => 22, "b" => 2 }, "thor" => 3} ``` ### Record experiment conversion event @@ -337,6 +343,27 @@ to the URL: https://gitlab.com/<EXPERIMENT_ENTRY_URL>?force_experiment=<EXPERIMENT_KEY> ``` +### A cookie-based approach to force an experiment + +It's possible to force the current user to be in the experiment group for `<EXPERIMENT_KEY>` +during the browser session by using your browser's developer tools: + +```javascript +document.cookie = "force_experiment=<EXPERIMENT_KEY>; path=/"; +``` + +Use a comma to list more than one experiment to be forced: + +```javascript +document.cookie = "force_experiment=<EXPERIMENT_KEY>,<ANOTHER_EXPERIMENT_KEY>; path=/"; +``` + +Clear the experiments by unsetting the `force_experiment` cookie: + +```javascript +document.cookie = "force_experiment=; path=/"; +``` + ### Testing and test helpers #### RSpec diff --git a/doc/development/fe_guide/dependencies.md b/doc/development/fe_guide/dependencies.md index 0ec10399ae0..b036819cde1 100644 --- a/doc/development/fe_guide/dependencies.md +++ b/doc/development/fe_guide/dependencies.md @@ -8,12 +8,7 @@ info: To determine the technical writer assigned to the Stage/Group associated w ## Package manager -We use [Yarn](https://yarnpkg.com/) to manage frontend dependencies. There are a few exceptions: - -- [FontAwesome](https://fontawesome.com/), installed via the `font-awesome-rails` gem: we are working to replace it with - [GitLab SVGs](https://gitlab-org.gitlab.io/gitlab-svgs/) icons library. -- [ACE](https://ace.c9.io/) editor, installed via the `ace-rails-ap` gem. -- Other dependencies found under `vendor/assets/`. +We use [Yarn](https://yarnpkg.com/) to manage frontend dependencies. There are a few exceptions, stored in `vendor/assets/`. ## Updating dependencies diff --git a/doc/development/fe_guide/editor_lite.md b/doc/development/fe_guide/editor_lite.md index 465d64ff63c..47ef85d8737 100644 --- a/doc/development/fe_guide/editor_lite.md +++ b/doc/development/fe_guide/editor_lite.md @@ -104,7 +104,14 @@ someActionFunction() { ## Extensions -Editor Lite has been built to provide a universal, extensible editing tool to the whole product, which would not depend on any particular group. Even though the Editor Lite's core is owned by [Create::Editor FE Team](https://about.gitlab.com/handbook/engineering/development/dev/create-editor-fe/), the main functional elements — extensions — can be owned by any group. Editor Lite extensions' main idea is that the core of the editor remains very slim and stable. At the same time, whatever new functionality is needed can be added as an extension to this core, without touching the core itself. It allows any group to build and own any new editing functionality without being afraid of it being broken or overridden with the Editor Lite changes. +Editor Lite has been built to provide a universal, extensible editing tool to the whole product, +which would not depend on any particular group. Even though the Editor Lite's core is owned by +[Create::Editor FE Team](https://about.gitlab.com/handbook/engineering/development/dev/create-editor/), +the main functional elements — extensions — can be owned by any group. Editor Lite extensions' main idea +is that the core of the editor remains very slim and stable. At the same time, whatever new functionality +is needed can be added as an extension to this core, without touching the core itself. It allows any group +to build and own any new editing functionality without being afraid of it being broken or overridden with +the Editor Lite changes. Structurally, the complete implementation of Editor Lite could be presented as the following diagram: diff --git a/doc/development/fe_guide/event_tracking.md b/doc/development/fe_guide/event_tracking.md index 24e83ffc524..e8b8e0c4885 100644 --- a/doc/development/fe_guide/event_tracking.md +++ b/doc/development/fe_guide/event_tracking.md @@ -1,8 +1,8 @@ --- -redirect_to: '../product_analytics/index.md' +redirect_to: 'https://about.gitlab.com/handbook/product/product-intelligence-guide/' --- -This document was moved to [another location](../product_analytics/index.md). +This document was moved to [another location](https://about.gitlab.com/handbook/product/product-intelligence-guide/). -<!-- This redirect file can be deleted after February 1, 2021. --> +<!-- This redirect file can be deleted after December 1, 2021. --> <!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/development/fe_guide/graphql.md b/doc/development/fe_guide/graphql.md index b1896863af9..cbaa648570c 100644 --- a/doc/development/fe_guide/graphql.md +++ b/doc/development/fe_guide/graphql.md @@ -815,7 +815,7 @@ it('calls mutation on submitting form ', () => { ### Testing with mocked Apollo Client -To test the logic of Apollo cache updates, we might want to mock an Apollo Client in our unit tests. We use [`mock-apollo-client`](https://www.npmjs.com/package/mock-apollo-client) library to mock Apollo client and [`createMockApollo` helper](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/frontend/helpers/mock_apollo_helper.js) we created on top of it. +To test the logic of Apollo cache updates, we might want to mock an Apollo Client in our unit tests. We use [`mock-apollo-client`](https://www.npmjs.com/package/mock-apollo-client) library to mock Apollo client and [`createMockApollo` helper](https://gitlab.com/gitlab-org/gitlab/-/blob/master/spec/frontend/__helpers__/mock_apollo_helper.js) we created on top of it. To separate tests with mocked client from 'usual' unit tests, it's recommended to create an additional factory and pass the created `mockApollo` as an option to the `createComponent`-factory. This way we only create Apollo Client instance when it's necessary. @@ -887,7 +887,7 @@ describe('Some component with Apollo mock', () => { After this, we need to create a mock Apollo Client instance using a helper: ```javascript -import createMockApollo from 'jest/helpers/mock_apollo_helper'; +import createMockApollo from 'helpers/mock_apollo_helper'; describe('Some component', () => { let wrapper; @@ -1031,7 +1031,6 @@ the following Apollo Client warning when passing only handlers: ```shell Unexpected call of console.warn() with: - Warning: mock-apollo-client - The query is entirely client-side (using @client directives) and resolvers have been configured. The request handler will not be called. ``` diff --git a/doc/development/fe_guide/icons.md b/doc/development/fe_guide/icons.md index 1468e886220..af587a31bbb 100644 --- a/doc/development/fe_guide/icons.md +++ b/doc/development/fe_guide/icons.md @@ -18,8 +18,6 @@ We are using SVG Icons in GitLab with a SVG Sprite. This means the icons are only loaded once, and are referenced through an ID. The sprite SVG is located under `/assets/icons.svg`. -Our goal is to replace one by one all inline SVG Icons (as those currently bloat the HTML) and also all Font Awesome icons. - ### Usage in HAML/Rails To use a sprite Icon in HAML or Rails we use a specific helper function: @@ -90,11 +88,6 @@ Please use the following function inside JS to render an icon: ### Usage in HAML/Rails -WARNING: -Do not use the `spinner` or `icon('spinner spin')` rails helpers to insert -loading icons. These helpers rely on the Font Awesome icon library which is -deprecated. - To insert a loading spinner in HAML or Rails use the `loading_icon` helper: ```haml diff --git a/doc/development/fe_guide/performance.md b/doc/development/fe_guide/performance.md index 7825c89b7cf..aac2258f3a3 100644 --- a/doc/development/fe_guide/performance.md +++ b/doc/development/fe_guide/performance.md @@ -43,7 +43,7 @@ It takes several arguments of which the measurement’s name is the only one req performance.measure('My component', 'my-component-start', 'my-component-end') ``` -- Duration between a mark and the moment the measurement is taken. The end mark is omitted in +- Duration between a mark and the moment the measurement is taken. The end mark is omitted in this case. ```javascript @@ -197,7 +197,7 @@ app-*-end // for an end ‘mark’ app-* // for ‘measure’ ``` -For example, `'webide-init-editor-start`, `mr-diffs-mark-file-tree-end`, and so on. We do it to +For example, `'webide-init-editor-start`, `mr-diffs-mark-file-tree-end`, and so on. We do it to help identify marks and measures coming from the different apps on the same page. ## Best Practices diff --git a/doc/development/fe_guide/style/javascript.md b/doc/development/fe_guide/style/javascript.md index 8e3538e891d..faf03a03101 100644 --- a/doc/development/fe_guide/style/javascript.md +++ b/doc/development/fe_guide/style/javascript.md @@ -7,7 +7,7 @@ disqus_identifier: 'https://docs.gitlab.com/ee/development/fe_guide/style_guide_ # JavaScript style guide -We use [Airbnb's JavaScript Style Guide](https://github.com/airbnb/javascript) and it's accompanying +We use [Airbnb's JavaScript Style Guide](https://github.com/airbnb/javascript) and its accompanying linter to manage most of our JavaScript style guidelines. In addition to the style guidelines set by Airbnb, we also have a few specific rules diff --git a/doc/development/fe_guide/style/vue.md b/doc/development/fe_guide/style/vue.md index b85c1b1de35..0288238a9e5 100644 --- a/doc/development/fe_guide/style/vue.md +++ b/doc/development/fe_guide/style/vue.md @@ -119,7 +119,8 @@ Please check this [rules](https://github.com/vuejs/eslint-plugin-vue#bulb-rules) ## Naming -1. **Extensions**: Use `.vue` extension for Vue components. Do not use `.js` as file extension ([#34371](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/34371)). +1. **Extensions**: Use `.vue` extension for Vue components. Do not use `.js` as file extension +([#34371](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/34371)). 1. **Reference Naming**: Use PascalCase for their instances: ```javascript @@ -402,7 +403,8 @@ When using `v-for` you need to provide a *unique* `:key` attribute for each item </div> ``` -1. When using `v-for` with `template` and there is more than one child element, the `:key` values must be unique. It's advised to use `kebab-case` namespaces. +1. When using `v-for` with `template` and there is more than one child element, the `:key` values +must be unique. It's advised to use `kebab-case` namespaces. ```html <template v-for="(item, index) in items"> @@ -468,9 +470,10 @@ Useful links: ## Vue testing -Over time, a number of programming patterns and style preferences have emerged in our efforts to effectively test Vue components. -The following guide describes some of these. **These are not strict guidelines**, but rather a collection of suggestions and -good practices that aim to provide insight into how we write Vue tests at GitLab. +Over time, a number of programming patterns and style preferences have emerged in our efforts to +effectively test Vue components. The following guide describes some of these. +**These are not strict guidelines**, but rather a collection of suggestions and good practices that +aim to provide insight into how we write Vue tests at GitLab. ### Mounting a component @@ -479,8 +482,10 @@ Typically, when testing a Vue component, the component should be "re-mounted" in To achieve this: 1. Create a mutable `wrapper` variable inside the top-level `describe` block. -1. Mount the component using [`mount`](https://vue-test-utils.vuejs.org/api/#mount)/[`shallowMount`](https://vue-test-utils.vuejs.org/api/#shallowMount). -1. Reassign the resulting [`Wrapper`](https://vue-test-utils.vuejs.org/api/wrapper/#wrapper) instance to our `wrapper` variable. +1. Mount the component using [`mount`](https://vue-test-utils.vuejs.org/api/#mount)/ +[`shallowMount`](https://vue-test-utils.vuejs.org/api/#shallowMount). +1. Reassign the resulting [`Wrapper`](https://vue-test-utils.vuejs.org/api/wrapper/#wrapper) +instance to our `wrapper` variable. Creating a global, mutable wrapper provides a number of advantages, including the ability to: @@ -497,14 +502,16 @@ Creating a global, mutable wrapper provides a number of advantages, including th }) ``` -- Use a `beforeEach` block to mount the component (see [the `createComponent` factory](#the-createcomponent-factory) for more information). +- Use a `beforeEach` block to mount the component (see +[the `createComponent` factory](#the-createcomponent-factory) for more information). - Use an `afterEach` block to destroy the component, for example, `wrapper.destroy()`. #### The `createComponent` factory To avoid duplicating our mounting logic, it's useful to define a `createComponent` factory function that we can reuse in each test block. This is a closure which should reassign our `wrapper` variable -to the result of [`mount`](https://vue-test-utils.vuejs.org/api/#mount) and [`shallowMount`](https://vue-test-utils.vuejs.org/api/#shallowMount): +to the result of [`mount`](https://vue-test-utils.vuejs.org/api/#mount) and +[`shallowMount`](https://vue-test-utils.vuejs.org/api/#shallowMount): ```javascript import MyComponent from '~/path/to/my_component.vue'; @@ -568,7 +575,8 @@ describe('MyComponent', () => { 1. Consider using a single (or a limited number of) object arguments over many arguments. Defining single parameters for common data like `props` is okay, - but keep in mind our [JavaScript style guide](javascript.md#limit-number-of-parameters) and stay within the parameter number limit: + but keep in mind our [JavaScript style guide](javascript.md#limit-number-of-parameters) and + stay within the parameter number limit: ```javascript // bad @@ -591,6 +599,19 @@ the mounting function (`mount` or `shallowMount`) to be used to mount the compon function createComponent({ mountFn = shallowMount } = {}) { } ``` +1. Wrap calls to `mount` and `shallowMount` in `extendedWrapper`, this exposes `wrapper.findByTestId()`: + + ```javascript + import { shallowMount } from '@vue/test-utils'; + import { extendedWrapper } from 'helpers/vue_test_utils_helper'; + import { SomeComponent } from 'components/some_component.vue'; + + let wrapper; + + const createWrapper = () => { wrapper = extendedWrapper(shallowMount(SomeComponent)); }; + const someButton = () => wrapper.findByTestId('someButtonTestId'); + ``` + ### Setting component state 1. Avoid using [`setProps`](https://vue-test-utils.vuejs.org/api/wrapper/#setprops) to set @@ -609,12 +630,13 @@ component state wherever possible. Instead, set the component's ``` The exception here is when you wish to test component reactivity in some way. - For example, you may want to test the output of a component when after a particular watcher has executed. - Using `setProps` to test such behavior is okay. + For example, you may want to test the output of a component when after a particular watcher has + executed. Using `setProps` to test such behavior is okay. ### Accessing component state -1. When accessing props or attributes, prefer the `wrapper.props('myProp')` syntax over `wrapper.props().myProp`: +1. When accessing props or attributes, prefer the `wrapper.props('myProp')` syntax over +`wrapper.props().myProp` or `wrapper.vm.myProp`: ```javascript // good @@ -626,7 +648,8 @@ component state wherever possible. Instead, set the component's expect(wrapper.attributes('myAttr')).toBe(true); ``` -1. When asserting multiple props, check the deep equality of the `props()` object with [`toEqual`](https://jestjs.io/docs/en/expect#toequalvalue): +1. When asserting multiple props, check the deep equality of the `props()` object with +[`toEqual`](https://jestjs.io/docs/en/expect#toequalvalue): ```javascript // good @@ -642,8 +665,9 @@ component state wherever possible. Instead, set the component's }); ``` -1. If you are only interested in some of the props, you can use [`toMatchObject`](https://jestjs.io/docs/en/expect#tomatchobjectobject). -Prefer `toMatchObject` over [`expect.objectContaining`](https://jestjs.io/docs/en/expect#expectobjectcontainingobject): +1. If you are only interested in some of the props, you can use +[`toMatchObject`](https://jestjs.io/docs/en/expect#tomatchobjectobject). Prefer `toMatchObject` +over [`expect.objectContaining`](https://jestjs.io/docs/en/expect#expectobjectcontainingobject): ```javascript // good @@ -664,12 +688,24 @@ Prefer `toMatchObject` over [`expect.objectContaining`](https://jestjs.io/docs/e The goal of this accord is to make sure we are all on the same page. 1. When writing Vue, you may not use jQuery in your application. - 1. If you need to grab data from the DOM, you may query the DOM 1 time while bootstrapping your application to grab data attributes using `dataset`. You can do this without jQuery. + 1. If you need to grab data from the DOM, you may query the DOM 1 time while bootstrapping your + application to grab data attributes using `dataset`. You can do this without jQuery. 1. You may use a jQuery dependency in Vue.js following [this example from the docs](https://vuejs.org/v2/examples/select2.html). - 1. If an outside jQuery Event needs to be listen to inside the Vue application, you may use jQuery event listeners. - 1. We avoid adding new jQuery events when they are not required. Instead of adding new jQuery events take a look at [different methods to do the same task](https://vuejs.org/v2/api/#vm-emit). -1. You may query the `window` object one time, while bootstrapping your application for application specific data (e.g. `scrollTo` is ok to access anytime). Do this access during the bootstrapping of your application. -1. You may have a temporary but immediate need to create technical debt by writing code that does not follow our standards, to be refactored later. Maintainers need to be ok with the tech debt in the first place. An issue should be created for that tech debt to evaluate it further and discuss. In the coming months you should fix that tech debt, with its priority to be determined by maintainers. -1. When creating tech debt you must write the tests for that code before hand and those tests may not be rewritten. e.g. jQuery tests rewritten to Vue tests. -1. You may choose to use VueX as a centralized state management. If you choose not to use VueX, you must use the *store pattern* which can be found in the [Vue.js documentation](https://vuejs.org/v2/guide/state-management.html#Simple-State-Management-from-Scratch). -1. Once you have chosen a centralized state-management solution you must use it for your entire application. i.e. Don't mix and match your state-management solutions. + 1. If an outside jQuery Event needs to be listen to inside the Vue application, you may use + jQuery event listeners. + 1. We avoid adding new jQuery events when they are not required. Instead of adding new jQuery + events take a look at [different methods to do the same task](https://vuejs.org/v2/api/#vm-emit). +1. You may query the `window` object one time, while bootstrapping your application for application +specific data (for example, `scrollTo` is ok to access anytime). Do this access during the +bootstrapping of your application. +1. You may have a temporary but immediate need to create technical debt by writing code that does +not follow our standards, to be refactored later. Maintainers need to be ok with the tech debt in +the first place. An issue should be created for that tech debt to evaluate it further and discuss. +In the coming months you should fix that tech debt, with its priority to be determined by maintainers. +1. When creating tech debt you must write the tests for that code before hand and those tests may +not be rewritten. For example, jQuery tests rewritten to Vue tests. +1. You may choose to use VueX as a centralized state management. If you choose not to use VueX, you +must use the *store pattern* which can be found in the +[Vue.js documentation](https://vuejs.org/v2/guide/state-management.html#Simple-State-Management-from-Scratch). +1. Once you have chosen a centralized state-management solution you must use it for your entire +application. Don't mix and match your state-management solutions. diff --git a/doc/development/fe_guide/vue.md b/doc/development/fe_guide/vue.md index 41fbd128631..b3fbb9556a9 100644 --- a/doc/development/fe_guide/vue.md +++ b/doc/development/fe_guide/vue.md @@ -22,7 +22,8 @@ All new features built with Vue.js must follow a [Flux architecture](https://fac The main goal we are trying to achieve is to have only one data flow and only one data entry. In order to achieve this goal we use [vuex](#vuex). -You can also read about this architecture in Vue docs about [state management](https://vuejs.org/v2/guide/state-management.html#Simple-State-Management-from-Scratch) +You can also read about this architecture in Vue docs about +[state management](https://vuejs.org/v2/guide/state-management.html#Simple-State-Management-from-Scratch) and about [one way data flow](https://vuejs.org/v2/guide/components.html#One-Way-Data-Flow). ### Components and Store @@ -62,14 +63,15 @@ Be sure to read about [page-specific JavaScript](performance.md#page-specific-ja While mounting a Vue application, you might need to provide data from Rails to JavaScript. To do that, you can use the `data` attributes in the HTML element and query them while mounting the application. -You should only do this while initializing the application, because the mounted element is replaced with a Vue-generated DOM. +You should only do this while initializing the application, because the mounted element is replaced +with a Vue-generated DOM. -The advantage of providing data from the DOM to the Vue instance through `props` in the `render` function -instead of querying the DOM inside the main Vue component is avoiding the need to create a fixture or an HTML element in the unit test, -which makes the tests easier. +The advantage of providing data from the DOM to the Vue instance through `props` in the `render` +function instead of querying the DOM inside the main Vue component is avoiding the need to create a +fixture or an HTML element in the unit test, which makes the tests easier. -See the following example, also, please refer to our [Vue style guide](style/vue.md#basic-rules) for additional -information on why we explicitly declare the data being passed into the Vue app; +See the following example, also, please refer to our [Vue style guide](style/vue.md#basic-rules) for +additional information on why we explicitly declare the data being passed into the Vue app; ```javascript // haml @@ -94,13 +96,15 @@ return new Vue({ }); ``` -> When adding an `id` attribute to mount a Vue application, please make sure this `id` is unique across the codebase +> When adding an `id` attribute to mount a Vue application, please make sure this `id` is unique +across the codebase. #### Accessing the `gl` object -When we need to query the `gl` object for data that doesn't change during the application's life cycle, we should do it in the same place where we query the DOM. -By following this practice, we can avoid the need to mock the `gl` object, which makes tests easier. -It should be done while initializing our Vue instance, and the data should be provided as `props` to the main component: +When we need to query the `gl` object for data that doesn't change during the application's life +cycle, we should do it in the same place where we query the DOM. By following this practice, we can +avoid the need to mock the `gl` object, which makes tests easier. It should be done while +initializing our Vue instance, and the data should be provided as `props` to the main component: ```javascript return new Vue({ @@ -192,13 +196,18 @@ Check this [page](vuex.md) for more details. In the [Vue documentation](https://vuejs.org/v2/api/#Options-Data) the Data function/object is defined as follows: -> The data object for the Vue instance. Vue recursively converts its properties into getter/setters to make it “reactive”. The object must be plain: native objects such as browser API objects and prototype properties are ignored. A rule of thumb is that data should just be data - it is not recommended to observe objects with their own stateful behavior. +> The data object for the Vue instance. Vue recursively converts its properties into getter/setters +to make it “reactive”. The object must be plain: native objects such as browser API objects and +prototype properties are ignored. A rule of thumb is that data should just be data - it is not +recommended to observe objects with their own stateful behavior. Based on the Vue guidance: -- **Do not** use or create a JavaScript class in your [data function](https://vuejs.org/v2/api/#data), such as `user: new User()`. +- **Do not** use or create a JavaScript class in your [data function](https://vuejs.org/v2/api/#data), +such as `user: new User()`. - **Do not** add new JavaScript class implementations. -- **Do** use [GraphQL](../api_graphql_styleguide.md), [Vuex](vuex.md) or a set of components if cannot use simple primitives or objects. +- **Do** use [GraphQL](../api_graphql_styleguide.md), [Vuex](vuex.md) or a set of components if +cannot use simple primitives or objects. - **Do** maintain existing implementations using such approaches. - **Do** Migrate components to a pure object model when there are substantial changes to it. - **Do** add business logic to helpers or utils, so you can test them separately from your component. @@ -209,7 +218,8 @@ There are additional reasons why having a JavaScript class presents maintainabil - Once a class is created, it is easy to extend it in a way that can infringe Vue reactivity and best practices. - A class adds a layer of abstraction, which makes the component API and its inner workings less clear. -- It makes it harder to test. Since the class is instantiated by the component data function, it is harder to 'manage' component and class separately. +- It makes it harder to test. Since the class is instantiated by the component data function, it is +harder to 'manage' component and class separately. - Adding OOP to a functional codebase adds yet another way of writing code, reducing consistency and clarity. ## Style guide @@ -231,6 +241,7 @@ Here's an example of a well structured unit test for [this Vue component](#appen ```javascript import { shallowMount } from '@vue/test-utils'; +import { extendedWrapper } from 'helpers/vue_test_utils_helper'; import { GlLoadingIcon } from '@gitlab/ui'; import MockAdapter from 'axios-mock-adapter'; import axios from '~/lib/utils/axios_utils'; @@ -263,19 +274,21 @@ describe('~/todos/app.vue', () => { }); // It is very helpful to separate setting up the component from - // its collaborators (i.e. Vuex, axios, etc.) + // its collaborators (for example, Vuex and axios). const createWrapper = (props = {}) => { - wrapper = shallowMount(App, { - propsData: { - path: TEST_TODO_PATH, - ...props, - }, - }); + wrapper = extendedWrapper( + shallowMount(App, { + propsData: { + path: TEST_TODO_PATH, + ...props, + }, + }) + ); }; // Helper methods greatly help test maintainability and readability. const findLoader = () => wrapper.find(GlLoadingIcon); - const findAddButton = () => wrapper.find('[data-testid="add-button"]'); - const findTextInput = () => wrapper.find('[data-testid="text-input"]'); + const findAddButton = () => wrapper.findByTestId('add-button'); + const findTextInput = () => wrapper.findByTestId('text-input'); const findTodoData = () => wrapper.findAll('[data-testid="todo-item"]').wrappers.map(wrapper => ({ text: wrapper.text() })); describe('when mounted and loading', () => { @@ -323,11 +336,41 @@ describe('~/todos/app.vue', () => { The main return value of a Vue component is the rendered output. In order to test the component we need to test the rendered output. Visit the [Vue testing guide](https://vuejs.org/v2/guide/testing.html#Unit-Testing). +### Child components + +1. Test any directive that defines if/how child component is rendered (for example, `v-if` and `v-for`). +1. Test any props we are passing to child components (especially if the prop is calculated in the +component under test, with the `computed` property, for example). Remember to use `.props()` and not `.vm.someProp`. +1. Test we react correctly to any events emitted from child components: + + ```javascript + const checkbox = wrapper.findByTestId('checkboxTestId'); + + expect(checkbox.attributes('disabled')).not.toBeDefined(); + + findChildComponent().vm.$emit('primary'); + await nextTick(); + + expect(checkbox.attributes('disabled')).toBeDefined(); + ``` + +1. **Do not** test the internal implementation of the child components: + + ```javascript + // bad + expect(findChildComponent().find('.error-alert').exists()).toBe(false); + + // good + expect(findChildComponent().props('withAlertContainer')).toBe(false); + ``` + ### Events -We should test for events emitted in response to an action within our component, this is useful to verify the correct events are being fired with the correct arguments. +We should test for events emitted in response to an action within our component, this is useful to +verify the correct events are being fired with the correct arguments. -For any DOM events we should use [`trigger`](https://vue-test-utils.vuejs.org/api/wrapper/#trigger) to fire out event. +For any DOM events we should use [`trigger`](https://vue-test-utils.vuejs.org/api/wrapper/#trigger) +to fire out event. ```javascript // Assuming SomeButton renders: <button>Some button</button> @@ -342,7 +385,8 @@ it('should fire the click event', () => { }) ``` -When we need to fire a Vue event, we should use [`emit`](https://vuejs.org/v2/guide/components-custom-events.html) to fire our event. +When we need to fire a Vue event, we should use [`emit`](https://vuejs.org/v2/guide/components-custom-events.html) +to fire our event. ```javascript wrapper = shallowMount(DropdownItem); @@ -355,7 +399,8 @@ it('should fire the itemClicked event', () => { }) ``` -We should verify an event has been fired by asserting against the result of the [`emitted()`](https://vue-test-utils.vuejs.org/api/wrapper/#emitted) method +We should verify an event has been fired by asserting against the result of the +[`emitted()`](https://vue-test-utils.vuejs.org/api/wrapper/#emitted) method. ## Vue.js Expert Role @@ -371,7 +416,8 @@ You should only apply to be a Vue.js expert when your own merge requests and you > This section is added temporarily to support the efforts to migrate the codebase from Vue 2.x to Vue 3.x -Currently, we recommend to minimize adding certain features to the codebase to prevent increasing the tech debt for the eventual migration: +Currently, we recommend to minimize adding certain features to the codebase to prevent increasing +the tech debt for the eventual migration: - filters; - event buses; @@ -382,7 +428,8 @@ You can find more details on [Migration to Vue 3](vue3_migration.md) ## Appendix - Vue component subject under test -This is the template for the example component which is tested in the [Testing Vue components](#testing-vue-components) section: +This is the template for the example component which is tested in the +[Testing Vue components](#testing-vue-components) section: ```html <template> diff --git a/doc/development/feature_categorization/index.md b/doc/development/feature_categorization/index.md index dd69d7bcf80..2f0f8101b53 100644 --- a/doc/development/feature_categorization/index.md +++ b/doc/development/feature_categorization/index.md @@ -122,7 +122,7 @@ the actions used in configuration still exist as routes. ## API endpoints The [GraphQL API](../../api/graphql/index.md) is currently categorized -as `not_owned`. For now, no extra specification is needed. For more +as `not_owned`. For now, no extra specification is needed. For more information, see [`gitlab-com/gl-infra/scalability#583`](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/583/). diff --git a/doc/development/feature_flags/controls.md b/doc/development/feature_flags/controls.md index 7551199aa58..adcf3175c45 100644 --- a/doc/development/feature_flags/controls.md +++ b/doc/development/feature_flags/controls.md @@ -250,6 +250,7 @@ Changes to the issue format can be submitted in the Any feature flag change that affects any GitLab instance is automatically logged in [features_json.log](../../administration/logs.md#features_jsonlog). You can search the change history in [Kibana](https://about.gitlab.com/handbook/support/workflows/kibana.html). +You can access the feature flag change history for GitLab.com [here](https://log.gprd.gitlab.net/goto/d060337c017723084c6d97e09e591fc6). ## Cleaning up diff --git a/doc/development/feature_flags/development.md b/doc/development/feature_flags/development.md index 7c5333c9aa6..dd732a08c72 100644 --- a/doc/development/feature_flags/development.md +++ b/doc/development/feature_flags/development.md @@ -378,6 +378,18 @@ You can also enable a feature flag for a given gate: Feature.enable(:feature_flag_name, Project.find_by_full_path("root/my-project")) ``` +### Removing a feature flag locally (in development) + +When manually enabling or disabling a feature flag from the Rails console, its default value gets overwritten. +This can cause confusion when changing the flag's `default_enabled` attribute. + +To reset the feature flag to the default status, you can remove it in the rails console (`rails c`) +as follows: + +```ruby +Feature.remove(:feature_flag_name) +``` + ## Feature flags in tests Introducing a feature flag into the codebase creates an additional code path that should be tested. diff --git a/doc/development/feature_flags/index.md b/doc/development/feature_flags/index.md index 270e07ed755..e93a5b3de1b 100644 --- a/doc/development/feature_flags/index.md +++ b/doc/development/feature_flags/index.md @@ -6,24 +6,34 @@ info: "See the Technical Writers assigned to Development Guidelines: https://abo # Feature flags in development of GitLab +**NOTE**: +The documentation below covers feature flags used by GitLab to deploy its own features, which **is not** the same +as the [feature flags offered as part of the product](../../operations/feature_flags.md). + ## When to use feature flags -Starting with GitLab 11.4, developers are required to use feature flags for -non-trivial changes. Such changes include: +Developers are required to use feature flags for changes that could affect availability of existing GitLab functionality (if it only affects the new feature you're making that is probably acceptable). +Such changes include: + +1. New features in high traffic areas (e.g. a new merge request widget, new option in issues/epics, new CI functionality). +1. Complex performance improvements that may require additional testing in production (e.g. rewriting complex queries, changes to frequently used API endpoints). +1. Invasive changes to the user interface (e.g. introducing a new navigation bar, removal of a sidebar, UI element change in issues or MR interface). +1. Introducing dependencies on third-party services (e.g. adding support for importing projects). +1. Changes to features that can cause data corruption or cause data loss (e.g. features processing repository data or user uploaded content). + +Situations where you might consider not using a feature flag: + +1. Adding a new API endpoint +1. Introducing new features in low traffic areas (e.g. adding a new export functionality in the admin area/group settings/project settings) +1. Non-invasive frontend changes (e.g. changing the color of a button, or moving a UI element in a low traffic area) + +In all cases, those working on the changes should ask themselves: -- New features (e.g. a new merge request widget, epics, etc). -- Complex performance improvements that may require additional testing in - production, such as rewriting complex queries. -- Invasive changes to the user interface, such as a new navigation bar or the - removal of a sidebar. -- Adding support for importing projects from a third-party service. -- Risk of data loss +> Why do I need to add a feature flag? If I don't add one, what options do I have to control the impact on application reliability, and user experience? -In all cases, those working on the changes can best decide if a feature flag is -necessary. For example, changing the color of a button doesn't need a feature -flag, while changing the navigation bar definitely needs one. In case you are -uncertain if a feature flag is necessary, simply ask about this in the merge -request, and those reviewing the changes will likely provide you with an answer. +For perspective on why we limit our use of feature flags please see the following [video](https://www.youtube.com/watch?v=DQaGqyolOd8). + +In case you are uncertain if a feature flag is necessary, simply ask about this in an early merge request, and those reviewing the changes will likely provide you with an answer. When using a feature flag for UI elements, make sure to _also_ use a feature flag for the underlying backend code, if there is any. This ensures there is @@ -36,35 +46,29 @@ they are new features or performance improvements. By using feature flags, you can determine the impact of GitLab-directed changes, while still being able to disable those changes without having to revert an entire release. -Before using feature flags for GitLab development, review the following development guides: - -NOTE: -The feature flags used by GitLab to deploy its own features **are not** the same -as the [feature flags offered as part of the product](../../operations/feature_flags.md). - For an overview about starting with feature flags in GitLab development, use this [training template](https://gitlab.com/gitlab-com/www-gitlab-com/-/blob/master/.gitlab/issue_templates/feature-flag-training.md). -Development guides: +Before using feature flags for GitLab development, review the following development guides: -- [Process for using features flags](process.md): When you should use +1. [Process for using features flags](process.md): When you should use feature flags in the development of GitLab, what's the cost of using them, and how to include them in a release. -- [Developing with feature flags](development.md): Learn about the types of +1. [Developing with feature flags](development.md): Learn about the types of feature flags, their definition and validation, how to create them, frontend and backend details, and other information. -- [Documenting features deployed behind feature flags](../documentation/feature_flags.md): +1. [Documenting features deployed behind feature flags](../documentation/feature_flags.md): How to document features deployed behind feature flags, and how to update the documentation for features' flags when their states change. -- [Controlling feature flags](controls.md): Learn the process for deploying +1. [Controlling feature flags](controls.md): Learn the process for deploying a new feature, enabling it on GitLab.com, communicating the change, logging, and cleaning up. User guides: -- [How GitLab administrators can enable and disable features behind flags](../../administration/feature_flags.md): +1. [How GitLab administrators can enable and disable features behind flags](../../administration/feature_flags.md): An explanation for GitLab administrators about how they can enable or disable GitLab features behind feature flags. -- [What "features deployed behind flags" means to the GitLab user](../../user/feature_flags.md): +1. [What "features deployed behind flags" means to the GitLab user](../../user/feature_flags.md): An explanation for GitLab users regarding how certain features might not be available to them until they are enabled. diff --git a/doc/development/feature_flags/process.md b/doc/development/feature_flags/process.md index 2e3680bb103..7e6299c193c 100644 --- a/doc/development/feature_flags/process.md +++ b/doc/development/feature_flags/process.md @@ -148,3 +148,30 @@ they speed up the process as managing incidents now becomes _much_ easier. Once continuous deployments are easier to perform, the time to iterate on a feature is reduced even further, as you no longer need to wait weeks before your changes are available on GitLab.com. + +### The benefits of feature flags + +It may seem like feature flags are configuration, which goes against our [convention-over-configuration](https://about.gitlab.com/handbook/product/product-principles/#convention-over-configuration) +principle. However, configuration is by definition something that is user-manageable. +Feature flags are not intended to be user-editable. Instead, they are intended as a tool for Engineers +and Site Reliability Engineers to use to de-risk their changes. Feature flags are the shim that gets us +to Continuous Delivery with our mono repo and without having to deploy the entire codebase on every change. +Feature flags are created to ensure that we can safely rollout our work on our terms. +If we use Feature Flags as a configuration, we are doing it wrong and are indeed in violation of our +principles. If something needs to be configured, we should intentionally make it configuration from the +first moment. + +Some of the benefits of using development-type feature flags are: + +1. It enables Continuous Delivery for GitLab.com. +1. It significantly reduces Mean-Time-To-Recovery. +1. It helps engineers to monitor and reduce the impact of their changes gradually, at any scale, + allowing us to be more metrics-driven and execute good DevOps practices, [shifting some responsibility "left"](https://devops.com/why-its-time-for-site-reliability-engineering-to-shift-left/). +1. Controlled feature rollout timing: without feature flags, we would need to wait until a specific + deployment was complete (which at GitLab could be at any time). +1. Increased psychological safety: when a feature flag is used, an engineer has the confidence that if anything goes wrong they can quickly disable the code and minimize the impact of a change that might be risky. +1. Improved throughput: when a change is less risky because a flag exists, theoretical tests about + scalability can potentially become unnecessary or less important. This allows an engineer to + potentially test a feature on a small project, monitor the impact, and proceed. The alternative might + be to build complex benchmarks locally, or on staging, or on another GitLab deployment, which has an + outsized impact on the time it can take to build and release a feature. diff --git a/doc/development/foreign_keys.md b/doc/development/foreign_keys.md index 0f100c6b66e..37764a12f97 100644 --- a/doc/development/foreign_keys.md +++ b/doc/development/foreign_keys.md @@ -105,6 +105,8 @@ create_table :user_configs, id: false do |t| end ``` +Setting `default: nil` will ensure a primary key sequence is not created, and since the primary key +will automatically get an index, we set `index: false` to avoid creating a duplicate. You will also need to add the new primary key to the model: ```ruby diff --git a/doc/development/geo/framework.md b/doc/development/geo/framework.md index e4518ce1b57..148953dc418 100644 --- a/doc/development/geo/framework.md +++ b/doc/development/geo/framework.md @@ -287,7 +287,7 @@ For example, to add support for files referenced by a `Widget` model with a t.datetime_with_timezone :created_at, null: false t.text :last_sync_failure - t.index :widget_id, name: :index_widget_registry_on_widget_id + t.index :widget_id, name: :index_widget_registry_on_widget_id, unique: true t.index :retry_at t.index :state end @@ -743,6 +743,8 @@ available in the Admin UI. #### Releasing the feature +1. In `ee/config/feature_flags/development/geo_widget_replication.yml`, set `default_enabled: true` + 1. In `ee/app/replicators/geo/widget_replicator.rb`, delete the `self.replication_enabled_by_default?` method: ```ruby @@ -770,3 +772,260 @@ available in the Admin UI. description: 'Find widget registries on this Geo node', feature_flag: :geo_widget_replication # REMOVE THIS LINE ``` + +### Repository Replicator Strategy + +Models that refer to any repository on the disk +can be easily supported by Geo with the `Geo::RepositoryReplicatorStrategy` module. + +For example, to add support for files referenced by a `Gizmos` model with a +`gizmos` table, you would perform the following steps. + +#### Replication + +1. Include `Gitlab::Geo::ReplicableModel` in the `Gizmo` class, and specify + the Replicator class `with_replicator Geo::GizmoReplicator`. + + At this point the `Gizmo` class should look like this: + + ```ruby + # frozen_string_literal: true + + class Gizmo < ApplicationRecord + include ::Gitlab::Geo::ReplicableModel + + with_replicator Geo::GizmoReplicator + + # @param primary_key_in [Range, Gizmo] arg to pass to primary_key_in scope + # @return [ActiveRecord::Relation<Gizmo>] everything that should be synced to this node, restricted by primary key + def self.replicables_for_current_secondary(primary_key_in) + # Should be implemented. The idea of the method is to restrict + # the set of synced items depending on synchronization settings + end + + # Geo checks this method in FrameworkRepositorySyncService to avoid + # snapshotting repositories using object pools + def pool_repository + nil + end + ... + end + ``` + + Pay some attention to method `pool_repository`. Not every repository type uses + repository pooling. As Geo prefers to use repository snapshotting, it can lead to data loss. + Make sure to overwrite `pool_repository` so it returns nil for repositories that do not + have pools. + + If there is a common constraint for records to be available for replication, + make sure to also overwrite the `available_replicables` scope. + +1. Create `ee/app/replicators/geo/gizmo_replicator.rb`. Implement the + `#repository` method which should return a `<Repository>` instance, + and implement the class method `.model` to return the `Gizmo` class: + + ```ruby + # frozen_string_literal: true + + module Geo + class GizmoReplicator < Gitlab::Geo::Replicator + include ::Geo::RepositoryReplicatorStrategy + + def self.model + ::Gizmo + end + + def repository + model_record.repository + end + + def self.git_access_class + ::Gitlab::GitAccessGizmo + end + + # The feature flag follows the format `geo_#{replicable_name}_replication`, + # so here it would be `geo_gizmo_replication` + def self.replication_enabled_by_default? + false + end + end + end + ``` + +1. Generate the feature flag definition file by running the feature flag command + and running through the steps: + + ```shell + bin/feature-flag --ee geo_gizmo_replication --type development --group 'group::geo' + ``` + +1. Make sure Geo push events are created. Usually it needs some + change in the `app/workers/post_receive.rb` file. Example: + + ```ruby + def replicate_gizmo_changes(gizmo) + if ::Gitlab::Geo.primary? + gizmo.replicator.handle_after_update if gizmo + end + end + ``` + + See `app/workers/post_receive.rb` for more examples. + +1. Make sure the repository removal is also handled. You may need to add something + like the following in the destroy service of the repository: + + ```ruby + gizmo.replicator.handle_after_destroy if gizmo.repository + ``` + +1. Add this replicator class to the method `replicator_classes` in + `ee/lib/gitlab/geo.rb`: + + ```ruby + REPLICATOR_CLASSES = [ + ... + ::Geo::PackageFileReplicator, + ::Geo::GizmoReplicator + ] + end + ``` + +1. Create `ee/spec/replicators/geo/gizmo_replicator_spec.rb` and perform + the necessary setup to define the `model_record` variable for the shared + examples: + + ```ruby + # frozen_string_literal: true + + require 'spec_helper' + + RSpec.describe Geo::GizmoReplicator do + let(:model_record) { build(:gizmo) } + + include_examples 'a repository replicator' + end + ``` + +1. Create the `gizmo_registry` table, with columns ordered according to [our guidelines](../ordering_table_columns.md) so Geo secondaries can track the sync and + verification state of each Gizmo. This migration belongs in `ee/db/geo/migrate`: + + ```ruby + # frozen_string_literal: true + + class CreateGizmoRegistry < ActiveRecord::Migration[6.0] + include Gitlab::Database::MigrationHelpers + + DOWNTIME = false + + disable_ddl_transaction! + + def up + create_table :gizmo_registry, id: :bigserial, force: :cascade do |t| + t.datetime_with_timezone :retry_at + t.datetime_with_timezone :last_synced_at + t.datetime_with_timezone :created_at, null: false + t.bigint :gizmo_id, null: false + t.integer :state, default: 0, null: false, limit: 2 + t.integer :retry_count, default: 0, limit: 2 + t.text :last_sync_failure + t.boolean :force_to_redownload + t.boolean :missing_on_primary + + t.index :gizmo_id, name: :index_gizmo_registry_on_gizmo_id, unique: true + t.index :retry_at + t.index :state + end + + add_text_limit :gizmo_registry, :last_sync_failure, 255 + end + + def down + drop_table :gizmo_registry + end + end + ``` + +1. Create `ee/app/models/geo/gizmo_registry.rb`: + + ```ruby + # frozen_string_literal: true + + class Geo::GizmoRegistry < Geo::BaseRegistry + include Geo::ReplicableRegistry + + MODEL_CLASS = ::Gizmo + MODEL_FOREIGN_KEY = :gizmo_id + + belongs_to :gizmo, class_name: 'Gizmo' + end + ``` + +1. Update `REGISTRY_CLASSES` in `ee/app/workers/geo/secondary/registry_consistency_worker.rb`. +1. Add `gizmo_registry` to `ActiveSupport::Inflector.inflections` in `config/initializers_before_autoloader/000_inflections.rb`. +1. Create `ee/spec/factories/geo/gizmo_registry.rb`: + + ```ruby + # frozen_string_literal: true + + FactoryBot.define do + factory :geo_gizmo_registry, class: 'Geo::GizmoRegistry' do + gizmo + state { Geo::GizmoRegistry.state_value(:pending) } + + trait :synced do + state { Geo::GizmoRegistry.state_value(:synced) } + last_synced_at { 5.days.ago } + end + + trait :failed do + state { Geo::GizmoRegistry.state_value(:failed) } + last_synced_at { 1.day.ago } + retry_count { 2 } + last_sync_failure { 'Random error' } + end + + trait :started do + state { Geo::GizmoRegistry.state_value(:started) } + last_synced_at { 1.day.ago } + retry_count { 0 } + end + end + end + ``` + +1. Create `ee/spec/models/geo/gizmo_registry_spec.rb`: + + ```ruby + # frozen_string_literal: true + + require 'spec_helper' + + RSpec.describe Geo::GizmoRegistry, :geo, type: :model do + let_it_be(:registry) { create(:geo_gizmo_registry) } + + specify 'factory is valid' do + expect(registry).to be_valid + end + + include_examples 'a Geo framework registry' + end + ``` + +1. Make sure the newly added repository type can be accessed by a secondary. + You may need to make some changes to one of the Git access classes. + + Gizmos should now be replicated by Geo. + +#### Metrics + +You need to make the same changes as for Blob Replicator Strategy. +You need to make the same changes for the [metrics as in the Blob Replicator Strategy](#metrics). + +#### GraphQL API + +You need to make the same changes for the GraphQL API [as in the Blob Replicator Strategy](#graphql-api). + +#### Releasing the feature + +You need to make the same changes for [releasing the feature as in the Blob Replicator Strategy](#releasing-the-feature). diff --git a/doc/development/gitaly.md b/doc/development/gitaly.md index 57a4e24679c..5d062d7404e 100644 --- a/doc/development/gitaly.md +++ b/doc/development/gitaly.md @@ -27,8 +27,9 @@ have changed since then, it should still serve as a good introduction. ## Beginner's guide Start by reading the Gitaly repository's -[Beginner's guide to Gitaly contributions](https://gitlab.com/gitlab-org/gitaly/blob/master/doc/beginners_guide.md). -It describes how to set up Gitaly, the various components of Gitaly and what they do, and how to run its test suites. +[Beginner's guide to Gitaly contributions](https://gitlab.com/gitlab-org/gitaly/-/blob/master/doc/beginners_guide.md). +It describes how to set up Gitaly, the various components of Gitaly and what +they do, and how to run its test suites. ## Developing new Git features @@ -36,33 +37,17 @@ To read or write Git data, a request has to be made to Gitaly. This means that if you're developing a new feature where you need data that's not yet available in `lib/gitlab/git` changes have to be made to Gitaly. -> This is a new process that is not clearly defined yet. If you want -to contribute a Git feature and you're getting stuck, reach out to the -Gitaly team or `@jacobvosmaer-gitlab`. +There should be no new code that touches Git repositories via disk access (for example, +Rugged, `git`, `rm -rf`) anywhere in the `gitlab` repository. Anything that +needs direct access to the Git repository *must* be implemented in Gitaly, and +exposed via an RPC. -By 'new feature' we mean any method or class in `lib/gitlab/git` that is -called from outside `lib/gitlab/git`. For new methods that are called -from inside `lib/gitlab/git`, see 'Modifying existing Git features' -below. +It's often easier to develop a new feature in Gitaly if you make the changes to +GitLab that will use the new feature in a separate merge request, to be merged +immediately after the Gitaly one. This allows you to test your changes before +they are merged. -There should be no new code that touches Git repositories via -disk access (e.g. Rugged, `git`, `rm -rf`) anywhere outside -`lib/gitlab/git`. - -The process for adding new Gitaly features is: - -- exploration / prototyping -- design and create a new Gitaly RPC in [`gitaly-proto`](https://gitlab.com/gitlab-org/gitaly-proto) -- release a new version of `gitaly-proto` -- write implementation and tests for the RPC [in Gitaly](https://gitlab.com/gitlab-org/gitaly), in Go or Ruby -- release a new version of Gitaly -- write client code in GitLab CE/EE, GitLab Workhorse or GitLab Shell that calls the new Gitaly RPC - -These steps often overlap. It is possible to use an unreleased version -of Gitaly and `gitaly-proto` during testing and development. - -- See the [Gitaly repository](https://gitlab.com/gitlab-org/gitaly/blob/master/CONTRIBUTING.md#development-and-testing-with-a-custom-gitaly-proto) for instructions on writing server side code with an unreleased protocol. -- See [below](#running-tests-with-a-locally-modified-version-of-gitaly) for instructions on running GitLab CE tests with a modified version of Gitaly. +- See [below](#running-tests-with-a-locally-modified-version-of-gitaly) for instructions on running GitLab tests with a modified version of Gitaly. - In GDK run `gdk install` and restart `gdk run` (or `gdk run app`) to use a locally modified Gitaly version for development ### `gitaly-ruby` @@ -208,7 +193,7 @@ to manually run `make` again. Note that CI tests do not use your locally modified version of Gitaly. To use a custom Gitaly version in CI you need to update -GITALY_SERVER_VERSION as described at the beginning of this paragraph. +GITALY_SERVER_VERSION as described at the beginning of this section. To use a different Gitaly repository, e.g., if your changes are present on a fork, you can specify a `GITALY_REPO_URL` environment variable when @@ -244,6 +229,9 @@ the branch with the changes (`new-feature-branch`, for example): 1. Run `bundle install` to use the modified RPC client. +Re-run `bundle install` in the `gitlab` project each time the Gitaly branch +changes to embed a new SHA in the `Gemfile.lock` file. + --- [Return to Development documentation](README.md) diff --git a/doc/development/go_guide/index.md b/doc/development/go_guide/index.md index b2405f4ce2a..68210c08a00 100644 --- a/doc/development/go_guide/index.md +++ b/doc/development/go_guide/index.md @@ -171,7 +171,7 @@ sure to use at least this version to avoid `checksum mismatch` errors. We don't use object-relational mapping libraries (ORMs) at GitLab (except [ActiveRecord](https://guides.rubyonrails.org/active_record_basics.html) in Ruby on Rails). Projects can be structured with services to avoid them. -[PQ](https://github.com/lib/pq) should be enough to interact with PostgreSQL +[pgx](https://github.com/jackc/pgx) should be enough to interact with PostgreSQL databases. ### Migrations @@ -449,11 +449,10 @@ changes between minor versions can expose bugs or cause problems in our projects Once you've picked a new Go version to use, the steps to update Omnibus and CNG are: -- [Create a merge request in the CNG project](https://gitlab.com/gitlab-org/build/CNG/edit/master/ci_files/variables.yml?branch_name=update-go-version), +- [Create a merge request in the CNG project](https://gitlab.com/gitlab-org/build/CNG/-/edit/master/ci_files/variables.yml?branch_name=update-go-version), updating the `GO_VERSION` in `ci_files/variables.yml`. -- Create a merge request in the [`gitlab-omnibus-builder` project](https://gitlab.com/gitlab-org/gitlab-omnibus-builder), - updating every file in the `docker/` directory so the `GO_VERSION` is set - appropriately. [Here's an example](https://gitlab.com/gitlab-org/gitlab-omnibus-builder/-/merge_requests/125/diffs). +- [Create a merge request in the `gitlab-omnibus-builder` project](https://gitlab.com/gitlab-org/gitlab-omnibus-builder/-/edit/master/docker/VERSIONS?branch_name=update-go-version), + updating the `GO_VERSION` in `docker/VERSIONS`. - Tag a new release of `gitlab-omnibus-builder` containing the change. - [Create a merge request in the `omnibus-gitlab` project](https://gitlab.com/gitlab-org/omnibus-gitlab/edit/master/.gitlab-ci.yml?branch_name=update-gitlab-omnibus-builder-version), updating the `BUILDER_IMAGE_REVISION` to match the newly-created tag. diff --git a/doc/development/gotchas.md b/doc/development/gotchas.md index 2b34aedddf6..a506b67d89d 100644 --- a/doc/development/gotchas.md +++ b/doc/development/gotchas.md @@ -270,7 +270,7 @@ This problem disappears as soon as we upgrade to Rails 6 and use the Zeitwerk au ### Further reading - Rails Guides: [Autoloading and Reloading Constants (Classic Mode)](https://guides.rubyonrails.org/autoloading_and_reloading_constants_classic_mode.html) -- Ruby Constant lookup: [Everything you ever wanted to know about constant lookup in Ruby](http://cirw.in/blog/constant-lookup) +- Ruby Constant lookup: [Everything you ever wanted to know about constant lookup in Ruby](https://cirw.in/blog/constant-lookup) - Rails 6 and Zeitwerk autoloader: [Understanding Zeitwerk in Rails 6](https://medium.com/cedarcode/understanding-zeitwerk-in-rails-6-f168a9f09a1f) ## Storing assets that do not require pre-compiling diff --git a/doc/development/img/snowplow_flow.png b/doc/development/img/snowplow_flow.png Binary files differindex 5996cf01537..aae597edc13 100644 --- a/doc/development/img/snowplow_flow.png +++ b/doc/development/img/snowplow_flow.png diff --git a/doc/development/img/stage_group_dashboards_annotation.png b/doc/development/img/stage_group_dashboards_annotation.png Binary files differnew file mode 100644 index 00000000000..3776d87e5bb --- /dev/null +++ b/doc/development/img/stage_group_dashboards_annotation.png diff --git a/doc/development/img/stage_group_dashboards_debug_1.png b/doc/development/img/stage_group_dashboards_debug_1.png Binary files differnew file mode 100644 index 00000000000..309fad89120 --- /dev/null +++ b/doc/development/img/stage_group_dashboards_debug_1.png diff --git a/doc/development/img/stage_group_dashboards_debug_2.png b/doc/development/img/stage_group_dashboards_debug_2.png Binary files differnew file mode 100644 index 00000000000..2aad9ab5592 --- /dev/null +++ b/doc/development/img/stage_group_dashboards_debug_2.png diff --git a/doc/development/img/stage_group_dashboards_debug_3.png b/doc/development/img/stage_group_dashboards_debug_3.png Binary files differnew file mode 100644 index 00000000000..38647410ffd --- /dev/null +++ b/doc/development/img/stage_group_dashboards_debug_3.png diff --git a/doc/development/img/stage_group_dashboards_filters.png b/doc/development/img/stage_group_dashboards_filters.png Binary files differnew file mode 100644 index 00000000000..27a836bc36d --- /dev/null +++ b/doc/development/img/stage_group_dashboards_filters.png diff --git a/doc/development/img/stage_group_dashboards_metrics.png b/doc/development/img/stage_group_dashboards_metrics.png Binary files differnew file mode 100644 index 00000000000..6b6faff6e3b --- /dev/null +++ b/doc/development/img/stage_group_dashboards_metrics.png diff --git a/doc/development/img/stage_group_dashboards_time_customization.png b/doc/development/img/stage_group_dashboards_time_customization.png Binary files differnew file mode 100644 index 00000000000..49e61183b7c --- /dev/null +++ b/doc/development/img/stage_group_dashboards_time_customization.png diff --git a/doc/development/img/stage_group_dashboards_time_filter.png b/doc/development/img/stage_group_dashboards_time_filter.png Binary files differnew file mode 100644 index 00000000000..81a3dc789f1 --- /dev/null +++ b/doc/development/img/stage_group_dashboards_time_filter.png diff --git a/doc/development/instrumentation.md b/doc/development/instrumentation.md index 8fb7f29c86c..94b56e10d9e 100644 --- a/doc/development/instrumentation.md +++ b/doc/development/instrumentation.md @@ -11,7 +11,7 @@ blocks of Ruby code. Method instrumentation is the primary form of instrumentation with block-based instrumentation only being used when we want to drill down to specific regions of code within a method. -Please refer to [Product Analytics](https://about.gitlab.com/handbook/product/product-analytics-guide/) if you are tracking product usage patterns. +Please refer to [Product Intelligence](https://about.gitlab.com/handbook/product/product-intelligence-guide/) if you are tracking product usage patterns. ## Instrumenting Methods diff --git a/doc/development/integrations/codesandbox.md b/doc/development/integrations/codesandbox.md index 1641f4656a0..faa1ec0ee3f 100644 --- a/doc/development/integrations/codesandbox.md +++ b/doc/development/integrations/codesandbox.md @@ -1,7 +1,7 @@ # Set up local Codesandbox development environment -This guide walks through setting up a local [Codesandbox repository](https://github.com/codesandbox/codesandbox-client) and integrating it with a local GitLab instance. Codesandbox -is used to power the Web IDE's [Live Preview feature](../../user/project/web_ide/index.md#live-preview). Having a local Codesandbox setup is useful for debugging upstream issues or +This guide walks through setting up a local [Codesandbox repository](https://github.com/codesandbox/codesandbox-client) and integrating it with a local GitLab instance. Codesandbox +is used to power the Web IDE's [Live Preview feature](../../user/project/web_ide/index.md#live-preview). Having a local Codesandbox setup is useful for debugging upstream issues or creating upstream contributions like [this one](https://github.com/codesandbox/codesandbox-client/pull/5137). ## Initial setup @@ -59,7 +59,7 @@ to use a locally-built module. To build and use a local `smooshpack` module: yarn run start ``` - Now, in the GitLab project, you can run `yarn link "smooshpack"`. `yarn` looks + Now, in the GitLab project, you can run `yarn link "smooshpack"`. `yarn` looks for `smooshpack` **on disk** as opposed to the one hosted remotely. 1. In the `gitlab` project directory, run: @@ -110,7 +110,7 @@ npx http-server --proxy http://localhost:3000 -S -C $PATH_TO_CERT_PEM -K $PATH_T ### Update `bundler_url` setting in GitLab -We need to update our `application_setting_implementation.rb` to point to the server that hosts the +We need to update our `application_setting_implementation.rb` to point to the server that hosts the Codesandbox `sandpack` assets. For instance, if these assets are hosted by a server at `https://sandpack.local:8044`: ```patch @@ -125,7 +125,7 @@ index 6eed627b502..1824669e881 100644 - 'https://sandbox-prod.gitlab-static.net' + 'https://sandpack.local:8044' end - + private ``` diff --git a/doc/development/integrations/jenkins.md b/doc/development/integrations/jenkins.md index c87b15e192a..a9a1026f1a8 100644 --- a/doc/development/integrations/jenkins.md +++ b/doc/development/integrations/jenkins.md @@ -90,7 +90,7 @@ option because the Jenkins plugin updates the build status on GitLab. In a **Pip ## Configure your GitLab project -To activate the Jenkins service you must have a Starter subscription or higher. +To activate the Jenkins service: 1. Go to your project's page, then **Settings > Integrations > Jenkins CI**. 1. Check the **Active** checkbox and the triggers for **Push** and **Merge request**. diff --git a/doc/development/integrations/jira_connect.md b/doc/development/integrations/jira_connect.md index 408b0e6068e..48beb526774 100644 --- a/doc/development/integrations/jira_connect.md +++ b/doc/development/integrations/jira_connect.md @@ -10,13 +10,20 @@ The following are required to install and test the app: - A Jira Cloud instance. Atlassian provides [free instances for development and testing](https://developer.atlassian.com/platform/marketplace/getting-started/#free-developer-instances-to-build-and-test-your-app). - A GitLab instance available over the internet. For the app to work, Jira Cloud should - be able to connect to the GitLab instance through the internet. To easily expose your - local development environment, you can use tools like: - - [serveo](https://medium.com/automationmaster/how-to-forward-my-local-port-to-public-using-serveo-4979f352a3bf) - - [ngrok](https://ngrok.com). + be able to connect to the GitLab instance through the internet. For this we + recommend using Gitpod or a similar cloud development environment. For more + information on using Gitpod with GDK, see the: - These also take care of SSL for you because Jira requires all connections to the app - host to be over SSL. + - [GDK in Gitpod](https://www.loom.com/share/9c9711d4876a40869b9294eecb24c54d) + video. + - [GDK with Gitpod](https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/master/doc/howto/gitpod.md) + documentation. + + You **must not** use tunneling tools such as Serveo or `ngrok`. These are + security risks, and must not be run on developer laptops. + + Jira requires all connections to the app host to be over SSL, so if you set up + your own environment, remember to enable SSL and an appropriate certificate. ## Install the app in Jira @@ -38,7 +45,7 @@ To install the app in Jira: For example: ```plaintext - https://xxxx.serveo.net/-/jira_connect/app_descriptor.json + https://xxxx.gitpod.io/-/jira_connect/app_descriptor.json ``` 1. Click **Upload**. diff --git a/doc/development/internal_api.md b/doc/development/internal_api.md index 4971e4d629d..43655c37048 100644 --- a/doc/development/internal_api.md +++ b/doc/development/internal_api.md @@ -448,3 +448,22 @@ Example Request: ```shell curl --request POST --header "Gitlab-Kas-Api-Request: <JWT token>" --header "Content-Type: application/json" --data '{"gitops_sync_count":1}' "http://localhost:3000/api/v4/internal/kubernetes/usage_metrics" ``` + +### Kubernetes agent alert metrics + +Called from GitLab Kubernetes Agent Server (KAS) to save alerts derived from Cilium on Kubernetes +Cluster. + +| Attribute | Type | Required | Description | +|:----------|:-------|:---------|:------------| +| `alert` | Hash | yes | Alerts detail. Currently same format as [3rd party alert](../operations/incident_management/alert_integrations.md#customize-the-alert-payload-outside-of-gitlab). | + +```plaintext +POST internal/kubernetes/modules/cilium_alert +``` + +Example Request: + +```shell +curl --request POST --header "Gitlab-Kas-Api-Request: <JWT token>" --header "Authorization: Bearer <agent token>" --header "Content-Type: application/json" --data '"{\"alert\":{\"title\":\"minimal\",\"message\":\"network problem\",\"evalMatches\":[{\"value\":1,\"metric\":\"Count\",\"tags\":{}}]}}"' "http://localhost:3000/api/v4/internal/kubernetes/modules/cilium_alert" +``` diff --git a/doc/development/iterating_tables_in_batches.md b/doc/development/iterating_tables_in_batches.md index 3953e7097dd..43d7f32ad7f 100644 --- a/doc/development/iterating_tables_in_batches.md +++ b/doc/development/iterating_tables_in_batches.md @@ -42,6 +42,29 @@ The API of this method is similar to `in_batches`, though it doesn't support all of the arguments that `in_batches` supports. You should always use `each_batch` _unless_ you have a specific need for `in_batches`. +## Avoid iterating over non-unique columns + +One should proceed with extra caution, and possibly avoid iterating over a column that can contain duplicate values. +When you iterate over an attribute that is not unique, even with the applied max batch size, there is no guarantee that the resulting batches will not surpass it. +The following snippet demonstrates this situation, whe one attempt to select `Ci::Build` entries for users with `id` between `1` and `10,s000`, database returns `1 215 178` +matching rows + +```ruby +[ gstg ] production> Ci::Build.where(user_id: (1..10_000)).size +=> 1215178 +``` + +This happens because built relation is translated into following query + +```ruby +[ gstg ] production> puts Ci::Build.where(user_id: (1..10_000)).to_sql +SELECT "ci_builds".* FROM "ci_builds" WHERE "ci_builds"."type" = 'Ci::Build' AND "ci_builds"."user_id" BETWEEN 1 AND 10000 +=> nil +``` + +And queries which filters non-unique column by range `WHERE "ci_builds"."user_id" BETWEEN ? AND ?`, even though the range size is limited to certain threshold (`10,000` in previous example) this threshold does not translates to the size of returned dataset. That happens because when taking `n` possible values of attributes, +one can't tell for sure that the number of records that contains them will be less than `n`. + ## Column definition `EachBatch` uses the primary key of the model by default for the iteration. This works most of the cases, however in some cases, you might want to use a different column for the iteration. @@ -55,7 +78,7 @@ end The query above iterates over the project creators and prints them out without duplications. NOTE: -In case the column is not unique (no unique index definition), calling the `distinct` method on the relation is necessary. +In case the column is not unique (no unique index definition), calling the `distinct` method on the relation is necessary. Using not unique column without `distinct` may result in `each_batch` falling into endless loop as described at following [issue](https://gitlab.com/gitlab-org/gitlab/-/issues/285097) ## `EachBatch` in data migrations diff --git a/doc/development/migration_style_guide.md b/doc/development/migration_style_guide.md index 8cdfbd558ca..e1205346585 100644 --- a/doc/development/migration_style_guide.md +++ b/doc/development/migration_style_guide.md @@ -516,12 +516,14 @@ class MyMigration < ActiveRecord::Migration[6.0] disable_ddl_transaction! + INDEX_NAME = 'index_name' + def up - add_concurrent_index :table, :column + add_concurrent_index :table, :column, name: INDEX_NAME end def down - remove_concurrent_index :table, :column, name: index_name + remove_concurrent_index :table, :column, name: INDEX_NAME end end ``` diff --git a/doc/development/new_fe_guide/development/accessibility.md b/doc/development/new_fe_guide/development/accessibility.md index 81f3773dd5c..65485104efe 100644 --- a/doc/development/new_fe_guide/development/accessibility.md +++ b/doc/development/new_fe_guide/development/accessibility.md @@ -42,7 +42,7 @@ In forms we should use the `for` attribute in the label statement: ## Testing -1. On MacOS you can use [VoiceOver](http://www.apple.com/accessibility/vision/) by pressing `cmd+F5`. +1. On MacOS you can use [VoiceOver](https://www.apple.com/accessibility/vision/) by pressing `cmd+F5`. 1. On Windows you can use [Narrator](https://www.microsoft.com/en-us/accessibility/windows) by pressing Windows logo key + Control + Enter. ## Online resources diff --git a/doc/development/packages.md b/doc/development/packages.md index 689dc6b4141..aadd71c9ffa 100644 --- a/doc/development/packages.md +++ b/doc/development/packages.md @@ -242,6 +242,24 @@ create the package record. Workhorse provides a variety of file metadata such as For testing purposes, you may want to [enable object storage](https://gitlab.com/gitlab-org/gitlab-development-kit/blob/master/doc/howto/object_storage.md) in your local development environment. +#### Rate Limits on GitLab.com + +Package manager clients can make rapid requests that exceed the +[GitLab.com standard API rate limits](../user/gitlab_com/index.md#gitlabcom-specific-rate-limits). +This results in a `429 Too Many Requests` error. + +We have opened a set of paths to allow higher rate limits. Unless it is not possible, +new package managers should follow these conventions so they can take advantage of the +expanded package rate limit. + +These route prefixes guarantee a higher rate limit: + +```plaintext +/api/v4/packages/ +/api/v4/projects/:project_id/packages/ +/api/v4/groups/:group_id/-/packages/ +``` + ### Future Work While working on the MVC, contributors might find features that are not mandatory for the MVC but can provide a better user experience. It's generally a good idea to keep an eye on those and open issues. diff --git a/doc/development/pipelines.md b/doc/development/pipelines.md index 3243c7ec753..0354e703357 100644 --- a/doc/development/pipelines.md +++ b/doc/development/pipelines.md @@ -60,7 +60,7 @@ Reference pipeline: <https://gitlab.com/gitlab-org/gitlab/pipelines/135236627> ```mermaid graph LR subgraph "No needed jobs"; - 1-1["danger-review (3.5 minutes)"]; + 1-1["danger-review (2.3 minutes)"]; click 1-1 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8100542&udv=0" 1-50["docs lint (9 minutes)"]; click 1-50 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356757&udv=0" @@ -76,23 +76,23 @@ graph RL; classDef criticalPath fill:#f66; subgraph "No needed jobs"; - 1-1["danger-review (3.5 minutes)"]; + 1-1["danger-review (2.3 minutes)"]; click 1-1 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8100542&udv=0" - 1-2["build-qa-image (2.4 minutes)"]; + 1-2["build-qa-image (1.6 minutes)"]; click 1-2 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914325&udv=0" - 1-3["compile-test-assets (8.5 minutes)"]; + 1-3["compile-test-assets (7 minutes)"]; click 1-3 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914317&udv=0" - 1-4["compile-test-assets as-if-foss (8.35 minutes)"]; + 1-4["compile-test-assets as-if-foss (7 minutes)"]; click 1-4 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356616&udv=0" 1-5["compile-production-assets (19 minutes)"]; click 1-5 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914312&udv=0" - 1-6["setup-test-env (7.4 minutes)"]; + 1-6["setup-test-env (9 minutes)"]; click 1-6 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914315&udv=0" 1-7["review-stop-failed-deployment"]; 1-8["dependency_scanning"]; 1-9["qa:internal, qa:internal-as-if-foss"]; 1-11["qa:selectors, qa:selectors-as-if-foss"]; - 1-14["retrieve-tests-metadata (1.9 minutes)"]; + 1-14["retrieve-tests-metadata (1 minutes)"]; click 1-14 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356697&udv=0" 1-15["code_quality"]; 1-16["brakeman-sast"]; @@ -100,7 +100,7 @@ graph RL; 1-18["kubesec-sast"]; 1-19["nodejs-scan-sast"]; 1-20["secrets-sast"]; - 1-21["static-analysis (17 minutes)"]; + 1-21["static-analysis (30 minutes)"]; click 1-21 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914471&udv=0" class 1-3 criticalPath; @@ -111,26 +111,26 @@ graph RL; click 2_1-1 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356715&udv=0" 2_1-2["memory-static (4.75 minutes)"]; click 2_1-2 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356721&udv=0" - 2_1-3["run-dev-fixtures (5 minutes)"]; + 2_1-3["run-dev-fixtures (6 minutes)"]; click 2_1-3 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356729&udv=0" - 2_1-4["run-dev-fixtures-ee (5 minutes)"]; + 2_1-4["run-dev-fixtures-ee (6.75 minutes)"]; click 2_1-4 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356731&udv=0" subgraph "Needs `setup-test-env`"; 2_1-1 & 2_1-2 & 2_1-3 & 2_1-4 --> 1-6; end - 2_2-2["frontend-fixtures (16.5 minutes)"]; + 2_2-2["rspec frontend_fixture/rspec-ee frontend_fixture (12 minutes)"]; class 2_2-2 criticalPath; click 2_2-2 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=7910143&udv=0" - 2_2-4["memory-on-boot (7.19 minutes)"]; + 2_2-4["memory-on-boot (6 minutes)"]; click 2_2-4 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356727&udv=0" - 2_2-5["webpack-dev-server (6.1 minutes)"]; + 2_2-5["webpack-dev-server (4.5 minutes)"]; click 2_2-5 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8404303&udv=0" subgraph "Needs `setup-test-env` & `compile-test-assets`"; 2_2-2 & 2_2-4 & 2_2-5 --> 1-6 & 1-3; end - 2_3-1["build-assets-image (2.5 minutes)"]; + 2_3-1["build-assets-image (1.6 minutes)"]; subgraph "Needs `compile-production-assets`"; 2_3-1 --> 1-5 end @@ -153,17 +153,17 @@ graph RL; click 3_1-1 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914204&udv=0" 3_1-2["karma (4 minutes)"]; click 3_1-3 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914200&udv=0" - subgraph "Needs `frontend-fixtures`"; + subgraph "Needs `rspec frontend_fixture/rspec-ee frontend_fixture`"; 3_1-1 & 3_1-2 --> 2_2-2; end - 3_2-1["rspec:coverage (7.5 minutes)"]; + 3_2-1["rspec:coverage (4.6 minutes)"]; subgraph "Depends on `rspec` jobs"; 3_2-1 -.->|"(don't use needs because of limitations)"| 2_5-1; click 3_2-1 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=7248745&udv=0" end - 4_1-1["coverage-frontend (3.6 minutes)"]; + 4_1-1["coverage-frontend (2.75 minutes)"]; subgraph "Needs `jest`"; 4_1-1 --> 3_1-1; class 4_1-1 criticalPath; @@ -180,23 +180,23 @@ graph RL; classDef criticalPath fill:#f66; subgraph "No needed jobs"; - 1-1["danger-review (3.5 minutes)"]; + 1-1["danger-review (2.3 minutes)"]; click 1-1 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8100542&udv=0" - 1-2["build-qa-image (2.4 minutes)"]; + 1-2["build-qa-image (1.6 minutes)"]; click 1-2 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914325&udv=0" - 1-3["compile-test-assets (8.5 minutes)"]; + 1-3["compile-test-assets (7 minutes)"]; click 1-3 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914317&udv=0" - 1-4["compile-test-assets as-if-foss (8.35 minutes)"]; + 1-4["compile-test-assets as-if-foss (7 minutes)"]; click 1-4 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356616&udv=0" 1-5["compile-production-assets (19 minutes)"]; click 1-5 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914312&udv=0" - 1-6["setup-test-env (7.4 minutes)"]; + 1-6["setup-test-env (9 minutes)"]; click 1-6 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914315&udv=0" 1-7["review-stop-failed-deployment"]; 1-8["dependency_scanning"]; 1-9["qa:internal, qa:internal-as-if-foss"]; 1-11["qa:selectors, qa:selectors-as-if-foss"]; - 1-14["retrieve-tests-metadata (1.9 minutes)"]; + 1-14["retrieve-tests-metadata (1 minutes)"]; click 1-14 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356697&udv=0" 1-15["code_quality"]; 1-16["brakeman-sast"]; @@ -204,7 +204,7 @@ graph RL; 1-18["kubesec-sast"]; 1-19["nodejs-scan-sast"]; 1-20["secrets-sast"]; - 1-21["static-analysis (17 minutes)"]; + 1-21["static-analysis (30 minutes)"]; click 1-21 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914471&udv=0" class 1-3 criticalPath; @@ -216,26 +216,26 @@ graph RL; click 2_1-1 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356715&udv=0" 2_1-2["memory-static (4.75 minutes)"]; click 2_1-2 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356721&udv=0" - 2_1-3["run-dev-fixtures (5 minutes)"]; + 2_1-3["run-dev-fixtures (6 minutes)"]; click 2_1-3 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356729&udv=0" - 2_1-4["run-dev-fixtures-ee (5 minutes)"]; + 2_1-4["run-dev-fixtures-ee (6.75 minutes)"]; click 2_1-4 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356731&udv=0" subgraph "Needs `setup-test-env`"; 2_1-1 & 2_1-2 & 2_1-3 & 2_1-4 --> 1-6; end - 2_2-2["frontend-fixtures (16.5 minutes)"]; + 2_2-2["rspec frontend_fixture/rspec-ee frontend_fixture (12 minutes)"]; class 2_2-2 criticalPath; click 2_2-2 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=7910143&udv=0" - 2_2-4["memory-on-boot (7.19 minutes)"]; + 2_2-4["memory-on-boot (6 minutes)"]; click 2_2-4 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356727&udv=0" - 2_2-5["webpack-dev-server (6.1 minutes)"]; + 2_2-5["webpack-dev-server (4.5 minutes)"]; click 2_2-5 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8404303&udv=0" subgraph "Needs `setup-test-env` & `compile-test-assets`"; 2_2-2 & 2_2-4 & 2_2-5 --> 1-6 & 1-3; end - 2_3-1["build-assets-image (2.5 minutes)"]; + 2_3-1["build-assets-image (1.6 minutes)"]; class 2_3-1 criticalPath; subgraph "Needs `compile-production-assets`"; 2_3-1 --> 1-5 @@ -266,17 +266,17 @@ graph RL; click 3_1-1 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914204&udv=0" 3_1-2["karma (4 minutes)"]; click 3_1-3 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914200&udv=0" - subgraph "Needs `frontend-fixtures`"; + subgraph "Needs `rspec frontend_fixture/rspec-ee frontend_fixture`"; 3_1-1 & 3_1-2 --> 2_2-2; end - 3_2-1["rspec:coverage (7.5 minutes)"]; + 3_2-1["rspec:coverage (4.6 minutes)"]; subgraph "Depends on `rspec` jobs"; 3_2-1 -.->|"(don't use needs because of limitations)"| 2_5-1; click 3_2-1 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=7248745&udv=0" end - 4_1-1["coverage-frontend (3.6 minutes)"]; + 4_1-1["coverage-frontend (2.75 minutes)"]; subgraph "Needs `jest`"; 4_1-1 --> 3_1-1; class 4_1-1 criticalPath; @@ -311,23 +311,23 @@ graph RL; classDef criticalPath fill:#f66; subgraph "No needed jobs"; - 1-1["danger-review (3.5 minutes)"]; + 1-1["danger-review (2.3 minutes)"]; click 1-1 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8100542&udv=0" - 1-2["build-qa-image (2.4 minutes)"]; + 1-2["build-qa-image (1.6 minutes)"]; click 1-2 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914325&udv=0" - 1-3["compile-test-assets (8.5 minutes)"]; + 1-3["compile-test-assets (7 minutes)"]; click 1-3 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914317&udv=0" - 1-4["compile-test-assets as-if-foss (8.35 minutes)"]; + 1-4["compile-test-assets as-if-foss (7 minutes)"]; click 1-4 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356616&udv=0" 1-5["compile-production-assets (19 minutes)"]; click 1-5 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914312&udv=0" - 1-6["setup-test-env (7.4 minutes)"]; + 1-6["setup-test-env (9 minutes)"]; click 1-6 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914315&udv=0" 1-7["review-stop-failed-deployment"]; 1-8["dependency_scanning"]; 1-9["qa:internal, qa:internal-as-if-foss"]; 1-11["qa:selectors, qa:selectors-as-if-foss"]; - 1-14["retrieve-tests-metadata (1.9 minutes)"]; + 1-14["retrieve-tests-metadata (1 minutes)"]; click 1-14 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=8356697&udv=0" 1-15["code_quality"]; 1-16["brakeman-sast"]; @@ -335,7 +335,7 @@ graph RL; 1-18["kubesec-sast"]; 1-19["nodejs-scan-sast"]; 1-20["secrets-sast"]; - 1-21["static-analysis (17 minutes)"]; + 1-21["static-analysis (30 minutes)"]; click 1-21 "https://app.periscopedata.com/app/gitlab/652085/Engineering-Productivity---Pipeline-Build-Durations?widget=6914471&udv=0" class 1-5 criticalPath; @@ -347,13 +347,13 @@ graph RL; 2_1-1 --> 1-6; end - 2_3-1["build-assets-image (2.5 minutes)"]; + 2_3-1["build-assets-image (1.6 minutes)"]; subgraph "Needs `compile-production-assets`"; 2_3-1 --> 1-5 class 2_3-1 criticalPath; end - 2_4-1["package-and-qa (108 minutes)"]; + 2_4-1["package-and-qa (105 minutes)"]; subgraph "Needs `build-qa-image` & `build-assets-image`"; 2_4-1 --> 1-2 & 2_3-1; class 2_4-1 criticalPath; @@ -422,24 +422,29 @@ We are using a custom mapping between source file to test files, maintained in t ### PostgreSQL versions testing +Even though [Omnibus defaults to PG12 for new installs and upgrades](https://docs.gitlab.com/omnibus/package-information/postgresql_versions.md), +our test suite is currently running against PG11, since GitLab.com still runs on PG11. + +We do run our test suite against PG12 on nightly scheduled pipelines as well as upon specific +database library changes in MRs and `master` pipelines (with the `rspec db-library-code pg12` job). + #### Current versions testing | Where? | PostgreSQL version | -| ------ | ------ | -| MRs | 11 | -| `master` (non-scheduled pipelines) | 11 | -| 2-hourly scheduled pipelines | 11 | +| ------ | ------------------ | +| MRs | 11, 12 for DB library changes | +| `master` (non-scheduled pipelines) | 11, 12 for DB library changes | +| 2-hourly scheduled pipelines | 11, 12 for DB library changes | | `nightly` scheduled pipelines | 11, 12 | #### Long-term plan We follow the [PostgreSQL versions shipped with Omnibus GitLab](https://docs.gitlab.com/omnibus/package-information/postgresql_versions.html): -| PostgreSQL version | 13.0 (May 2020) | 13.1 (June 2020) | 13.2 (July 2020) | 13.3 (August 2020) | 13.4, 13.5 | [13.7 (December 2020)](https://gitlab.com/gitlab-org/omnibus-gitlab/-/issues/5722) | 14.0 (May 2021?) | -| ------ | --------------- | ---------------- | ---------------- | ------------------ | ------------ | -------------------- | ---------------- | -| PG11 | MRs/`master`/`2-hour`/`nightly` | MRs/`master`/`2-hour`/`nightly` | MRs/`master`/`2-hour`/`nightly` | MRs/`master`/`2-hour`/`nightly` | MRs/`master`/`2-hour`/`nightly` | `nightly` | - | -| PG12 | - | - | `nightly` | `2-hour`/`nightly` | `2-hour`/`nightly` | MRs/`2-hour`/`nightly` | `2-hour`/`nightly` | -| PG13 | - | - | - | - | - | - | MRs/`2-hour`/`nightly` | +| PostgreSQL version | 13.7 (December 2020) | 13.8 (January 2021) | 13.9 (February 2021) | 13.10 (March 2021) | 13.11 (April 2021) | 14.0 (May 2021?) | +| -------------------| -------------------- | ------------------- | -------------------- | ------------------ | ------------------ | ---------------- | +| PG11 | MRs/`2-hour`/`nightly` | MRs/`2-hour`/`nightly` | MRs/`2-hour`/`nightly` | MRs/`2-hour`/`nightly` | MRs/`2-hour`/`nightly` | MRs/`2-hour`/`nightly` | +| PG12 | `nightly` | `nightly` | `nightly` | `nightly` | `nightly` | `nightly` | ### Test jobs @@ -504,6 +509,10 @@ request, be sure to start the `dont-interrupt-me` job before pushing. - `update-yarn-cache`, defined in [`.gitlab/ci/frontend.gitlab-ci.yml`](https://gitlab.com/gitlab-org/gitlab/blob/master/.gitlab/ci/frontend.gitlab-ci.yml). 1. These jobs run in merge requests whose title include `UPDATE CACHE`. +### Artifacts strategy + +We limit the artifacts that are saved and retrieved by jobs to the minimum in order to reduce the upload/download time and costs, as well as the artifacts storage. + ### Pre-clone step The `gitlab-org/gitlab` project on GitLab.com uses a [pre-clone step](https://gitlab.com/gitlab-org/gitlab/-/issues/39134) @@ -671,7 +680,7 @@ and included in `rules` definitions via [YAML anchors](../ci/yaml/README.md#anch | `if-master-refs` | Matches if the current branch is `master`. | | | `if-master-push` | Matches if the current branch is `master` and pipeline source is `push`. | | | `if-master-schedule-2-hourly` | Matches if the current branch is `master` and pipeline runs on a 2-hourly schedule. | | -| `if-master-schedule-2-nightly` | Matches if the current branch is `master` and pipeline runs on a nightly schedule. | | +| `if-master-schedule-nightly` | Matches if the current branch is `master` and pipeline runs on a nightly schedule. | | | `if-auto-deploy-branches` | Matches if the current branch is an auto-deploy one. | | | `if-master-or-tag` | Matches if the pipeline is for the `master` branch or for a tag. | | | `if-merge-request` | Matches if the pipeline is for a merge request. | | diff --git a/doc/development/product_analytics/event_dictionary.md b/doc/development/product_analytics/event_dictionary.md index 9c363f08cb4..e8b8e0c4885 100644 --- a/doc/development/product_analytics/event_dictionary.md +++ b/doc/development/product_analytics/event_dictionary.md @@ -1,8 +1,8 @@ --- -redirect_to: 'https://about.gitlab.com/handbook/product/product-analytics-guide/' +redirect_to: 'https://about.gitlab.com/handbook/product/product-intelligence-guide/' --- -This document was moved to [another location](https://about.gitlab.com/handbook/product/product-analytics-guide/). +This document was moved to [another location](https://about.gitlab.com/handbook/product/product-intelligence-guide/). -<!-- This redirect file can be deleted after February 1, 2021. --> +<!-- This redirect file can be deleted after December 1, 2021. --> <!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/development/product_analytics/index.md b/doc/development/product_analytics/index.md index 9c363f08cb4..4d2168cf304 100644 --- a/doc/development/product_analytics/index.md +++ b/doc/development/product_analytics/index.md @@ -1,8 +1,13 @@ --- -redirect_to: 'https://about.gitlab.com/handbook/product/product-analytics-guide/' +redirect_to: 'https://about.gitlab.com/handbook/product/product-intelligence-guide/' --- -This document was moved to [another location](https://about.gitlab.com/handbook/product/product-analytics-guide/). +This document was moved to [another location](https://about.gitlab.com/handbook/product/product-intelligence-guide/). -<!-- This redirect file can be deleted after February 1, 2021. --> +<!-- Needed by the Product Intelligence group + +Since our new standard for redirects otherwise lies within the gitlab-docs repo, +as long as we need a redirect to the handbook, we need to retain this file. + --> +<!-- This redirect file can be deleted after December 1, 2021. --> <!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/development/product_analytics/snowplow.md b/doc/development/product_analytics/snowplow.md index 48b816f0b83..bb056ffddfe 100644 --- a/doc/development/product_analytics/snowplow.md +++ b/doc/development/product_analytics/snowplow.md @@ -1,616 +1,8 @@ --- -stage: Growth -group: Product Analytics -info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +redirect_to: '../snowplow.md' --- -# Snowplow Guide +This document was moved to [another location](../snowplow.md). -This guide provides an overview of how Snowplow works, and implementation details. - -For more information about Product Analytics, see: - -- [Product Analytics Guide](https://about.gitlab.com/handbook/product/product-analytics-guide/) -- [Usage Ping Guide](usage_ping.md) - -More useful links: - -- [Product Analytics Direction](https://about.gitlab.com/direction/product-analytics/) -- [Data Analysis Process](https://about.gitlab.com/handbook/business-ops/data-team/#data-analysis-process/) -- [Data for Product Managers](https://about.gitlab.com/handbook/business-ops/data-team/programs/data-for-product-managers/) -- [Data Infrastructure](https://about.gitlab.com/handbook/business-ops/data-team/platform/infrastructure/) - -## What is Snowplow - -Snowplow is an enterprise-grade marketing and product analytics platform which helps track the way users engage with our website and application. - -[Snowplow](https://github.com/snowplow/snowplow) consists of the following loosely-coupled sub-systems: - -- **Trackers** fire Snowplow events. Snowplow has 12 trackers, covering web, mobile, desktop, server, and IoT. -- **Collectors** receive Snowplow events from trackers. We have three different event collectors, synchronizing events either to Amazon S3, Apache Kafka, or Amazon Kinesis. -- **Enrich** cleans up the raw Snowplow events, enriches them and puts them into storage. We have an Hadoop-based enrichment process, and a Kinesis-based or Kafka-based process. -- **Storage** is where the Snowplow events live. We store the Snowplow events in a flat file structure on S3, and in the Redshift and PostgreSQL databases. -- **Data modeling** is where event-level data is joined with other data sets and aggregated into smaller data sets, and business logic is applied. This produces a clean set of tables which make it easier to perform analysis on the data. We have data models for Redshift and Looker. -- **Analytics** are performed on the Snowplow events or on the aggregate tables. - -![snowplow_flow](../img/snowplow_flow.png) - -## Snowplow schema - -We have many definitions of Snowplow's schema. We have an active issue to [standardize this schema](https://gitlab.com/gitlab-org/gitlab/-/issues/207930) including the following definitions: - -- Frontend and backend taxonomy as listed below -- [Structured event taxonomy](#structured-event-taxonomy) -- [Self describing events](https://github.com/snowplow/snowplow/wiki/Custom-events#self-describing-events) -- [Iglu schema](https://gitlab.com/gitlab-org/iglu/) -- [Snowplow authored events](https://github.com/snowplow/snowplow/wiki/Snowplow-authored-events) - -## Enabling Snowplow - -Tracking can be enabled at: - -- The instance level, which enables tracking on both the frontend and backend layers. -- User level, though user tracking can be disabled on a per-user basis. GitLab tracking respects the [Do Not Track](https://www.eff.org/issues/do-not-track) standard, so any user who has enabled the Do Not Track option in their browser is not tracked at a user level. - -We use Snowplow for the majority of our tracking strategy and it is enabled on GitLab.com. On a self-managed instance, Snowplow can be enabled by navigating to: - -- **Admin Area > Settings > General** in the UI. -- `admin/application_settings/integrations` in your browser. - -The following configuration is required: - -| Name | Value | -|---------------|---------------------------| -| Collector | `snowplow.trx.gitlab.net` | -| Site ID | `gitlab` | -| Cookie domain | `.gitlab.com` | - -## Snowplow request flow - -The following example shows a basic request/response flow between the following components: - -- Snowplow JS / Ruby Trackers on GitLab.com -- [GitLab.com Snowplow Collector](https://gitlab.com/gitlab-com/gl-infra/readiness/-/blob/master/library/snowplow/index.md) -- The GitLab S3 Bucket -- The GitLab Snowflake Data Warehouse -- Sisense: - -```mermaid -sequenceDiagram - participant Snowplow JS (Frontend) - participant Snowplow Ruby (Backend) - participant GitLab.com Snowplow Collector - participant S3 Bucket - participant Snowflake DW - participant Sisense Dashboards - Snowplow JS (Frontend) ->> GitLab.com Snowplow Collector: FE Tracking event - Snowplow Ruby (Backend) ->> GitLab.com Snowplow Collector: BE Tracking event - loop Process using Kinesis Stream - GitLab.com Snowplow Collector ->> GitLab.com Snowplow Collector: Log raw events - GitLab.com Snowplow Collector ->> GitLab.com Snowplow Collector: Enrich events - GitLab.com Snowplow Collector ->> GitLab.com Snowplow Collector: Write to disk - end - GitLab.com Snowplow Collector ->> S3 Bucket: Kinesis Firehose - S3 Bucket->>Snowflake DW: Import data - Snowflake DW->>Snowflake DW: Transform data using dbt - Snowflake DW->>Sisense Dashboards: Data available for querying -``` - -## Structured event taxonomy - -When adding new click events, we should add them in a way that's internally consistent. If we don't, it is very painful to perform analysis across features since each feature captures events differently. - -The current method provides several attributes that are sent on each click event. Please try to follow these guidelines when specifying events to capture: - -| attribute | type | required | description | -| --------- | ------- | -------- | ----------- | -| category | text | true | The page or backend area of the application. Unless infeasible, please use the Rails page attribute by default in the frontend, and namespace + classname on the backend. | -| action | text | true | The action the user is taking, or aspect that's being instrumented. The first word should always describe the action or aspect: clicks should be `click`, activations should be `activate`, creations should be `create`, etc. Use underscores to describe what was acted on; for example, activating a form field would be `activate_form_input`. An interface action like clicking on a dropdown would be `click_dropdown`, while a behavior like creating a project record from the backend would be `create_project` | -| label | text | false | The specific element, or object that's being acted on. This is either the label of the element (e.g. a tab labeled 'Create from template' may be `create_from_template`) or a unique identifier if no text is available (e.g. closing the Groups dropdown in the top navbar might be `groups_dropdown_close`), or it could be the name or title attribute of a record being created. | -| property | text | false | Any additional property of the element, or object being acted on. | -| value | decimal | false | Describes a numeric value or something directly related to the event. This could be the value of an input (e.g. `10` when clicking `internal` visibility). | - -### Web-specific parameters - -Snowplow JS adds many [web-specific parameters](https://docs.snowplowanalytics.com/docs/collecting-data/collecting-from-own-applications/snowplow-tracker-protocol/#Web-specific_parameters) to all web events by default. - -## Implementing Snowplow JS (Frontend) tracking - -GitLab provides `Tracking`, an interface that wraps the [Snowplow JavaScript Tracker](https://github.com/snowplow/snowplow/wiki/javascript-tracker) for tracking custom events. There are a few ways to use tracking, but each generally requires at minimum, a `category` and an `action`. Additional data can be provided that adheres to our [Structured event taxonomy](#structured-event-taxonomy). - -| field | type | default value | description | -|:-----------|:-------|:---------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `category` | string | document.body.dataset.page | Page or subsection of a page that events are being captured within. | -| `action` | string | 'generic' | Action the user is taking. Clicks should be `click` and activations should be `activate`, so for example, focusing a form field would be `activate_form_input`, and clicking a button would be `click_button`. | -| `data` | object | {} | Additional data such as `label`, `property`, `value`, and `context` as described in our [Structured event taxonomy](#structured-event-taxonomy). | - -### Tracking in HAML (or Vue Templates) - -When working within HAML (or Vue templates) we can add `data-track-*` attributes to elements of interest. All elements that have a `data-track-event` attribute automatically have event tracking bound on clicks. - -Below is an example of `data-track-*` attributes assigned to a button: - -```haml -%button.btn{ data: { track: { event: "click_button", label: "template_preview", property: "my-template" } } } -``` - -```html -<button class="btn" - data-track-event="click_button" - data-track-label="template_preview" - data-track-property="my-template" -/> -``` - -Event listeners are bound at the document level to handle click events on or within elements with these data attributes. This allows them to be properly handled on re-rendering and changes to the DOM. Note that because of the way these events are bound, click events should not be stopped from propagating up the DOM tree. If for any reason click events are being stopped from propagating, you need to implement your own listeners and follow the instructions in [Tracking in raw JavaScript](#tracking-in-raw-javascript). - -Below is a list of supported `data-track-*` attributes: - -| attribute | required | description | -|:----------------------|:---------|:------------| -| `data-track-event` | true | Action the user is taking. Clicks must be prepended with `click` and activations must be prepended with `activate`. For example, focusing a form field would be `activate_form_input` and clicking a button would be `click_button`. | -| `data-track-label` | false | The `label` as described in our [Structured event taxonomy](#structured-event-taxonomy). | -| `data-track-property` | false | The `property` as described in our [Structured event taxonomy](#structured-event-taxonomy). | -| `data-track-value` | false | The `value` as described in our [Structured event taxonomy](#structured-event-taxonomy). If omitted, this is the element's `value` property or an empty string. For checkboxes, the default value is the element's checked attribute or `false` when unchecked. | -| `data-track-context` | false | The `context` as described in our [Structured event taxonomy](#structured-event-taxonomy). | - -#### Caveats - -When using the GitLab helper method [`nav_link`](https://gitlab.com/gitlab-org/gitlab/-/blob/898b286de322e5df6a38d257b10c94974d580df8/app/helpers/tab_helper.rb#L69) be sure to wrap `html_options` under the `html_options` keyword argument. -Be careful, as this behavior can be confused with the `ActionView` helper method [`link_to`](https://api.rubyonrails.org/v5.2.3/classes/ActionView/Helpers/UrlHelper.html#method-i-link_to) that does not require additional wrapping of `html_options` - -`nav_link(controller: ['dashboard/groups', 'explore/groups'], html_options: { data: { track_label: "groups_dropdown", track_event: "click_dropdown" } })` - -vs - -`link_to assigned_issues_dashboard_path, title: _('Issues'), data: { track_label: 'main_navigation', track_event: 'click_issues_link' }` - -### Tracking within Vue components - -There's a tracking Vue mixin that can be used in components if more complex tracking is required. To use it, first import the `Tracking` library and request a mixin. - -```javascript -import Tracking from '~/tracking'; -const trackingMixin = Tracking.mixin({ label: 'right_sidebar' }); -``` - -You can provide default options that are passed along whenever an event is tracked from within your component. For instance, if all events within a component should be tracked with a given `label`, you can provide one at this time. Available defaults are `category`, `label`, `property`, and `value`. If no category is specified, `document.body.dataset.page` is used as the default. - -You can then use the mixin normally in your component with the `mixin` Vue declaration. The mixin also provides the ability to specify tracking options in `data` or `computed`. These override any defaults and allow the values to be dynamic from props, or based on state. - -```javascript -export default { - mixins: [trackingMixin], - // ...[component implementation]... - data() { - return { - expanded: false, - tracking: { - label: 'left_sidebar' - } - }; - }, -} -``` - -The mixin provides a `track` method that can be called within the template, or from component methods. An example of the whole implementation might look like the following. - -```javascript -export default { - mixins: [Tracking.mixin({ label: 'right_sidebar' })], - data() { - return { - expanded: false, - }; - }, - methods: { - toggle() { - this.expanded = !this.expanded; - this.track('click_toggle', { value: this.expanded }) - } - } -}; -``` - -And if needed within the template, you can use the `track` method directly as well. - -```html -<template> - <div> - <a class="toggle" @click.prevent="toggle">Toggle</a> - <div v-if="expanded"> - <p>Hello world!</p> - <a @click.prevent="track('click_action')">Track an event</a> - </div> - </div> -</template> -``` - -### Tracking in raw JavaScript - -Custom event tracking and instrumentation can be added by directly calling the `Tracking.event` static function. The following example demonstrates tracking a click on a button by calling `Tracking.event` manually. - -```javascript -import Tracking from '~/tracking'; - -const button = document.getElementById('create_from_template_button'); -button.addEventListener('click', () => { - Tracking.event('dashboard:projects:index', 'click_button', { - label: 'create_from_template', - property: 'template_preview', - value: 'rails', - }); -}) -``` - -### Tests and test helpers - -In Jest particularly in Vue tests, you can use the following: - -```javascript -import { mockTracking } from 'helpers/tracking_helper'; - -describe('MyTracking', () => { - let spy; - - beforeEach(() => { - spy = mockTracking('_category_', wrapper.element, jest.spyOn); - }); - - it('tracks an event when clicked on feedback', () => { - wrapper.find('.discover-feedback-icon').trigger('click'); - - expect(spy).toHaveBeenCalledWith('_category_', 'click_button', { - label: 'security-discover-feedback-cta', - property: '0', - }); - }); -}); -``` - -In obsolete Karma tests it's used as below: - -```javascript -import { mockTracking, triggerEvent } from 'spec/helpers/tracking_helper'; - -describe('my component', () => { - let trackingSpy; - - beforeEach(() => { - trackingSpy = mockTracking('_category_', vm.$el, spyOn); - }); - - const triggerEvent = () => { - // action which should trigger a event - }; - - it('tracks an event when toggled', () => { - expect(trackingSpy).not.toHaveBeenCalled(); - - triggerEvent('a.toggle'); - - expect(trackingSpy).toHaveBeenCalledWith('_category_', 'click_edit_button', { - label: 'right_sidebar', - property: 'confidentiality', - }); - }); -}); -``` - -## Implementing Snowplow Ruby (Backend) tracking - -GitLab provides `Gitlab::Tracking`, an interface that wraps the [Snowplow Ruby Tracker](https://github.com/snowplow/snowplow/wiki/ruby-tracker) for tracking custom events. - -Custom event tracking and instrumentation can be added by directly calling the `GitLab::Tracking.event` class method, which accepts the following arguments: - -| argument | type | default value | description | -|:-----------|:-------|:--------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| -| `category` | string | 'application' | Area or aspect of the application. This could be `HealthCheckController` or `Lfs::FileTransformer` for instance. | -| `action` | string | 'generic' | The action being taken, which can be anything from a controller action like `create` to something like an Active Record callback. | -| `data` | object | {} | Additional data such as `label`, `property`, `value`, and `context` as described in [Structured event taxonomy](#structured-event-taxonomy). These are set as empty strings if you don't provide them. | - -Tracking can be viewed as either tracking user behavior, or can be used for instrumentation to monitor and visualize performance over time in an area or aspect of code. - -For example: - -```ruby -class Projects::CreateService < BaseService - def execute - project = Project.create(params) - - Gitlab::Tracking.event('Projects::CreateService', 'create_project', - label: project.errors.full_messages.to_sentence, - value: project.valid? - ) - end -end -``` - -### Unit testing - -Use the `expect_snowplow_event` helper when testing backend Snowplow events. See [testing best practices]( -https://docs.gitlab.com/ee/development/testing_guide/best_practices.html#test-snowplow-events) for details. - -### Performance - -We use the [AsyncEmitter](https://github.com/snowplow/snowplow/wiki/Ruby-Tracker#52-the-asyncemitter-class) when tracking events, which allows for instrumentation calls to be run in a background thread. This is still an active area of development. - -## Developing and testing Snowplow - -There are several tools for developing and testing Snowplow Event - -| Testing Tool | Frontend Tracking | Backend Tracking | Local Development Environment | Production Environment | Production Environment | -|----------------------------------------------|--------------------|---------------------|-------------------------------|------------------------|------------------------| -| Snowplow Analytics Debugger Chrome Extension | **{check-circle}** | **{dotted-circle}** | **{check-circle}** | **{check-circle}** | **{check-circle}** | -| Snowplow Inspector Chrome Extension | **{check-circle}** | **{dotted-circle}** | **{check-circle}** | **{check-circle}** | **{check-circle}** | -| Snowplow Micro | **{check-circle}** | **{check-circle}** | **{check-circle}** | **{dotted-circle}** | **{dotted-circle}** | -| Snowplow Mini | **{check-circle}** | **{check-circle}** | **{dotted-circle}** | **{status_preparing}** | **{status_preparing}** | - -**Legend** - -**{check-circle}** Available, **{status_preparing}** In progress, **{dotted-circle}** Not Planned - -### Preparing your MR for Review - -1. For frontend events, in the MR description section, add a screenshot of the event's relevant section using the [Snowplow Analytics Debugger](https://chrome.google.com/webstore/detail/snowplow-analytics-debugg/jbnlcgeengmijcghameodeaenefieedm) Chrome browser extension. -1. For backend events, please use Snowplow Micro and add the output of the Snowplow Micro good events `GET http://localhost:9090/micro/good`. - -### Snowplow Analytics Debugger Chrome Extension - -Snowplow Analytics Debugger is a browser extension for testing frontend events. This works on production, staging and local development environments. - -1. Install the [Snowplow Analytics Debugger](https://chrome.google.com/webstore/detail/snowplow-analytics-debugg/jbnlcgeengmijcghameodeaenefieedm) Chrome browser extension. -1. Open Chrome DevTools to the Snowplow Analytics Debugger tab. -1. Learn more at [Igloo Analytics](https://www.iglooanalytics.com/blog/snowplow-analytics-debugger-chrome-extension.html). - -### Snowplow Inspector Chrome Extension - -Snowplow Inspector Chrome Extension is a browser extension for testing frontend events. This works on production, staging and local development environments. - -1. Install [Snowplow Inspector](https://chrome.google.com/webstore/detail/snowplow-inspector/maplkdomeamdlngconidoefjpogkmljm?hl=en). -1. Open the Chrome extension by pressing the Snowplow Inspector icon beside the address bar. -1. Click around on a webpage with Snowplow and you should see JavaScript events firing in the inspector window. - -### Snowplow Micro - -Snowplow Micro is a very small version of a full Snowplow data collection pipeline: small enough that it can be launched by a test suite. Events can be recorded into Snowplow Micro just as they can a full Snowplow pipeline. Micro then exposes an API that can be queried. - -Snowplow Micro is a Docker-based solution for testing frontend and backend events in a local development environment. You need to modify GDK using the instructions below to set this up. - -- Read [Introducing Snowplow Micro](https://snowplowanalytics.com/blog/2019/07/17/introducing-snowplow-micro/) -- Look at the [Snowplow Micro repository](https://github.com/snowplow-incubator/snowplow-micro) -- Watch our [installation guide recording](https://www.youtube.com/watch?v=OX46fo_A0Ag) - -1. Ensure Docker is installed and running. - -1. Install [Snowplow Micro](https://github.com/snowplow-incubator/snowplow-micro) by cloning the settings in [this project](https://gitlab.com/gitlab-org/snowplow-micro-configuration): -1. Navigate to the directory with the cloned project, and start the appropriate Docker - container with the following script: - - ```shell - ./snowplow-micro.sh - ``` - -1. Update your instance's settings to enable Snowplow events and point to the Snowplow Micro collector: - - ```shell - gdk psql -d gitlabhq_development - update application_settings set snowplow_collector_hostname='localhost:9090', snowplow_enabled=true, snowplow_cookie_domain='.gitlab.com'; - ``` - -1. Update `DEFAULT_SNOWPLOW_OPTIONS` in `app/assets/javascripts/tracking.js` to remove `forceSecureTracker: true`: - - ```diff - diff --git a/app/assets/javascripts/tracking.js b/app/assets/javascripts/tracking.js - index 0a1211d0a76..3b98c8f28f2 100644 - --- a/app/assets/javascripts/tracking.js - +++ b/app/assets/javascripts/tracking.js - @@ -7,7 +7,6 @@ const DEFAULT_SNOWPLOW_OPTIONS = { - appId: '', - userFingerprint: false, - respectDoNotTrack: true, - - forceSecureTracker: true, - eventMethod: 'post', - contexts: { webPage: true, performanceTiming: true }, - formTracking: false, - - ``` - -1. Update `snowplow_options` in `lib/gitlab/tracking.rb` to add `protocol` and `port`: - - ```diff - diff --git a/lib/gitlab/tracking.rb b/lib/gitlab/tracking.rb - index 618e359211b..e9084623c43 100644 - --- a/lib/gitlab/tracking.rb - +++ b/lib/gitlab/tracking.rb - @@ -41,7 +41,9 @@ def snowplow_options(group) - cookie_domain: Gitlab::CurrentSettings.snowplow_cookie_domain, - app_id: Gitlab::CurrentSettings.snowplow_app_id, - form_tracking: additional_features, - - link_click_tracking: additional_features - + link_click_tracking: additional_features, - + protocol: 'http', - + port: 9090 - }.transform_keys! { |key| key.to_s.camelize(:lower).to_sym } - end - ``` - -1. Update `emitter` in `lib/gitlab/tracking/destinations/snowplow.rb` to change `protocol`: - - ```diff - diff --git a/lib/gitlab/tracking/destinations/snowplow.rb b/lib/gitlab/tracking/destinations/snowplow.rb - index 4fa844de325..5dd9d0eacfb 100644 - --- a/lib/gitlab/tracking/destinations/snowplow.rb - +++ b/lib/gitlab/tracking/destinations/snowplow.rb - @@ -40,7 +40,7 @@ def tracker - def emitter - SnowplowTracker::AsyncEmitter.new( - Gitlab::CurrentSettings.snowplow_collector_hostname, - - protocol: 'https' - + protocol: 'http' - ) - end - end - - ``` - -1. Restart GDK: - - ```shell - `gdk restart` - ``` - -1. Send a test Snowplow event from the Rails console: - - ```ruby - Gitlab::Tracking.self_describing_event('iglu:com.gitlab/pageview_context/jsonschema/1-0-0', data: { page_type: 'MY_TYPE' }, context: nil) - ``` - -1. Navigate to `localhost:9090/micro/good` to see the event. - -### Snowplow Mini - -[Snowplow Mini](https://github.com/snowplow/snowplow-mini) is an easily-deployable, single-instance version of Snowplow. - -Snowplow Mini can be used for testing frontend and backend events on a production, staging and local development environment. - -For GitLab.com, we're setting up a [QA and Testing environment](https://gitlab.com/gitlab-org/telemetry/-/issues/266) using Snowplow Mini. - -## Snowplow Schemas - -### Default Schema - -| Field Name | Required | Type | Description | -|--------------------------|---------------------|-----------|----------------------------------------------------------------------------------------------------------------------------------| -| app_id | **{check-circle}** | string | Unique identifier for website / application | -| base_currency | **{dotted-circle}** | string | Reporting currency | -| br_colordepth | **{dotted-circle}** | integer | Browser color depth | -| br_cookies | **{dotted-circle}** | boolean | Does the browser permit cookies? | -| br_family | **{dotted-circle}** | string | Browser family | -| br_features_director | **{dotted-circle}** | boolean | Director plugin installed? | -| br_features_flash | **{dotted-circle}** | boolean | Flash plugin installed? | -| br_features_gears | **{dotted-circle}** | boolean | Google gears installed? | -| br_features_java | **{dotted-circle}** | boolean | Java plugin installed? | -| br_features_pdf | **{dotted-circle}** | boolean | Adobe PDF plugin installed? | -| br_features_quicktime | **{dotted-circle}** | boolean | Quicktime plugin installed? | -| br_features_realplayer | **{dotted-circle}** | boolean | Realplayer plugin installed? | -| br_features_silverlight | **{dotted-circle}** | boolean | Silverlight plugin installed? | -| br_features_windowsmedia | **{dotted-circle}** | boolean | Windows media plugin installed? | -| br_lang | **{dotted-circle}** | string | Language the browser is set to | -| br_name | **{dotted-circle}** | string | Browser name | -| br_renderengine | **{dotted-circle}** | string | Browser rendering engine | -| br_type | **{dotted-circle}** | string | Browser type | -| br_version | **{dotted-circle}** | string | Browser version | -| br_viewheight | **{dotted-circle}** | string | Browser viewport height | -| br_viewwidth | **{dotted-circle}** | string | Browser viewport width | -| collector_tstamp | **{dotted-circle}** | timestamp | Time stamp for the event recorded by the collector | -| contexts | **{dotted-circle}** | | | -| derived_contexts | **{dotted-circle}** | | Contexts derived in the Enrich process | -| derived_tstamp | **{dotted-circle}** | timestamp | Timestamp making allowance for innaccurate device clock | -| doc_charset | **{dotted-circle}** | string | Web page’s character encoding | -| doc_height | **{dotted-circle}** | string | Web page height | -| doc_width | **{dotted-circle}** | string | Web page width | -| domain_sessionid | **{dotted-circle}** | string | Unique identifier (UUID) for this visit of this user_id to this domain | -| domain_sessionidx | **{dotted-circle}** | integer | Index of number of visits that this user_id has made to this domain (The first visit is `1`) | -| domain_userid | **{dotted-circle}** | string | Unique identifier for a user, based on a first party cookie (so domain specific) | -| dvce_created_tstamp | **{dotted-circle}** | timestamp | Timestamp when event occurred, as recorded by client device | -| dvce_ismobile | **{dotted-circle}** | boolean | Indicates whether device is mobile | -| dvce_screenheight | **{dotted-circle}** | string | Screen / monitor resolution | -| dvce_screenwidth | **{dotted-circle}** | string | Screen / monitor resolution | -| dvce_sent_tstamp | **{dotted-circle}** | timestamp | Timestamp when event was sent by client device to collector | -| dvce_type | **{dotted-circle}** | string | Type of device | -| etl_tags | **{dotted-circle}** | string | JSON of tags for this ETL run | -| etl_tstamp | **{dotted-circle}** | timestamp | Timestamp event began ETL | -| event | **{dotted-circle}** | string | Event type | -| event_fingerprint | **{dotted-circle}** | string | Hash client-set event fields | -| event_format | **{dotted-circle}** | string | Format for event | -| event_id | **{dotted-circle}** | string | Event UUID | -| event_name | **{dotted-circle}** | string | Event name | -| event_vendor | **{dotted-circle}** | string | The company who developed the event model | -| event_version | **{dotted-circle}** | string | Version of event schema | -| geo_city | **{dotted-circle}** | string | City of IP origin | -| geo_country | **{dotted-circle}** | string | Country of IP origin | -| geo_latitude | **{dotted-circle}** | string | An approximate latitude | -| geo_longitude | **{dotted-circle}** | string | An approximate longitude | -| geo_region | **{dotted-circle}** | string | Region of IP origin | -| geo_region_name | **{dotted-circle}** | string | Region of IP origin | -| geo_timezone | **{dotted-circle}** | string | Timezone of IP origin | -| geo_zipcode | **{dotted-circle}** | string | Zip (postal) code of IP origin | -| ip_domain | **{dotted-circle}** | string | Second level domain name associated with the visitor’s IP address | -| ip_isp | **{dotted-circle}** | string | Visitor’s ISP | -| ip_netspeed | **{dotted-circle}** | string | Visitor’s connection type | -| ip_organization | **{dotted-circle}** | string | Organization associated with the visitor’s IP address – defaults to ISP name if none is found | -| mkt_campaign | **{dotted-circle}** | string | The campaign ID | -| mkt_clickid | **{dotted-circle}** | string | The click ID | -| mkt_content | **{dotted-circle}** | string | The content or ID of the ad. | -| mkt_medium | **{dotted-circle}** | string | Type of traffic source | -| mkt_network | **{dotted-circle}** | string | The ad network to which the click ID belongs | -| mkt_source | **{dotted-circle}** | string | The company / website where the traffic came from | -| mkt_term | **{dotted-circle}** | string | Keywords associated with the referrer | -| name_tracker | **{dotted-circle}** | string | The tracker namespace | -| network_userid | **{dotted-circle}** | string | Unique identifier for a user, based on a cookie from the collector (so set at a network level and shouldn’t be set by a tracker) | -| os_family | **{dotted-circle}** | string | Operating system family | -| os_manufacturer | **{dotted-circle}** | string | Manufacturers of operating system | -| os_name | **{dotted-circle}** | string | Name of operating system | -| os_timezone | **{dotted-circle}** | string | Client operating system timezone | -| page_referrer | **{dotted-circle}** | string | Referrer URL | -| page_title | **{dotted-circle}** | string | Page title | -| page_url | **{dotted-circle}** | string | Page URL | -| page_urlfragment | **{dotted-circle}** | string | Fragment aka anchor | -| page_urlhost | **{dotted-circle}** | string | Host aka domain | -| page_urlpath | **{dotted-circle}** | string | Path to page | -| page_urlport | **{dotted-circle}** | integer | Port if specified, 80 if not | -| page_urlquery | **{dotted-circle}** | string | Query string | -| page_urlscheme | **{dotted-circle}** | string | Scheme (protocol name) | -| platform | **{dotted-circle}** | string | The platform the app runs on | -| pp_xoffset_max | **{dotted-circle}** | integer | Maximum page x offset seen in the last ping period | -| pp_xoffset_min | **{dotted-circle}** | integer | Minimum page x offset seen in the last ping period | -| pp_yoffset_max | **{dotted-circle}** | integer | Maximum page y offset seen in the last ping period | -| pp_yoffset_min | **{dotted-circle}** | integer | Minimum page y offset seen in the last ping period | -| refr_domain_userid | **{dotted-circle}** | string | The Snowplow domain_userid of the referring website | -| refr_dvce_tstamp | **{dotted-circle}** | timestamp | The time of attaching the domain_userid to the inbound link | -| refr_medium | **{dotted-circle}** | string | Type of referer | -| refr_source | **{dotted-circle}** | string | Name of referer if recognised | -| refr_term | **{dotted-circle}** | string | Keywords if source is a search engine | -| refr_urlfragment | **{dotted-circle}** | string | Referer URL fragment | -| refr_urlhost | **{dotted-circle}** | string | Referer host | -| refr_urlpath | **{dotted-circle}** | string | Referer page path | -| refr_urlport | **{dotted-circle}** | integer | Referer port | -| refr_urlquery | **{dotted-circle}** | string | Referer URL querystring | -| refr_urlscheme | **{dotted-circle}** | string | Referer scheme | -| se_action | **{dotted-circle}** | string | The action / event itself | -| se_category | **{dotted-circle}** | string | The category of event | -| se_label | **{dotted-circle}** | string | A label often used to refer to the ‘object’ the action is performed on | -| se_property | **{dotted-circle}** | string | A property associated with either the action or the object | -| se_value | **{dotted-circle}** | decimal | A value associated with the user action | -| ti_category | **{dotted-circle}** | string | Item category | -| ti_currency | **{dotted-circle}** | string | Currency | -| ti_name | **{dotted-circle}** | string | Item name | -| ti_orderid | **{dotted-circle}** | string | Order ID | -| ti_price | **{dotted-circle}** | decimal | Item price | -| ti_price_base | **{dotted-circle}** | decimal | Item price in base currency | -| ti_quantity | **{dotted-circle}** | integer | Item quantity | -| ti_sku | **{dotted-circle}** | string | Item SKU | -| tr_affiliation | **{dotted-circle}** | string | Transaction affiliation (such as channel) | -| tr_city | **{dotted-circle}** | string | Delivery address: city | -| tr_country | **{dotted-circle}** | string | Delivery address: country | -| tr_currency | **{dotted-circle}** | string | Transaction Currency | -| tr_orderid | **{dotted-circle}** | string | Order ID | -| tr_shipping | **{dotted-circle}** | decimal | Delivery cost charged | -| tr_shipping_base | **{dotted-circle}** | decimal | Shipping cost in base currency | -| tr_state | **{dotted-circle}** | string | Delivery address: state | -| tr_tax | **{dotted-circle}** | decimal | Transaction tax value (such as amount of VAT included) | -| tr_tax_base | **{dotted-circle}** | decimal | Tax applied in base currency | -| tr_total | **{dotted-circle}** | decimal | Transaction total value | -| tr_total_base | **{dotted-circle}** | decimal | Total amount of transaction in base currency | -| true_tstamp | **{dotted-circle}** | timestamp | User-set exact timestamp | -| txn_id | **{dotted-circle}** | string | Transaction ID | -| unstruct_event | **{dotted-circle}** | JSON | The properties of the event | -| uploaded_at | **{dotted-circle}** | | | -| user_fingerprint | **{dotted-circle}** | integer | User identifier based on (hopefully unique) browser features | -| user_id | **{dotted-circle}** | string | Unique identifier for user, set by the business using setUserId | -| user_ipaddress | **{dotted-circle}** | string | IP address | -| useragent | **{dotted-circle}** | string | User agent (expressed as a browser string) | -| v_collector | **{dotted-circle}** | string | Collector version | -| v_etl | **{dotted-circle}** | string | ETL version | -| v_tracker | **{dotted-circle}** | string | Identifier for Snowplow tracker | +<!-- This redirect file can be deleted after February 1, 2021. --> +<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/development/product_analytics/usage_ping.md b/doc/development/product_analytics/usage_ping.md index 37363bbabbc..5fbdb508bb1 100644 --- a/doc/development/product_analytics/usage_ping.md +++ b/doc/development/product_analytics/usage_ping.md @@ -1,1059 +1,8 @@ --- -stage: Growth -group: Product Analytics -info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +redirect_to: '../usage_ping.md' --- -# Usage Ping Guide +This document was moved to [another location](../usage_ping.md). -> - Introduced in GitLab Enterprise Edition 8.10. -> - More statistics were added in GitLab Enterprise Edition 8.12. -> - Moved to GitLab Core in 9.1. -> - More statistics were added in GitLab Ultimate 11.2. - -This guide describes Usage Ping's purpose and how it's implemented. - -For more information about Product Analytics, see: - -- [Product Analytics Guide](https://about.gitlab.com/handbook/product/product-analytics-guide/) -- [Snowplow Guide](snowplow.md) - -More useful links: - -- [Product Analytics Direction](https://about.gitlab.com/direction/product-analytics/) -- [Data Analysis Process](https://about.gitlab.com/handbook/business-ops/data-team/#data-analysis-process/) -- [Data for Product Managers](https://about.gitlab.com/handbook/business-ops/data-team/programs/data-for-product-managers/) -- [Data Infrastructure](https://about.gitlab.com/handbook/business-ops/data-team/platform/infrastructure/) - -## What is Usage Ping? - -- GitLab sends a weekly payload containing usage data to GitLab Inc. Usage Ping provides high-level data to help our product, support, and sales teams. It does not send any project names, usernames, or any other specific data. The information from the usage ping is not anonymous, it is linked to the hostname of the instance. Sending usage ping is optional, and any instance can disable analytics. -- The usage data is primarily composed of row counts for different tables in the instance’s database. By comparing these counts month over month (or week over week), we can get a rough sense for how an instance is using the different features within the product. In addition to counts, other facts - that help us classify and understand GitLab installations are collected. -- Usage ping is important to GitLab as we use it to calculate our Stage Monthly Active Users (SMAU) which helps us measure the success of our stages and features. -- While usage ping is enabled, GitLab gathers data from the other instances and can show usage statistics of your instance to your users. - -### Why should we enable Usage Ping? - -- The main purpose of Usage Ping is to build a better GitLab. Data about how GitLab is used is collected to better understand feature/stage adoption and usage, which helps us understand how GitLab is adding value and helps our team better understand the reasons why people use GitLab and with this knowledge we're able to make better product decisions. -- As a benefit of having the usage ping active, GitLab lets you analyze the users’ activities over time of your GitLab installation. -- As a benefit of having the usage ping active, GitLab provides you with The DevOps Report,which gives you an overview of your entire instance’s adoption of Concurrent DevOps from planning to monitoring. -- You get better, more proactive support. (assuming that our TAMs and support organization used the data to deliver more value) -- You get insight and advice into how to get the most value out of your investment in GitLab. Wouldn't you want to know that a number of features or values are not being adopted in your organization? -- You get a report that illustrates how you compare against other similar organizations (anonymized), with specific advice and recommendations on how to improve your DevOps processes. -- Usage Ping is enabled by default. To disable it, see [Disable Usage Ping](#disable-usage-ping). - -### Limitations - -- Usage Ping does not track frontend events things like page views, link clicks, or user sessions, and only focuses on aggregated backend events. -- Because of these limitations we recommend instrumenting your products with Snowplow for more detailed analytics on GitLab.com and use Usage Ping to track aggregated backend events on self-managed. - -## Usage Ping payload - -You can view the exact JSON payload sent to GitLab Inc. in the administration panel. To view the payload: - -1. Navigate to **Admin Area > Settings > Metrics and profiling**. -1. Expand the **Usage statistics** section. -1. Click the **Preview payload** button. - -For an example payload, see [Example Usage Ping payload](#example-usage-ping-payload). - -## Disable Usage Ping - -To disable Usage Ping in the GitLab UI, go to the **Settings** page of your administration panel and uncheck the **Usage Ping** checkbox. - -To disable Usage Ping and prevent it from being configured in the future through the administration panel, Omnibus installs can set the following in [`gitlab.rb`](https://docs.gitlab.com/omnibus/settings/configuration.html#configuration-options): - -```ruby -gitlab_rails['usage_ping_enabled'] = false -``` - -Source installations can set the following in `gitlab.yml`: - -```yaml -production: &base - # ... - gitlab: - # ... - usage_ping_enabled: false -``` - -## Usage Ping request flow - -The following example shows a basic request/response flow between a GitLab instance, the Versions Application, the License Application, Salesforce, the GitLab S3 Bucket, the GitLab Snowflake Data Warehouse, and Sisense: - -```mermaid -sequenceDiagram - participant GitLab Instance - participant Versions Application - participant Licenses Application - participant Salesforce - participant S3 Bucket - participant Snowflake DW - participant Sisense Dashboards - GitLab Instance->>Versions Application: Send Usage Ping - loop Process usage data - Versions Application->>Versions Application: Parse usage data - Versions Application->>Versions Application: Write to database - Versions Application->>Versions Application: Update license ping time - end - loop Process data for Salesforce - Versions Application-xLicenses Application: Request Zuora subscription id - Licenses Application-xVersions Application: Zuora subscription id - Versions Application-xSalesforce: Request Zuora account id by Zuora subscription id - Salesforce-xVersions Application: Zuora account id - Versions Application-xSalesforce: Usage data for the Zuora account - end - Versions Application->>S3 Bucket: Export Versions database - S3 Bucket->>Snowflake DW: Import data - Snowflake DW->>Snowflake DW: Transform data using dbt - Snowflake DW->>Sisense Dashboards: Data available for querying - Versions Application->>GitLab Instance: DevOps Report (Conversational Development Index) -``` - -## How Usage Ping works - -1. The Usage Ping [cron job](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/gitlab_usage_ping_worker.rb#L30) is set in Sidekiq to run weekly. -1. When the cron job runs, it calls [`GitLab::UsageData.to_json`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/submit_usage_ping_service.rb#L22). -1. `GitLab::UsageData.to_json` [cascades down](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data.rb#L22) to ~400+ other counter method calls. -1. The response of all methods calls are [merged together](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data.rb#L14) into a single JSON payload in `GitLab::UsageData.to_json`. -1. The JSON payload is then [posted to the Versions application]( https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/submit_usage_ping_service.rb#L20) - If a firewall exception is needed, the required URL depends on several things. If - the hostname is `version.gitlab.com`, the protocol is `TCP`, and the port number is `443`, - the required URL is <https://version.gitlab.com/>. - -## Implementing Usage Ping - -Usage Ping consists of two kinds of data, counters and observations. Counters track how often a certain event -happened over time, such as how many CI pipelines have run. They are monotonic and always trend up. -Observations are facts collected from one or more GitLab instances and can carry arbitrary data. There are no -general guidelines around how to collect those, due to the individual nature of that data. - -There are several types of counters which are all found in `usage_data.rb`: - -- **Ordinary Batch Counters:** Simple count of a given ActiveRecord_Relation -- **Distinct Batch Counters:** Distinct count of a given ActiveRecord_Relation on given column -- **Sum Batch Counters:** Sum the values of a given ActiveRecord_Relation on given column -- **Alternative Counters:** Used for settings and configurations -- **Redis Counters:** Used for in-memory counts. - -NOTE: -Only use the provided counter methods. Each counter method contains a built in fail safe to isolate each counter to avoid breaking the entire Usage Ping. - -### Why batch counting - -For large tables, PostgreSQL can take a long time to count rows due to MVCC [(Multi-version Concurrency Control)](https://en.wikipedia.org/wiki/Multiversion_concurrency_control). Batch counting is a counting method where a single large query is broken into multiple smaller queries. For example, instead of a single query querying 1,000,000 records, with batch counting, you can execute 100 queries of 10,000 records each. Batch counting is useful for avoiding database timeouts as each batch query is significantly shorter than one single long running query. - -For GitLab.com, there are extremely large tables with 15 second query timeouts, so we use batch counting to avoid encountering timeouts. Here are the sizes of some GitLab.com tables: - -| Table | Row counts in millions | -|------------------------------|------------------------| -| `merge_request_diff_commits` | 2280 | -| `ci_build_trace_sections` | 1764 | -| `merge_request_diff_files` | 1082 | -| `events` | 514 | - -There are two batch counting methods provided, `Ordinary Batch Counters` and `Distinct Batch Counters`. Batch counting requires indexes on columns to calculate max, min, and range queries. In some cases, a specialized index may need to be added on the columns involved in a counter. - -### Ordinary Batch Counters - -Handles `ActiveRecord::StatementInvalid` error - -Simple count of a given ActiveRecord_Relation, does a non-distinct batch count, smartly reduces batch_size and handles errors. - -Method: `count(relation, column = nil, batch: true, start: nil, finish: nil)` - -Arguments: - -- `relation` the ActiveRecord_Relation to perform the count -- `column` the column to perform the count on, by default is the primary key -- `batch`: default `true` in order to use batch counting -- `start`: custom start of the batch counting in order to avoid complex min calculations -- `end`: custom end of the batch counting in order to avoid complex min calculations - -Examples: - -```ruby -count(User.active) -count(::Clusters::Cluster.aws_installed.enabled, :cluster_id) -count(::Clusters::Cluster.aws_installed.enabled, :cluster_id, start: ::Clusters::Cluster.minimum(:id), finish: ::Clusters::Cluster.maximum(:id)) -``` - -### Distinct Batch Counters - -Handles `ActiveRecord::StatementInvalid` error - -Distinct count of a given ActiveRecord_Relation on given column, a distinct batch count, smartly reduces batch_size and handles errors. - -Method: `distinct_count(relation, column = nil, batch: true, batch_size: nil, start: nil, finish: nil)` - -Arguments: - -- `relation` the ActiveRecord_Relation to perform the count -- `column` the column to perform the distinct count, by default is the primary key -- `batch`: default `true` in order to use batch counting -- `batch_size`: if none set it uses default value 10000 from `Gitlab::Database::BatchCounter` -- `start`: custom start of the batch counting in order to avoid complex min calculations -- `end`: custom end of the batch counting in order to avoid complex min calculations - -Examples: - -```ruby -distinct_count(::Project, :creator_id) -distinct_count(::Note.with_suggestions.where(time_period), :author_id, start: ::User.minimum(:id), finish: ::User.maximum(:id)) -distinct_count(::Clusters::Applications::CertManager.where(time_period).available.joins(:cluster), 'clusters.user_id') -``` - -### Sum Batch Counters - -Handles `ActiveRecord::StatementInvalid` error - -Sum the values of a given ActiveRecord_Relation on given column and handles errors. - -Method: `sum(relation, column, batch_size: nil, start: nil, finish: nil)` - -Arguments: - -- `relation` the ActiveRecord_Relation to perform the operation -- `column` the column to sum on -- `batch_size`: if none set it uses default value 1000 from `Gitlab::Database::BatchCounter` -- `start`: custom start of the batch counting in order to avoid complex min calculations -- `end`: custom end of the batch counting in order to avoid complex min calculations - -Examples: - -```ruby -sum(JiraImportState.finished, :imported_issues_count) -``` - -### Grouping & Batch Operations - -The `count`, `distinct_count`, and `sum` batch counters can accept an `ActiveRecord::Relation` -object, which groups by a specified column. With a grouped relation, the methods do batch counting, -handle errors, and returns a hash table of key-value pairs. - -Examples: - -```ruby -count(Namespace.group(:type)) -# returns => {nil=>179, "Group"=>54} - -distinct_count(Project.group(:visibility_level), :creator_id) -# returns => {0=>1, 10=>1, 20=>11} - -sum(Issue.group(:state_id), :weight)) -# returns => {1=>3542, 2=>6820} -``` - -### Redis Counters - -Handles `::Redis::CommandError` and `Gitlab::UsageDataCounters::BaseCounter::UnknownEvent` -returns -1 when a block is sent or hash with all values -1 when a `counter(Gitlab::UsageDataCounters)` is sent -different behavior due to 2 different implementations of Redis counter - -Method: `redis_usage_data(counter, &block)` - -Arguments: - -- `counter`: a counter from `Gitlab::UsageDataCounters`, that has `fallback_totals` method implemented -- or a `block`: which is evaluated - -#### Ordinary Redis Counters - -Examples of implementation: - -- Using Redis methods [`INCR`](https://redis.io/commands/incr), [`GET`](https://redis.io/commands/get), and [`Gitlab::UsageDataCounters::WikiPageCounter`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/wiki_page_counter.rb) -- Using Redis methods [`HINCRBY`](https://redis.io/commands/hincrby), [`HGETALL`](https://redis.io/commands/hgetall), and [`Gitlab::UsageCounters::PodLogs`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_counters/pod_logs.rb) - -##### UsageData API Tracking - -<!-- There's nearly identical content in `##### Adding new events`. If you fix errors here, you may need to fix the same errors in the other location. --> - -1. Track event using `UsageData` API - - Increment event count using ordinary Redis counter, for given event name. - - Tracking events using the `UsageData` API requires the `usage_data_api` feature flag to be enabled, which is enabled by default. - - API requests are protected by checking for a valid CSRF token. - - In order to be able to increment the values the related feature `usage_data_<event_name>` should be enabled. - - ```plaintext - POST /usage_data/increment_counter - ``` - - | Attribute | Type | Required | Description | - | :-------- | :--- | :------- | :---------- | - | `event` | string | yes | The event name it should be tracked | - - Response - - - `200` if event was tracked - - `400 Bad request` if event parameter is missing - - `401 Unauthorized` if user is not authenticated - - `403 Forbidden` for invalid CSRF token provided - -1. Track events using JavaScript/Vue API helper which calls the API above - - Note that `usage_data_api` and `usage_data_#{event_name}` should be enabled in order to be able to track events - - ```javascript - import api from '~/api'; - - api.trackRedisCounterEvent('my_already_defined_event_name'), - ``` - -#### Redis HLL Counters - -With `Gitlab::UsageDataCounters::HLLRedisCounter` we have available data structures used to count unique values. - -Implemented using Redis methods [PFADD](https://redis.io/commands/pfadd) and [PFCOUNT](https://redis.io/commands/pfcount). - -##### Adding new events - -1. Define events in [`known_events`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/). - - Example event: - - ```yaml - - name: i_compliance_credential_inventory - category: compliance - redis_slot: compliance - expiry: 42 # 6 weeks - aggregation: weekly - ``` - - Keys: - - - `name`: unique event name. - - Name format `<prefix>_<redis_slot>_name`. - - Use one of the following prefixes for the event's name: - - - `g_` for group, as an event which is tracked for group. - - `p_` for project, as an event which is tracked for project. - - `i_` for instance, as an event which is tracked for instance. - - `a_` for events encompassing all `g_`, `p_`, `i_`. - - `o_` for other. - - Consider including in the event's name the Redis slot in order to be able to count totals for a specific category. - - Example names: `i_compliance_credential_inventory`, `g_analytics_contribution`. - - - `category`: event category. Used for getting total counts for events in a category, for easier - access to a group of events. - - `redis_slot`: optional Redis slot; default value: event name. Used if needed to calculate totals - for a group of metrics. Ensure keys are in the same slot. For example: - `i_compliance_credential_inventory` with `redis_slot: 'compliance'` builds Redis key - `i_{compliance}_credential_inventory-2020-34`. If `redis_slot` is not defined the Redis key will - be `{i_compliance_credential_inventory}-2020-34`. - - `expiry`: expiry time in days. Default: 29 days for daily aggregation and 6 weeks for weekly - aggregation. - - `aggregation`: may be set to a `:daily` or `:weekly` key. Defines how counting data is stored in Redis. - Aggregation on a `daily` basis does not pull more fine grained data. - - `feature_flag`: optional. For details, see our [GitLab internal Feature flags](../feature_flags/) documentation. - -1. Track event in controller using `RedisTracking` module with `track_redis_hll_event(*controller_actions, name:, feature:, feature_default_enabled: false)`. - - Arguments: - - - `controller_actions`: controller actions we want to track. - - `name`: event name. - - `feature`: feature name, all metrics we track should be under feature flag. - - `feature_default_enabled`: feature flag is disabled by default, set to `true` for it to be enabled by default. - - Example usage: - - ```ruby - # controller - class ProjectsController < Projects::ApplicationController - include RedisTracking - - skip_before_action :authenticate_user!, only: :show - track_redis_hll_event :index, :show, name: 'g_compliance_example_feature_visitors', feature: :compliance_example_feature, feature_default_enabled: true - - def index - render html: 'index' - end - - def new - render html: 'new' - end - - def show - render html: 'show' - end - end - ``` - -1. Track event in API using `increment_unique_values(event_name, values)` helper method. - - In order to be able to track the event, Usage Ping must be enabled and the event feature `usage_data_<event_name>` must be enabled. - - Arguments: - - - `event_name`: event name. - - `values`: values counted, one value or array of values. - - Example usage: - - ```ruby - get ':id/registry/repositories' do - repositories = ContainerRepositoriesFinder.new( - user: current_user, subject: user_group - ).execute - - increment_unique_values('i_list_repositories', current_user.id) - - present paginate(repositories), with: Entities::ContainerRegistry::Repository, tags: params[:tags], tags_count: params[:tags_count] - end - ``` - -1. Track event using `track_usage_event(event_name, values) in services and graphql - - Increment unique values count using Redis HLL, for given event name. - - Example: - - [Track usage event for incident created in service](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/issues/update_service.rb) - - [Track usage event for incident created in graphql](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/graphql/mutations/alert_management/update_alert_status.rb) - - ```ruby - track_usage_event(:incident_management_incident_created, current_user.id) - ``` - -<!-- There's nearly identical content in `##### UsageData API Tracking`. If you find / fix errors here, you may need to fix errors in that section too. --> - -1. Track event using `UsageData` API - - Increment unique users count using Redis HLL, for given event name. - - Tracking events using the `UsageData` API requires the `usage_data_api` feature flag to be enabled, which is enabled by default. - - API requests are protected by checking for a valid CSRF token. - - In order to increment the values, the related feature `usage_data_<event_name>` should be - set to `default_enabled: true`. For more information, see - [Feature flags in development of GitLab](../feature_flags/index.md). - - ```plaintext - POST /usage_data/increment_unique_users - ``` - - | Attribute | Type | Required | Description | - | :-------- | :--- | :------- | :---------- | - | `event` | string | yes | The event name it should be tracked | - - Response - - Return 200 if tracking failed for any reason. - - - `200` if event was tracked or any errors - - `400 Bad request` if event parameter is missing - - `401 Unauthorized` if user is not authenticated - - `403 Forbidden` for invalid CSRF token provided - -1. Track events using JavaScript/Vue API helper which calls the API above - - Example usage for an existing event already defined in [known events](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/): - - Usage Data API is behind `usage_data_api` feature flag which, as of GitLab 13.7, is - now set to `default_enabled: true`. - - Each event tracked using Usage Data API is behind a feature flag `usage_data_#{event_name}` which should be `default_enabled: true` - - ```javascript - import api from '~/api'; - - api.trackRedisHllUserEvent('my_already_defined_event_name'), - ``` - -1. Track event using base module `Gitlab::UsageDataCounters::HLLRedisCounter.track_event(values, event_name)`. - - Arguments: - - - `values`: One value or array of values we count. For example: user_id, visitor_id, user_ids. - - `event_name`: event name. - -1. Track event on context level using base module `Gitlab::UsageDataCounters::HLLRedisCounter.track_event_in_context(entity_id, event_name, context)`. - - Arguments: - - - `entity_id`: value we count. For example: user_id, visitor_id. - - `event_name`: event name. - - `context`: context value. Allowed values are `default`, `free`, `bronze`, `silver`, `gold`, `starter`, `premium`, `ultimate` - -1. Get event data using `Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names:, start_date:, end_date:, context: '')`. - - Arguments: - - - `event_names`: the list of event names. - - `start_date`: start date of the period for which we want to get event data. - - `end_date`: end date of the period for which we want to get event data. - - `context`: context of the event. Allowed values are `default`, `free`, `bronze`, `silver`, `gold`, `starter`, `premium`, `ultimate`. - -1. Testing tracking and getting unique events - -Trigger events in rails console by using `track_event` method - - ```ruby - Gitlab::UsageDataCounters::HLLRedisCounter.track_event(1, 'g_compliance_audit_events') - Gitlab::UsageDataCounters::HLLRedisCounter.track_event(2, 'g_compliance_audit_events') - ``` - -Next, get the unique events for the current week. - - ```ruby - # Get unique events for metric for current_week - Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names: 'g_compliance_audit_events', - start_date: Date.current.beginning_of_week, end_date: Date.current.end_of_week) - ``` - -##### Recommendations - -We have the following recommendations for [Adding new events](#adding-new-events): - -- Event aggregation: weekly. -- Key expiry time: - - Daily: 29 days. - - Weekly: 42 days. -- When adding new metrics, use a [feature flag](../../operations/feature_flags.md) to control the impact. -- For feature flags triggered by another service, set `default_enabled: false`, - - Events can be triggered using the `UsageData` API, which helps when there are > 10 events per change - -##### Enable/Disable Redis HLL tracking - -Events are tracked behind [feature flags](../feature_flags/index.md) due to concerns for Redis performance and scalability. - -For a full list of events and corresponding feature flags see, [known_events](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/) files. - -To enable or disable tracking for specific event within <https://gitlab.com> or <https://about.staging.gitlab.com>, run commands such as the following to -[enable or disable the corresponding feature](../feature_flags/index.md). - -```shell -/chatops run feature set <feature_name> true -/chatops run feature set <feature_name> false -``` - -##### Known events in usage data payload - -All events added in [`known_events/common.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/common.yml) are automatically added to usage data generation under the `redis_hll_counters` key. This column is stored in [version-app as a JSON](https://gitlab.com/gitlab-services/version-gitlab-com/-/blob/master/db/schema.rb#L209). -For each event we add metrics for the weekly and monthly time frames, and totals for each where applicable: - -- `#{event_name}_weekly`: Data for 7 days for daily [aggregation](#adding-new-events) events and data for the last complete week for weekly [aggregation](#adding-new-events) events. -- `#{event_name}_monthly`: Data for 28 days for daily [aggregation](#adding-new-events) events and data for the last 4 complete weeks for weekly [aggregation](#adding-new-events) events. -- `#{category}_total_unique_counts_weekly`: Total unique counts for events in the same category for the last 7 days or the last complete week, if events are in the same Redis slot and we have more than one metric. -- `#{category}_total_unique_counts_monthly`: Total unique counts for events in same category for the last 28 days or the last 4 complete weeks, if events are in the same Redis slot and we have more than one metric. - -Example of `redis_hll_counters` data: - -```ruby -{:redis_hll_counters=> - {"compliance"=> - {"g_compliance_dashboard_weekly"=>0, - "g_compliance_dashboard_monthly"=>0, - "g_compliance_audit_events_weekly"=>0, - "g_compliance_audit_events_monthly"=>0, - "compliance_total_unique_counts_weekly"=>0, - "compliance_total_unique_counts_monthly"=>0}, - "analytics"=> - {"g_analytics_contribution_weekly"=>0, - "g_analytics_contribution_monthly"=>0, - "g_analytics_insights_weekly"=>0, - "g_analytics_insights_monthly"=>0, - "analytics_total_unique_counts_weekly"=>0, - "analytics_total_unique_counts_monthly"=>0}, - "ide_edit"=> - {"g_edit_by_web_ide_weekly"=>0, - "g_edit_by_web_ide_monthly"=>0, - "g_edit_by_sfe_weekly"=>0, - "g_edit_by_sfe_monthly"=>0, - "ide_edit_total_unique_counts_weekly"=>0, - "ide_edit_total_unique_counts_monthly"=>0}, - "search"=> - {"i_search_total_weekly"=>0, "i_search_total_monthly"=>0, "i_search_advanced_weekly"=>0, "i_search_advanced_monthly"=>0, "i_search_paid_weekly"=>0, "i_search_paid_monthly"=>0, "search_total_unique_counts_weekly"=>0, "search_total_unique_counts_monthly"=>0}, - "source_code"=>{"wiki_action_weekly"=>0, "wiki_action_monthly"=>0} - } -``` - -Example usage: - -```ruby -# Redis Counters -redis_usage_data(Gitlab::UsageDataCounters::WikiPageCounter) -redis_usage_data { ::Gitlab::UsageCounters::PodLogs.usage_totals[:total] } - -# Define events in common.yml https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/common.yml - -# Tracking events -Gitlab::UsageDataCounters::HLLRedisCounter.track_event(visitor_id, 'expand_vulnerabilities') - -# Get unique events for metric -redis_usage_data { Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names: 'expand_vulnerabilities', start_date: 28.days.ago, end_date: Date.current) } -``` - -### Alternative Counters - -Handles `StandardError` and fallbacks into -1 this way not all measures fail if we encounter one exception. -Mainly used for settings and configurations. - -Method: `alt_usage_data(value = nil, fallback: -1, &block)` - -Arguments: - -- `value`: a simple static value in which case the value is simply returned. -- or a `block`: which is evaluated -- `fallback: -1`: the common value used for any metrics that are failing. - -Example of usage: - -```ruby -alt_usage_data { Gitlab::VERSION } -alt_usage_data { Gitlab::CurrentSettings.uuid } -alt_usage_data(999) -``` - -### Prometheus Queries - -In those cases where operational metrics should be part of Usage Ping, a database or Redis query is unlikely -to provide useful data. Instead, Prometheus might be more appropriate, since most GitLab architectural -components publish metrics to it that can be queried back, aggregated, and included as usage data. - -NOTE: -Prometheus as a data source for Usage Ping is currently only available for single-node Omnibus installations -that are running the [bundled Prometheus](../../administration/monitoring/prometheus/index.md) instance. - -To query Prometheus for metrics, a helper method is available to `yield` a fully configured -`PrometheusClient`, given it is available as per the note above: - -```ruby -with_prometheus_client do |client| - response = client.query('<your query>') - ... -end -``` - -Please refer to [the `PrometheusClient` definition](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/prometheus_client.rb) -for how to use its API to query for data. - -## Developing and testing Usage Ping - -### 1. Naming and placing the metrics - -Add the metric in one of the top level keys - -- `license`: for license related metrics. -- `settings`: for settings related metrics. -- `counts_weekly`: for counters that have data for the most recent 7 days. -- `counts_monthly`: for counters that have data for the most recent 28 days. -- `counts`: for counters that have data for all time. - -### 2. Use your Rails console to manually test counters - -```ruby -# count -Gitlab::UsageData.count(User.active) -Gitlab::UsageData.count(::Clusters::Cluster.aws_installed.enabled, :cluster_id) - -# count distinct -Gitlab::UsageData.distinct_count(::Project, :creator_id) -Gitlab::UsageData.distinct_count(::Note.with_suggestions.where(time_period), :author_id, start: ::User.minimum(:id), finish: ::User.maximum(:id)) -``` - -### 3. Generate the SQL query - -Your Rails console returns the generated SQL queries. - -Example: - -```ruby -pry(main)> Gitlab::UsageData.count(User.active) - (2.6ms) SELECT "features"."key" FROM "features" - (15.3ms) SELECT MIN("users"."id") FROM "users" WHERE ("users"."state" IN ('active')) AND ("users"."user_type" IS NULL OR "users"."user_type" IN (6, 4)) - (2.4ms) SELECT MAX("users"."id") FROM "users" WHERE ("users"."state" IN ('active')) AND ("users"."user_type" IS NULL OR "users"."user_type" IN (6, 4)) - (1.9ms) SELECT COUNT("users"."id") FROM "users" WHERE ("users"."state" IN ('active')) AND ("users"."user_type" IS NULL OR "users"."user_type" IN (6, 4)) AND "users"."id" BETWEEN 1 AND 100000 -``` - -### 4. Optimize queries with #database-lab - -Paste the SQL query into `#database-lab` to see how the query performs at scale. - -- `#database-lab` is a Slack channel which uses a production-sized environment to test your queries. -- GitLab.com’s production database has a 15 second timeout. -- Any single query must stay below [1 second execution time](../query_performance.md#timing-guidelines-for-queries) with cold caches. -- Add a specialized index on columns involved to reduce the execution time. - -In order to have an understanding of the query's execution we add in the MR description the following information: - -- For counters that have a `time_period` test we add information for both cases: - - `time_period = {}` for all time periods - - `time_period = { created_at: 28.days.ago..Time.current }` for last 28 days period -- Execution plan and query time before and after optimization -- Query generated for the index and time -- Migration output for up and down execution - -We also use `#database-lab` and [explain.depesz.com](https://explain.depesz.com/). For more details, see the [database review guide](../database_review.md#preparation-when-adding-or-modifying-queries). - -#### Optimization recommendations and examples - -- Use specialized indexes [example 1](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/26871), [example 2](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/26445). -- Use defined `start` and `finish`, and simple queries, because these values can be memoized and reused, [example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/37155). -- Avoid joins and write the queries as simply as possible, [example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/36316). -- Set a custom `batch_size` for `distinct_count`, [example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/38000). - -### 5. Add the metric definition - -When adding, changing, or updating metrics, please update the [Event Dictionary's **Usage Ping** table](https://about.gitlab.com/handbook/product/product-analytics-guide/#event-dictionary). - -### 6. Add new metric to Versions Application - -Check if new metrics need to be added to the Versions Application. See `usage_data` [schema](https://gitlab.com/gitlab-services/version-gitlab-com/-/blob/master/db/schema.rb#L147) and usage data [parameters accepted](https://gitlab.com/gitlab-services/version-gitlab-com/-/blob/master/app/services/usage_ping.rb). Any metrics added under the `counts` key are saved in the `stats` column. - -### 7. Add the feature label - -Add the `feature` label to the Merge Request for new Usage Ping metrics. These are user-facing changes and are part of expanding the Usage Ping feature. - -### 8. Add a changelog file - -Ensure you comply with the [Changelog entries guide](../changelog.md). - -### 9. Ask for a Product Analytics Review - -On GitLab.com, we have DangerBot setup to monitor Product Analytics related files and DangerBot recommends a Product Analytics review. Mention `@gitlab-org/growth/product_analytics/engineers` in your MR for a review. - -### 10. Verify your metric - -On GitLab.com, the Product Analytics team regularly monitors Usage Ping. They may alert you that your metrics need further optimization to run quicker and with greater success. You may also use the [Usage Ping QA dashboard](https://app.periscopedata.com/app/gitlab/632033/Usage-Ping-QA) to check how well your metric performs. The dashboard allows filtering by GitLab version, by "Self-managed" & "Saas" and shows you how many failures have occurred for each metric. Whenever you notice a high failure rate, you may re-optimize your metric. - -### Optional: Test Prometheus based Usage Ping - -If the data submitted includes metrics [queried from Prometheus](#prometheus-queries) that you would like to inspect and verify, -then you need to ensure that a Prometheus server is running locally, and that furthermore the respective GitLab components -are exporting metrics to it. If you do not need to test data coming from Prometheus, no further action -is necessary, since Usage Ping should degrade gracefully in the absence of a running Prometheus server. - -There are currently three kinds of components that may export data to Prometheus, and which are included in Usage Ping: - -- [`node_exporter`](https://github.com/prometheus/node_exporter) - Exports node metrics from the host machine -- [`gitlab-exporter`](https://gitlab.com/gitlab-org/gitlab-exporter) - Exports process metrics from various GitLab components -- various GitLab services such as Sidekiq and the Rails server that export their own metrics - -#### Test with an Omnibus container - -This is the recommended approach to test Prometheus based Usage Ping. - -The easiest way to verify your changes is to build a new Omnibus image from your code branch via CI, then download the image -and run a local container instance: - -1. From your merge request, click on the `qa` stage, then trigger the `package-and-qa` job. This job triggers an Omnibus -build in a [downstream pipeline of the `omnibus-gitlab-mirror` project](https://gitlab.com/gitlab-org/build/omnibus-gitlab-mirror/-/pipelines). -1. In the downstream pipeline, wait for the `gitlab-docker` job to finish. -1. Open the job logs and locate the full container name including the version. It takes the following form: `registry.gitlab.com/gitlab-org/build/omnibus-gitlab-mirror/gitlab-ee:<VERSION>`. -1. On your local machine, make sure you are logged in to the GitLab Docker registry. You can find the instructions for this in -[Authenticate to the GitLab Container Registry](../../user/packages/container_registry/index.md#authenticate-with-the-container-registry). -1. Once logged in, download the new image via `docker pull registry.gitlab.com/gitlab-org/build/omnibus-gitlab-mirror/gitlab-ee:<VERSION>` -1. For more information about working with and running Omnibus GitLab containers in Docker, please refer to [GitLab Docker images](https://docs.gitlab.com/omnibus/docker/README.html) in the Omnibus documentation. - -#### Test with GitLab development toolkits - -This is the less recommended approach, since it comes with a number of difficulties when emulating a real GitLab deployment. - -The [GDK](https://gitlab.com/gitlab-org/gitlab-development-kit) is not currently set up to run a Prometheus server or `node_exporter` alongside other GitLab components. If you would -like to do so, [Monitoring the GDK with Prometheus](https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/master/doc/howto/prometheus/index.md#monitoring-the-gdk-with-prometheus) is a good start. - -The [GCK](https://gitlab.com/gitlab-org/gitlab-compose-kit) has limited support for testing Prometheus based Usage Ping. -By default, it already comes with a fully configured Prometheus service that is set up to scrape a number of components, -but with the following limitations: - -- It does not currently run a `gitlab-exporter` instance, so several `process_*` metrics from services such as Gitaly may be missing. -- While it runs a `node_exporter`, `docker-compose` services emulate hosts, meaning that it would normally report itself to not be associated -with any of the other services that are running. That is not how node metrics are reported in a production setup, where `node_exporter` -always runs as a process alongside other GitLab components on any given node. From Usage Ping's perspective none of the node data would therefore -appear to be associated to any of the services running, since they all appear to be running on different hosts. To alleviate this problem, the `node_exporter` in GCK was arbitrarily "assigned" to the `web` service, meaning only for this service `node_*` metrics appears in Usage Ping. - -## Aggregated metrics - -> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/45979) in GitLab 13.6. -> - It's [deployed behind a feature flag](../../user/feature_flags.md), disabled by default. -> - It's enabled on GitLab.com. - -WARNING: -This feature is intended solely for internal GitLab use. - -In order to add data for aggregated metrics into Usage Ping payload you should add corresponding definition into [`aggregated_metrics.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/aggregated_metrics.yml) file. Each aggregate definition includes following parts: - -- name: unique name under which aggregate metric is added to Usage Ping payload -- operator: operator that defines how aggregated metric data is counted. Available operators are: - - `OR`: removes duplicates and counts all entries that triggered any of listed events - - `AND`: removes duplicates and counts all elements that were observed triggering all of following events -- events: list of events names (from [`known_events.yml`](#known-events-in-usage-data-payload)) to aggregate into metric. All events in this list must have the same `redis_slot` and `aggregation` attributes. -- feature_flag: name of [development feature flag](../feature_flags/development.md#development-type) that is checked before -metrics aggregation is performed. Corresponding feature flag should have `default_enabled` attribute set to `false`. -`feature_flag` attribute is **OPTIONAL** and can be omitted, when `feature_flag` is missing no feature flag is checked. - -Example aggregated metric entries: - -```yaml -- name: product_analytics_test_metrics_union - operator: OR - events: ['i_search_total', 'i_search_advanced', 'i_search_paid'] -- name: product_analytics_test_metrics_intersection_with_feautre_flag - operator: AND - events: ['i_search_total', 'i_search_advanced', 'i_search_paid'] - feature_flag: example_aggregated_metric -``` - -Aggregated metrics are added under `aggregated_metrics` key in both `counts_weekly` and `counts_monthly` top level keys in Usage Ping payload. - -```ruby -{ - :counts_monthly => { - :deployments => 1003, - :successful_deployments => 78, - :failed_deployments => 275, - :packages => 155, - :personal_snippets => 2106, - :project_snippets => 407, - :promoted_issues => 719, - :aggregated_metrics => { - :product_analytics_test_metrics_union => 7, - :product_analytics_test_metrics_intersection_with_feautre_flag => 2 - }, - :snippets => 2513 - } -} -``` - -## Example Usage Ping payload - -The following is example content of the Usage Ping payload. - -```json -{ - "uuid": "0000000-0000-0000-0000-000000000000", - "hostname": "example.com", - "version": "12.10.0-pre", - "installation_type": "omnibus-gitlab", - "active_user_count": 999, - "recorded_at": "2020-04-17T07:43:54.162+00:00", - "edition": "EEU", - "license_md5": "00000000000000000000000000000000", - "license_id": null, - "historical_max_users": 999, - "licensee": { - "Name": "ABC, Inc.", - "Email": "email@example.com", - "Company": "ABC, Inc." - }, - "license_user_count": 999, - "license_starts_at": "2020-01-01", - "license_expires_at": "2021-01-01", - "license_plan": "ultimate", - "license_add_ons": { - }, - "license_trial": false, - "counts": { - "assignee_lists": 999, - "boards": 999, - "ci_builds": 999, - ... - }, - "container_registry_enabled": true, - "dependency_proxy_enabled": false, - "gitlab_shared_runners_enabled": true, - "gravatar_enabled": true, - "influxdb_metrics_enabled": true, - "ldap_enabled": false, - "mattermost_enabled": false, - "omniauth_enabled": true, - "prometheus_enabled": false, - "prometheus_metrics_enabled": false, - "reply_by_email_enabled": "incoming+%{key}@incoming.gitlab.com", - "signup_enabled": true, - "web_ide_clientside_preview_enabled": true, - "ingress_modsecurity_enabled": true, - "projects_with_expiration_policy_disabled": 999, - "projects_with_expiration_policy_enabled": 999, - ... - "elasticsearch_enabled": true, - "license_trial_ends_on": null, - "geo_enabled": false, - "git": { - "version": { - "major": 2, - "minor": 26, - "patch": 1 - } - }, - "gitaly": { - "version": "12.10.0-rc1-93-g40980d40", - "servers": 56, - "clusters": 14, - "filesystems": [ - "EXT_2_3_4" - ] - }, - "gitlab_pages": { - "enabled": true, - "version": "1.17.0" - }, - "container_registry_server": { - "vendor": "gitlab", - "version": "2.9.1-gitlab" - }, - "database": { - "adapter": "postgresql", - "version": "9.6.15", - "pg_system_id": 6842684531675334351 - }, - "analytics_unique_visits": { - "g_analytics_contribution": 999, - ... - }, - "usage_activity_by_stage": { - "configure": { - "project_clusters_enabled": 999, - ... - }, - "create": { - "merge_requests": 999, - ... - }, - "manage": { - "events": 999, - ... - }, - "monitor": { - "clusters": 999, - ... - }, - "package": { - "projects_with_packages": 999 - }, - "plan": { - "issues": 999, - ... - }, - "release": { - "deployments": 999, - ... - }, - "secure": { - "user_container_scanning_jobs": 999, - ... - }, - "verify": { - "ci_builds": 999, - ... - } - }, - "usage_activity_by_stage_monthly": { - "configure": { - "project_clusters_enabled": 999, - ... - }, - "create": { - "merge_requests": 999, - ... - }, - "manage": { - "events": 999, - ... - }, - "monitor": { - "clusters": 999, - ... - }, - "package": { - "projects_with_packages": 999 - }, - "plan": { - "issues": 999, - ... - }, - "release": { - "deployments": 999, - ... - }, - "secure": { - "user_container_scanning_jobs": 999, - ... - }, - "verify": { - "ci_builds": 999, - ... - } - }, - "topology": { - "duration_s": 0.013836685999194742, - "application_requests_per_hour": 4224, - "query_apdex_weekly_average": 0.996, - "failures": [], - "nodes": [ - { - "node_memory_total_bytes": 33269903360, - "node_memory_utilization": 0.35, - "node_cpus": 16, - "node_cpu_utilization": 0.2, - "node_uname_info": { - "machine": "x86_64", - "sysname": "Linux", - "release": "4.19.76-linuxkit" - }, - "node_services": [ - { - "name": "web", - "process_count": 16, - "process_memory_pss": 233349888, - "process_memory_rss": 788220927, - "process_memory_uss": 195295487, - "server": "puma" - }, - { - "name": "sidekiq", - "process_count": 1, - "process_memory_pss": 734080000, - "process_memory_rss": 750051328, - "process_memory_uss": 731533312 - }, - ... - ], - ... - }, - ... - ] - } -} -``` - -## Notable changes - -In GitLab 13.5, `pg_system_id` was added to send the [PostgreSQL system identifier](https://www.2ndquadrant.com/en/blog/support-for-postgresqls-system-identifier-in-barman/). - -## Exporting Usage Ping SQL queries and definitions - -Two Rake tasks exist to export Usage Ping definitions. - -- The Rake tasks export the raw SQL queries for `count`, `distinct_count`, `sum`. -- The Rake tasks export the Redis counter class or the line of the Redis block for `redis_usage_data`. -- The Rake tasks calculate the `alt_usage_data` metrics. - -In the home directory of your local GitLab installation run the following Rake tasks for the YAML and JSON versions respectively: - -```shell -# for YAML export -bin/rake gitlab:usage_data:dump_sql_in_yaml - -# for JSON export -bin/rake gitlab:usage_data:dump_sql_in_json - -# You may pipe the output into a file -bin/rake gitlab:usage_data:dump_sql_in_yaml > ~/Desktop/usage-metrics-2020-09-02.yaml -``` - -## Generating and troubleshooting usage ping - -To get a usage ping, or to troubleshoot caching issues on your GitLab instance, please follow [instructions to generate usage ping](../../administration/troubleshooting/gitlab_rails_cheat_sheet.md#generate-usage-ping). +<!-- This redirect file can be deleted after February 1, 2021. --> +<!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/development/profiling.md b/doc/development/profiling.md index 76c89d361fc..ce9c1191648 100644 --- a/doc/development/profiling.md +++ b/doc/development/profiling.md @@ -128,8 +128,66 @@ console. As a follow up to finding `N+1` queries with Bullet, consider writing a [QueryRecoder test](query_recorder.md) to prevent a regression. +## System stats + +During or after profiling, you may want to get detailed information about the Ruby virtual machine process, +such as memory consumption, time spent on CPU, or garbage collector statistics. These are easy to produce individually +through various tools, but for convenience, a summary endpoint has been added that exports this data as a JSON payload: + +```shell +curl localhost:3000/-/metrics/system | jq +``` + +Example output: + +```json +{ + "version": "ruby 2.7.2p137 (2020-10-01 revision a8323b79eb) [x86_64-linux-gnu]", + "gc_stat": { + "count": 118, + "heap_allocated_pages": 11503, + "heap_sorted_length": 11503, + "heap_allocatable_pages": 0, + "heap_available_slots": 4688580, + "heap_live_slots": 3451712, + "heap_free_slots": 1236868, + "heap_final_slots": 0, + "heap_marked_slots": 3451450, + "heap_eden_pages": 11503, + "heap_tomb_pages": 0, + "total_allocated_pages": 11503, + "total_freed_pages": 0, + "total_allocated_objects": 32679478, + "total_freed_objects": 29227766, + "malloc_increase_bytes": 84760, + "malloc_increase_bytes_limit": 32883343, + "minor_gc_count": 88, + "major_gc_count": 30, + "compact_count": 0, + "remembered_wb_unprotected_objects": 114228, + "remembered_wb_unprotected_objects_limit": 228456, + "old_objects": 3185330, + "old_objects_limit": 6370660, + "oldmalloc_increase_bytes": 21838024, + "oldmalloc_increase_bytes_limit": 119181499 + }, + "memory_rss": 1326501888, + "memory_uss": 1048563712, + "memory_pss": 1139554304, + "time_cputime": 82.885264633, + "time_realtime": 1610459445.5579069, + "time_monotonic": 24001.23145713, + "worker_id": "puma_0" +} +``` + +NOTE: +This endpoint is only available for Rails web workers. Sidekiq workers can not be inspected this way. + ## Settings that impact performance +### Application settings + 1. `development` environment by default works with hot-reloading enabled, this makes Rails to check file changes every request, and create a potential contention lock, as hot reload is single threaded. 1. `development` environment can load code lazily once the request is fired which results in first request to always be slow. @@ -140,3 +198,34 @@ To disable those features for profiling/benchmarking set the `RAILS_PROFILE` env - restart GDK with `gdk restart` *This environment variable is only applicable for the development mode.* + +### GC settings + +Ruby's garbage collector (GC) can be tuned via a variety of environment variables that will directly impact application performance. + +The following table lists these variables along with their default values. + +| Environment variable | Default value | +|--|--| +| `RUBY_GC_HEAP_INIT_SLOTS` | `10000` | +| `RUBY_GC_HEAP_FREE_SLOTS` | `4096` | +| `RUBY_GC_HEAP_FREE_SLOTS_MIN_RATIO` | `0.20` | +| `RUBY_GC_HEAP_FREE_SLOTS_GOAL_RATIO` | `0.40` | +| `RUBY_GC_HEAP_FREE_SLOTS_MAX_RATIO` | `0.65` | +| `RUBY_GC_HEAP_GROWTH_FACTOR` | `1.8` | +| `RUBY_GC_HEAP_GROWTH_MAX_SLOTS` | `0 (disable)` | +| `RUBY_GC_HEAP_OLDOBJECT_LIMIT_FACTOR` | `2.0` | +| `RUBY_GC_MALLOC_LIMIT(_MIN)` | `(16 * 1024 * 1024 /* 16MB */)` | +| `RUBY_GC_MALLOC_LIMIT_MAX` | `(32 * 1024 * 1024 /* 32MB */)` | +| `RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR` | `1.4` | +| `RUBY_GC_OLDMALLOC_LIMIT(_MIN)` | `(16 * 1024 * 1024 /* 16MB */)` | +| `RUBY_GC_OLDMALLOC_LIMIT_MAX` | `(128 * 1024 * 1024 /* 128MB */)` | +| `RUBY_GC_OLDMALLOC_LIMIT_GROWTH_FACTOR` | `1.2` | + +([Source](https://github.com/ruby/ruby/blob/45b29754cfba8435bc4980a87cd0d32c648f8a2e/gc.c#L254-L308)) + +GitLab may decide to change these settings in order to speed up application performance, lower memory requirements, or both. + +You can see how each of these settings affect GC performance, memory use and application start-up time for an idle instance of +GitLab by runnning the `scripts/perf/gc/collect_gc_stats.rb` script. It will output GC stats and general timing data to standard +out as CSV. diff --git a/doc/development/query_performance.md b/doc/development/query_performance.md index c61d2a0864f..3cb1b10c417 100644 --- a/doc/development/query_performance.md +++ b/doc/development/query_performance.md @@ -21,7 +21,7 @@ When you are optimizing your SQL queries, there are two dimensions to pay attent | Queries in a migration | `100ms` | This is different than the total [migration time](database_review.md#timing-guidelines-for-migrations). | | Concurrent operations in a migration | `5min` | Concurrent operations do not block the database, but they block the GitLab update. This includes operations such as `add_concurrent_index` and `add_concurrent_foreign_key`. | | Background migrations | `1s` | | -| Usage Ping | `1s` | See the [usage ping docs](product_analytics/usage_ping.md#developing-and-testing-usage-ping) for more details. | +| Usage Ping | `1s` | See the [usage ping docs](usage_ping.md#developing-and-testing-usage-ping) for more details. | - When analyzing your query's performance, pay attention to if the time you are seeing is on a [cold or warm cache](#cold-and-warm-cache). These guidelines apply for both cache types. - When working with batched queries, change the range and batch size to see how it effects the query timing and caching. diff --git a/doc/development/secure_coding_guidelines.md b/doc/development/secure_coding_guidelines.md index 44a95f6e820..bd98ea170e5 100644 --- a/doc/development/secure_coding_guidelines.md +++ b/doc/development/secure_coding_guidelines.md @@ -194,7 +194,7 @@ Go's [`regexp`](https://golang.org/pkg/regexp/) package uses `re2` and isn't vul - [Rubular](https://rubular.com/) is a nice online tool to fiddle with Ruby Regexps. - [Runaway Regular Expressions](https://www.regular-expressions.info/catastrophic.html) -- [The impact of regular expression denial of service (ReDoS) in practice: an empirical study at the ecosystem scale](http://people.cs.vt.edu/~davisjam/downloads/publications/DavisCoghlanServantLee-EcosystemREDOS-ESECFSE18.pdf). This research paper discusses approaches to automatically detect ReDoS vulnerabilities. +- [The impact of regular expression denial of service (ReDoS) in practice: an empirical study at the ecosystem scale](https://people.cs.vt.edu/~davisjam/downloads/publications/DavisCoghlanServantLee-EcosystemREDOS-ESECFSE18.pdf). This research paper discusses approaches to automatically detect ReDoS vulnerabilities. - [Freezing the web: A study of redos vulnerabilities in JavaScript-based web servers](https://www.usenix.org/system/files/conference/usenixsecurity18/sec18-staicu.pdf). Another research paper about detecting ReDoS vulnerabilities. ## Server Side Request Forgery (SSRF) diff --git a/doc/development/sidekiq_style_guide.md b/doc/development/sidekiq_style_guide.md index e4f07f732cf..e290eaee7c2 100644 --- a/doc/development/sidekiq_style_guide.md +++ b/doc/development/sidekiq_style_guide.md @@ -825,7 +825,7 @@ For the same reasons that removing workers is dangerous, care should be taken when renaming queues. When renaming queues, use the `sidekiq_queue_migrate` helper migration method, -as show in this example: +as shown in this example: ```ruby class MigrateTheRenamedSidekiqQueue < ActiveRecord::Migration[5.0] diff --git a/doc/development/snowplow.md b/doc/development/snowplow.md new file mode 100644 index 00000000000..6b37936cd93 --- /dev/null +++ b/doc/development/snowplow.md @@ -0,0 +1,623 @@ +--- +stage: Growth +group: Product Intelligence +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Snowplow Guide + +This guide provides an overview of how Snowplow works, and implementation details. + +For more information about Product Intelligence, see: + +- [Product Intelligence Guide](https://about.gitlab.com/handbook/product/product-intelligence-guide/) +- [Usage Ping Guide](usage_ping.md) + +More useful links: + +- [Product Intelligence Direction](https://about.gitlab.com/direction/product-intelligence/) +- [Data Analysis Process](https://about.gitlab.com/handbook/business-ops/data-team/#data-analysis-process/) +- [Data for Product Managers](https://about.gitlab.com/handbook/business-ops/data-team/programs/data-for-product-managers/) +- [Data Infrastructure](https://about.gitlab.com/handbook/business-ops/data-team/platform/infrastructure/) + +## What is Snowplow + +Snowplow is an enterprise-grade marketing and Product Intelligence platform which helps track the way users engage with our website and application. + +[Snowplow](https://github.com/snowplow/snowplow) consists of the following loosely-coupled sub-systems: + +- **Trackers** fire Snowplow events. Snowplow has 12 trackers, covering web, mobile, desktop, server, and IoT. +- **Collectors** receive Snowplow events from trackers. We have three different event collectors, synchronizing events either to Amazon S3, Apache Kafka, or Amazon Kinesis. +- **Enrich** cleans up the raw Snowplow events, enriches them and puts them into storage. We have an Hadoop-based enrichment process, and a Kinesis-based or Kafka-based process. +- **Storage** is where the Snowplow events live. We store the Snowplow events in a flat file structure on S3, and in the Redshift and PostgreSQL databases. +- **Data modeling** is where event-level data is joined with other data sets and aggregated into smaller data sets, and business logic is applied. This produces a clean set of tables which make it easier to perform analysis on the data. We have data models for Redshift and Looker. +- **Analytics** are performed on the Snowplow events or on the aggregate tables. + +![snowplow_flow](img/snowplow_flow.png) + +## Snowplow schema + +We have many definitions of Snowplow's schema. We have an active issue to [standardize this schema](https://gitlab.com/gitlab-org/gitlab/-/issues/207930) including the following definitions: + +- Frontend and backend taxonomy as listed below +- [Structured event taxonomy](#structured-event-taxonomy) +- [Self describing events](https://github.com/snowplow/snowplow/wiki/Custom-events#self-describing-events) +- [Iglu schema](https://gitlab.com/gitlab-org/iglu/) +- [Snowplow authored events](https://github.com/snowplow/snowplow/wiki/Snowplow-authored-events) + +## Enabling Snowplow + +Tracking can be enabled at: + +- The instance level, which enables tracking on both the frontend and backend layers. +- User level, though user tracking can be disabled on a per-user basis. GitLab tracking respects the [Do Not Track](https://www.eff.org/issues/do-not-track) standard, so any user who has enabled the Do Not Track option in their browser is not tracked at a user level. + +We use Snowplow for the majority of our tracking strategy and it is enabled on GitLab.com. On a self-managed instance, Snowplow can be enabled by navigating to: + +- **Admin Area > Settings > General** in the UI. +- `admin/application_settings/integrations` in your browser. + +The following configuration is required: + +| Name | Value | +|---------------|---------------------------| +| Collector | `snowplow.trx.gitlab.net` | +| Site ID | `gitlab` | +| Cookie domain | `.gitlab.com` | + +## Snowplow request flow + +The following example shows a basic request/response flow between the following components: + +- Snowplow JS / Ruby Trackers on GitLab.com +- [GitLab.com Snowplow Collector](https://gitlab.com/gitlab-com/gl-infra/readiness/-/blob/master/library/snowplow/index.md) +- The GitLab S3 Bucket +- The GitLab Snowflake Data Warehouse +- Sisense: + +```mermaid +sequenceDiagram + participant Snowplow JS (Frontend) + participant Snowplow Ruby (Backend) + participant GitLab.com Snowplow Collector + participant S3 Bucket + participant Snowflake DW + participant Sisense Dashboards + Snowplow JS (Frontend) ->> GitLab.com Snowplow Collector: FE Tracking event + Snowplow Ruby (Backend) ->> GitLab.com Snowplow Collector: BE Tracking event + loop Process using Kinesis Stream + GitLab.com Snowplow Collector ->> GitLab.com Snowplow Collector: Log raw events + GitLab.com Snowplow Collector ->> GitLab.com Snowplow Collector: Enrich events + GitLab.com Snowplow Collector ->> GitLab.com Snowplow Collector: Write to disk + end + GitLab.com Snowplow Collector ->> S3 Bucket: Kinesis Firehose + S3 Bucket->>Snowflake DW: Import data + Snowflake DW->>Snowflake DW: Transform data using dbt + Snowflake DW->>Sisense Dashboards: Data available for querying +``` + +## Structured event taxonomy + +When adding new click events, we should add them in a way that's internally consistent. If we don't, it is very painful to perform analysis across features since each feature captures events differently. + +The current method provides several attributes that are sent on each click event. Please try to follow these guidelines when specifying events to capture: + +| attribute | type | required | description | +| --------- | ------- | -------- | ----------- | +| category | text | true | The page or backend area of the application. Unless infeasible, please use the Rails page attribute by default in the frontend, and namespace + classname on the backend. | +| action | text | true | The action the user is taking, or aspect that's being instrumented. The first word should always describe the action or aspect: clicks should be `click`, activations should be `activate`, creations should be `create`, etc. Use underscores to describe what was acted on; for example, activating a form field would be `activate_form_input`. An interface action like clicking on a dropdown would be `click_dropdown`, while a behavior like creating a project record from the backend would be `create_project` | +| label | text | false | The specific element, or object that's being acted on. This is either the label of the element (e.g. a tab labeled 'Create from template' may be `create_from_template`) or a unique identifier if no text is available (e.g. closing the Groups dropdown in the top navbar might be `groups_dropdown_close`), or it could be the name or title attribute of a record being created. | +| property | text | false | Any additional property of the element, or object being acted on. | +| value | decimal | false | Describes a numeric value or something directly related to the event. This could be the value of an input (e.g. `10` when clicking `internal` visibility). | + +### Web-specific parameters + +Snowplow JS adds many [web-specific parameters](https://docs.snowplowanalytics.com/docs/collecting-data/collecting-from-own-applications/snowplow-tracker-protocol/#Web-specific_parameters) to all web events by default. + +## Implementing Snowplow JS (Frontend) tracking + +GitLab provides `Tracking`, an interface that wraps the [Snowplow JavaScript Tracker](https://github.com/snowplow/snowplow/wiki/javascript-tracker) for tracking custom events. There are a few ways to use tracking, but each generally requires at minimum, a `category` and an `action`. Additional data can be provided that adheres to our [Structured event taxonomy](#structured-event-taxonomy). + +| field | type | default value | description | +|:-----------|:-------|:---------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `category` | string | document.body.dataset.page | Page or subsection of a page that events are being captured within. | +| `action` | string | 'generic' | Action the user is taking. Clicks should be `click` and activations should be `activate`, so for example, focusing a form field would be `activate_form_input`, and clicking a button would be `click_button`. | +| `data` | object | {} | Additional data such as `label`, `property`, `value`, and `context` as described in our [Structured event taxonomy](#structured-event-taxonomy). | + +### Tracking in HAML (or Vue Templates) + +When working within HAML (or Vue templates) we can add `data-track-*` attributes to elements of interest. All elements that have a `data-track-event` attribute automatically have event tracking bound on clicks. + +Below is an example of `data-track-*` attributes assigned to a button: + +```haml +%button.btn{ data: { track: { event: "click_button", label: "template_preview", property: "my-template" } } } +``` + +```html +<button class="btn" + data-track-event="click_button" + data-track-label="template_preview" + data-track-property="my-template" +/> +``` + +Event listeners are bound at the document level to handle click events on or within elements with these data attributes. This allows them to be properly handled on re-rendering and changes to the DOM. Note that because of the way these events are bound, click events should not be stopped from propagating up the DOM tree. If for any reason click events are being stopped from propagating, you need to implement your own listeners and follow the instructions in [Tracking in raw JavaScript](#tracking-in-raw-javascript). + +Below is a list of supported `data-track-*` attributes: + +| attribute | required | description | +|:----------------------|:---------|:------------| +| `data-track-event` | true | Action the user is taking. Clicks must be prepended with `click` and activations must be prepended with `activate`. For example, focusing a form field would be `activate_form_input` and clicking a button would be `click_button`. | +| `data-track-label` | false | The `label` as described in our [Structured event taxonomy](#structured-event-taxonomy). | +| `data-track-property` | false | The `property` as described in our [Structured event taxonomy](#structured-event-taxonomy). | +| `data-track-value` | false | The `value` as described in our [Structured event taxonomy](#structured-event-taxonomy). If omitted, this is the element's `value` property or an empty string. For checkboxes, the default value is the element's checked attribute or `false` when unchecked. | +| `data-track-context` | false | The `context` as described in our [Structured event taxonomy](#structured-event-taxonomy). | + +#### Caveats + +When using the GitLab helper method [`nav_link`](https://gitlab.com/gitlab-org/gitlab/-/blob/898b286de322e5df6a38d257b10c94974d580df8/app/helpers/tab_helper.rb#L69) be sure to wrap `html_options` under the `html_options` keyword argument. +Be careful, as this behavior can be confused with the `ActionView` helper method [`link_to`](https://api.rubyonrails.org/v5.2.3/classes/ActionView/Helpers/UrlHelper.html#method-i-link_to) that does not require additional wrapping of `html_options` + +`nav_link(controller: ['dashboard/groups', 'explore/groups'], html_options: { data: { track_label: "groups_dropdown", track_event: "click_dropdown" } })` + +vs + +`link_to assigned_issues_dashboard_path, title: _('Issues'), data: { track_label: 'main_navigation', track_event: 'click_issues_link' }` + +### Tracking within Vue components + +There's a tracking Vue mixin that can be used in components if more complex tracking is required. To use it, first import the `Tracking` library and request a mixin. + +```javascript +import Tracking from '~/tracking'; +const trackingMixin = Tracking.mixin({ label: 'right_sidebar' }); +``` + +You can provide default options that are passed along whenever an event is tracked from within your component. For instance, if all events within a component should be tracked with a given `label`, you can provide one at this time. Available defaults are `category`, `label`, `property`, and `value`. If no category is specified, `document.body.dataset.page` is used as the default. + +You can then use the mixin normally in your component with the `mixin` Vue declaration. The mixin also provides the ability to specify tracking options in `data` or `computed`. These override any defaults and allow the values to be dynamic from props, or based on state. + +```javascript +export default { + mixins: [trackingMixin], + // ...[component implementation]... + data() { + return { + expanded: false, + tracking: { + label: 'left_sidebar' + } + }; + }, +} +``` + +The mixin provides a `track` method that can be called within the template, or from component methods. An example of the whole implementation might look like the following. + +```javascript +export default { + mixins: [Tracking.mixin({ label: 'right_sidebar' })], + data() { + return { + expanded: false, + }; + }, + methods: { + toggle() { + this.expanded = !this.expanded; + this.track('click_toggle', { value: this.expanded }) + } + } +}; +``` + +And if needed within the template, you can use the `track` method directly as well. + +```html +<template> + <div> + <a class="toggle" @click.prevent="toggle">Toggle</a> + <div v-if="expanded"> + <p>Hello world!</p> + <a @click.prevent="track('click_action')">Track an event</a> + </div> + </div> +</template> +``` + +### Tracking in raw JavaScript + +Custom event tracking and instrumentation can be added by directly calling the `Tracking.event` static function. The following example demonstrates tracking a click on a button by calling `Tracking.event` manually. + +```javascript +import Tracking from '~/tracking'; + +const button = document.getElementById('create_from_template_button'); +button.addEventListener('click', () => { + Tracking.event('dashboard:projects:index', 'click_button', { + label: 'create_from_template', + property: 'template_preview', + value: 'rails', + }); +}) +``` + +### Tests and test helpers + +In Jest particularly in Vue tests, you can use the following: + +```javascript +import { mockTracking } from 'helpers/tracking_helper'; + +describe('MyTracking', () => { + let spy; + + beforeEach(() => { + spy = mockTracking('_category_', wrapper.element, jest.spyOn); + }); + + it('tracks an event when clicked on feedback', () => { + wrapper.find('.discover-feedback-icon').trigger('click'); + + expect(spy).toHaveBeenCalledWith('_category_', 'click_button', { + label: 'security-discover-feedback-cta', + property: '0', + }); + }); +}); +``` + +In obsolete Karma tests it's used as below: + +```javascript +import { mockTracking, triggerEvent } from 'spec/helpers/tracking_helper'; + +describe('my component', () => { + let trackingSpy; + + beforeEach(() => { + trackingSpy = mockTracking('_category_', vm.$el, spyOn); + }); + + const triggerEvent = () => { + // action which should trigger a event + }; + + it('tracks an event when toggled', () => { + expect(trackingSpy).not.toHaveBeenCalled(); + + triggerEvent('a.toggle'); + + expect(trackingSpy).toHaveBeenCalledWith('_category_', 'click_edit_button', { + label: 'right_sidebar', + property: 'confidentiality', + }); + }); +}); +``` + +## Implementing Snowplow Ruby (Backend) tracking + +GitLab provides `Gitlab::Tracking`, an interface that wraps the [Snowplow Ruby Tracker](https://github.com/snowplow/snowplow/wiki/ruby-tracker) for tracking custom events. + +Custom event tracking and instrumentation can be added by directly calling the `GitLab::Tracking.event` class method, which accepts the following arguments: + +| argument | type | default value | description | +|:-----------|:-------|:--------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| +| `category` | string | 'application' | Area or aspect of the application. This could be `HealthCheckController` or `Lfs::FileTransformer` for instance. | +| `action` | string | 'generic' | The action being taken, which can be anything from a controller action like `create` to something like an Active Record callback. | +| `data` | object | {} | Additional data such as `label`, `property`, `value`, and `context` as described in [Structured event taxonomy](#structured-event-taxonomy). These are set as empty strings if you don't provide them. | + +Tracking can be viewed as either tracking user behavior, or can be used for instrumentation to monitor and visualize performance over time in an area or aspect of code. + +For example: + +```ruby +class Projects::CreateService < BaseService + def execute + project = Project.create(params) + + Gitlab::Tracking.event('Projects::CreateService', 'create_project', + label: project.errors.full_messages.to_sentence, + value: project.valid? + ) + end +end +``` + +### Unit testing + +Use the `expect_snowplow_event` helper when testing backend Snowplow events. See [testing best practices]( +https://docs.gitlab.com/ee/development/testing_guide/best_practices.html#test-snowplow-events) for details. + +### Performance + +We use the [AsyncEmitter](https://github.com/snowplow/snowplow/wiki/Ruby-Tracker#52-the-asyncemitter-class) when tracking events, which allows for instrumentation calls to be run in a background thread. This is still an active area of development. + +## Developing and testing Snowplow + +There are several tools for developing and testing Snowplow Event + +| Testing Tool | Frontend Tracking | Backend Tracking | Local Development Environment | Production Environment | Production Environment | +|----------------------------------------------|--------------------|---------------------|-------------------------------|------------------------|------------------------| +| Snowplow Analytics Debugger Chrome Extension | **{check-circle}** | **{dotted-circle}** | **{check-circle}** | **{check-circle}** | **{check-circle}** | +| Snowplow Inspector Chrome Extension | **{check-circle}** | **{dotted-circle}** | **{check-circle}** | **{check-circle}** | **{check-circle}** | +| Snowplow Micro | **{check-circle}** | **{check-circle}** | **{check-circle}** | **{dotted-circle}** | **{dotted-circle}** | +| Snowplow Mini | **{check-circle}** | **{check-circle}** | **{dotted-circle}** | **{status_preparing}** | **{status_preparing}** | + +**Legend** + +**{check-circle}** Available, **{status_preparing}** In progress, **{dotted-circle}** Not Planned + +### Preparing your MR for Review + +1. For frontend events, in the MR description section, add a screenshot of the event's relevant section using the [Snowplow Analytics Debugger](https://chrome.google.com/webstore/detail/snowplow-analytics-debugg/jbnlcgeengmijcghameodeaenefieedm) Chrome browser extension. +1. For backend events, please use Snowplow Micro and add the output of the Snowplow Micro good events `GET http://localhost:9090/micro/good`. + +### Snowplow Analytics Debugger Chrome Extension + +Snowplow Analytics Debugger is a browser extension for testing frontend events. This works on production, staging and local development environments. + +1. Install the [Snowplow Analytics Debugger](https://chrome.google.com/webstore/detail/snowplow-analytics-debugg/jbnlcgeengmijcghameodeaenefieedm) Chrome browser extension. +1. Open Chrome DevTools to the Snowplow Analytics Debugger tab. +1. Learn more at [Igloo Analytics](https://www.iglooanalytics.com/blog/snowplow-analytics-debugger-chrome-extension.html). + +### Snowplow Inspector Chrome Extension + +Snowplow Inspector Chrome Extension is a browser extension for testing frontend events. This works on production, staging and local development environments. + +1. Install [Snowplow Inspector](https://chrome.google.com/webstore/detail/snowplow-inspector/maplkdomeamdlngconidoefjpogkmljm?hl=en). +1. Open the Chrome extension by pressing the Snowplow Inspector icon beside the address bar. +1. Click around on a webpage with Snowplow and you should see JavaScript events firing in the inspector window. + +### Snowplow Micro + +Snowplow Micro is a very small version of a full Snowplow data collection pipeline: small enough that it can be launched by a test suite. Events can be recorded into Snowplow Micro just as they can a full Snowplow pipeline. Micro then exposes an API that can be queried. + +Snowplow Micro is a Docker-based solution for testing frontend and backend events in a local development environment. You need to modify GDK using the instructions below to set this up. + +- Read [Introducing Snowplow Micro](https://snowplowanalytics.com/blog/2019/07/17/introducing-snowplow-micro/) +- Look at the [Snowplow Micro repository](https://github.com/snowplow-incubator/snowplow-micro) +- Watch our [installation guide recording](https://www.youtube.com/watch?v=OX46fo_A0Ag) + +1. Ensure Docker is installed and running. + +1. Install [Snowplow Micro](https://github.com/snowplow-incubator/snowplow-micro) by cloning the settings in [this project](https://gitlab.com/gitlab-org/snowplow-micro-configuration): +1. Navigate to the directory with the cloned project, and start the appropriate Docker + container with the following script: + + ```shell + ./snowplow-micro.sh + ``` + +1. Update your instance's settings to enable Snowplow events and point to the Snowplow Micro collector: + + ```shell + gdk psql -d gitlabhq_development + update application_settings set snowplow_collector_hostname='localhost:9090', snowplow_enabled=true, snowplow_cookie_domain='.gitlab.com'; + ``` + +1. Update `DEFAULT_SNOWPLOW_OPTIONS` in `app/assets/javascripts/tracking.js` to remove `forceSecureTracker: true`: + + ```diff + diff --git a/app/assets/javascripts/tracking.js b/app/assets/javascripts/tracking.js + index 0a1211d0a76..3b98c8f28f2 100644 + --- a/app/assets/javascripts/tracking.js + +++ b/app/assets/javascripts/tracking.js + @@ -7,7 +7,6 @@ const DEFAULT_SNOWPLOW_OPTIONS = { + appId: '', + userFingerprint: false, + respectDoNotTrack: true, + - forceSecureTracker: true, + eventMethod: 'post', + contexts: { webPage: true, performanceTiming: true }, + formTracking: false, + + ``` + +1. Update `snowplow_options` in `lib/gitlab/tracking.rb` to add `protocol` and `port`: + + ```diff + diff --git a/lib/gitlab/tracking.rb b/lib/gitlab/tracking.rb + index 618e359211b..e9084623c43 100644 + --- a/lib/gitlab/tracking.rb + +++ b/lib/gitlab/tracking.rb + @@ -41,7 +41,9 @@ def snowplow_options(group) + cookie_domain: Gitlab::CurrentSettings.snowplow_cookie_domain, + app_id: Gitlab::CurrentSettings.snowplow_app_id, + form_tracking: additional_features, + - link_click_tracking: additional_features + + link_click_tracking: additional_features, + + protocol: 'http', + + port: 9090 + }.transform_keys! { |key| key.to_s.camelize(:lower).to_sym } + end + ``` + +1. Update `emitter` in `lib/gitlab/tracking/destinations/snowplow.rb` to change `protocol`: + + ```diff + diff --git a/lib/gitlab/tracking/destinations/snowplow.rb b/lib/gitlab/tracking/destinations/snowplow.rb + index 4fa844de325..5dd9d0eacfb 100644 + --- a/lib/gitlab/tracking/destinations/snowplow.rb + +++ b/lib/gitlab/tracking/destinations/snowplow.rb + @@ -40,7 +40,7 @@ def tracker + def emitter + SnowplowTracker::AsyncEmitter.new( + Gitlab::CurrentSettings.snowplow_collector_hostname, + - protocol: 'https' + + protocol: 'http' + ) + end + end + + ``` + +1. Restart GDK: + + ```shell + `gdk restart` + ``` + +1. Send a test Snowplow event from the Rails console: + + ```ruby + Gitlab::Tracking.self_describing_event('iglu:com.gitlab/pageview_context/jsonschema/1-0-0', data: { page_type: 'MY_TYPE' }, context: nil) + ``` + +1. Navigate to `localhost:9090/micro/good` to see the event. + +### Snowplow Mini + +[Snowplow Mini](https://github.com/snowplow/snowplow-mini) is an easily-deployable, single-instance version of Snowplow. + +Snowplow Mini can be used for testing frontend and backend events on a production, staging and local development environment. + +For GitLab.com, we're setting up a [QA and Testing environment](https://gitlab.com/gitlab-org/telemetry/-/issues/266) using Snowplow Mini. + +## Snowplow Schemas + +### [gitlab_standard](https://gitlab.com/gitlab-org/iglu/-/blob/master/public/schemas/com.gitlab/gitlab_standard/jsonschema/1-0-0) Schema + +| Field Name | Required | Type | Description | +|--------------|---------------------|---------|--------------------------------| +| project_id | **{dotted-circle}** | integer | ID of the associated project | +| namespace_id | **{dotted-circle}** | integer | ID of the associated namespace | + +### Default Schema + +| Field Name | Required | Type | Description | +|--------------------------|---------------------|-----------|----------------------------------------------------------------------------------------------------------------------------------| +| app_id | **{check-circle}** | string | Unique identifier for website / application | +| base_currency | **{dotted-circle}** | string | Reporting currency | +| br_colordepth | **{dotted-circle}** | integer | Browser color depth | +| br_cookies | **{dotted-circle}** | boolean | Does the browser permit cookies? | +| br_family | **{dotted-circle}** | string | Browser family | +| br_features_director | **{dotted-circle}** | boolean | Director plugin installed? | +| br_features_flash | **{dotted-circle}** | boolean | Flash plugin installed? | +| br_features_gears | **{dotted-circle}** | boolean | Google gears installed? | +| br_features_java | **{dotted-circle}** | boolean | Java plugin installed? | +| br_features_pdf | **{dotted-circle}** | boolean | Adobe PDF plugin installed? | +| br_features_quicktime | **{dotted-circle}** | boolean | Quicktime plugin installed? | +| br_features_realplayer | **{dotted-circle}** | boolean | Realplayer plugin installed? | +| br_features_silverlight | **{dotted-circle}** | boolean | Silverlight plugin installed? | +| br_features_windowsmedia | **{dotted-circle}** | boolean | Windows media plugin installed? | +| br_lang | **{dotted-circle}** | string | Language the browser is set to | +| br_name | **{dotted-circle}** | string | Browser name | +| br_renderengine | **{dotted-circle}** | string | Browser rendering engine | +| br_type | **{dotted-circle}** | string | Browser type | +| br_version | **{dotted-circle}** | string | Browser version | +| br_viewheight | **{dotted-circle}** | string | Browser viewport height | +| br_viewwidth | **{dotted-circle}** | string | Browser viewport width | +| collector_tstamp | **{dotted-circle}** | timestamp | Time stamp for the event recorded by the collector | +| contexts | **{dotted-circle}** | | | +| derived_contexts | **{dotted-circle}** | | Contexts derived in the Enrich process | +| derived_tstamp | **{dotted-circle}** | timestamp | Timestamp making allowance for innaccurate device clock | +| doc_charset | **{dotted-circle}** | string | Web page’s character encoding | +| doc_height | **{dotted-circle}** | string | Web page height | +| doc_width | **{dotted-circle}** | string | Web page width | +| domain_sessionid | **{dotted-circle}** | string | Unique identifier (UUID) for this visit of this user_id to this domain | +| domain_sessionidx | **{dotted-circle}** | integer | Index of number of visits that this user_id has made to this domain (The first visit is `1`) | +| domain_userid | **{dotted-circle}** | string | Unique identifier for a user, based on a first party cookie (so domain specific) | +| dvce_created_tstamp | **{dotted-circle}** | timestamp | Timestamp when event occurred, as recorded by client device | +| dvce_ismobile | **{dotted-circle}** | boolean | Indicates whether device is mobile | +| dvce_screenheight | **{dotted-circle}** | string | Screen / monitor resolution | +| dvce_screenwidth | **{dotted-circle}** | string | Screen / monitor resolution | +| dvce_sent_tstamp | **{dotted-circle}** | timestamp | Timestamp when event was sent by client device to collector | +| dvce_type | **{dotted-circle}** | string | Type of device | +| etl_tags | **{dotted-circle}** | string | JSON of tags for this ETL run | +| etl_tstamp | **{dotted-circle}** | timestamp | Timestamp event began ETL | +| event | **{dotted-circle}** | string | Event type | +| event_fingerprint | **{dotted-circle}** | string | Hash client-set event fields | +| event_format | **{dotted-circle}** | string | Format for event | +| event_id | **{dotted-circle}** | string | Event UUID | +| event_name | **{dotted-circle}** | string | Event name | +| event_vendor | **{dotted-circle}** | string | The company who developed the event model | +| event_version | **{dotted-circle}** | string | Version of event schema | +| geo_city | **{dotted-circle}** | string | City of IP origin | +| geo_country | **{dotted-circle}** | string | Country of IP origin | +| geo_latitude | **{dotted-circle}** | string | An approximate latitude | +| geo_longitude | **{dotted-circle}** | string | An approximate longitude | +| geo_region | **{dotted-circle}** | string | Region of IP origin | +| geo_region_name | **{dotted-circle}** | string | Region of IP origin | +| geo_timezone | **{dotted-circle}** | string | Timezone of IP origin | +| geo_zipcode | **{dotted-circle}** | string | Zip (postal) code of IP origin | +| ip_domain | **{dotted-circle}** | string | Second level domain name associated with the visitor’s IP address | +| ip_isp | **{dotted-circle}** | string | Visitor’s ISP | +| ip_netspeed | **{dotted-circle}** | string | Visitor’s connection type | +| ip_organization | **{dotted-circle}** | string | Organization associated with the visitor’s IP address – defaults to ISP name if none is found | +| mkt_campaign | **{dotted-circle}** | string | The campaign ID | +| mkt_clickid | **{dotted-circle}** | string | The click ID | +| mkt_content | **{dotted-circle}** | string | The content or ID of the ad. | +| mkt_medium | **{dotted-circle}** | string | Type of traffic source | +| mkt_network | **{dotted-circle}** | string | The ad network to which the click ID belongs | +| mkt_source | **{dotted-circle}** | string | The company / website where the traffic came from | +| mkt_term | **{dotted-circle}** | string | Keywords associated with the referrer | +| name_tracker | **{dotted-circle}** | string | The tracker namespace | +| network_userid | **{dotted-circle}** | string | Unique identifier for a user, based on a cookie from the collector (so set at a network level and shouldn’t be set by a tracker) | +| os_family | **{dotted-circle}** | string | Operating system family | +| os_manufacturer | **{dotted-circle}** | string | Manufacturers of operating system | +| os_name | **{dotted-circle}** | string | Name of operating system | +| os_timezone | **{dotted-circle}** | string | Client operating system timezone | +| page_referrer | **{dotted-circle}** | string | Referrer URL | +| page_title | **{dotted-circle}** | string | Page title | +| page_url | **{dotted-circle}** | string | Page URL | +| page_urlfragment | **{dotted-circle}** | string | Fragment aka anchor | +| page_urlhost | **{dotted-circle}** | string | Host aka domain | +| page_urlpath | **{dotted-circle}** | string | Path to page | +| page_urlport | **{dotted-circle}** | integer | Port if specified, 80 if not | +| page_urlquery | **{dotted-circle}** | string | Query string | +| page_urlscheme | **{dotted-circle}** | string | Scheme (protocol name) | +| platform | **{dotted-circle}** | string | The platform the app runs on | +| pp_xoffset_max | **{dotted-circle}** | integer | Maximum page x offset seen in the last ping period | +| pp_xoffset_min | **{dotted-circle}** | integer | Minimum page x offset seen in the last ping period | +| pp_yoffset_max | **{dotted-circle}** | integer | Maximum page y offset seen in the last ping period | +| pp_yoffset_min | **{dotted-circle}** | integer | Minimum page y offset seen in the last ping period | +| refr_domain_userid | **{dotted-circle}** | string | The Snowplow domain_userid of the referring website | +| refr_dvce_tstamp | **{dotted-circle}** | timestamp | The time of attaching the domain_userid to the inbound link | +| refr_medium | **{dotted-circle}** | string | Type of referer | +| refr_source | **{dotted-circle}** | string | Name of referer if recognised | +| refr_term | **{dotted-circle}** | string | Keywords if source is a search engine | +| refr_urlfragment | **{dotted-circle}** | string | Referer URL fragment | +| refr_urlhost | **{dotted-circle}** | string | Referer host | +| refr_urlpath | **{dotted-circle}** | string | Referer page path | +| refr_urlport | **{dotted-circle}** | integer | Referer port | +| refr_urlquery | **{dotted-circle}** | string | Referer URL querystring | +| refr_urlscheme | **{dotted-circle}** | string | Referer scheme | +| se_action | **{dotted-circle}** | string | The action / event itself | +| se_category | **{dotted-circle}** | string | The category of event | +| se_label | **{dotted-circle}** | string | A label often used to refer to the ‘object’ the action is performed on | +| se_property | **{dotted-circle}** | string | A property associated with either the action or the object | +| se_value | **{dotted-circle}** | decimal | A value associated with the user action | +| ti_category | **{dotted-circle}** | string | Item category | +| ti_currency | **{dotted-circle}** | string | Currency | +| ti_name | **{dotted-circle}** | string | Item name | +| ti_orderid | **{dotted-circle}** | string | Order ID | +| ti_price | **{dotted-circle}** | decimal | Item price | +| ti_price_base | **{dotted-circle}** | decimal | Item price in base currency | +| ti_quantity | **{dotted-circle}** | integer | Item quantity | +| ti_sku | **{dotted-circle}** | string | Item SKU | +| tr_affiliation | **{dotted-circle}** | string | Transaction affiliation (such as channel) | +| tr_city | **{dotted-circle}** | string | Delivery address: city | +| tr_country | **{dotted-circle}** | string | Delivery address: country | +| tr_currency | **{dotted-circle}** | string | Transaction Currency | +| tr_orderid | **{dotted-circle}** | string | Order ID | +| tr_shipping | **{dotted-circle}** | decimal | Delivery cost charged | +| tr_shipping_base | **{dotted-circle}** | decimal | Shipping cost in base currency | +| tr_state | **{dotted-circle}** | string | Delivery address: state | +| tr_tax | **{dotted-circle}** | decimal | Transaction tax value (such as amount of VAT included) | +| tr_tax_base | **{dotted-circle}** | decimal | Tax applied in base currency | +| tr_total | **{dotted-circle}** | decimal | Transaction total value | +| tr_total_base | **{dotted-circle}** | decimal | Total amount of transaction in base currency | +| true_tstamp | **{dotted-circle}** | timestamp | User-set exact timestamp | +| txn_id | **{dotted-circle}** | string | Transaction ID | +| unstruct_event | **{dotted-circle}** | JSON | The properties of the event | +| uploaded_at | **{dotted-circle}** | | | +| user_fingerprint | **{dotted-circle}** | integer | User identifier based on (hopefully unique) browser features | +| user_id | **{dotted-circle}** | string | Unique identifier for user, set by the business using setUserId | +| user_ipaddress | **{dotted-circle}** | string | IP address | +| useragent | **{dotted-circle}** | string | User agent (expressed as a browser string) | +| v_collector | **{dotted-circle}** | string | Collector version | +| v_etl | **{dotted-circle}** | string | ETL version | +| v_tracker | **{dotted-circle}** | string | Identifier for Snowplow tracker | diff --git a/doc/development/stage_group_dashboards.md b/doc/development/stage_group_dashboards.md new file mode 100644 index 00000000000..453d71411c3 --- /dev/null +++ b/doc/development/stage_group_dashboards.md @@ -0,0 +1,148 @@ +--- +stage: Enablement +group: Infrastructure +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Dashboards for stage groups + +## Introduction + +Observability is about bringing visibility into a system to see and understand the state of each component, with context, to support performance tuning and debugging. To run a SaaS platform at scale, a rich and detailed observability platform is a necessity. We have a set of monitoring dashboards designed for [each stage group](https://about.gitlab.com/handbook/product/categories/#devops-stages). + +These dashboards are designed to give an insight, to everyone working in a feature category, into how their code operates at GitLab.com scale. They are grouped per stage group to show the impact of feature/code changes, deployments, and feature-flag toggles. + +Each stage group has a dashboard consisting of metrics at the application level, such as Rails Web Requests, Rails API Requests, Sidekiq Jobs, and so on. The metrics in each dashboard are filtered and accumulated based on the [GitLab product categories](https://about.gitlab.com/handbook/product/categories/) and [feature categories](feature_categorization/index.md). + +The list of dashboards for each stage group is accessible at <https://dashboards.gitlab.net/dashboards/f/stage-groups/stage-groups> (GitLab team members only), or at [the public mirror](https://dashboards.gitlab.com/dashboards?tag=feature_category&tag=stage-groups) (accessible to everyone with a GitLab.com account, with some limitations). + +The dashboards for stage groups are at a very early stage. All contributions are welcome. If you have any questions or suggestions, please submit an issue in the [Scalability Team issues tracker](https://gitlab.com/gitlab-com/gl-infra/scalability/-/issues/new). + +## Usage + +Inside a stage group dashboard, there are some notable components. Let's take the [Source Code group's dashboard](https://dashboards.gitlab.net/d/stage-groups-source_code/stage-groups-group-dashboard-create-source-code?orgId=1) as an example. + +### Time range controls + +![Default time filter](img/stage_group_dashboards_time_filter.png) + +- By default, all the times are in UTC timezone. [We use UTC when communicating in Engineering](https://about.gitlab.com/handbook/communication/#writing-style-guidelines). +- All metrics recorded in the GitLab production system have [1-year retention](https://gitlab.com/gitlab-cookbooks/gitlab-prometheus/-/blob/31526b03fef823e2f9b3cda7c75dcd28a12418a3/attributes/prometheus.rb#L40). +- Alternatively, you can zoom in or filter the time range directly on a graph. See the [Grafana Time Range Controls](https://grafana.com/docs/grafana/latest/dashboards/time-range-controls/) documentation for more information. + +### Filters and annotations + +In each dashboard, there are two filters and some annotations switches on the top of the page. [Grafana annotations](https://grafana.com/docs/grafana/latest/dashboards/annotations/) mark some special events, which are meaningful to development and operational activities, directly on the graphs. + +![Filters and annotations](img/stage_group_dashboards_filters.png) + +| Name | Type | Description | +| ---- | ---- | ----------- | +| `PROMETHEUS_DS` | filter | Filter the selective [Prometheus data sources](https://about.gitlab.com/handbook/engineering/monitoring/#prometheus). The default value is `Global`, which aggregates the data from all available data sources. Most of the time, you don't need to care about this filter. | +| `environment` | filter | Filter the environment the metrics are fetched from. The default setting is production (`gprd`). Check [Production Environment mapping](https://about.gitlab.com/handbook/engineering/infrastructure/production/architecture/#environments) for other possibilities. | +| `deploy` | annotation | Mark a deployment event on the GitLab.com SaaS platform. | +| `canary-deploy` | annotation | Mark a [canary deployment](https://about.gitlab.com/handbook/engineering/#canary-testing) event on the GitLab.com SaaS platform. | +| `feature-flags` | annotation | Mark the time point where a feature flag is updated.| + +This is an example of a feature flag annotation displayed on a dashboard panel. + +![Annotations](img/stage_group_dashboards_annotation.png) + +### Metrics panels + +![Metrics panels](img/stage_group_dashboards_metrics.png) + +Although most of the metrics displayed in the panels are self-explanatory in their title and nearby description, note the following: + +- The events are counted, measured, accumulated, then collected, and stored as [time series](https://prometheus.io/docs/concepts/data_model/). The data are calculated using statistical methods to produce metrics. It means that metrics are approximately correct and meaningful over a time period. They help you have an overview of the stage of a system over time. They are not meant to give you precise numbers of a discrete event. If you need a higher level of accuracy, please look at another monitoring tool like [logs](https://about.gitlab.com/handbook/engineering/monitoring/#logs). Please read the following examples for more explanations. +- All the rate metrics' units are `requests per second`. The default aggregate time frame is 1 minute. For example, a panel shows the requests per second number at `2020-12-25 00:42:00` is `34.13`. It means at the minute 42 (from `2020-12-25 00:42:00` to `2020-12-25 00:42:59` ), there are approximately `34.13 * 60 = ~ 2047` requests processed by the web servers. +- You may encounter some gotchas related to decimal fraction and rounding up frequently, especially in low-traffic cases. For example, the error rate of `RepositoryUpdateMirrorWorker` at `2020-12-25 02:04:00` is `0.07`, equivalent to `4.2` jobs per minute. The raw result is `0.06666666667`, equivalent to 4 jobs per minute. +- All the rate metrics are more accurate when the data is big enough. The default floating-point precision is 2. In some extremely low panels, you would see `0.00` although there is still some real traffic. + +To inspect the raw data of the panel for further calculation, click on the Inspect button from the dropdown menu of a panel. Queries, raw data, and panel JSON structure are available. Read more at [Grafana panel inspection](https://grafana.com/docs/grafana/latest/panels/inspect-panel/). + +All the dashboards are powered by [Grafana](https://grafana.com/), a frontend for displaying metrics. Grafana consumes the data returned from queries to backend Prometheus data source, then presents them under different visualizations. The stage group dashboards are built to serve the most common use cases with a limited set of filters, and pre-built queries. Grafana provides a way to explore and visualize the metrics data with [Grafana Explore](https://grafana.com/docs/grafana/latest/explore/). This would require some knowledge about [Prometheus Promql query language](https://prometheus.io/docs/prometheus/latest/querying/basics/). + +## How to debug with the dashboards + +- A team member in the Code Review group has merged an MR which got deployed to production. +- To verify the deployment, we can check the [Code Review group's dashboard](https://dashboards.gitlab.net/d/stage-groups-code_review/stage-groups-group-dashboard-create-code-review?orgId=1). +- Sidekiq Error Rate panel shows an elevated error rate, specifically `UpdateMergeRequestsWorker`. + + ![Debug 1](img/stage_group_dashboards_debug_1.png) + +- If we click on `Kibana: Kibana Sidekiq failed request logs` link in the Extra links session, we can filter for `UpdateMergeRequestsWorker`, and read through the logs. + + ![Debug 2](img/stage_group_dashboards_debug_2.png) + +- [Sentry](https://sentry.gitlab.net/gitlab/gitlabcom/) gives us a way to find the exception where we can filter by transaction type and correlation_id from a Kibana's result item. + + ![Debug 3](img/stage_group_dashboards_debug_3.png) + +- A precise exception, including a stack trace, job arguments, and other information, should now appear. Happy debugging! + +## How to customize the dashboard + +All Grafana dashboards at GitLab are generated from the [Jsonnet files](https://github.com/grafana/grafonnet-lib) stored in [the runbook project](https://gitlab.com/gitlab-com/runbooks/-/tree/master/dashboards). Particularly, the stage group dashboards definitions are stored in [/dashboards/stage-groups](https://gitlab.com/gitlab-com/runbooks/-/tree/master/dashboards/stage-groups) subfolder in the Runbook. By convention, each group has a corresponding jsonnet file. The dashboards are synced with GitLab [stage group data](https://gitlab.com/gitlab-com/www-gitlab-com/-/raw/master/data/stages.yml) every month. Expansion and customization are one of the key principles used when we designed this system. To customize your group's dashboard, you need to edit the corresponding file and follow the [Runbook workflow](https://gitlab.com/gitlab-com/runbooks/-/tree/master/dashboards#dashboard-source). The dashboard is updated after the MR is merged. Looking at an autogenerated file, for example, [`product_planning.dashboard.jsonnet`](https://gitlab.com/gitlab-com/runbooks/-/blob/master/dashboards/stage-groups/product_planning.dashboard.jsonnet): + +```jsonnet +// This file is autogenerated using scripts/update_stage_groups_dashboards.rb +// Please feel free to customize this file. +local stageGroupDashboards = import './stage-group-dashboards.libsonnet'; + +stageGroupDashboards.dashboard('product_planning') +.stageGroupDashboardTrailer() +``` + +We provide basic customization to filter out the components essential to your group's activities. By default, all components `web`, `api`, `git`, and `sidekiq` are available in the dashboard. We can change this to only show `web` and `api`, or only show `sidekiq`: + +```jsonnet +stageGroupDashboards.dashboard('product_planning', components=['web', 'api']).stageGroupDashboardTrailer() +# Or +stageGroupDashboards.dashboard('product_planning', components=['sidekiq']).stageGroupDashboardTrailer() + +``` + +You can also append further information or custom metrics to a dashboard. This is an example that adds some links and a total request rate on the top of the page: + +```jsonnet +local stageGroupDashboards = import './stage-group-dashboards.libsonnet'; +local grafana = import 'github.com/grafana/grafonnet-lib/grafonnet/grafana.libsonnet'; +local basic = import 'grafana/basic.libsonnet'; + +stageGroupDashboards.dashboard('source_code') +.addPanel( + grafana.text.new( + title='Group information', + mode='markdown', + content=||| + Useful link for the Source Code Management group dashboard: + - [Issue list](https://gitlab.com/groups/gitlab-org/-/issues?scope=all&utf8=%E2%9C%93&state=opened&label_name%5B%5D=repository) + - [Epic list](https://gitlab.com/groups/gitlab-org/-/epics?label_name[]=repository) + |||, + ), + gridPos={ x: 0, y: 0, w: 24, h: 4 } +) +.addPanel( + basic.timeseries( + title='Total Request Rate', + yAxisLabel='Requests per Second', + decimals=2, + query=||| + sum ( + rate(gitlab_transaction_duration_seconds_count{ + env='$environment', + environment='$environment', + feature_category=~'source_code_management', + }[$__interval]) + ) + ||| + ), + gridPos={ x: 0, y: 0, w: 24, h: 7 } +) +.stageGroupDashboardTrailer() +``` + +![Stage Group Dashboard Customization](img/stage_group_dashboards_time_customization.png) + +For deeper customization and more complicated metrics, visit the [Grafonnet lib](https://github.com/grafana/grafonnet-lib) project and the [GitLab Prometheus Metrics](../administration/monitoring/prometheus/gitlab_metrics.md#gitlab-prometheus-metrics) documentation. diff --git a/doc/development/telemetry/event_dictionary.md b/doc/development/telemetry/event_dictionary.md index bc230a46441..b3b3b0b4fdd 100644 --- a/doc/development/telemetry/event_dictionary.md +++ b/doc/development/telemetry/event_dictionary.md @@ -1,8 +1,8 @@ --- -redirect_to: '../product_analytics/event_dictionary.md' +redirect_to: 'https://about.gitlab.com/handbook/product/product-intelligence-guide/' --- -This document was moved to [another location](../product_analytics/event_dictionary.md). +This document was moved to [another location](https://about.gitlab.com/handbook/product/product-intelligence-guide/). <!-- This redirect file can be deleted after February 1, 2021. --> <!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/development/telemetry/index.md b/doc/development/telemetry/index.md index 24e83ffc524..b3b3b0b4fdd 100644 --- a/doc/development/telemetry/index.md +++ b/doc/development/telemetry/index.md @@ -1,8 +1,8 @@ --- -redirect_to: '../product_analytics/index.md' +redirect_to: 'https://about.gitlab.com/handbook/product/product-intelligence-guide/' --- -This document was moved to [another location](../product_analytics/index.md). +This document was moved to [another location](https://about.gitlab.com/handbook/product/product-intelligence-guide/). <!-- This redirect file can be deleted after February 1, 2021. --> <!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/development/telemetry/snowplow.md b/doc/development/telemetry/snowplow.md index 7cd385be681..bb056ffddfe 100644 --- a/doc/development/telemetry/snowplow.md +++ b/doc/development/telemetry/snowplow.md @@ -1,8 +1,8 @@ --- -redirect_to: '../product_analytics/snowplow.md' +redirect_to: '../snowplow.md' --- -This document was moved to [another location](../product_analytics/snowplow.md). +This document was moved to [another location](../snowplow.md). <!-- This redirect file can be deleted after February 1, 2021. --> <!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/development/telemetry/usage_ping.md b/doc/development/telemetry/usage_ping.md index c890353fe3b..5fbdb508bb1 100644 --- a/doc/development/telemetry/usage_ping.md +++ b/doc/development/telemetry/usage_ping.md @@ -1,8 +1,8 @@ --- -redirect_to: '../product_analytics/usage_ping.md' +redirect_to: '../usage_ping.md' --- -This document was moved to [another location](../product_analytics/usage_ping.md). +This document was moved to [another location](../usage_ping.md). <!-- This redirect file can be deleted after February 1, 2021. --> <!-- Before deletion, see: https://docs.gitlab.com/ee/development/documentation/#move-or-rename-a-page --> diff --git a/doc/development/testing_guide/best_practices.md b/doc/development/testing_guide/best_practices.md index d1b7883451f..ac5f1a47f9b 100644 --- a/doc/development/testing_guide/best_practices.md +++ b/doc/development/testing_guide/best_practices.md @@ -842,6 +842,41 @@ Example: expect(response).to have_gitlab_http_status(:ok) ``` +#### `match_schema` and `match_response_schema` + +The `match_schema` matcher allows validating that the subject matches a +[JSON schema](https://json-schema.org/). The item inside `expect` can be +a JSON string or a JSON-compatible data structure. + +`match_response_schema` is a convenience matcher for using with a +response object. from a [request +spec](testing_levels.md#integration-tests). + +Examples: + +```ruby +# Matches against spec/fixtures/api/schemas/prometheus/additional_metrics_query_result.json +expect(data).to match_schema('prometheus/additional_metrics_query_result') + +# Matches against ee/spec/fixtures/api/schemas/board.json +expect(data).to match_schema('board', dir: 'ee') + +# Matches against a schema made up of Ruby data structures +expect(data).to match_schema(Atlassian::Schemata.build_info) +``` + +#### `be_valid_json` + +`be_valid_json` allows validating that a string parses as JSON and gives +a non-empty result. To combine it with the schema matching above, use +`and`: + +```ruby +expect(json_string).to be_valid_json + +expect(json_string).to be_valid_json.and match_schema(schema) +``` + ### Testing query performance Testing query performance allows us to: diff --git a/doc/development/testing_guide/end_to_end/running_tests_that_require_special_setup.md b/doc/development/testing_guide/end_to_end/running_tests_that_require_special_setup.md index 8a49c333f9f..cd429a74a2a 100644 --- a/doc/development/testing_guide/end_to_end/running_tests_that_require_special_setup.md +++ b/doc/development/testing_guide/end_to_end/running_tests_that_require_special_setup.md @@ -403,3 +403,85 @@ Geo requires an EE license. To visit the Geo sites in your browser, you need a r - You can find the full image address from a pipeline by [following these instructions](https://about.gitlab.com/handbook/engineering/quality/guidelines/tips-and-tricks/#running-gitlab-qa-pipeline-against-a-specific-gitlab-release). You might be prompted to set the `GITLAB_QA_ACCESS_TOKEN` variable if you specify the full image address. - You can increase the wait time for replication by setting `GEO_MAX_FILE_REPLICATION_TIME` and `GEO_MAX_DB_REPLICATION_TIME`. The default is 120 seconds. - To save time during tests, create a Personal Access Token with API access on the Geo primary node, and pass that value in as `GITLAB_QA_ACCESS_TOKEN` and `GITLAB_QA_ADMIN_ACCESS_TOKEN`. + +## LDAP Tests + +Tests that are tagged with `:ldap_tls` and `:ldap_no_tls` meta are orchestrated tests where the sign-in happens via LDAP. + +These tests spin up a Docker container [(osixia/openldap)](https://hub.docker.com/r/osixia/openldap) running an instance of [OpenLDAP](https://www.openldap.org/). +The container uses fixtures [checked into the GitLab-QA repo](https://gitlab.com/gitlab-org/gitlab-qa/-/tree/9ffb9ad3be847a9054967d792d6772a74220fb42/fixtures/ldap) to create +base data such as users and groups including the admin group. The password for [all users](https://gitlab.com/gitlab-org/gitlab-qa/-/blob/9ffb9ad3be847a9054967d792d6772a74220fb42/fixtures/ldap/2_add_users.ldif) including [the `tanuki` user](https://gitlab.com/gitlab-org/gitlab-qa/-/blob/9ffb9ad3be847a9054967d792d6772a74220fb42/fixtures/ldap/tanuki.ldif) is `password`. + +A GitLab instance is also created in a Docker container based on our [General LDAP setup](../../../administration/auth/ldap/index.md#general-ldap-setup) documentation. + +Tests that are tagged `:ldap_tls` enable TLS on GitLab using the certificate [checked into the GitLab-QA repo](https://gitlab.com/gitlab-org/gitlab-qa/-/tree/9ffb9ad3be847a9054967d792d6772a74220fb42/tls_certificates/gitlab). + +The certificate was generated with openssl using this command: + +```shell +openssl req -x509 -newkey rsa:4096 -keyout gitlab.test.key -out gitlab.test.crt -days 3650 -nodes -subj "/C=US/ST=CA/L=San Francisco/O=GitLab/OU=Org/CN=gitlab.test" +``` + +The OpenLDAP container also uses its [auto-generated TLS certificates](https://github.com/osixia/docker-openldap#use-auto-generated-certificate). + +### Running LDAP tests with TLS enabled + +To run the LDAP tests on your local with TLS enabled, follow these steps: + +1. Include the following entry in your `/etc/hosts` file: + + `127.0.0.1 gitlab.test` + + You can then run tests against GitLab in a Docker container on `https://gitlab.test`. Please note that the TLS certificate [checked into the GitLab-QA repo](https://gitlab.com/gitlab-org/gitlab-qa/-/tree/9ffb9ad3be847a9054967d792d6772a74220fb42/tls_certificates/gitlab) is configured for this domain. +1. Run the OpenLDAP container with TLS enabled. Change the path to [`gitlab-qa/fixtures/ldap`](https://gitlab.com/gitlab-org/gitlab-qa/-/tree/9ffb9ad3be847a9054967d792d6772a74220fb42/fixtures/ldap) directory to your local checkout path: + + ```shell + docker network create test && docker run --name ldap-server --net test --hostname ldap-server.test --volume /path/to/gitlab-qa/fixtures/ldap:/container/service/slapd/assets/config/bootstrap/ldif/custom:Z --env LDAP_TLS_CRT_FILENAME="ldap-server.test.crt" --env LDAP_TLS_KEY_FILENAME="ldap-server.test.key" --env LDAP_TLS_ENFORCE="true" --env LDAP_TLS_VERIFY_CLIENT="never" osixia/openldap:latest --copy-service + ``` + +1. Run the GitLab container with TLS enabled. Change the path to [`gitlab-qa/tls_certificates/gitlab`](https://gitlab.com/gitlab-org/gitlab-qa/-/tree/9ffb9ad3be847a9054967d792d6772a74220fb42/tls_certificates/gitlab) directory to your local checkout path: + + ```shell + sudo docker run \ + --hostname gitlab.test \ + --net test \ + --publish 443:443 --publish 80:80 --publish 22:22 \ + --name gitlab \ + --volume /path/to/gitlab-qa/tls_certificates/gitlab:/etc/gitlab/ssl \ + --env GITLAB_OMNIBUS_CONFIG="gitlab_rails['ldap_enabled'] = true; gitlab_rails['ldap_servers'] = {\"main\"=>{\"label\"=>\"LDAP\", \"host\"=>\"ldap-server.test\", \"port\"=>636, \"uid\"=>\"uid\", \"bind_dn\"=>\"cn=admin,dc=example,dc=org\", \"password\"=>\"admin\", \"encryption\"=>\"simple_tls\", \"verify_certificates\"=>false, \"base\"=>\"dc=example,dc=org\", \"user_filter\"=>\"\", \"group_base\"=>\"ou=Global Groups,dc=example,dc=org\", \"admin_group\"=>\"AdminGroup\", \"external_groups\"=>\"\", \"sync_ssh_keys\"=>false}}; letsencrypt['enable'] = false; external_url 'https://gitlab.test'; gitlab_rails['ldap_sync_worker_cron'] = '* * * * *'; gitlab_rails['ldap_group_sync_worker_cron'] = '* * * * *'; " \ + gitlab/gitlab-ee:latest + ``` + +1. Run an LDAP test from [`gitlab/qa`](https://gitlab.com/gitlab-org/gitlab/-/tree/d5447ebb5f99d4c72780681ddf4dc25b0738acba/qa) directory: + + ```shell + GITLAB_LDAP_USERNAME="tanuki" GITLAB_LDAP_PASSWORD="password" QA_DEBUG=true CHROME_HEADLESS=false bin/qa Test::Instance::All https://gitlab.test qa/specs/features/browser_ui/1_manage/login/log_into_gitlab_via_ldap_spec.rb + ``` + +### Running LDAP tests with TLS disabled + +To run the LDAP tests on your local with TLS disabled, follow these steps: + +1. Run OpenLDAP container with TLS disabled. Change the path to [`gitlab-qa/fixtures/ldap`](https://gitlab.com/gitlab-org/gitlab-qa/-/tree/9ffb9ad3be847a9054967d792d6772a74220fb42/fixtures/ldap) directory to your local checkout path: + + ```shell + docker network create test && docker run --net test --publish 389:389 --publish 636:636 --name ldap-server --hostname ldap-server.test --volume /path/to/gitlab-qa/fixtures/ldap:/container/service/slapd/assets/config/bootstrap/ldif/custom:Z --env LDAP_TLS="false" osixia/openldap:latest --copy-service + ``` + +1. Run the GitLab container: + + ```shell + sudo docker run \ + --hostname localhost \ + --net test \ + --publish 443:443 --publish 80:80 --publish 22:22 \ + --name gitlab \ + --env GITLAB_OMNIBUS_CONFIG="gitlab_rails['ldap_enabled'] = true; gitlab_rails['ldap_servers'] = {\"main\"=>{\"label\"=>\"LDAP\", \"host\"=>\"ldap-server.test\", \"port\"=>389, \"uid\"=>\"uid\", \"bind_dn\"=>\"cn=admin,dc=example,dc=org\", \"password\"=>\"admin\", \"encryption\"=>\"plain\", \"verify_certificates\"=>false, \"base\"=>\"dc=example,dc=org\", \"user_filter\"=>\"\", \"group_base\"=>\"ou=Global Groups,dc=example,dc=org\", \"admin_group\"=>\"AdminGroup\", \"external_groups\"=>\"\", \"sync_ssh_keys\"=>false}}; gitlab_rails['ldap_sync_worker_cron'] = '* * * * *'; gitlab_rails['ldap_group_sync_worker_cron'] = '* * * * *'; " \ + gitlab/gitlab-ee:latest + ``` + +1. Run an LDAP test from [`gitlab/qa`](https://gitlab.com/gitlab-org/gitlab/-/tree/d5447ebb5f99d4c72780681ddf4dc25b0738acba/qa) directory: + + ```shell + GITLAB_LDAP_USERNAME="tanuki" GITLAB_LDAP_PASSWORD="password" QA_DEBUG=true CHROME_HEADLESS=false bin/qa Test::Instance::All http://localhost qa/specs/features/browser_ui/1_manage/login/log_into_gitlab_via_ldap_spec.rb + ``` diff --git a/doc/development/testing_guide/frontend_testing.md b/doc/development/testing_guide/frontend_testing.md index d83d58d14dd..94bc80abcdb 100644 --- a/doc/development/testing_guide/frontend_testing.md +++ b/doc/development/testing_guide/frontend_testing.md @@ -89,7 +89,7 @@ If your test exceeds that time, it fails. If you cannot improve the performance of the tests, you can increase the timeout for a specific test using -[`setTestTimeout`](https://gitlab.com/gitlab-org/gitlab/blob/master/spec/frontend/helpers/timeout.js). +[`setTestTimeout`](https://gitlab.com/gitlab-org/gitlab/blob/master/spec/frontend/__helpers__/timeout.js). ```javascript import { setTestTimeout } from 'helpers/timeout'; @@ -834,12 +834,50 @@ The `response` variable gets automatically set if the test is marked as `type: : When creating a new fixture, it often makes sense to take a look at the corresponding tests for the endpoint in `(ee/)spec/controllers/` or `(ee/)spec/requests/`. +##### GraphQL query fixtures + +You can create a fixture that represents the result of a GraphQL query using the `get_graphql_query_as_string` +helper method. For example: + +```ruby +# spec/frontend/fixtures/releases.rb + +describe GraphQL::Query, type: :request do + include GraphqlHelpers + + all_releases_query_path = 'releases/queries/all_releases.query.graphql' + fragment_paths = ['releases/queries/release.fragment.graphql'] + + before(:all) do + clean_frontend_fixtures('graphql/releases/') + end + + it "graphql/#{all_releases_query_path}.json" do + query = get_graphql_query_as_string(all_releases_query_path, fragment_paths) + + post_graphql(query, current_user: admin, variables: { fullPath: project.full_path }) + + expect_graphql_errors_to_be_empty + end +end +``` + +This will create a new fixture located at +`tmp/tests/frontend/fixtures-ee/graphql/releases/queries/all_releases.query.graphql.json`. + +Note that you will need to provide the paths to all fragments used by the query. +`get_graphql_query_as_string` reads all of the provided file paths and returns +the result as a single, concatenated string. + +You can import the JSON fixture in a Jest test using the `getJSONFixture` method +[as described below](#use-fixtures). + ### Use fixtures Jest and Karma test suites import fixtures in different ways: - The Karma test suite are served by [jasmine-jquery](https://github.com/velesin/jasmine-jquery). -- Jest use `spec/frontend/helpers/fixtures.js`. +- Jest use `spec/frontend/__helpers__/fixtures.js`. The following are examples of tests that work for both Karma and Jest: @@ -1024,6 +1062,9 @@ See also [Notes on testing Vue components](../fe_guide/vue.md#testing-vue-compon ## Test helpers +Test helpers can be found in [`spec/frontend/__helpers__`](https://gitlab.com/gitlab-org/gitlab/blob/master/spec/frontend/__helpers__). +If you introduce new helpers, please place them in that directory. + ### Vuex Helper: `testAction` We have a helper available to make testing actions easier, as per [official documentation](https://vuex.vuejs.org/guide/testing.html): @@ -1065,7 +1106,7 @@ By doing so, the `wrapper` provides you with the ability to perform a `findByTes which is a shortcut to the more verbose `wrapper.find('[data-testid="my-test-id"]');` ```javascript -import { extendedWrapper } from 'jest/helpers/vue_test_utils_helper'; +import { extendedWrapper } from 'helpers/vue_test_utils_helper'; describe('FooComponent', () => { const wrapper = extendedWrapper(shallowMount({ diff --git a/doc/development/testing_guide/testing_levels.md b/doc/development/testing_guide/testing_levels.md index 14d4ee82f75..abacb9a0c87 100644 --- a/doc/development/testing_guide/testing_levels.md +++ b/doc/development/testing_guide/testing_levels.md @@ -230,7 +230,7 @@ They're useful to test permissions, redirections, what view is rendered etc. | Code path | Tests path | Testing engine | Notes | | --------- | ---------- | -------------- | ----- | -| `app/controllers/` | `spec/controllers/` | RSpec | For N+1 tests, use [request specs](../query_recorder.md#use-request-specs-instead-of-controller-specs) | +| `app/controllers/` | `spec/requests/`, `spec/controllers` | RSpec | Request specs are preferred over legacy controller specs. | | `app/mailers/` | `spec/mailers/` | RSpec | | | `lib/api/` | `spec/requests/api/` | RSpec | | | `app/assets/javascripts/` | `spec/javascripts/`, `spec/frontend/` | Karma & Jest | [More details below](#frontend-integration-tests) | @@ -310,6 +310,8 @@ graph RL ### About controller tests +GitLab is [transitioning from controller specs to request specs](https://gitlab.com/groups/gitlab-org/-/epics/5076). + In an ideal world, controllers should be thin. However, when this is not the case, it's acceptable to write a system or feature test without JavaScript instead of a controller test. Testing a fat controller usually involves a lot of stubbing, such as: @@ -318,7 +320,7 @@ of a controller test. Testing a fat controller usually involves a lot of stubbin controller.instance_variable_set(:@user, user) ``` -and use methods which are deprecated in Rails 5 ([#23768](https://gitlab.com/gitlab-org/gitlab/-/issues/16260)). +and use methods [deprecated in Rails 5](https://gitlab.com/gitlab-org/gitlab/-/issues/16260). ### About Karma diff --git a/doc/development/usage_ping.md b/doc/development/usage_ping.md new file mode 100644 index 00000000000..10c3de2f0a1 --- /dev/null +++ b/doc/development/usage_ping.md @@ -0,0 +1,1151 @@ +--- +stage: Growth +group: Product Intelligence +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Usage Ping Guide + +> - Introduced in GitLab Enterprise Edition 8.10. +> - More statistics were added in GitLab Enterprise Edition 8.12. +> - Moved to GitLab Core in 9.1. +> - More statistics were added in GitLab Ultimate 11.2. + +This guide describes Usage Ping's purpose and how it's implemented. + +For more information about Product Intelligence, see: + +- [Product Intelligence Guide](https://about.gitlab.com/handbook/product/product-intelligence-guide/) +- [Snowplow Guide](snowplow.md) + +More useful links: + +- [Product Intelligence Direction](https://about.gitlab.com/direction/product-intelligence/) +- [Data Analysis Process](https://about.gitlab.com/handbook/business-ops/data-team/#data-analysis-process/) +- [Data for Product Managers](https://about.gitlab.com/handbook/business-ops/data-team/programs/data-for-product-managers/) +- [Data Infrastructure](https://about.gitlab.com/handbook/business-ops/data-team/platform/infrastructure/) + +## What is Usage Ping? + +- GitLab sends a weekly payload containing usage data to GitLab Inc. Usage Ping provides high-level data to help our product, support, and sales teams. It does not send any project names, usernames, or any other specific data. The information from the usage ping is not anonymous, it is linked to the hostname of the instance. Sending usage ping is optional, and any instance can disable analytics. +- The usage data is primarily composed of row counts for different tables in the instance’s database. By comparing these counts month over month (or week over week), we can get a rough sense for how an instance is using the different features within the product. In addition to counts, other facts + that help us classify and understand GitLab installations are collected. +- Usage ping is important to GitLab as we use it to calculate our Stage Monthly Active Users (SMAU) which helps us measure the success of our stages and features. +- While usage ping is enabled, GitLab gathers data from the other instances and can show usage statistics of your instance to your users. + +### Why should we enable Usage Ping? + +- The main purpose of Usage Ping is to build a better GitLab. Data about how GitLab is used is collected to better understand feature/stage adoption and usage, which helps us understand how GitLab is adding value and helps our team better understand the reasons why people use GitLab and with this knowledge we're able to make better product decisions. +- As a benefit of having the usage ping active, GitLab lets you analyze the users’ activities over time of your GitLab installation. +- As a benefit of having the usage ping active, GitLab provides you with The DevOps Report,which gives you an overview of your entire instance’s adoption of Concurrent DevOps from planning to monitoring. +- You get better, more proactive support. (assuming that our TAMs and support organization used the data to deliver more value) +- You get insight and advice into how to get the most value out of your investment in GitLab. Wouldn't you want to know that a number of features or values are not being adopted in your organization? +- You get a report that illustrates how you compare against other similar organizations (anonymized), with specific advice and recommendations on how to improve your DevOps processes. +- Usage Ping is enabled by default. To disable it, see [Disable Usage Ping](#disable-usage-ping). + +### Limitations + +- Usage Ping does not track frontend events things like page views, link clicks, or user sessions, and only focuses on aggregated backend events. +- Because of these limitations we recommend instrumenting your products with Snowplow for more detailed analytics on GitLab.com and use Usage Ping to track aggregated backend events on self-managed. + +## Usage Ping payload + +You can view the exact JSON payload sent to GitLab Inc. in the administration panel. To view the payload: + +1. Navigate to **Admin Area > Settings > Metrics and profiling**. +1. Expand the **Usage statistics** section. +1. Click the **Preview payload** button. + +For an example payload, see [Example Usage Ping payload](#example-usage-ping-payload). + +## Disable Usage Ping + +To disable Usage Ping in the GitLab UI, go to the **Settings** page of your administration panel and uncheck the **Usage Ping** checkbox. + +To disable Usage Ping and prevent it from being configured in the future through the administration panel, Omnibus installs can set the following in [`gitlab.rb`](https://docs.gitlab.com/omnibus/settings/configuration.html#configuration-options): + +```ruby +gitlab_rails['usage_ping_enabled'] = false +``` + +Source installations can set the following in `gitlab.yml`: + +```yaml +production: &base + # ... + gitlab: + # ... + usage_ping_enabled: false +``` + +## Usage Ping request flow + +The following example shows a basic request/response flow between a GitLab instance, the Versions Application, the License Application, Salesforce, the GitLab S3 Bucket, the GitLab Snowflake Data Warehouse, and Sisense: + +```mermaid +sequenceDiagram + participant GitLab Instance + participant Versions Application + participant Licenses Application + participant Salesforce + participant S3 Bucket + participant Snowflake DW + participant Sisense Dashboards + GitLab Instance->>Versions Application: Send Usage Ping + loop Process usage data + Versions Application->>Versions Application: Parse usage data + Versions Application->>Versions Application: Write to database + Versions Application->>Versions Application: Update license ping time + end + loop Process data for Salesforce + Versions Application-xLicenses Application: Request Zuora subscription id + Licenses Application-xVersions Application: Zuora subscription id + Versions Application-xSalesforce: Request Zuora account id by Zuora subscription id + Salesforce-xVersions Application: Zuora account id + Versions Application-xSalesforce: Usage data for the Zuora account + end + Versions Application->>S3 Bucket: Export Versions database + S3 Bucket->>Snowflake DW: Import data + Snowflake DW->>Snowflake DW: Transform data using dbt + Snowflake DW->>Sisense Dashboards: Data available for querying + Versions Application->>GitLab Instance: DevOps Report (Conversational Development Index) +``` + +## How Usage Ping works + +1. The Usage Ping [cron job](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/workers/gitlab_usage_ping_worker.rb#L30) is set in Sidekiq to run weekly. +1. When the cron job runs, it calls [`Gitlab::UsageData.to_json`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/submit_usage_ping_service.rb#L22). +1. `Gitlab::UsageData.to_json` [cascades down](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data.rb#L22) to ~400+ other counter method calls. +1. The response of all methods calls are [merged together](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data.rb#L14) into a single JSON payload in `Gitlab::UsageData.to_json`. +1. The JSON payload is then [posted to the Versions application]( https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/submit_usage_ping_service.rb#L20) + If a firewall exception is needed, the required URL depends on several things. If + the hostname is `version.gitlab.com`, the protocol is `TCP`, and the port number is `443`, + the required URL is <https://version.gitlab.com/>. + +## Implementing Usage Ping + +Usage Ping consists of two kinds of data, counters and observations. Counters track how often a certain event +happened over time, such as how many CI pipelines have run. They are monotonic and always trend up. +Observations are facts collected from one or more GitLab instances and can carry arbitrary data. There are no +general guidelines around how to collect those, due to the individual nature of that data. + +There are several types of counters which are all found in `usage_data.rb`: + +- **Ordinary Batch Counters:** Simple count of a given ActiveRecord_Relation +- **Distinct Batch Counters:** Distinct count of a given ActiveRecord_Relation in a given column +- **Sum Batch Counters:** Sum the values of a given ActiveRecord_Relation in a given column +- **Alternative Counters:** Used for settings and configurations +- **Redis Counters:** Used for in-memory counts. + +NOTE: +Only use the provided counter methods. Each counter method contains a built in fail safe to isolate each counter to avoid breaking the entire Usage Ping. + +### Why batch counting + +For large tables, PostgreSQL can take a long time to count rows due to MVCC [(Multi-version Concurrency Control)](https://en.wikipedia.org/wiki/Multiversion_concurrency_control). Batch counting is a counting method where a single large query is broken into multiple smaller queries. For example, instead of a single query querying 1,000,000 records, with batch counting, you can execute 100 queries of 10,000 records each. Batch counting is useful for avoiding database timeouts as each batch query is significantly shorter than one single long running query. + +For GitLab.com, there are extremely large tables with 15 second query timeouts, so we use batch counting to avoid encountering timeouts. Here are the sizes of some GitLab.com tables: + +| Table | Row counts in millions | +|------------------------------|------------------------| +| `merge_request_diff_commits` | 2280 | +| `ci_build_trace_sections` | 1764 | +| `merge_request_diff_files` | 1082 | +| `events` | 514 | + +We have several batch counting methods available: + +- `Ordinary Batch Counters` +- `Distinct Batch Counters` +- `Sum Batch Counters` +- `Estimated Batch Counters` + +Batch counting requires indexes on columns to calculate max, min, and range queries. In some cases, +you may need to add a specialized index on the columns involved in a counter. + +### Ordinary Batch Counters + +Handles `ActiveRecord::StatementInvalid` error + +Simple count of a given ActiveRecord_Relation, does a non-distinct batch count, smartly reduces batch_size and handles errors. + +Method: `count(relation, column = nil, batch: true, start: nil, finish: nil)` + +Arguments: + +- `relation` the ActiveRecord_Relation to perform the count +- `column` the column to perform the count on, by default is the primary key +- `batch`: default `true` in order to use batch counting +- `start`: custom start of the batch counting in order to avoid complex min calculations +- `end`: custom end of the batch counting in order to avoid complex min calculations + +Examples: + +```ruby +count(User.active) +count(::Clusters::Cluster.aws_installed.enabled, :cluster_id) +count(::Clusters::Cluster.aws_installed.enabled, :cluster_id, start: ::Clusters::Cluster.minimum(:id), finish: ::Clusters::Cluster.maximum(:id)) +``` + +### Distinct Batch Counters + +Handles `ActiveRecord::StatementInvalid` error + +Distinct count of a given ActiveRecord_Relation on given column, a distinct batch count, smartly reduces batch_size and handles errors. + +Method: `distinct_count(relation, column = nil, batch: true, batch_size: nil, start: nil, finish: nil)` + +Arguments: + +- `relation` the ActiveRecord_Relation to perform the count +- `column` the column to perform the distinct count, by default is the primary key +- `batch`: default `true` in order to use batch counting +- `batch_size`: if none set it uses default value 10000 from `Gitlab::Database::BatchCounter` +- `start`: custom start of the batch counting in order to avoid complex min calculations +- `end`: custom end of the batch counting in order to avoid complex min calculations + +WARNING: +Counting over non-unique columns can lead to performance issues. Take a look at the [iterating tables in batches](iterating_tables_in_batches.md) guide for more details. + +Examples: + +```ruby +distinct_count(::Project, :creator_id) +distinct_count(::Note.with_suggestions.where(time_period), :author_id, start: ::User.minimum(:id), finish: ::User.maximum(:id)) +distinct_count(::Clusters::Applications::CertManager.where(time_period).available.joins(:cluster), 'clusters.user_id') +``` + +### Sum Batch Counters + +Handles `ActiveRecord::StatementInvalid` error + +Sum the values of a given ActiveRecord_Relation on given column and handles errors. + +Method: `sum(relation, column, batch_size: nil, start: nil, finish: nil)` + +Arguments: + +- `relation` the ActiveRecord_Relation to perform the operation +- `column` the column to sum on +- `batch_size`: if none set it uses default value 1000 from `Gitlab::Database::BatchCounter` +- `start`: custom start of the batch counting in order to avoid complex min calculations +- `end`: custom end of the batch counting in order to avoid complex min calculations + +Examples: + +```ruby +sum(JiraImportState.finished, :imported_issues_count) +``` + +### Grouping & Batch Operations + +The `count`, `distinct_count`, and `sum` batch counters can accept an `ActiveRecord::Relation` +object, which groups by a specified column. With a grouped relation, the methods do batch counting, +handle errors, and returns a hash table of key-value pairs. + +Examples: + +```ruby +count(Namespace.group(:type)) +# returns => {nil=>179, "Group"=>54} + +distinct_count(Project.group(:visibility_level), :creator_id) +# returns => {0=>1, 10=>1, 20=>11} + +sum(Issue.group(:state_id), :weight)) +# returns => {1=>3542, 2=>6820} +``` + +### Estimated Batch Counters + +> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/48233) in GitLab 13.7. + +Estimated batch counter functionality handles `ActiveRecord::StatementInvalid` errors +when used through the provided `estimate_batch_distinct_count` method. +Errors return a value of `-1`. + +WARNING: +This functionality estimates a distinct count of a specific ActiveRecord_Relation in a given column, +which uses the [HyperLogLog](http://algo.inria.fr/flajolet/Publications/FlFuGaMe07.pdf) algorithm. +As the HyperLogLog algorithm is probabilistic, the **results always include error**. +The highest encountered error rate is 4.9%. + +When correctly used, the `estimate_batch_distinct_count` method enables efficient counting over +columns that contain non-unique values, which can not be assured by other counters. + +Method: [`estimate_batch_distinct_count(relation, column = nil, batch_size: nil, start: nil, finish: nil)`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/utils/usage_data.rb#L63) + +The method includes the following arguments: + +- `relation`: The ActiveRecord_Relation to perform the count. +- `column`: The column to perform the distinct count. The default is the primary key. +- `batch_size`: The default is 10,000, from `Gitlab::Database::PostgresHll::BatchDistinctCounter::DEFAULT_BATCH_SIZE`. +- `start`: The custom start of the batch count, to avoid complex minimum calculations. +- `finish`: The custom end of the batch count in order to avoid complex maximum calculations. + +The method includes the following prerequisites: + +1. The supplied `relation` must include the primary key defined as the numeric column. + For example: `id bigint NOT NULL`. +1. The `estimate_batch_distinct_count` can handle a joined relation. To use its ability to + count non-unique columns, the joined relation **must NOT** have a one-to-many relationship, + such as `has_many :boards`. +1. Both `start` and `finish` arguments should always represent primary key relationship values, + even if the estimated count refers to another column, for example: + + ```ruby + estimate_batch_distinct_count(::Note, :author_id, start: ::Note.minimum(:id), finish: ::Note.maximum(:id)) + ``` + +Examples: + +1. Simple execution of estimated batch counter, with only relation provided, + returned value represents estimated number of unique values in `id` column + (which is the primary key) of `Project` relation: + + ```ruby + estimate_batch_distinct_count(::Project) + ``` + +1. Execution of estimated batch counter, where provided relation has applied + additional filter (`.where(time_period)`), number of unique values estimated + in custom column (`:author_id`), and parameters: `start` and `finish` together + apply boundaries that defines range of provided relation to analyze: + + ```ruby + estimate_batch_distinct_count(::Note.with_suggestions.where(time_period), :author_id, start: ::Note.minimum(:id), finish: ::Note.maximum(:id)) + ``` + +1. Execution of estimated batch counter with joined relation (`joins(:cluster)`), + for a custom column (`'clusters.user_id'`): + + ```ruby + estimate_batch_distinct_count(::Clusters::Applications::CertManager.where(time_period).available.joins(:cluster), 'clusters.user_id') + ``` + +When instrumenting metric with usage of estimated batch counter please add +`_estimated` suffix to its name, for example: + +```ruby + "counts": { + "ci_builds_estimated": estimate_batch_distinct_count(Ci::Build), + ... +``` + +### Redis Counters + +Handles `::Redis::CommandError` and `Gitlab::UsageDataCounters::BaseCounter::UnknownEvent` +returns -1 when a block is sent or hash with all values -1 when a `counter(Gitlab::UsageDataCounters)` is sent +different behavior due to 2 different implementations of Redis counter + +Method: `redis_usage_data(counter, &block)` + +Arguments: + +- `counter`: a counter from `Gitlab::UsageDataCounters`, that has `fallback_totals` method implemented +- or a `block`: which is evaluated + +#### Ordinary Redis Counters + +Examples of implementation: + +- Using Redis methods [`INCR`](https://redis.io/commands/incr), [`GET`](https://redis.io/commands/get), and [`Gitlab::UsageDataCounters::WikiPageCounter`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/wiki_page_counter.rb) +- Using Redis methods [`HINCRBY`](https://redis.io/commands/hincrby), [`HGETALL`](https://redis.io/commands/hgetall), and [`Gitlab::UsageCounters::PodLogs`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_counters/pod_logs.rb) + +##### UsageData API Tracking + +<!-- There's nearly identical content in `##### Adding new events`. If you fix errors here, you may need to fix the same errors in the other location. --> + +1. Track event using `UsageData` API + + Increment event count using ordinary Redis counter, for given event name. + + Tracking events using the `UsageData` API requires the `usage_data_api` feature flag to be enabled, which is enabled by default. + + API requests are protected by checking for a valid CSRF token. + + In order to be able to increment the values the related feature `usage_data_<event_name>` should be enabled. + + ```plaintext + POST /usage_data/increment_counter + ``` + + | Attribute | Type | Required | Description | + | :-------- | :--- | :------- | :---------- | + | `event` | string | yes | The event name it should be tracked | + + Response + + - `200` if event was tracked + - `400 Bad request` if event parameter is missing + - `401 Unauthorized` if user is not authenticated + - `403 Forbidden` for invalid CSRF token provided + +1. Track events using JavaScript/Vue API helper which calls the API above + + Note that `usage_data_api` and `usage_data_#{event_name}` should be enabled in order to be able to track events + + ```javascript + import api from '~/api'; + + api.trackRedisCounterEvent('my_already_defined_event_name'), + ``` + +#### Redis HLL Counters + +WARNING: +HyperLogLog (HLL) is a probabilistic algorithm and its **results always includes some small error**. According to [Redis documentation](https://redis.io/commands/pfcount), data from +used HLL implementation is "approximated with a standard error of 0.81%". + +With `Gitlab::UsageDataCounters::HLLRedisCounter` we have available data structures used to count unique values. + +Implemented using Redis methods [PFADD](https://redis.io/commands/pfadd) and [PFCOUNT](https://redis.io/commands/pfcount). + +##### Adding new events + +1. Define events in [`known_events`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/). + + Example event: + + ```yaml + - name: i_compliance_credential_inventory + category: compliance + redis_slot: compliance + expiry: 42 # 6 weeks + aggregation: weekly + ``` + + Keys: + + - `name`: unique event name. + + Name format `<prefix>_<redis_slot>_name`. + + Use one of the following prefixes for the event's name: + + - `g_` for group, as an event which is tracked for group. + - `p_` for project, as an event which is tracked for project. + - `i_` for instance, as an event which is tracked for instance. + - `a_` for events encompassing all `g_`, `p_`, `i_`. + - `o_` for other. + + Consider including in the event's name the Redis slot in order to be able to count totals for a specific category. + + Example names: `i_compliance_credential_inventory`, `g_analytics_contribution`. + + - `category`: event category. Used for getting total counts for events in a category, for easier + access to a group of events. + - `redis_slot`: optional Redis slot; default value: event name. Used if needed to calculate totals + for a group of metrics. Ensure keys are in the same slot. For example: + `i_compliance_credential_inventory` with `redis_slot: 'compliance'` builds Redis key + `i_{compliance}_credential_inventory-2020-34`. If `redis_slot` is not defined the Redis key will + be `{i_compliance_credential_inventory}-2020-34`. + - `expiry`: expiry time in days. Default: 29 days for daily aggregation and 6 weeks for weekly + aggregation. + - `aggregation`: may be set to a `:daily` or `:weekly` key. Defines how counting data is stored in Redis. + Aggregation on a `daily` basis does not pull more fine grained data. + - `feature_flag`: optional. For details, see our [GitLab internal Feature flags](feature_flags/) documentation. + +1. Track event in controller using `RedisTracking` module with `track_redis_hll_event(*controller_actions, name:, feature:, feature_default_enabled: false)`. + + Arguments: + + - `controller_actions`: controller actions we want to track. + - `name`: event name. + - `feature`: feature name, all metrics we track should be under feature flag. + - `feature_default_enabled`: feature flag is disabled by default, set to `true` for it to be enabled by default. + + Example usage: + + ```ruby + # controller + class ProjectsController < Projects::ApplicationController + include RedisTracking + + skip_before_action :authenticate_user!, only: :show + track_redis_hll_event :index, :show, name: 'g_compliance_example_feature_visitors', feature: :compliance_example_feature, feature_default_enabled: true + + def index + render html: 'index' + end + + def new + render html: 'new' + end + + def show + render html: 'show' + end + end + ``` + +1. Track event in API using `increment_unique_values(event_name, values)` helper method. + + In order to be able to track the event, Usage Ping must be enabled and the event feature `usage_data_<event_name>` must be enabled. + + Arguments: + + - `event_name`: event name. + - `values`: values counted, one value or array of values. + + Example usage: + + ```ruby + get ':id/registry/repositories' do + repositories = ContainerRepositoriesFinder.new( + user: current_user, subject: user_group + ).execute + + increment_unique_values('i_list_repositories', current_user.id) + + present paginate(repositories), with: Entities::ContainerRegistry::Repository, tags: params[:tags], tags_count: params[:tags_count] + end + ``` + +1. Track event using `track_usage_event(event_name, values) in services and GraphQL + + Increment unique values count using Redis HLL, for given event name. + + Example: + + [Track usage event for incident created in service](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/services/issues/update_service.rb) + + [Track usage event for incident created in GraphQL](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/graphql/mutations/alert_management/update_alert_status.rb) + + ```ruby + track_usage_event(:incident_management_incident_created, current_user.id) + ``` + +<!-- There's nearly identical content in `##### UsageData API Tracking`. If you find / fix errors here, you may need to fix errors in that section too. --> + +1. Track event using `UsageData` API + + Increment unique users count using Redis HLL, for given event name. + + Tracking events using the `UsageData` API requires the `usage_data_api` feature flag to be enabled, which is enabled by default. + + API requests are protected by checking for a valid CSRF token. + + In order to increment the values, the related feature `usage_data_<event_name>` should be + set to `default_enabled: true`. For more information, see + [Feature flags in development of GitLab](feature_flags/index.md). + + ```plaintext + POST /usage_data/increment_unique_users + ``` + + | Attribute | Type | Required | Description | + | :-------- | :--- | :------- | :---------- | + | `event` | string | yes | The event name it should be tracked | + + Response + + Return 200 if tracking failed for any reason. + + - `200` if event was tracked or any errors + - `400 Bad request` if event parameter is missing + - `401 Unauthorized` if user is not authenticated + - `403 Forbidden` for invalid CSRF token provided + +1. Track events using JavaScript/Vue API helper which calls the API above + + Example usage for an existing event already defined in [known events](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/): + + Usage Data API is behind `usage_data_api` feature flag which, as of GitLab 13.7, is + now set to `default_enabled: true`. + + Each event tracked using Usage Data API is behind a feature flag `usage_data_#{event_name}` which should be `default_enabled: true` + + ```javascript + import api from '~/api'; + + api.trackRedisHllUserEvent('my_already_defined_event_name'), + ``` + +1. Track event using base module `Gitlab::UsageDataCounters::HLLRedisCounter.track_event(event_name, values:)`. + + Arguments: + + - `event_name`: event name. + - `values`: One value or array of values we count. For example: user_id, visitor_id, user_ids. + +1. Track event on context level using base module `Gitlab::UsageDataCounters::HLLRedisCounter.track_event_in_context(event_name, values:, context:)`. + + Arguments: + + - `event_name`: event name. + - `values`: values we count. For example: user_id, visitor_id. + - `context`: context value. Allowed values are `default`, `free`, `bronze`, `silver`, `gold`, `starter`, `premium`, `ultimate` + +1. Get event data using `Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names:, start_date:, end_date:, context: '')`. + + Arguments: + + - `event_names`: the list of event names. + - `start_date`: start date of the period for which we want to get event data. + - `end_date`: end date of the period for which we want to get event data. + - `context`: context of the event. Allowed values are `default`, `free`, `bronze`, `silver`, `gold`, `starter`, `premium`, `ultimate`. + +1. Testing tracking and getting unique events + +Trigger events in rails console by using `track_event` method + + ```ruby + Gitlab::UsageDataCounters::HLLRedisCounter.track_event('g_compliance_audit_events', values: 1) + Gitlab::UsageDataCounters::HLLRedisCounter.track_event('g_compliance_audit_events', values: [2, 3]) + ``` + +Next, get the unique events for the current week. + + ```ruby + # Get unique events for metric for current_week + Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names: 'g_compliance_audit_events', + start_date: Date.current.beginning_of_week, end_date: Date.current.end_of_week) + ``` + +##### Recommendations + +We have the following recommendations for [Adding new events](#adding-new-events): + +- Event aggregation: weekly. +- Key expiry time: + - Daily: 29 days. + - Weekly: 42 days. +- When adding new metrics, use a [feature flag](../operations/feature_flags.md) to control the impact. +- For feature flags triggered by another service, set `default_enabled: false`, + - Events can be triggered using the `UsageData` API, which helps when there are > 10 events per change + +##### Enable/Disable Redis HLL tracking + +Events are tracked behind [feature flags](feature_flags/index.md) due to concerns for Redis performance and scalability. + +For a full list of events and corresponding feature flags see, [known_events](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/) files. + +To enable or disable tracking for specific event within <https://gitlab.com> or <https://about.staging.gitlab.com>, run commands such as the following to +[enable or disable the corresponding feature](feature_flags/index.md). + +```shell +/chatops run feature set <feature_name> true +/chatops run feature set <feature_name> false +``` + +##### Known events are added automatically in usage data payload + +All events added in [`known_events/common.yml`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/common.yml) are automatically added to usage data generation under the `redis_hll_counters` key. This column is stored in [version-app as a JSON](https://gitlab.com/gitlab-services/version-gitlab-com/-/blob/master/db/schema.rb#L209). +For each event we add metrics for the weekly and monthly time frames, and totals for each where applicable: + +- `#{event_name}_weekly`: Data for 7 days for daily [aggregation](#adding-new-events) events and data for the last complete week for weekly [aggregation](#adding-new-events) events. +- `#{event_name}_monthly`: Data for 28 days for daily [aggregation](#adding-new-events) events and data for the last 4 complete weeks for weekly [aggregation](#adding-new-events) events. + +Redis HLL implementation calculates automatic total metrics, if there are more than one metric for the same category, aggregation and Redis slot. + +- `#{category}_total_unique_counts_weekly`: Total unique counts for events in the same category for the last 7 days or the last complete week, if events are in the same Redis slot and we have more than one metric. +- `#{category}_total_unique_counts_monthly`: Total unique counts for events in same category for the last 28 days or the last 4 complete weeks, if events are in the same Redis slot and we have more than one metric. + +Example of `redis_hll_counters` data: + +```ruby +{:redis_hll_counters=> + {"compliance"=> + {"g_compliance_dashboard_weekly"=>0, + "g_compliance_dashboard_monthly"=>0, + "g_compliance_audit_events_weekly"=>0, + "g_compliance_audit_events_monthly"=>0, + "compliance_total_unique_counts_weekly"=>0, + "compliance_total_unique_counts_monthly"=>0}, + "analytics"=> + {"g_analytics_contribution_weekly"=>0, + "g_analytics_contribution_monthly"=>0, + "g_analytics_insights_weekly"=>0, + "g_analytics_insights_monthly"=>0, + "analytics_total_unique_counts_weekly"=>0, + "analytics_total_unique_counts_monthly"=>0}, + "ide_edit"=> + {"g_edit_by_web_ide_weekly"=>0, + "g_edit_by_web_ide_monthly"=>0, + "g_edit_by_sfe_weekly"=>0, + "g_edit_by_sfe_monthly"=>0, + "ide_edit_total_unique_counts_weekly"=>0, + "ide_edit_total_unique_counts_monthly"=>0}, + "search"=> + {"i_search_total_weekly"=>0, "i_search_total_monthly"=>0, "i_search_advanced_weekly"=>0, "i_search_advanced_monthly"=>0, "i_search_paid_weekly"=>0, "i_search_paid_monthly"=>0, "search_total_unique_counts_weekly"=>0, "search_total_unique_counts_monthly"=>0}, + "source_code"=>{"wiki_action_weekly"=>0, "wiki_action_monthly"=>0} + } +``` + +Example usage: + +```ruby +# Redis Counters +redis_usage_data(Gitlab::UsageDataCounters::WikiPageCounter) +redis_usage_data { ::Gitlab::UsageCounters::PodLogs.usage_totals[:total] } + +# Define events in common.yml https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/known_events/common.yml + +# Tracking events +Gitlab::UsageDataCounters::HLLRedisCounter.track_event('expand_vulnerabilities', values: visitor_id) + +# Get unique events for metric +redis_usage_data { Gitlab::UsageDataCounters::HLLRedisCounter.unique_events(event_names: 'expand_vulnerabilities', start_date: 28.days.ago, end_date: Date.current) } +``` + +### Alternative Counters + +Handles `StandardError` and fallbacks into -1 this way not all measures fail if we encounter one exception. +Mainly used for settings and configurations. + +Method: `alt_usage_data(value = nil, fallback: -1, &block)` + +Arguments: + +- `value`: a simple static value in which case the value is simply returned. +- or a `block`: which is evaluated +- `fallback: -1`: the common value used for any metrics that are failing. + +Example of usage: + +```ruby +alt_usage_data { Gitlab::VERSION } +alt_usage_data { Gitlab::CurrentSettings.uuid } +alt_usage_data(999) +``` + +### Prometheus Queries + +In those cases where operational metrics should be part of Usage Ping, a database or Redis query is unlikely +to provide useful data. Instead, Prometheus might be more appropriate, since most GitLab architectural +components publish metrics to it that can be queried back, aggregated, and included as usage data. + +NOTE: +Prometheus as a data source for Usage Ping is currently only available for single-node Omnibus installations +that are running the [bundled Prometheus](../administration/monitoring/prometheus/index.md) instance. + +To query Prometheus for metrics, a helper method is available to `yield` a fully configured +`PrometheusClient`, given it is available as per the note above: + +```ruby +with_prometheus_client do |client| + response = client.query('<your query>') + ... +end +``` + +Please refer to [the `PrometheusClient` definition](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/prometheus_client.rb) +for how to use its API to query for data. + +## Developing and testing Usage Ping + +### 1. Naming and placing the metrics + +Add the metric in one of the top level keys + +- `license`: for license related metrics. +- `settings`: for settings related metrics. +- `counts_weekly`: for counters that have data for the most recent 7 days. +- `counts_monthly`: for counters that have data for the most recent 28 days. +- `counts`: for counters that have data for all time. + +### 2. Use your Rails console to manually test counters + +```ruby +# count +Gitlab::UsageData.count(User.active) +Gitlab::UsageData.count(::Clusters::Cluster.aws_installed.enabled, :cluster_id) + +# count distinct +Gitlab::UsageData.distinct_count(::Project, :creator_id) +Gitlab::UsageData.distinct_count(::Note.with_suggestions.where(time_period), :author_id, start: ::User.minimum(:id), finish: ::User.maximum(:id)) +``` + +### 3. Generate the SQL query + +Your Rails console returns the generated SQL queries. + +Example: + +```ruby +pry(main)> Gitlab::UsageData.count(User.active) + (2.6ms) SELECT "features"."key" FROM "features" + (15.3ms) SELECT MIN("users"."id") FROM "users" WHERE ("users"."state" IN ('active')) AND ("users"."user_type" IS NULL OR "users"."user_type" IN (6, 4)) + (2.4ms) SELECT MAX("users"."id") FROM "users" WHERE ("users"."state" IN ('active')) AND ("users"."user_type" IS NULL OR "users"."user_type" IN (6, 4)) + (1.9ms) SELECT COUNT("users"."id") FROM "users" WHERE ("users"."state" IN ('active')) AND ("users"."user_type" IS NULL OR "users"."user_type" IN (6, 4)) AND "users"."id" BETWEEN 1 AND 100000 +``` + +### 4. Optimize queries with #database-lab + +Paste the SQL query into `#database-lab` to see how the query performs at scale. + +- `#database-lab` is a Slack channel which uses a production-sized environment to test your queries. +- GitLab.com’s production database has a 15 second timeout. +- Any single query must stay below [1 second execution time](query_performance.md#timing-guidelines-for-queries) with cold caches. +- Add a specialized index on columns involved to reduce the execution time. + +In order to have an understanding of the query's execution we add in the MR description the following information: + +- For counters that have a `time_period` test we add information for both cases: + - `time_period = {}` for all time periods + - `time_period = { created_at: 28.days.ago..Time.current }` for last 28 days period +- Execution plan and query time before and after optimization +- Query generated for the index and time +- Migration output for up and down execution + +We also use `#database-lab` and [explain.depesz.com](https://explain.depesz.com/). For more details, see the [database review guide](database_review.md#preparation-when-adding-or-modifying-queries). + +#### Optimization recommendations and examples + +- Use specialized indexes [example 1](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/26871), [example 2](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/26445). +- Use defined `start` and `finish`, and simple queries, because these values can be memoized and reused, [example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/37155). +- Avoid joins and write the queries as simply as possible, [example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/36316). +- Set a custom `batch_size` for `distinct_count`, [example](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/38000). + +### 5. Add the metric definition + +When adding, changing, or updating metrics, please update the [Event Dictionary's **Usage Ping** table](https://about.gitlab.com/handbook/product/product-intelligence-guide/#event-dictionary). + +### 6. Add new metric to Versions Application + +Check if new metrics need to be added to the Versions Application. See `usage_data` [schema](https://gitlab.com/gitlab-services/version-gitlab-com/-/blob/master/db/schema.rb#L147) and usage data [parameters accepted](https://gitlab.com/gitlab-services/version-gitlab-com/-/blob/master/app/services/usage_ping.rb). Any metrics added under the `counts` key are saved in the `stats` column. + +### 7. Add the feature label + +Add the `feature` label to the Merge Request for new Usage Ping metrics. These are user-facing changes and are part of expanding the Usage Ping feature. + +### 8. Add a changelog file + +Ensure you comply with the [Changelog entries guide](changelog.md). + +### 9. Ask for a Product Intelligence Review + +On GitLab.com, we have DangerBot setup to monitor Product Intelligence related files and DangerBot recommends a Product Intelligence review. Mention `@gitlab-org/growth/product_intelligence/engineers` in your MR for a review. + +### 10. Verify your metric + +On GitLab.com, the Product Intelligence team regularly monitors Usage Ping. They may alert you that your metrics need further optimization to run quicker and with greater success. You may also use the [Usage Ping QA dashboard](https://app.periscopedata.com/app/gitlab/632033/Usage-Ping-QA) to check how well your metric performs. The dashboard allows filtering by GitLab version, by "Self-managed" & "SaaS" and shows you how many failures have occurred for each metric. Whenever you notice a high failure rate, you may re-optimize your metric. + +### Optional: Test Prometheus based Usage Ping + +If the data submitted includes metrics [queried from Prometheus](#prometheus-queries) that you would like to inspect and verify, +then you need to ensure that a Prometheus server is running locally, and that furthermore the respective GitLab components +are exporting metrics to it. If you do not need to test data coming from Prometheus, no further action +is necessary, since Usage Ping should degrade gracefully in the absence of a running Prometheus server. + +There are currently three kinds of components that may export data to Prometheus, and which are included in Usage Ping: + +- [`node_exporter`](https://github.com/prometheus/node_exporter) - Exports node metrics from the host machine +- [`gitlab-exporter`](https://gitlab.com/gitlab-org/gitlab-exporter) - Exports process metrics from various GitLab components +- various GitLab services such as Sidekiq and the Rails server that export their own metrics + +#### Test with an Omnibus container + +This is the recommended approach to test Prometheus based Usage Ping. + +The easiest way to verify your changes is to build a new Omnibus image from your code branch via CI, then download the image +and run a local container instance: + +1. From your merge request, click on the `qa` stage, then trigger the `package-and-qa` job. This job triggers an Omnibus +build in a [downstream pipeline of the `omnibus-gitlab-mirror` project](https://gitlab.com/gitlab-org/build/omnibus-gitlab-mirror/-/pipelines). +1. In the downstream pipeline, wait for the `gitlab-docker` job to finish. +1. Open the job logs and locate the full container name including the version. It takes the following form: `registry.gitlab.com/gitlab-org/build/omnibus-gitlab-mirror/gitlab-ee:<VERSION>`. +1. On your local machine, make sure you are logged in to the GitLab Docker registry. You can find the instructions for this in +[Authenticate to the GitLab Container Registry](../user/packages/container_registry/index.md#authenticate-with-the-container-registry). +1. Once logged in, download the new image via `docker pull registry.gitlab.com/gitlab-org/build/omnibus-gitlab-mirror/gitlab-ee:<VERSION>` +1. For more information about working with and running Omnibus GitLab containers in Docker, please refer to [GitLab Docker images](https://docs.gitlab.com/omnibus/docker/README.html) in the Omnibus documentation. + +#### Test with GitLab development toolkits + +This is the less recommended approach, since it comes with a number of difficulties when emulating a real GitLab deployment. + +The [GDK](https://gitlab.com/gitlab-org/gitlab-development-kit) is not currently set up to run a Prometheus server or `node_exporter` alongside other GitLab components. If you would +like to do so, [Monitoring the GDK with Prometheus](https://gitlab.com/gitlab-org/gitlab-development-kit/-/blob/master/doc/howto/prometheus/index.md#monitoring-the-gdk-with-prometheus) is a good start. + +The [GCK](https://gitlab.com/gitlab-org/gitlab-compose-kit) has limited support for testing Prometheus based Usage Ping. +By default, it already comes with a fully configured Prometheus service that is set up to scrape a number of components, +but with the following limitations: + +- It does not currently run a `gitlab-exporter` instance, so several `process_*` metrics from services such as Gitaly may be missing. +- While it runs a `node_exporter`, `docker-compose` services emulate hosts, meaning that it would normally report itself to not be associated +with any of the other services that are running. That is not how node metrics are reported in a production setup, where `node_exporter` +always runs as a process alongside other GitLab components on any given node. From Usage Ping's perspective none of the node data would therefore +appear to be associated to any of the services running, since they all appear to be running on different hosts. To alleviate this problem, the `node_exporter` in GCK was arbitrarily "assigned" to the `web` service, meaning only for this service `node_*` metrics appears in Usage Ping. + +## Aggregated metrics + +> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/45979) in GitLab 13.6. + +WARNING: +This feature is intended solely for internal GitLab use. + +In order to add data for aggregated metrics into Usage Ping payload you should add corresponding definition in [`aggregated_metrics`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/usage_data_counters/aggregated_metrics/). Each aggregate definition includes following parts: + +- name: unique name under which aggregate metric is added to Usage Ping payload +- operator: operator that defines how aggregated metric data is counted. Available operators are: + - `OR`: removes duplicates and counts all entries that triggered any of listed events + - `AND`: removes duplicates and counts all elements that were observed triggering all of following events +- events: list of events names (from [`known_events/`](#known-events-are-added-automatically-in-usage-data-payload)) to aggregate into metric. All events in this list must have the same `redis_slot` and `aggregation` attributes. +- feature_flag: name of [development feature flag](feature_flags/development.md#development-type) that is checked before +metrics aggregation is performed. Corresponding feature flag should have `default_enabled` attribute set to `false`. +`feature_flag` attribute is **OPTIONAL** and can be omitted, when `feature_flag` is missing no feature flag is checked. + +Example aggregated metric entries: + +```yaml +- name: product_analytics_test_metrics_union + operator: OR + events: ['i_search_total', 'i_search_advanced', 'i_search_paid'] +- name: product_analytics_test_metrics_intersection_with_feautre_flag + operator: AND + events: ['i_search_total', 'i_search_advanced', 'i_search_paid'] + feature_flag: example_aggregated_metric +``` + +Aggregated metrics are added under `aggregated_metrics` key in both `counts_weekly` and `counts_monthly` top level keys in Usage Ping payload. + +```ruby +{ + :counts_monthly => { + :deployments => 1003, + :successful_deployments => 78, + :failed_deployments => 275, + :packages => 155, + :personal_snippets => 2106, + :project_snippets => 407, + :promoted_issues => 719, + :aggregated_metrics => { + :product_analytics_test_metrics_union => 7, + :product_analytics_test_metrics_intersection_with_feautre_flag => 2 + }, + :snippets => 2513 + } +} +``` + +## Example Usage Ping payload + +The following is example content of the Usage Ping payload. + +```json +{ + "uuid": "0000000-0000-0000-0000-000000000000", + "hostname": "example.com", + "version": "12.10.0-pre", + "installation_type": "omnibus-gitlab", + "active_user_count": 999, + "recorded_at": "2020-04-17T07:43:54.162+00:00", + "edition": "EEU", + "license_md5": "00000000000000000000000000000000", + "license_id": null, + "historical_max_users": 999, + "licensee": { + "Name": "ABC, Inc.", + "Email": "email@example.com", + "Company": "ABC, Inc." + }, + "license_user_count": 999, + "license_starts_at": "2020-01-01", + "license_expires_at": "2021-01-01", + "license_plan": "ultimate", + "license_add_ons": { + }, + "license_trial": false, + "counts": { + "assignee_lists": 999, + "boards": 999, + "ci_builds": 999, + ... + }, + "container_registry_enabled": true, + "dependency_proxy_enabled": false, + "gitlab_shared_runners_enabled": true, + "gravatar_enabled": true, + "influxdb_metrics_enabled": true, + "ldap_enabled": false, + "mattermost_enabled": false, + "omniauth_enabled": true, + "prometheus_enabled": false, + "prometheus_metrics_enabled": false, + "reply_by_email_enabled": "incoming+%{key}@incoming.gitlab.com", + "signup_enabled": true, + "web_ide_clientside_preview_enabled": true, + "ingress_modsecurity_enabled": true, + "projects_with_expiration_policy_disabled": 999, + "projects_with_expiration_policy_enabled": 999, + ... + "elasticsearch_enabled": true, + "license_trial_ends_on": null, + "geo_enabled": false, + "git": { + "version": { + "major": 2, + "minor": 26, + "patch": 1 + } + }, + "gitaly": { + "version": "12.10.0-rc1-93-g40980d40", + "servers": 56, + "clusters": 14, + "filesystems": [ + "EXT_2_3_4" + ] + }, + "gitlab_pages": { + "enabled": true, + "version": "1.17.0" + }, + "container_registry_server": { + "vendor": "gitlab", + "version": "2.9.1-gitlab" + }, + "database": { + "adapter": "postgresql", + "version": "9.6.15", + "pg_system_id": 6842684531675334351 + }, + "analytics_unique_visits": { + "g_analytics_contribution": 999, + ... + }, + "usage_activity_by_stage": { + "configure": { + "project_clusters_enabled": 999, + ... + }, + "create": { + "merge_requests": 999, + ... + }, + "manage": { + "events": 999, + ... + }, + "monitor": { + "clusters": 999, + ... + }, + "package": { + "projects_with_packages": 999 + }, + "plan": { + "issues": 999, + ... + }, + "release": { + "deployments": 999, + ... + }, + "secure": { + "user_container_scanning_jobs": 999, + ... + }, + "verify": { + "ci_builds": 999, + ... + } + }, + "usage_activity_by_stage_monthly": { + "configure": { + "project_clusters_enabled": 999, + ... + }, + "create": { + "merge_requests": 999, + ... + }, + "manage": { + "events": 999, + ... + }, + "monitor": { + "clusters": 999, + ... + }, + "package": { + "projects_with_packages": 999 + }, + "plan": { + "issues": 999, + ... + }, + "release": { + "deployments": 999, + ... + }, + "secure": { + "user_container_scanning_jobs": 999, + ... + }, + "verify": { + "ci_builds": 999, + ... + } + }, + "topology": { + "duration_s": 0.013836685999194742, + "application_requests_per_hour": 4224, + "query_apdex_weekly_average": 0.996, + "failures": [], + "nodes": [ + { + "node_memory_total_bytes": 33269903360, + "node_memory_utilization": 0.35, + "node_cpus": 16, + "node_cpu_utilization": 0.2, + "node_uname_info": { + "machine": "x86_64", + "sysname": "Linux", + "release": "4.19.76-linuxkit" + }, + "node_services": [ + { + "name": "web", + "process_count": 16, + "process_memory_pss": 233349888, + "process_memory_rss": 788220927, + "process_memory_uss": 195295487, + "server": "puma" + }, + { + "name": "sidekiq", + "process_count": 1, + "process_memory_pss": 734080000, + "process_memory_rss": 750051328, + "process_memory_uss": 731533312 + }, + ... + ], + ... + }, + ... + ] + } +} +``` + +## Notable changes + +In GitLab 13.5, `pg_system_id` was added to send the [PostgreSQL system identifier](https://www.2ndquadrant.com/en/blog/support-for-postgresqls-system-identifier-in-barman/). + +## Exporting Usage Ping SQL queries and definitions + +Two Rake tasks exist to export Usage Ping definitions. + +- The Rake tasks export the raw SQL queries for `count`, `distinct_count`, `sum`. +- The Rake tasks export the Redis counter class or the line of the Redis block for `redis_usage_data`. +- The Rake tasks calculate the `alt_usage_data` metrics. + +In the home directory of your local GitLab installation run the following Rake tasks for the YAML and JSON versions respectively: + +```shell +# for YAML export +bin/rake gitlab:usage_data:dump_sql_in_yaml + +# for JSON export +bin/rake gitlab:usage_data:dump_sql_in_json + +# You may pipe the output into a file +bin/rake gitlab:usage_data:dump_sql_in_yaml > ~/Desktop/usage-metrics-2020-09-02.yaml +``` + +## Generating and troubleshooting usage ping + +To get a usage ping, or to troubleshoot caching issues on your GitLab instance, please follow [instructions to generate usage ping](../administration/troubleshooting/gitlab_rails_cheat_sheet.md#generate-usage-ping). diff --git a/doc/development/usage_ping/metrics_dictionary.md b/doc/development/usage_ping/metrics_dictionary.md new file mode 100644 index 00000000000..bae79689f3b --- /dev/null +++ b/doc/development/usage_ping/metrics_dictionary.md @@ -0,0 +1,73 @@ +--- +stage: Growth +group: Product Intelligence +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments +--- + +# Metrics Dictionary Guide + +This guide describes Metrics Dictionary and how it's implemented + +## Metrics Definition and validation + +We are using [JSON Schema](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/metrics/schema.json) to validate the metrics definition. + +This process is meant to ensure consistent and valid metrics defined for Usage Ping. All metrics *must*: + +- Comply with the definied [JSON schema](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/metrics/schema.json). +- Have a unique `full_path` . +- Have an owner. + +All metrics are stored in YAML files: + +- [`config/metrics`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/config/metrics) + +Each metric is definied in a separate YAML file consisting of a number of fields: + +| Field | Required | Additional information | +|---------------------|----------|----------------------------------------------------------------| +| `name` | yes | | +| `description` | yes | | +| `value_type` | yes | | +| `status` | yes | | +| `default_generation`| yes | Default generation path of the metric. One full_path value. (1) | +| `full_path` | yes | Full path of the metric for one or multiple generations. Path of the metric in Usage Ping payload. (1) | +| `group` | yes | The [group](https://about.gitlab.com/handbook/product/categories/#devops-stages) that owns the metric. | +| `time_frame` | yes | `string`; may be set to a value like "7d" | +| `data_source` | yes | `string`: may be set to a value like `database` or `redis_hll`. | +| `distribution` | yes | The [distribution](https://about.gitlab.com/handbook/marketing/strategic-marketing/tiers/#definitions) where the metric applies. | +| `tier` | yes | The [tier]( https://about.gitlab.com/handbook/marketing/strategic-marketing/tiers/) where the metric applies. | +| `product_category` | no | The [product category](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/categories.yml) for the metric. | +| `stage` | no | The [stage](https://gitlab.com/gitlab-com/www-gitlab-com/blob/master/data/stages.yml) for the metric. | +| `milestone` | no | The milestone when the metric is introduced. | +| `milestone_removed` | no | The milestone when the metric is removed. | +| `introduced_by_url` | no | The URL to the Merge Request that introduced the metric. | + +1. The default generation path is the location of the metric in the Usage Ping payload. + The `full_path` is the list locations for multiple Usage Ping generaations. + +### Example metric definition + +The linked [`uuid`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/config/metrics/license/uuid.yml) +YAML file includes an example metric definition, where the `uuid` metric is the GitLab +instance unique identifier. + +```yaml +name: uuid +description: GitLab instance unique identifier +value_type: string +product_category: collection +stage: growth +status: data_available +default_generation: generation_1 +full_path: + generation_1: uuid + generation_2: license.uuid +milestone: 9.1 +introduced_by_url: https://gitlab.com/gitlab-org/gitlab/-/merge_requests/1521 +group: group::product intelligence +time_frame: none +data_source: database +distribution: [ee, ce] +tier: ['free', 'starter', 'premium', 'ultimate', 'bronze', 'silver', 'gold'] +``` |