7 files changed, 280 insertions, 49 deletions
diff --git a/doc/ci/pipelines/img/ci_efficiency_pipeline_dag_critical_path.png b/doc/ci/pipelines/img/ci_efficiency_pipeline_dag_critical_path.png
new file mode 100644
index 00000000000..1715e8224ab
--- /dev/null
+++ b/doc/ci/pipelines/img/ci_efficiency_pipeline_dag_critical_path.png
diff --git a/doc/ci/pipelines/img/ci_efficiency_pipeline_health_grafana_dashboard.png b/doc/ci/pipelines/img/ci_efficiency_pipeline_health_grafana_dashboard.png
new file mode 100644
index 00000000000..0956e76804e
--- /dev/null
+++ b/doc/ci/pipelines/img/ci_efficiency_pipeline_health_grafana_dashboard.png
diff --git a/doc/ci/pipelines/index.md b/doc/ci/pipelines/index.md
index 8419b474d54..1b9048089bd 100644
--- a/doc/ci/pipelines/index.md
+++ b/doc/ci/pipelines/index.md
@@ -22,7 +22,7 @@ Pipelines comprise:
 - Jobs, which define *what* to do. For example, jobs that compile or test code.
 - Stages, which define *when* to run the jobs. For example, stages that run tests after stages that compile the code.
 
-Jobs are executed by [Runners](../runners/README.md). Multiple jobs in the same stage are executed in parallel,
+Jobs are executed by [runners](../runners/README.md). Multiple jobs in the same stage are executed in parallel,
 if there are enough concurrent runners.
 
 If *all* jobs in a stage succeed, the pipeline moves on to the next stage.
@@ -40,7 +40,7 @@ A typical pipeline might consist of four stages, executed in the following order
 - A `production` stage, with a job called `deploy-to-prod`.
 
 NOTE: **Note:**
-If you have a [mirrored repository that GitLab pulls from](../../user/project/repository/repository_mirroring.md#pulling-from-a-remote-repository-starter),
+If you have a [mirrored repository that GitLab pulls from](../../user/project/repository/repository_mirroring.md#pulling-from-a-remote-repository),
 you may need to enable pipeline triggering in your project's
 **Settings > Repository > Pull from a remote repository > Trigger pipelines for mirror updates**.
 
@@ -199,7 +199,7 @@ such as builds, logs, artifacts, and triggers. **This action cannot be undone.**
 ### Pipeline quotas
 
 Each user has a personal pipeline quota that tracks the usage of shared runners in all personal projects.
-Each group has a [usage quota](../../subscriptions/index.md#ci-pipeline-minutes) that tracks the usage of shared runners for all projects created within the group.
+Each group has a [usage quota](../../subscriptions/gitlab_com/index.md#ci-pipeline-minutes) that tracks the usage of shared runners for all projects created within the group.
 
 When a pipeline is triggered, regardless of who triggered it, the pipeline quota for the project owner's [namespace](../../user/group/index.md#namespaces) is used. In this case, the namespace can be the user or group that owns the project.
 
@@ -483,7 +483,7 @@ be found when you are on a [single pipeline page](#view-pipelines). For example:
 
 ![Pipelines example](img/pipelines.png)
 
-[Multi-project pipeline graphs](../multi_project_pipelines.md#multi-project-pipeline-visualization-premium) help
+[Multi-project pipeline graphs](../multi_project_pipelines.md#multi-project-pipeline-visualization) help
 you visualize the entire pipeline, including all cross-project inter-dependencies. **(PREMIUM)**
 
 ### Pipeline mini graphs
@@ -535,32 +535,3 @@ GitLab provides API endpoints to:
 - Trigger pipeline runs. For more information, see:
   - [Triggering pipelines through the API](../triggers/README.md).
   - [Pipeline triggers API](../../api/pipeline_triggers.md).
-
-## Troubleshooting `fatal: reference is not a tree:`
-
-> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/17043) in GitLab 12.4.
-
-Previously, you'd have encountered unexpected pipeline failures when you force-pushed
-a branch to its remote repository. To illustrate the problem, suppose you've had the current workflow:
-
-1. A user creates a feature branch named `example` and pushes it to a remote repository.
-1. A new pipeline starts running on the `example` branch.
-1. A user rebases the `example` branch on the latest `master` branch and force-pushes it to its remote repository.
-1. A new pipeline starts running on the `example` branch again, however,
-   the previous pipeline (2) fails because of `fatal: reference is not a tree:` error.
-
-This is because the previous pipeline cannot find a checkout-SHA (which associated with the pipeline record)
-from the `example` branch that the commit history has already been overwritten by the force-push.
-Similarly, [Pipelines for merged results](../merge_request_pipelines/pipelines_for_merged_results/index.md)
-might have failed intermittently due to [the same reason](../merge_request_pipelines/pipelines_for_merged_results/index.md#intermittently-pipelines-fail-by-fatal-reference-is-not-a-tree-error).
-
-As of GitLab 12.4, we've improved this behavior by persisting pipeline refs exclusively.
-To illustrate its life cycle:
-
-1. A pipeline is created on a feature branch named `example`.
-1. A persistent pipeline ref is created at `refs/pipelines/<pipeline-id>`,
-   which retains the checkout-SHA of the associated pipeline record.
-   This persistent ref stays intact during the pipeline execution,
-   even if the commit history of the `example` branch has been overwritten by force-push.
-1. GitLab Runner fetches the persistent pipeline ref and gets source code from the checkout-SHA.
-1. When the pipeline finished, its persistent ref is cleaned up in a background process.
diff --git a/doc/ci/pipelines/job_artifacts.md b/doc/ci/pipelines/job_artifacts.md
index be6886fe6b2..750a76bfaa0 100644
--- a/doc/ci/pipelines/job_artifacts.md
+++ b/doc/ci/pipelines/job_artifacts.md
@@ -47,7 +47,7 @@ when the job fails, or always, by using [`artifacts:when`](../yaml/README.md#art
 parameter. GitLab keeps these uploaded artifacts for 1 week, as defined
 by the `expire_in` definition. You can keep the artifacts from expiring
 via the [web interface](#browsing-artifacts). If the expiry time is not defined, it defaults
-to the [instance wide setting](../../user/admin_area/settings/continuous_integration.md#default-artifacts-expiration-core-only).
+to the [instance wide setting](../../user/admin_area/settings/continuous_integration.md#default-artifacts-expiration).
 
 For more examples on artifacts, follow the [artifacts reference in
 `.gitlab-ci.yml`](../yaml/README.md#artifacts).
@@ -75,13 +75,13 @@ If you also want the ability to browse the report output files, include the
 > - [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/20390) in GitLab 11.2.
 > - Requires GitLab Runner 11.2 and above.
 
-The `junit` report collects [JUnit XML files](https://www.ibm.com/support/knowledgecenter/en/SSQ2R2_14.1.0/com.ibm.rsar.analysis.codereview.cobol.doc/topics/cac_useresults_junit.html)
+The `junit` report collects [JUnit report format XML files](https://www.ibm.com/support/knowledgecenter/en/SSQ2R2_14.1.0/com.ibm.rsar.analysis.codereview.cobol.doc/topics/cac_useresults_junit.html)
 as artifacts. Although JUnit was originally developed in Java, there are many
-[third party ports](https://en.wikipedia.org/wiki/JUnit#Ports) for other
+third party ports for other
 languages like JavaScript, Python, Ruby, and so on.
 
-See [JUnit test reports](../junit_test_reports.md) for more details and examples.
-Below is an example of collecting a JUnit XML file from Ruby's RSpec test tool:
+See [Unit test reports](../unit_test_reports.md) for more details and examples.
+Below is an example of collecting a JUnit report format XML file from Ruby's RSpec test tool:
 
 ```yaml
 rspec:
@@ -94,7 +94,7 @@ rspec:
       junit: rspec.xml
 ```
 
-The collected JUnit reports upload to GitLab as an artifact and display in merge requests.
+The collected Unit test reports upload to GitLab as an artifact and display in merge requests.
 
 NOTE: **Note:**
 If the JUnit tool you use exports to multiple XML files, specify
@@ -221,7 +221,7 @@ dashboards.
 
 CAUTION: **Warning:**
 This artifact is still valid but is **deprecated** in favor of the
-[artifacts:reports:license_scanning](../pipelines/job_artifacts.md#artifactsreportslicense_scanning-ultimate)
+[artifacts:reports:license_scanning](../pipelines/job_artifacts.md#artifactsreportslicense_scanning)
 introduced in GitLab 12.8.
 
 The `license_management` report collects [Licenses](../../user/compliance/license_compliance/index.md)
diff --git a/doc/ci/pipelines/pipeline_architectures.md b/doc/ci/pipelines/pipeline_architectures.md
index ace765ddb41..77614424b33 100644
--- a/doc/ci/pipelines/pipeline_architectures.md
+++ b/doc/ci/pipelines/pipeline_architectures.md
@@ -199,7 +199,7 @@ trigger_a:
     include: a/.gitlab-ci.yml
   rules:
     - changes:
-      - a/*
+        - a/*
 
 trigger_b:
   stage: triggers
@@ -207,7 +207,7 @@ trigger_b:
     include: b/.gitlab-ci.yml
   rules:
     - changes:
-      - b/*
+        - b/*
 ```
 
 Example child `a` pipeline configuration, located in `/a/.gitlab-ci.yml`, making
diff --git a/doc/ci/pipelines/pipeline_efficiency.md b/doc/ci/pipelines/pipeline_efficiency.md
new file mode 100644
index 00000000000..c4febba8f44
--- /dev/null
+++ b/doc/ci/pipelines/pipeline_efficiency.md
@@ -0,0 +1,252 @@
+---
+stage: Verify
+group: Continuous Integration
+info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers
+type: reference
+---
+
+# Pipeline Efficiency
+
+[CI/CD Pipelines](index.md) are the fundamental building blocks for [GitLab CI/CD](../README.md).
+Making pipelines more efficient helps you save developer time, which:
+
+- Speeds up your DevOps processes
+- Reduces costs
+- Shortens the development feedback loop
+
+It's common that new teams or projects start with slow and inefficient pipelines,
+and improve their configuration over time through trial and error. A better process is
+to use pipeline features that improve efficiency right away, and get a faster software
+development lifecycle earlier.
+
+First ensure you are familiar with [GitLab CI/CD fundamentals](../introduction/index.md)
+and understand the [quick start guide](../quick_start/README.md).
+
+## Identify bottlenecks and common failures
+
+The easiest indicators to check for inefficient pipelines are the runtimes of the jobs,
+stages, and the total runtime of the pipeline itself. The total pipeline duration is
+heavily influenced by the:
+
+- Total number of stages and jobs.
+- Dependencies between jobs.
+- The ["critical path"](#directed-acyclic-graphs-dag-visualization), which represents
+  the minimum and maximum pipeline duration.
+
+Additional points to pay attention relate to [GitLab Runners](../runners/README.md):
+
+- Availability of the runners and the resources they are provisioned with.
+- Build dependencies and their installation time.
+- [Container image size](#docker-images).
+- Network latency and slow connections.
+
+Pipelines frequently failing unnecessarily also causes slowdowns in the development
+lifecycle. You should look for problematic patterns with failed jobs:
+
+- Flaky unit tests which fail randomly, or produce unreliable test results.
+- Test coverage drops and code quality correlated to that behavior.
+- Failures that can be safely ignored, but that halt the pipeline instead.
+- Tests that fail at the end of a long pipeline, but could be in an earlier stage,
+  causing delayed feedback.
+
+## Pipeline analysis
+
+Analyze the performance of your pipeline to find ways to improve efficiency. Analysis
+can help identify possible blockers in the CI/CD infrastructure. This includes analyzing:
+
+- Job workloads.
+- Bottlenecks in the execution times.
+- The overall pipeline architecture.
+
+It's important to understand and document the pipeline workflows, and discuss possible
+actions and changes. Refactoring pipelines may need careful interaction between teams
+in the DevSecOps lifecycle.
+
+Pipeline analysis can help identify issues with cost efficiency. For example, [runners](../runners/README.md)
+hosted with a paid cloud service may be provisioned with:
+
+- More resources than needed for CI/CD pipelines, wasting money.
+- Not enough resources, causing slow runtimes and wasting time.
+
+### Pipeline Insights
+
+The [Pipeline success and duration charts](index.md#pipeline-success-and-duration-charts)
+give information about pipeline runtime and failed job counts.
+
+Tests like [unit tests](../unit_test_reports.md), integration tests, end-to-end tests,
+[code quality](../../user/project/merge_requests/code_quality.md) tests, and others
+ensure that problems are automatically found by the CI/CD pipeline. There could be many
+pipeline stages involved causing long runtimes.
+
+You can improve runtimes by running jobs that test different things in parallel, in
+the same stage, reducing overall runtime. The downside is that you need more runners
+running simultaneously to support the parallel jobs.
+
+The [testing levels for GitLab](../../development/testing_guide/testing_levels.md)
+provide an example of a complex testing strategy with many components involved.
+
+### Directed Acyclic Graphs (DAG) visualization
+
+The [Directed Acyclic Graph](../directed_acyclic_graph/index.md) (DAG) visualization can help analyze the critical path in
+the pipeline and understand possible blockers.
+
+![CI Pipeline Critical Path with DAG](img/ci_efficiency_pipeline_dag_critical_path.png)
+
+### Pipeline Monitoring
+
+Global pipeline health is a key indicator to monitor along with job and pipeline duration.
+[CI/CD analytics](index.md#pipeline-success-and-duration-charts) give a visual
+representation of pipeline health.
+
+Instance administrators have access to additional [performance metrics and self-monitoring](../../administration/monitoring/index.md).
+
+You can fetch specific pipeline health metrics from the [API](../../api/README.md).
+External monitoring tools can poll the API and verify pipeline health or collect
+metrics for long term SLA analytics.
+
+For example, the [GitLab CI Pipelines Exporter](https://github.com/mvisonneau/gitlab-ci-pipelines-exporter)
+for Prometheus fetches metrics from the API. It can check branches in projects automatically
+and get the pipeline status and duration. In combination with a Grafana dashboard,
+this helps build an actionable view for your operations team. Metric graphs can also
+be embedded into incidents making problem resolving easier.
+
+![Grafana Dashboard for GitLab CI Pipelines Prometheus Exporter](img/ci_efficiency_pipeline_health_grafana_dashboard.png)
+
+Alternatively, you can use a monitoring tool that can execute scripts, like
+[`check_gitlab`](https://gitlab.com/6uellerBpanda/check_gitlab) for example.
+
+#### Runner monitoring
+
+You can also [monitor CI runners](https://docs.gitlab.com/runner/monitoring/) on
+their host systems, or in clusters like Kubernetes. This includes checking:
+
+- Disk and disk IO
+- CPU usage
+- Memory
+- Runner process resources
+
+The [Prometheus Node Exporter](https://prometheus.io/docs/guides/node-exporter/)
+can monitor runners on Linux hosts, and [`kube-state-metrics`](https://github.com/kubernetes/kube-state-metrics)
+runs in a Kubernetes cluster.
+
+You can also test [GitLab Runner auto-scaling](https://docs.gitlab.com/runner/configuration/autoscale.html)
+with cloud providers, and define offline times to reduce costs.
+
+#### Dashboards and incident management
+
+Use your existing monitoring tools and dashboards to integrate CI/CD pipeline monitoring,
+or build them from scratch. Ensure that the runtime data is actionable and useful
+in teams, and operations/SREs are able to identify problems early enough.
+[Incident management](../../operations/incident_management/index.md) can help here too,
+with embedded metric charts and all valuable details to analyze the problem.
+
+### Storage usage
+
+Review the storage use of the following to help analyze costs and efficiency:
+
+- [Job artifacts](job_artifacts.md) and their [`expire_in`](../yaml/README.md#artifactsexpire_in)
+  configuration. If kept for too long, storage usage grows and could slow pipelines down.
+- [Container registry](../../user/packages/container_registry/index.md) usage.
+- [Package registry](../../user/packages/package_registry/index.md) usage.
+
+## Pipeline configuration
+
+Make careful choices when configuring pipelines to speed up pipelines and reduce
+resource usage. This includes making use of GitLab CI/CD's built-in features that
+make pipelines run faster and more efficiently.
+
+### Reduce how often jobs run
+
+Try to find which jobs don't need to run in all situations, and use pipeline configuration
+to stop them from running:
+
+- Use the [`interruptible`](../yaml/README.md#interruptible) keyword to stop old pipelines
+  when they are superceded by a newer pipeline.
+- Use [`rules`](../yaml/README.md#rules) to skip tests that aren't needed. For example,
+  skip backend tests when only the frontend code is changed.
+- Run non-essential [scheduled pipelines](schedules.md) less frequently.
+
+### Fail fast
+
+Ensure that errors are detected early in the CI/CD pipeline. A job that takes a very long
+time to complete keeps a pipeline from returning a failed status until the job completes.
+
+Design pipelines so that jobs that can [fail fast](../../user/project/merge_requests/fail_fast_testing.md)
+run earlier. For example, add an early stage and move the syntax, style linting,
+Git commit message verification, and similar jobs in there.
+
+Decide if it's important for long jobs to run early, before fast feedback from
+faster jobs. The initial failures may make it clear that the rest of the pipeline
+shouldn't run, saving pipeline resources.
+
+### Directed Acyclic Graphs (DAG)
+
+In a basic configuration, jobs always wait for all other jobs in earlier stages to complete
+before running. This is the simplest configuration, but it's also the slowest in most
+cases. [Directed Acyclic Graphs](../directed_acyclic_graph/index.md) and
+[parent/child pipelines](../parent_child_pipelines.md) are more flexible and can
+be more efficient, but can also make pipelines harder to understand and analyze.
+
+### Caching
+
+Another optimization method is to use [caching](../caching/index.md) between jobs and stages,
+for example [`/node_modules` for NodeJS](../caching/index.md#caching-nodejs-dependencies).
+
+### Docker Images
+
+Downloading and initializing Docker images can be a large part of the overall runtime
+of jobs.
+
+If a Docker image is slowing down job execution, analyze the base image size and network
+connection to the registry. If GitLab is running in the cloud, look for a cloud container
+registry offered by the vendor. In addition to that, you can make use of the
+[GitLab container registry](../../user/packages/container_registry/index.md) which can be accessed
+by the GitLab instance faster than other registries.
+
+#### Optimize Docker images
+
+Build optimized Docker images because large Docker images use up a lot of space and
+take a long time to download with slower connection speeds. If possible, avoid using
+one large image for all jobs. Use multiple smaller images, each for a specific task,
+that download and run faster.
+
+Try to use custom Docker images with the software pre-installed. It's usually much
+faster to download a larger pre-configured image than to use a common image and install
+software on it each time. Docker's [Best practices for writing Dockerfiles](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/)
+has more information about building efficient Docker images.
+
+Methods to reduce Docker image size:
+
+- Use a small base image, for example `debian-slim`.
+- Do not install convenience tools like vim, curl, and so on, if they aren't strictly needed.
+- Create a dedicated development image.
+- Disable man pages and docs installed by packages to save space.
+- Reduce the `RUN` layers and combine software installation steps.
+- If using `apt`, add `--no-install-recommends` to avoid unnecessary packages.
+- Clean up caches and files that are no longer needed at the end. For example
+  `rm -rf /var/lib/apt/lists/*` for Debian and Ubuntu, or `yum clean all` for RHEL and CentOS.
+- Use tools like [dive](https://github.com/wagoodman/dive) or [DockerSlim](https://github.com/docker-slim/docker-slim)
+  to analyze and shrink images.
+
+To simplify Docker image management, you can create a dedicated group for managing
+[Docker images](../docker/README.md) and test, build and publish them with CI/CD pipelines.
+
+## Test, document, and learn
+
+Improving pipelines is an iterative process. Make small changes, monitor the effect,
+then iterate again. Many small improvements can add up to a large increase in pipeline
+efficiency.
+
+It can help to document the pipeline design and architecture. You can do this with
+[Mermaid charts in Markdown](../../user/markdown.md#mermaid) directly in the GitLab
+repository.
+
+Document CI/CD pipeline problems and incidents in issues, including research done
+and solutions found. This helps onboarding new team members, and also helps
+identify recurring problems with CI pipeline efficiency.
+
+### Learn More
+
+- [CI Monitoring Webcast Slides](https://docs.google.com/presentation/d/1ONwIIzRB7GWX-WOSziIIv8fz1ngqv77HO1yVfRooOHM/edit?usp=sharing)
+- [GitLab.com Monitoring Handbook](https://about.gitlab.com/handbook/engineering/monitoring/)
+- [Buildings dashboards for operational visibility](https://aws.amazon.com/builders-library/building-dashboards-for-operational-visibility/)
diff --git a/doc/ci/pipelines/settings.md b/doc/ci/pipelines/settings.md
index 40093167213..849eb66d07f 100644
--- a/doc/ci/pipelines/settings.md
+++ b/doc/ci/pipelines/settings.md
@@ -57,17 +57,17 @@ The default value is 60 minutes. Decrease the time limit if you want to impose
 a hard limit on your jobs' running time or increase it otherwise. In any case,
 if the job surpasses the threshold, it is marked as failed.
 
-### Timeout overriding on Runner level
+### Timeout overriding for runners
 
 > [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/-/merge_requests/17221) in GitLab 10.7.
 
 Project defined timeout (either specific timeout set by user or the default
-60 minutes timeout) may be [overridden on Runner level](../runners/README.md#set-maximum-job-timeout-for-a-runner).
+60 minutes timeout) may be [overridden for runners](../runners/README.md#set-maximum-job-timeout-for-a-runner).
 
 ## Maximum artifacts size **(CORE ONLY)**
 
 For information about setting a maximum artifact size for a project, see
-[Maximum artifacts size](../../user/admin_area/settings/continuous_integration.md#maximum-artifacts-size-core-only).
+[Maximum artifacts size](../../user/admin_area/settings/continuous_integration.md#maximum-artifacts-size).
 
 ## Custom CI configuration path
 
@@ -263,7 +263,15 @@ Depending on the status of your job, a badge can have the following values:
 You can access a pipeline status badge image using the following link:
 
 ```plaintext
-https://example.gitlab.com/<namespace>/<project>/badges/<branch>/pipeline.svg
+https://gitlab.example.com/<namespace>/<project>/badges/<branch>/pipeline.svg
+```
+
+#### Display only non-skipped status
+
+If you want the pipeline status badge to only display the last non-skipped status, you can use the `?ignore_skipped=true` query parameter:
+
+```plaintext
+https://gitlab.example.com/<namespace>/<project>/badges/<branch>/pipeline.svg?ignore_skipped=true
 ```
 
 ### Test coverage report badge
@@ -275,7 +283,7 @@ pipeline can have the test coverage percentage value defined.
 The test coverage badge can be accessed using following link:
 
 ```plaintext
-https://example.gitlab.com/<namespace>/<project>/badges/<branch>/coverage.svg
+https://gitlab.example.com/<namespace>/<project>/badges/<branch>/coverage.svg
 ```
 
 If you would like to get the coverage report from a specific job, you can add
@@ -294,7 +302,7 @@ Pipeline badges can be rendered in different styles by adding the `style=style_n
 #### Flat (default)
 
 ```plaintext
-https://example.gitlab.com/<namespace>/<project>/badges/<branch>/coverage.svg?style=flat
+https://gitlab.example.com/<namespace>/<project>/badges/<branch>/coverage.svg?style=flat
 ```
 
 ![Badge flat style](https://gitlab.com/gitlab-org/gitlab/badges/master/coverage.svg?job=coverage&style=flat)
@@ -304,7 +312,7 @@ https://example.gitlab.com/<namespace>/<project>/badges/<branch>/coverage.svg?st
 > [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/30120) in GitLab 11.8.
 
 ```plaintext
-https://example.gitlab.com/<namespace>/<project>/badges/<branch>/coverage.svg?style=flat-square
+https://gitlab.example.com/<namespace>/<project>/badges/<branch>/coverage.svg?style=flat-square
 ```
 
 ![Badge flat square style](https://gitlab.com/gitlab-org/gitlab/badges/master/coverage.svg?job=coverage&style=flat-square)