From a59c9590f5171f3638a1b2abeff55157aedc577b Mon Sep 17 00:00:00 2001 From: GitLab Bot Date: Wed, 11 Dec 2019 00:08:25 +0000 Subject: Add latest changes from gitlab-org/gitlab@master --- doc/ci/caching/index.md | 102 +++++++++++++++++++++++------------------------- 1 file changed, 48 insertions(+), 54 deletions(-) (limited to 'doc/ci/caching/index.md') diff --git a/doc/ci/caching/index.md b/doc/ci/caching/index.md index 6b8e7fa2ad5..b6518c87e13 100644 --- a/doc/ci/caching/index.md +++ b/doc/ci/caching/index.md @@ -23,61 +23,55 @@ how it is defined in `.gitlab-ci.yml`. NOTE: **Note:** Be careful if you use cache and artifacts to store the same path in your jobs -as **caches are restored before artifacts** and the content would be overwritten. - -Don't mix the caching with passing artifacts between stages. Caching is not -designed to pass artifacts between stages. Cache is for runtime dependencies -needed to compile the project: - -- `cache`: **Use for temporary storage for project dependencies.** Not useful - for keeping intermediate build results, like `jar` or `apk` files. - Cache was designed to be used to speed up invocations of subsequent runs of a - given job, by keeping things like dependencies (e.g., npm packages, Go vendor - packages, etc.) so they don't have to be re-fetched from the public internet. - While the cache can be abused to pass intermediate build results between - stages, there may be cases where artifacts are a better fit. +as **caches are restored before artifacts** and the content could be overwritten. + +Don't use caching for passing artifacts between stages, as it is designed to store +runtime dependencies needed to compile the project: + +- `cache`: **For storing project dependencies** + + Caches are used to speed up runs of a given job in **subsequent pipelines**, by + storing downloaded dependencies so that they don't have to be fetched from the + internet again (like npm packages, Go vendor packages, etc.) While the cache could + be configured to pass intermediate build results between stages, this should be + done with artifacts instead. + - `artifacts`: **Use for stage results that will be passed between stages.** - Artifacts were designed to upload some compiled/generated bits of the build, - and they can be fetched by any number of concurrent Runners. They are - guaranteed to be available and are there to pass data between jobs. They are - also exposed to be downloaded from the UI. **Artifacts can only exist in - directories relative to the build directory** and specifying paths which don't - comply to this rule trigger an unintuitive and illogical error message (an - enhancement is discussed at - [https://gitlab.com/gitlab-org/gitlab-foss/issues/15530](https://gitlab.com/gitlab-org/gitlab-foss/issues/15530) - ). Artifacts need to be uploaded to the GitLab instance (not only the GitLab - runner) before the next stage job(s) can start, so you need to evaluate - carefully whether your bandwidth allows you to profit from parallelization - with stages and shared artifacts before investing time in changes to the - setup. - -It's sometimes confusing because the name artifact sounds like something that -is only useful outside of the job, like for downloading a final image. But -artifacts are also available in between stages within a pipeline. So if you -build your application by downloading all the required modules, you might want -to declare them as artifacts so that each subsequent stage can depend on them -being there. There are some optimizations like declaring an -[expiry time](../yaml/README.md#artifactsexpire_in) so you don't keep artifacts -around too long, and using [dependencies](../yaml/README.md#dependencies) to -control exactly where artifacts are passed around. - -In summary: - -- Caches are disabled if not defined globally or per job (using `cache:`). -- Caches are available for all jobs in your `.gitlab-ci.yml` if enabled globally. -- Caches can be used by subsequent pipelines of that same job (a script in - a stage) in which the cache was created (if not defined globally). -- Caches are stored where the Runner is installed **and** uploaded to S3 if - [distributed cache is enabled](https://docs.gitlab.com/runner/configuration/autoscale.html#distributed-runners-caching). -- Caches defined per job are only used, either: - - For the next pipeline of that job. - - If that same cache is also defined in a subsequent job of the same pipeline. -- Artifacts are disabled if not defined per job (using `artifacts:`). -- Artifacts can only be enabled per job, not globally. -- Artifacts are created during a pipeline and can be used by the subsequent - jobs of that currently active pipeline. -- Artifacts are always uploaded to GitLab (known as coordinator). -- Artifacts can have an expiration value for controlling disk usage (30 days by default). + + Artifacts are files generated by a job which are stored and uploaded, and can then + be fetched and used by jobs in later stages of the **same pipeline**. This data + will not be available in different pipelines, but is available to be downloaded + from the UI. + +The name `artifacts` sounds like it's only useful outside of the job, like for downloading +a final image, but artifacts are also available in later stages within a pipeline. +So if you build your application by downloading all the required modules, you might +want to declare them as artifacts so that subsequent stages can use them. There are +some optimizations like declaring an [expiry time](../yaml/README.md#artifactsexpire_in) +so you don't keep artifacts around too long, or using [dependencies](../yaml/README.md#dependencies) +to control which jobs fetch the artifacts. + +Caches: + +- Are disabled if not defined globally or per job (using `cache:`). +- Are available for all jobs in your `.gitlab-ci.yml` if enabled globally. +- Can be used in subsequent pipelines by the same job in which the cache was created (if not defined globally). +- Are stored where the Runner is installed **and** uploaded to S3 if [distributed cache is enabled](https://docs.gitlab.com/runner/configuration/autoscale.html#distributed-runners-caching). +- If defined per job, are used: + - By the same job in a subsequent pipeline. + - By subsequent jobs in the same pipeline, if the they have identical dependencies. + +Artifacts: + +- Are disabled if not defined per job (using `artifacts:`). +- Can only be enabled per job, not globally. +- Are created during a pipeline and can be used by the subsequent jobs of that currently active pipeline. +- Are always uploaded to GitLab (known as coordinator). +- Can have an expiration value for controlling disk usage (30 days by default). + +NOTE: **Note:** +Both artifacts and caches define their paths relative to the project directory, and +can't link to files outside it. ## Good caching practices -- cgit v1.2.1