diff options
author | GitLab Bot <gitlab-bot@gitlab.com> | 2020-04-09 15:09:29 +0000 |
---|---|---|
committer | GitLab Bot <gitlab-bot@gitlab.com> | 2020-04-09 15:09:29 +0000 |
commit | 209bd8cf1f542f6ba2a069b368a9187faa871e96 (patch) | |
tree | 6b77dc8183135b8316cc70c8dbc9c4e7c18cf05a /doc | |
parent | a9ced7da447785c57477b3d8dbccc73a78cface1 (diff) | |
download | gitlab-ce-209bd8cf1f542f6ba2a069b368a9187faa871e96.tar.gz |
Add latest changes from gitlab-org/gitlab@master
Diffstat (limited to 'doc')
23 files changed, 231 insertions, 149 deletions
diff --git a/doc/administration/geo/replication/datatypes.md b/doc/administration/geo/replication/datatypes.md index 7e697e8dd81..a1f511fe2a5 100644 --- a/doc/administration/geo/replication/datatypes.md +++ b/doc/administration/geo/replication/datatypes.md @@ -73,7 +73,7 @@ for Wiki and Design Repository cases. GitLab stores files and blobs such as Issue attachments or LFS objects into either: - The filesystem in a specific location. -- An Object Storage solution. Object Storage solutions can be: +- An [Object Storage](../../object_storage.md) solution. Object Storage solutions can be: - Cloud based like Amazon S3 Google Cloud Storage. - Hosted by you (like MinIO). - A Storage Appliance that exposes an Object Storage-compatible API. diff --git a/doc/administration/geo/replication/object_storage.md b/doc/administration/geo/replication/object_storage.md index db8d26b3865..ffd44282b23 100644 --- a/doc/administration/geo/replication/object_storage.md +++ b/doc/administration/geo/replication/object_storage.md @@ -12,6 +12,8 @@ To have: - GitLab manage replication, follow [Enabling GitLab replication](#enabling-gitlab-managed-object-storage-replication). - Third-party services manage replication, follow [Third-party replication services](#third-party-replication-services). +[Read more about using object storage with GitLab](../../object_storage.md). + ## Enabling GitLab managed object storage replication > [Introduced](https://gitlab.com/gitlab-org/gitlab/issues/10586) in GitLab 12.4. diff --git a/doc/administration/job_artifacts.md b/doc/administration/job_artifacts.md index c45388087ab..a9a13062a25 100644 --- a/doc/administration/job_artifacts.md +++ b/doc/administration/job_artifacts.md @@ -92,6 +92,8 @@ Use an object storage option like AWS S3 to store job artifacts. DANGER: **Danger:** If you configure GitLab to store CI logs and artifacts on object storage, you must also enable [incremental logging](job_logs.md#new-incremental-logging-architecture). Otherwise, job logs will disappear or not be saved. +[Read more about using object storage with GitLab](object_storage.md). + #### Object Storage Settings For source installations the following settings are nested under `artifacts:` and then `object_store:`. On Omnibus GitLab installs they are prefixed by `artifacts_object_store_`. diff --git a/doc/administration/lfs/index.md b/doc/administration/lfs/index.md index 10ff15b1ff4..71c1ae22305 100644 --- a/doc/administration/lfs/index.md +++ b/doc/administration/lfs/index.md @@ -61,6 +61,8 @@ You can also use external object storage in a private local network. For example GitLab provides two different options for the uploading mechanism: "Direct upload" and "Background upload". +[Read more about using object storage with GitLab](../object_storage.md). + **Option 1. Direct upload** 1. User pushes an `lfs` file to the GitLab instance diff --git a/doc/administration/merge_request_diffs.md b/doc/administration/merge_request_diffs.md index fd1a425d6b1..795933e2772 100644 --- a/doc/administration/merge_request_diffs.md +++ b/doc/administration/merge_request_diffs.md @@ -68,6 +68,8 @@ Instead of storing the external diffs on disk, we recommended the use of an obje store like AWS S3 instead. This configuration relies on valid AWS credentials to be configured already. +[Read more about using object storage with GitLab](object_storage.md). + ## Object Storage Settings For source installations, these settings are nested under `external_diffs:` and diff --git a/doc/administration/object_storage.md b/doc/administration/object_storage.md index 55ec66112d2..80305c89e81 100644 --- a/doc/administration/object_storage.md +++ b/doc/administration/object_storage.md @@ -21,9 +21,6 @@ Object storage options that GitLab has tested, or is aware of customers using in For configuring GitLab to use Object Storage refer to the following guides: -1. Make sure the [`git` user home directory](https://docs.gitlab.com/omnibus/settings/configuration.html#moving-the-home-directory-for-a-user) is on local disk. -1. Configure [database lookup of SSH keys](operations/fast_ssh_key_lookup.md) - to eliminate the need for a shared `authorized_keys` file. 1. Configure [object storage for backups](../raketasks/backup_restore.md#uploading-backups-to-a-remote-cloud-storage). 1. Configure [object storage for job artifacts](job_artifacts.md#using-object-storage) including [incremental logging](job_logs.md#new-incremental-logging-architecture). @@ -36,6 +33,19 @@ For configuring GitLab to use Object Storage refer to the following guides: 1. Configure [object storage for Dependency Proxy](packages/dependency_proxy.md#using-object-storage) (optional feature). **(PREMIUM ONLY)** 1. Configure [object storage for Pseudonymizer](pseudonymizer.md#configuration) (optional feature). **(ULTIMATE ONLY)** 1. Configure [object storage for autoscale Runner caching](https://docs.gitlab.com/runner/configuration/autoscale.html#distributed-runners-caching) (optional - for improved performance). +1. Configure [object storage for Terraform state files](terraform_state.md#using-object-storage-core-only) + +### Other alternatives to filesystem storage + +If you're working to [scale out](scaling/index.md) your GitLab implementation, +or add [fault tolerance and redundancy](high_availability/README.md) you may be +looking at removing dependencies on block or network filesystems. +See the following guides and +[note that Pages requires disk storage](#gitlab-pages-requires-nfs): + +1. Make sure the [`git` user home directory](https://docs.gitlab.com/omnibus/settings/configuration.html#moving-the-home-directory-for-a-user) is on local disk. +1. Configure [database lookup of SSH keys](operations/fast_ssh_key_lookup.md) + to eliminate the need for a shared `authorized_keys` file. ## Warnings, limitations, and known issues @@ -67,8 +77,9 @@ with the Fog library that GitLab uses. Symptoms include: ### GitLab Pages requires NFS -If you're working to [scale out](high_availability/README.md) your GitLab implementation and -one of your requirements is [GitLab Pages](../user/project/pages/index.md) this currently requires +If you're working to add more GitLab servers for [scaling](scaling/index.md) or +[fault tolerance](high_availability/README.md) and one of your requirements +is [GitLab Pages](../user/project/pages/index.md) this currently requires NFS. There is [work in progress](https://gitlab.com/gitlab-org/gitlab-pages/issues/196) to remove this dependency. In the future, GitLab Pages may use [object storage](https://gitlab.com/gitlab-org/gitlab/-/issues/208135). diff --git a/doc/administration/packages/container_registry.md b/doc/administration/packages/container_registry.md index b940cb6933b..aaf1ca29084 100644 --- a/doc/administration/packages/container_registry.md +++ b/doc/administration/packages/container_registry.md @@ -367,6 +367,8 @@ The different supported drivers are: Read more about the individual driver's config options in the [Docker Registry docs](https://docs.docker.com/registry/configuration/#storage). +[Read more about using object storage with GitLab](../object_storage.md). + CAUTION: **Warning:** GitLab will not backup Docker images that are not stored on the filesystem. Remember to enable backups with your object storage provider if desired. diff --git a/doc/administration/packages/dependency_proxy.md b/doc/administration/packages/dependency_proxy.md index ff3c24d6162..ec2020c26de 100644 --- a/doc/administration/packages/dependency_proxy.md +++ b/doc/administration/packages/dependency_proxy.md @@ -77,7 +77,9 @@ To change the local storage path: ### Using object storage Instead of relying on the local storage, you can use an object storage to -upload the blobs of the dependency proxy: +store the blobs of the dependency proxy. + +[Read more about using object storage with GitLab](../object_storage.md). **Omnibus GitLab installations** diff --git a/doc/administration/packages/index.md b/doc/administration/packages/index.md index 536b6a5f246..d14726d33de 100644 --- a/doc/administration/packages/index.md +++ b/doc/administration/packages/index.md @@ -86,7 +86,9 @@ To change the local storage path: ### Using object storage Instead of relying on the local storage, you can use an object storage to -upload packages: +store packages. + +[Read more about using object storage with GitLab](../object_storage.md). **Omnibus GitLab installations** diff --git a/doc/administration/pseudonymizer.md b/doc/administration/pseudonymizer.md index 3cf0e96d18f..36bb446da78 100644 --- a/doc/administration/pseudonymizer.md +++ b/doc/administration/pseudonymizer.md @@ -26,6 +26,8 @@ To configure the pseudonymizer, you need to: Alternatively, you can use an absolute file path. - Use an object storage and specify the connection parameters in the `pseudonymizer.upload.connection` configuration option. +[Read more about using object storage with GitLab](object_storage.md). + **For Omnibus installations:** 1. Edit `/etc/gitlab/gitlab.rb` and add the following lines by replacing with diff --git a/doc/administration/raketasks/uploads/migrate.md b/doc/administration/raketasks/uploads/migrate.md index 6dae9b71e1f..d2823847a89 100644 --- a/doc/administration/raketasks/uploads/migrate.md +++ b/doc/administration/raketasks/uploads/migrate.md @@ -7,6 +7,8 @@ After [configuring the object storage](../../uploads.md#using-object-storage-cor >**Note:** All of the processing will be done in a background worker and requires **no downtime**. +[Read more about using object storage with GitLab](../../object_storage.md). + ### All-in-one Rake task GitLab provides a wrapper Rake task that migrates all uploaded files - avatars, diff --git a/doc/administration/terraform_state.md b/doc/administration/terraform_state.md index c684178f13e..0956edaf252 100644 --- a/doc/administration/terraform_state.md +++ b/doc/administration/terraform_state.md @@ -51,6 +51,8 @@ Instead of storing Terraform state files on disk, we recommend the use of an obj store that is S3-compatible instead. This configuration relies on valid credentials to be configured already. +[Read more about using object storage with GitLab](object_storage.md). + ### Object storage settings The following settings are: diff --git a/doc/administration/uploads.md b/doc/administration/uploads.md index 45cffb64671..f29deba3d40 100644 --- a/doc/administration/uploads.md +++ b/doc/administration/uploads.md @@ -55,6 +55,8 @@ If you don't want to use the local disk where GitLab is installed to store the uploads, you can use an object storage provider like AWS S3 instead. This configuration relies on valid AWS credentials to be configured already. +[Read more about using object storage with GitLab](object_storage.md). + ## Object Storage Settings For source installations the following settings are nested under `uploads:` and then `object_store:`. On Omnibus GitLab installs they are prefixed by `uploads_object_store_`. diff --git a/doc/api/dependency_proxy.md b/doc/api/dependency_proxy.md new file mode 100644 index 00000000000..a379f1481c1 --- /dev/null +++ b/doc/api/dependency_proxy.md @@ -0,0 +1,21 @@ +# Dependency Proxy API **(PREMIUM)** + +## Purge the dependency proxy for a group + +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/11631) in GitLab 12.10. + +Deletes the cached blobs for a group. This endpoint requires group admin access. + +```plaintext +DELETE /groups/:id/dependency_proxy/cache +``` + +| Attribute | Type | Required | Description | +| --------- | ---- | -------- | ----------- | +| `id` | integer/string | yes | The ID or [URL-encoded path of the group](README.md#namespaced-path-encoding) owned by the authenticated user | + +Example request: + +```shell +curl --request DELETE --header "PRIVATE-TOKEN: <your_access_token>" "https://gitlab.example.com/api/v4/groups/5/dependency_proxy/cache" +``` diff --git a/doc/api/projects.md b/doc/api/projects.md index 959b263c301..f0b65b9ac6a 100644 --- a/doc/api/projects.md +++ b/doc/api/projects.md @@ -162,7 +162,7 @@ When the user is authenticated and `simple` is not set this returns something li "merge_method": "merge", "autoclose_referenced_issues": true, "suggestion_commit_message": null, - "marked_for_deletion_at": "2020-04-03", + "marked_for_deletion_at": "2020-04-03", // to be deprecated in GitLab 13.0 in favor of marked_for_deletion_on "marked_for_deletion_on": "2020-04-03", "statistics": { "commit_count": 37, @@ -287,6 +287,9 @@ When the user is authenticated and `simple` is not set this returns something li ] ``` +NOTE: **Note:** +For users on GitLab [Silver, Premium, or higher](https://about.gitlab.com/pricing/) the `marked_for_deletion_at` attribute will be deprecated in GitLab 13.0 in favor of the `marked_for_deletion_on` attribute. + Users on GitLab [Starter, Bronze, or higher](https://about.gitlab.com/pricing/) will also see the `approvals_before_merge` parameter: @@ -408,7 +411,7 @@ This endpoint supports [keyset pagination](README.md#keyset-based-pagination) fo "merge_method": "merge", "autoclose_referenced_issues": true, "suggestion_commit_message": null, - "marked_for_deletion_at": "2020-04-03", + "marked_for_deletion_at": "2020-04-03", // to be deprecated in GitLab 13.0 in favor of marked_for_deletion_on "marked_for_deletion_on": "2020-04-03", "statistics": { "commit_count": 37, @@ -874,7 +877,7 @@ GET /projects/:id "service_desk_address": null, "autoclose_referenced_issues": true, "suggestion_commit_message": null, - "marked_for_deletion_at": "2020-04-03", + "marked_for_deletion_at": "2020-04-03", // to be deprecated in GitLab 13.0 in favor of marked_for_deletion_on "marked_for_deletion_on": "2020-04-03", "statistics": { "commit_count": 37, diff --git a/doc/ci/directed_acyclic_graph/index.md b/doc/ci/directed_acyclic_graph/index.md index b6b53737dde..d4b87648f49 100644 --- a/doc/ci/directed_acyclic_graph/index.md +++ b/doc/ci/directed_acyclic_graph/index.md @@ -4,7 +4,7 @@ type: reference # Directed Acyclic Graph -> [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/issues/47063) in GitLab 12.2 (enabled by `ci_dag_support` feature flag). +> [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/issues/47063) in GitLab 12.2. A [directed acyclic graph](https://www.techopedia.com/definition/5739/directed-acyclic-graph-dag) can be used in the context of a CI/CD pipeline to build relationships between jobs such that diff --git a/doc/ci/parent_child_pipelines.md b/doc/ci/parent_child_pipelines.md index b39e0b6e540..2bc897901fa 100644 --- a/doc/ci/parent_child_pipelines.md +++ b/doc/ci/parent_child_pipelines.md @@ -136,12 +136,11 @@ your own script to generate a YAML file, which is then [used to trigger a child This technique can be very powerful in generating pipelines targeting content that changed or to build a matrix of targets and architectures. +In GitLab 12.9, the child pipeline could fail to be created in certain cases, causing the parent pipeline to fail. +This is [resolved in GitLab 12.10](https://gitlab.com/gitlab-org/gitlab/-/issues/209070). + ## Limitations A parent pipeline can trigger many child pipelines, but a child pipeline cannot trigger further child pipelines. See the [related issue](https://gitlab.com/gitlab-org/gitlab/issues/29651) for discussion on possible future improvements. - -When triggering dynamic child pipelines, if the job containing the CI config artifact is not a predecessor of the -trigger job, the child pipeline will fail to be created, causing also the parent pipeline to fail. -In the future we want to validate the trigger job's dependencies [at the time the parent pipeline is created](https://gitlab.com/gitlab-org/gitlab/-/issues/209070) rather than when the child pipeline is created. diff --git a/doc/install/aws/img/choose_ami.png b/doc/install/aws/img/choose_ami.png Binary files differdeleted file mode 100644 index a07d42dd6fb..00000000000 --- a/doc/install/aws/img/choose_ami.png +++ /dev/null diff --git a/doc/install/aws/index.md b/doc/install/aws/index.md index a88e8b0c310..6708e40abb4 100644 --- a/doc/install/aws/index.md +++ b/doc/install/aws/index.md @@ -555,7 +555,7 @@ In `/etc/ssh/sshd_config` update the following: #### Amazon S3 object storage -Since we're not using NFS for shared storage, we will use [Amazon S3](https://aws.amazon.com/s3/) buckets to store backups, artifacts, LFS objects, uploads, merge request diffs, container registry images, and more. For instructions on how to configure each of these, please see [Cloud Object Storage](../../administration/high_availability/object_storage.md). +Since we're not using NFS for shared storage, we will use [Amazon S3](https://aws.amazon.com/s3/) buckets to store backups, artifacts, LFS objects, uploads, merge request diffs, container registry images, and more. Our [documentation includes configuration instructions](../../administration/object_storage.md) for each of these, and other information about using object storage with GitLab. Remember to run `sudo gitlab-ctl reconfigure` after saving the changes to the `gitlab.rb` file. @@ -580,90 +580,55 @@ On the EC2 dashboard: Now we have a custom AMI that we'll use to create our launch configuration the next step. -## Deploying GitLab inside an auto scaling group +## Deploy GitLab inside an auto scaling group -We'll use AWS's wizard to deploy GitLab and then SSH into the instance to -configure the PostgreSQL and Redis connections. +### Create a launch configuration -The Auto Scaling Group option is available through the EC2 dashboard on the left -sidebar. - -1. Click **Create Auto Scaling group**. -1. Create a new launch configuration. - -### Choose the AMI - -Choose the AMI: - -1. Go to the Community AMIs and search for `GitLab EE <version>` - where `<version>` the latest version as seen on the - [releases page](https://about.gitlab.com/releases/). - - ![Choose AMI](img/choose_ami.png) - -### Choose an instance type - -You should choose an instance type based on your workload. Consult -[the hardware requirements](../requirements.md#hardware-requirements) to choose -one that fits your needs (at least `c5.xlarge`, which is enough to accommodate 100 users): - -1. Choose the your instance type. -1. Click **Next: Configure Instance Details**. - -### Configure details - -In this step we'll configure some details: - -1. Enter a name (`gitlab-autoscaling`). -1. Select the IAM role we created. -1. Optionally, enable CloudWatch and the EBS-optimized instance settings. -1. In the "Advanced Details" section, set the IP address type to - "Do not assign a public IP address to any instances." -1. Click **Next: Add Storage**. - -### Add storage - -The root volume is 8GB by default and should be enough given that we won't store any data there. - -### Configure security group - -As a last step, configure the security group: - -1. Select the existing load balancer security group we have [created](#load-balancer). -1. Select **Review**. - -### Review and launch - -Now is a good time to review all the previous settings. When ready, click -**Create launch configuration** and select the SSH key pair with which you will -connect to the instance. - -### Create Auto Scaling Group - -We are now able to start creating our Auto Scaling Group: - -1. Give it a group name. -1. Set the group size to 2 as we want to always start with two instances. -1. Assign it our network VPC and add the **private subnets**. -1. In the "Advanced Details" section, choose to receive traffic from ELBs - and select our ELB. -1. Choose the ELB health check. -1. Click **Next: Configure scaling policies**. +From the EC2 dashboard: -This is the really great part of Auto Scaling; we get to choose when AWS -launches new instances and when it removes them. For this group we'll -scale between 2 and 4 instances where one instance will be added if CPU +1. Select **Launch Configurations** from the left menu and click **Create launch configuration**. +1. Select **My AMIs** from the left menu and select the `GitLab` custom AMI we created above. +1. Select an instance type best suited for your needs (at least a `c5.xlarge`) and click **Configure details**. +1. Enter a name for your launch configuration (we'll use `gitlab-ha-launch-config`). +1. **Do not** check **Request Spot Instance**. +1. From the **IAM Role** dropdown, pick the `GitLabAdmin` instance role we [created earlier](#creating-an-iam-ec2-instance-role-and-profile). +1. Leave the rest as defaults and click **Add Storage**. +1. The root volume is 8GiB by default and should be enough given that we won’t store any data there. Click **Configure Security Group**. +1. Check **Select and existing security group** and select the `gitlab-loadbalancer-sec-group` we created earlier. +1. Click **Review**, review your changes, and click **Create launch configuration**. +1. Acknowledge that you have access to the private key or create a new one. Click **Create launch configuration**. + +### Create an auto scaling group + +1. As soon as the launch configuration is created, you'll see an option to **Create an Auto Scaling group using this launch configuration**. Click that to start creating the auto scaling group. +1. Enter a **Group name** (we'll use `gitlab-auto-scaling-group`). +1. For **Group size**, enter the number of instances you want to start with (we'll enter `2`). +1. Select the `gitlab-vpc` from the **Network** dropdown. +1. Add both the private [subnets we created earlier](#subnets). +1. Expand the **Advanced Details** section and check the **Receive traffic from one or more load balancers** option. +1. From the **Classic Load Balancers** dropdown, Select the load balancer we created earlier. +1. For **Health Check Type**, select **ELB**. +1. We'll leave our **Health Check Grace Period** as the default `300` seconds. Click **Configure scaling policies**. +1. Check **Use scaling policies to adjust the capacity of this group**. +1. For this group we'll scale between 2 and 4 instances where one instance will be added if CPU utilization is greater than 60% and one instance is removed if it falls to less than 45%. ![Auto scaling group policies](img/policies.png) -Finally, configure notifications and tags as you see fit, and create the +1. Finally, configure notifications and tags as you see fit, review your changes, and create the auto scaling group. -You'll notice that after we save the configuration, AWS starts launching our two -instances in different AZs and without a public IP which is exactly what -we intended. +As the auto scaling group is created, you'll see your new instances spinning up in your EC2 dashboard. You'll also see the new instances added to your load balancer. Once the instances pass the heath check, they are ready to start receiving traffic from the load balancer. + +Since our instances are created by the auto scaling group, go back to your instances and terminate the [instance we created manually above](#install-gitlab). We only needed this instance to create our custom AMI. + +### Log in for the first time + +Using the domain name you used when setting up [DNS for the load balancer](#configure-dns-for-load-balancer), you should now be able to visit GitLab in your browser. The very first time you will be asked to set up a password +for the `root` user which has admin privileges on the GitLab instance. + +After you set it up, login with username `root` and the newly created password. ## Health check and monitoring with Prometheus diff --git a/doc/raketasks/backup_restore.md b/doc/raketasks/backup_restore.md index 19065b27275..b0d90ea0345 100644 --- a/doc/raketasks/backup_restore.md +++ b/doc/raketasks/backup_restore.md @@ -309,6 +309,8 @@ In the example below we use Amazon S3 for storage, but Fog also lets you use for AWS, Google, OpenStack Swift, Rackspace and Aliyun as well. A local driver is [also available](#uploading-to-locally-mounted-shares). +[Read more about using object storage with GitLab](../administration/object_storage.md). + #### Using Amazon S3 For Omnibus GitLab packages: diff --git a/doc/topics/git/partial_clone.md b/doc/topics/git/partial_clone.md index 83f1d0f0de5..fcb7d8630f5 100644 --- a/doc/topics/git/partial_clone.md +++ b/doc/topics/git/partial_clone.md @@ -1,82 +1,115 @@ -# Partial Clone for Large Repositories - -CAUTION: **Alpha:** -Partial Clone is an experimental feature, and will significantly increase -Gitaly resource utilization when performing a partial clone, and decrease -performance of subsequent fetch operations. - -As Git repositories become very large, usability decreases as performance -decreases. One major challenge is cloning the repository, because Git will -download the entire repository including every commit and every version of -every object. This can be slow to transfer, and require large amounts of disk -space. - -Historically, performing a **shallow clone** -([`--depth`](https://www.git-scm.com/docs/git-clone#Documentation/git-clone.txt---depthltdepthgt)) -has been the only way to reduce the amount of data transferred when cloning -a Git repository. This does not, however, allow filtering by sub-tree which is -important for monolithic repositories containing many projects, or by object -size preventing unnecessary large objects being downloaded. +# Partial Clone + +As Git repositories grow in size, they can become cumbersome to work with +because of the large amount of history that must be downloaded, and the large +amount of disk space they require. [Partial clone](https://github.com/git/git/blob/master/Documentation/technical/partial-clone.txt) is a performance optimization that "allows Git to function without having a complete copy of the repository. The goal of this work is to allow Git better handle extremely large repositories." -Specifically, using partial clone, it should be possible for Git to natively -support: - -- large objects, instead of using [Git LFS](https://git-lfs.github.com/) -- enormous repositories - -Briefly, partial clone works by: +## Filter by file size -- excluding objects from being transferred when cloning or fetching a - repository using a new `--filter` flag -- downloading missing objects on demand +> [Introduced](https://gitlab.com/gitlab-org/gitaly/-/issues/2553) in GitLab 12.10. -Follow [Git for enormous repositories](https://gitlab.com/groups/gitlab-org/-/epics/773) for roadmap and updates. +Storing large binary files in Git is normally discouraged, because every large +file added will be downloaded by everyone who clones or fetches changes +thereafter. This is slow, if not a complete obstruction when working from a slow +or unreliable internet connection. -## Enabling partial clone +Using partial clone with a file size filter solves this problem, by excluding +troublesome large files from clones and fetches. When Git encounters a missing +file, it will be downloaded on demand. -> [Introduced](https://gitlab.com/gitlab-org/gitaly/issues/1553) in GitLab 12.4. - -To enable partial clone, use the [feature flags API](../../api/features.md). -For example: +When cloning a repository, use the `--filter=blob:limit=<size>` argument. For example, +to clone the repository excluding files larger than 1 megabyte: ```shell -curl --data "value=true" --header "PRIVATE-TOKEN: <your_access_token>" https://gitlab.example.com/api/v4/features/gitaly_upload_pack_filter +git clone --filter=blob:limit=1m git@gitlab.com:gitlab-com/www-gitlab-com.git ``` -Alternatively, flip the switch and enable the feature flag: - -```ruby -Feature.enable(:gitaly_upload_pack_filter) +This would produce the following output: + +```plaintext +Cloning into 'www-gitlab-com'... +remote: Enumerating objects: 832467, done. +remote: Counting objects: 100% (832467/832467), done. +remote: Compressing objects: 100% (207226/207226), done. +remote: Total 832467 (delta 585563), reused 826624 (delta 580099), pack-reused 0 +Receiving objects: 100% (832467/832467), 2.34 GiB | 5.05 MiB/s, done. +Resolving deltas: 100% (585563/585563), done. +remote: Enumerating objects: 146, done. +remote: Counting objects: 100% (146/146), done. +remote: Compressing objects: 100% (138/138), done. +remote: Total 146 (delta 8), reused 144 (delta 8), pack-reused 0 +Receiving objects: 100% (146/146), 471.45 MiB | 4.60 MiB/s, done. +Resolving deltas: 100% (8/8), done. +Updating files: 100% (13008/13008), done. +Filtering content: 100% (3/3), 131.24 MiB | 4.65 MiB/s, done. ``` -## Excluding objects by size - -Partial Clone allows large objects to be stored directly in the Git repository, -and be excluded from clones as desired by the user. This eliminates the error -prone process of deciding which objects should be stored in LFS or not. Using -partial clone, all files – large or small – may be treated the same. +The output will be longer because Git will first clone the repository excluding +files larger than 1 megabyte, and second download any missing large files needed +to checkout the `master` branch. + +When changing branches, Git may need to download more missing files. + +## Filter by object type + +> [Introduced](https://gitlab.com/gitlab-org/gitaly/-/issues/2553) in GitLab 12.10. + +For enormous repositories with millions of files, and long history, it may be +helpful to exclude all files and use in combination with `sparse-checkout` to +reduce the size of your working copy. + +```plaintext +# Clone the repo excluding all files +$ git clone --filter=blob:none --sparse git@gitlab.com:gitlab-com/www-gitlab-com/git +Cloning into 'www-gitlab-com'... +remote: Enumerating objects: 678296, done. +remote: Counting objects: 100% (678296/678296), done. +remote: Compressing objects: 100% (165915/165915), done. +remote: Total 678296 (delta 472342), reused 673292 (delta 467476), pack-reused 0 +Receiving objects: 100% (678296/678296), 81.06 MiB | 5.74 MiB/s, done. +Resolving deltas: 100% (472342/472342), done. +remote: Enumerating objects: 28, done. +remote: Counting objects: 100% (28/28), done. +remote: Compressing objects: 100% (25/25), done. +remote: Total 28 (delta 0), reused 12 (delta 0), pack-reused 0 +Receiving objects: 100% (28/28), 140.29 KiB | 341.00 KiB/s, done. +Updating files: 100% (28/28), done. + +$ cd www-gitlab-com + +$ git sparse-checkout init --cone + +$ git sparse-checkout add data +remote: Enumerating objects: 301, done. +remote: Counting objects: 100% (301/301), done. +remote: Compressing objects: 100% (292/292), done. +remote: Total 301 (delta 16), reused 102 (delta 9), pack-reused 0 +Receiving objects: 100% (301/301), 1.15 MiB | 608.00 KiB/s, done. +Resolving deltas: 100% (16/16), done. +Updating files: 100% (302/302), done. +``` -With the `uploadpack.allowFilter` and `uploadpack.allowAnySHA1InWant` options -enabled on the Git server: +For more details, see the Git documentation for +[`sparse-checkout`](https://git-scm.com/docs/git-sparse-checkout). -```shell -# clone the repo, excluding blobs larger than 1 megabyte -git clone --filter=blob:limit=1m <url> +## Filter by file path -# in the checkout step of the clone, and any subsequent operations -# any blobs that are needed will be downloaded on demand -git checkout feature-branch -``` +CAUTION: **Experimental:** +Partial Clone using `sparse` filters is experimental, slow, and will +significantly increase Gitaly resource utilization when cloning and fetching. -## Excluding objects by path +Deeper integration between Partial Clone and Sparse Checkout is being explored +through the `--filter=sparse:oid=<blob-ish>` filter spec, but this is highly +experimental. This mode of filtering uses a format similar to a `.gitignore` +file to specify which files should be included when cloning and fetching. -Partial Clone allows clones to be filtered by path using a format similar to a -`.gitignore` file stored inside the repository. +For more details, see the Git documentation for +[`rev-list-options`](https://gitlab.com/gitlab-org/git/-/blob/9fadedd637b312089337d73c3ed8447e9f0aa775/Documentation/rev-list-options.txt#L735-780). With the `uploadpack.allowFilter` and `uploadpack.allowAnySHA1InWant` options enabled on the Git server: diff --git a/doc/user/clusters/applications.md b/doc/user/clusters/applications.md index ab2aad3b043..1adbbf51397 100644 --- a/doc/user/clusters/applications.md +++ b/doc/user/clusters/applications.md @@ -1116,3 +1116,22 @@ To avoid installation errors: kubectl get secrets/tiller-secret -n gitlab-managed-apps -o "jsonpath={.data['ca\.crt']}" | base64 -d > b.pem diff a.pem b.pem ``` + +### Error installing managed apps on EKS cluster + +If you're using a managed cluster on AWS EKS, and you are not able to install some of the managed +apps, consider checking the logs. + +You can check the logs by running following commands: + +```shell +kubectl get pods --all-namespaces +kubectl get services --all-namespaces +``` + +If you are getting the `Failed to assign an IP address to container` error, it's probably due to the +instance type you've specified in the AWS configuration. +The number and size of nodes might not have enough IP addresses to run or install those pods. + +For reference, all the AWS instance IP limits are found +[in this AWS repository on GitHub](https://github.com/aws/amazon-vpc-cni-k8s/blob/master/pkg/awsutils/vpc_ip_resource_limit.go) (search for `InstanceENIsAvailable`). diff --git a/doc/user/packages/dependency_proxy/index.md b/doc/user/packages/dependency_proxy/index.md index 26a7936f8fa..cfdcd9821fb 100644 --- a/doc/user/packages/dependency_proxy/index.md +++ b/doc/user/packages/dependency_proxy/index.md @@ -65,6 +65,13 @@ from GitLab. The blobs are kept forever, and there is no hard limit on how much data can be stored. +## Clearing the cache + +It is possible to use the GitLab API to purge the dependency proxy cache for a +given group to gain back disk space that may be taken up by image blobs that +are no longer needed. See the [dependency proxy API documentation](../../../api/dependency_proxy.md) +for more details. + ## Limitations The following limitations apply: |