diff options
Diffstat (limited to 'doc/user/project/repository')
-rw-r--r-- | doc/user/project/repository/forking_workflow.md | 2 | ||||
-rw-r--r-- | doc/user/project/repository/git_blame.md | 2 | ||||
-rw-r--r-- | doc/user/project/repository/img/repository_cleanup.png | bin | 8114 -> 0 bytes | |||
-rw-r--r-- | doc/user/project/repository/index.md | 16 | ||||
-rw-r--r-- | doc/user/project/repository/jupyter_notebooks/index.md | 2 | ||||
-rw-r--r-- | doc/user/project/repository/reducing_the_repo_size_using_git.md | 274 | ||||
-rw-r--r-- | doc/user/project/repository/repository_mirroring.md | 28 | ||||
-rw-r--r-- | doc/user/project/repository/x509_signed_commits/index.md | 4 |
8 files changed, 221 insertions, 107 deletions
diff --git a/doc/user/project/repository/forking_workflow.md b/doc/user/project/repository/forking_workflow.md index a49701017f3..75a84e36169 100644 --- a/doc/user/project/repository/forking_workflow.md +++ b/doc/user/project/repository/forking_workflow.md @@ -46,7 +46,7 @@ You can use [repository mirroring](repository_mirroring.md) to keep your fork sy The main difference is that with repository mirroring your remote fork will be automatically kept up-to-date. -Without mirroring, to work locally you'll have to use `git pull` to update your local repo +Without mirroring, to work locally you'll have to use `git pull` to update your local repository with the upstream project, then push the changes back to your fork to update it. CAUTION: **Caution:** diff --git a/doc/user/project/repository/git_blame.md b/doc/user/project/repository/git_blame.md index 2deb53b313c..e63b57747ef 100644 --- a/doc/user/project/repository/git_blame.md +++ b/doc/user/project/repository/git_blame.md @@ -25,7 +25,7 @@ for that commit. ## Blame previous commit -> [Introduced](https://gitlab.com/gitlab-org/gitlab/issues/19299) in GitLab 12.7. +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/19299) in GitLab 12.7. To see earlier revisions of a specific line, click **View blame prior to this change** until you've found the changes you're interested in viewing: diff --git a/doc/user/project/repository/img/repository_cleanup.png b/doc/user/project/repository/img/repository_cleanup.png Binary files differdeleted file mode 100644 index e343f23ac27..00000000000 --- a/doc/user/project/repository/img/repository_cleanup.png +++ /dev/null diff --git a/doc/user/project/repository/index.md b/doc/user/project/repository/index.md index 055443daa1f..48975b7864e 100644 --- a/doc/user/project/repository/index.md +++ b/doc/user/project/repository/index.md @@ -27,7 +27,7 @@ that you [connect with GitLab via SSH](../../../ssh/README.md). ## Files -Use a repository to store your files in GitLab. From [GitLab 12.10 onwards](https://gitlab.com/gitlab-org/gitlab/issues/33806), +Use a repository to store your files in GitLab. In [GitLab 12.10 and later](https://gitlab.com/gitlab-org/gitlab/-/issues/33806), you'll see on the repository's file tree an icon next to the file name according to its extension: @@ -84,9 +84,9 @@ according to the markup language. | [AsciiDoc](../../asciidoc.md) | `adoc`, `ad`, `asciidoc` | | [Textile](https://textile-lang.com/) | `textile` | | [rdoc](http://rdoc.sourceforge.net/doc/index.html) | `rdoc` | -| [Orgmode](https://orgmode.org/) | `org` | +| [Org mode](https://orgmode.org/) | `org` | | [creole](http://www.wikicreole.org/) | `creole` | -| [Mediawiki](https://www.mediawiki.org/wiki/MediaWiki) | `wiki`, `mediawiki` | +| [MediaWiki](https://www.mediawiki.org/wiki/MediaWiki) | `wiki`, `mediawiki` | ### Repository README and index files @@ -116,7 +116,7 @@ user's sessions and include code, narrative text, equations, and rich output. ### OpenAPI viewer -> [Introduced](https://gitlab.com/gitlab-org/gitlab/issues/19515) in GitLab 12.6. +> [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/19515) in GitLab 12.6. GitLab can render OpenAPI specification files with its file viewer, provided their filenames include `openapi` or `swagger` and their extension is `yaml`, @@ -219,7 +219,9 @@ vendored code, and most markup languages are excluded. This behavior can be adjusted by overriding the default. For example, to enable `.proto` files to be detected, add the following to `.gitattributes` in the root of your repository. -> *.proto linguist-detectable=true +```plaintext +*.proto linguist-detectable=true +``` ## Locked files **(PREMIUM)** @@ -232,7 +234,7 @@ You can access your repos via [repository API](../../../api/repositories.md). ## Clone in Apple Xcode -> [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/issues/45820) in GitLab 11.0 +> [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/45820) in GitLab 11.0 Projects that contain a `.xcodeproj` or `.xcworkspace` directory can now be cloned in Xcode using the new **Open in Xcode** button, located next to the Git URL @@ -240,7 +242,7 @@ used for cloning your project. The button is only shown on macOS. ## Download Source Code -> Support for directory download was [introduced](https://gitlab.com/gitlab-org/gitlab-foss/issues/24704) in GitLab 11.11. +> Support for directory download was [introduced](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/24704) in GitLab 11.11. The source code stored in a repository can be downloaded from the UI. By clicking the download icon, a dropdown will open with links to download the following: diff --git a/doc/user/project/repository/jupyter_notebooks/index.md b/doc/user/project/repository/jupyter_notebooks/index.md index ca82be280d9..1948b12aacd 100644 --- a/doc/user/project/repository/jupyter_notebooks/index.md +++ b/doc/user/project/repository/jupyter_notebooks/index.md @@ -1,6 +1,6 @@ # Jupyter Notebook Files -> [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/issues/2508/) in GitLab 9.1. +> [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/2508/) in GitLab 9.1. [Jupyter](https://jupyter.org/) Notebook (previously IPython Notebook) files are used for interactive computing in many fields and contain a complete record of the diff --git a/doc/user/project/repository/reducing_the_repo_size_using_git.md b/doc/user/project/repository/reducing_the_repo_size_using_git.md index 16bffe5417d..124150c441a 100644 --- a/doc/user/project/repository/reducing_the_repo_size_using_git.md +++ b/doc/user/project/repository/reducing_the_repo_size_using_git.md @@ -1,150 +1,244 @@ --- +stage: Create +group: Gitaly +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers type: howto --- -# Reducing the repository size using Git - -A GitLab Enterprise Edition administrator can set a [repository size limit](../../admin_area/settings/account_and_limit_settings.md) -which will prevent you from exceeding it. - -When a project has reached its size limit, you will not be able to push to it, -create a new merge request, or merge existing ones. You will still be able to -create new issues, and clone the project though. Uploading LFS objects will -also be denied. - -If you exceed the repository size limit, your first thought might be to remove -some data, make a new commit and push back to the repository. Perhaps you can -move some blobs to LFS, or remove some old dependency updates from history. -Unfortunately, it's not so easy and that workflow won't work. Deleting files in -a commit doesn't actually reduce the size of the repo since the earlier commits -and blobs are still around. What you need to do is rewrite history with Git's -[`filter-branch` option](https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History#The-Nuclear-Option:-filter-branch), -or an open source community-maintained tool like the -[BFG](https://rtyley.github.io/bfg-repo-cleaner/). - -Note that even with that method, until `git gc` runs on the GitLab side, the -"removed" commits and blobs will still be around. You also need to be able to -push the rewritten history to GitLab, which may be impossible if you've already -exceeded the maximum size limit. +# Reduce repository size -In order to lift these restrictions, the administrator of the GitLab instance -needs to increase the limit on the particular project that exceeded it, so it's -always better to spot that you're approaching the limit and act proactively to -stay underneath it. If you hit the limit, and your admin can't - or won't - -temporarily increase it for you, your only option is to prune all the unneeded -stuff locally, and then create a new project on GitLab and start using that -instead. +Git repositories become larger over time. When large files are added to a Git repository: -If you can continue to use the original project, we recommend [using -BFG](#using-the-bfg-repo-cleaner), a tool that's built and -maintained by the open source community. It's faster and simpler than -`git filter-branch`, and GitLab can use its account of what has changed to clean -up its own internal state, maximizing the space saved. +- Fetching the repository becomes slower because everyone must download the files. +- They take up a large amount of storage space on the server. +- Git repository storage limits [can be reached](#storage-limits). -CAUTION: **Caution:** -Make sure to first make a copy of your repository since rewriting history will -purge the files and information you are about to delete. Also make sure to -inform any collaborators to not use `pull` after your changes, but use `rebase`. +Rewriting a repository can remove unwanted history to make the repository smaller. +[`git filter-repo`](https://github.com/newren/git-filter-repo) is a tool for quickly rewriting Git +repository history, and is recommended over both: -CAUTION: **Caution:** -This process is not suitable for removing sensitive data like password or keys -from your repository. Information about commits, including file content, is -cached in the database, and will remain visible even after they have been -removed from the repository. +- [`git filter-branch`](https://git-scm.com/docs/git-filter-branch). +- [BFG](https://rtyley.github.io/bfg-repo-cleaner/). + +DANGER: **Danger:** +Rewriting repository history is a destructive operation. Make sure to backup your repository before +you begin. The best way back up a repository is to +[export the project](../settings/import_export.md#exporting-a-project-and-its-data). -## Using the BFG Repo-Cleaner +## Purge files from repository history -> [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/issues/19376) in GitLab 11.6. +To make cloning your project faster, rewrite branches and tags to remove unwanted files. -1. [Install BFG](https://rtyley.github.io/bfg-repo-cleaner/) from its open source community repository. +1. [Install `git filter-repo`](https://github.com/newren/git-filter-repo/blob/master/INSTALL.md) + using a supported package manager or from source. -1. Navigate to your repository: +1. Clone a fresh copy of the repository using `--bare`: ```shell - cd my_repository/ + git clone --bare https://example.gitlab.com/my/project.git ``` -1. Change to the branch you want to remove the big file from: +1. Using `git filter-repo`, purge any files from the history of your repository. + + To purge all large files, the `--strip-blobs-bigger-than` option can be used: ```shell - git checkout master + git filter-repo --strip-blobs-bigger-than 10M ``` -1. Create a commit removing the large file from the branch, if it still exists: + To purge specific large files by path, the `--path` and `--invert-paths` options can be combined: ```shell - git rm path/to/big_file.mpg - git commit -m 'Remove unneeded large file' + git filter-repo --path path/to/big/file.m4v --invert-paths ``` -1. Rewrite history: + See the + [`git filter-repo` documentation](https://htmlpreview.github.io/?https://github.com/newren/git-filter-repo/blob/docs/html/git-filter-repo.html#EXAMPLES) + for more examples and the complete documentation. + +1. Running `git filter-repo` removes all remotes. To restore the remote for your project, run: ```shell - bfg --delete-files path/to/big_file.mpg + git remote add origin https://example.gitlab.com/<namespace>/<project_name>.git ``` - An object map file will be written to `object-id-map.old-new.txt`. Keep it - around - you'll need it for the final step! +1. Force push your changes to overwrite all branches on GitLab: -1. Force-push the changes to GitLab: + ```shell + git push origin --force --all + ``` + + [Protected branches](../protected_branches.md) will cause this to fail. To proceed, you must + remove branch protection, push, and then re-enable protected branches. + +1. To remove large files from tagged releases, force push your changes to all tags on GitLab: ```shell - git push --force-with-lease origin master + git push origin --force --tags ``` - If this step fails, someone has changed the `master` branch while you were - rewriting history. You could restore the branch and re-run BFG to preserve - their changes, or use `git push --force` to overwrite their changes. + [Protected tags](../protected_tags.md) will cause this to fail. To proceed, you must remove tag + protection, push, and then re-enable protected tags. -1. Navigate to **Project > Settings > Repository > Repository Cleanup**: +## Purge files from GitLab storage - ![Repository settings cleanup form](img/repository_cleanup.png) +To reduce the size of your repository in GitLab, you must remove GitLab internal references to +commits that contain large files. Before completing these steps, +[purge files from your repository history](#purge-files-from-repository-history). - Upload the `object-id-map.old-new.txt` file and press **Start cleanup**. - This will remove any internal Git references to the old commits, and run - `git gc` against the repository. You will receive an email once it has - completed. +As well as [branches](branches/index.md) and tags, which are a type of Git ref, GitLab automatically +creates other refs. These refs prevent dead links to commits, or missing diffs when viewing merge +requests. [Repository cleanup](#repository-cleanup) can be used to remove these from GitLab. -NOTE: **Note:** -This process will remove some copies of the rewritten commits from GitLab's -cache and database, but there are still numerous gaps in coverage - at present, -some of the copies may persist indefinitely. [Clearing the instance cache](../../../administration/raketasks/maintenance.md#clear-redis-cache) -may help to remove some of them, but it should not be depended on for security -purposes! +The following internal refs are not advertised: -## Using `git filter-branch` +- `refs/merge-requests/*` for merge requests. +- `refs/pipelines/*` for + [pipelines](../../../ci/pipelines/index.md#troubleshooting-fatal-reference-is-not-a-tree). +- `refs/environments/*` for environments. -1. Navigate to your repository: +This means they are not usually included when fetching, which makes fetching faster. In addition, +`refs/keep-around/*` are hidden refs to prevent commits with discussion from being deleted and +cannot be fetched at all. - ```shell - cd my_repository/ - ``` +However, these refs can be accessed from the Git bundle inside a project export. -1. Change to the branch you want to remove the big file from: +1. [Install `git filter-repo`](https://github.com/newren/git-filter-repo/blob/master/INSTALL.md) + using a supported package manager or from source. + +1. Generate a fresh [export from the + project](../settings/import_export.html#exporting-a-project-and-its-data) and download it. + +1. Decompress the backup using `tar`: ```shell - git checkout master + tar xzf project-backup.tar.gz ``` -1. Use `filter-branch` to remove the big file: + This will contain a `project.bundle` file, which was created by + [`git bundle`](https://git-scm.com/docs/git-bundle). + +1. Clone a fresh copy of the repository from the bundle: ```shell - git filter-branch --force --tree-filter 'rm -f path/to/big_file.mpg' HEAD + git clone --bare --mirror /path/to/project.bundle ``` -1. Instruct Git to purge the unwanted data: +1. Using `git filter-repo`, purge any files from the history of your repository. Because we are + trying to remove internal refs, we will rely on the `commit-map` produced by each run to tell us + which internal refs to remove. + + NOTE:**Note:** + `git filter-repo` creates a new `commit-map` file every run, and overwrite the `commit-map` from + the previous run. You will need this file from **every** run. Do the next step every time you run + `git filter-repo`. + + To purge all large files, the `--strip-blobs-bigger-than` option can be used: ```shell - git reflog expire --expire=now --all && git gc --prune=now --aggressive + git filter-repo --strip-blobs-bigger-than 10M ``` -1. Lastly, force push to the repository: + To purge specific large files by path, the `--path` and `--invert-paths` options can be combined. ```shell - git push --force origin master + git filter-repo --path path/to/big/file.m4v --invert-paths ``` -Your repository should now be below the size limit. + See the + [`git filter-repo` documentation](https://htmlpreview.github.io/?https://github.com/newren/git-filter-repo/blob/docs/html/git-filter-repo.html#EXAMPLES) + for more examples and the complete documentation. + +1. Run a [repository cleanup](#repository-cleanup). + +## Repository cleanup + +> [Introduced](https://gitlab.com/gitlab-org/gitlab-foss/-/issues/19376) in GitLab 11.6. + +Repository cleanup allows you to upload a text file of objects and GitLab will remove internal Git +references to these objects. You can use +[`git filter-repo`](https://github.com/newren/git-filter-repo) to produce a list of objects (in a +`commit-map` file) that can be used with repository cleanup. + +To clean up a repository: + +1. Go to the project for the repository. +1. Navigate to **{settings}** **Settings > Repository**. +1. Upload a list of objects. For example, a `commit-map` file. +1. Click **Start cleanup**. + +This will: + +- Remove any internal Git references to old commits. +- Run `git gc` against the repository. + +You will receive an email once it has completed. + +When using repository cleanup, note: + +- Housekeeping prunes loose objects older than 2 weeks. This means objects added in the last 2 weeks + will not be removed immediately. If you have access to the + [Gitaly](../../../administration/gitaly/index.md) server, you may run `git gc --prune=now` to + prune all loose objects immediately. +- This process will remove some copies of the rewritten commits from GitLab's cache and database, + but there are still numerous gaps in coverage and some of the copies may persist indefinitely. + [Clearing the instance cache](../../../administration/raketasks/maintenance.md#clear-redis-cache) + may help to remove some of them, but it should not be depended on for security purposes! + +## Storage limits + +Repository size limits: + +- Can [be set by an administrator](../../admin_area/settings/account_and_limit_settings.md#repository-size-limit-starter-only) + on self-managed instances. **(STARTER ONLY)** +- Are [set for GitLab.com](../../gitlab_com/index.md#repository-size-limit). + +When a project has reached its size limit, you cannot: + +- Push to the project. +- Create a new merge request. +- Merge existing merge requests. +- Upload LFS objects. + +You can still: + +- Create new issues. +- Clone the project. + +If you exceed the repository size limit, you might try to: + +1. Remove some data. +1. Make a new commit. +1. Push back to the repository. + +Perhaps you might also: + +- Move some blobs to LFS. +- Remove some old dependency updates from history. + +Unfortunately, this workflow won't work. Deleting files in a commit doesn't actually reduce the size +of the repository because the earlier commits and blobs still exist. + +What you need to do is rewrite history. We recommend the open-source community-maintained tool +[`git filter-repo`](https://github.com/newren/git-filter-repo). + +NOTE: **Note:** +Until `git gc` runs on the GitLab side, the "removed" commits and blobs will still exist. You also +must be able to push the rewritten history to GitLab, which may be impossible if you've already +exceeded the maximum size limit. + +In order to lift these restrictions, the administrator of the self-managed GitLab instance must +increase the limit on the particular project that exceeded it. Therefore, it's always better to +proactively stay underneath the limit. If you hit the limit, and can't have it temporarily +increased, your only option is to: + +1. Prune all the unneeded stuff locally. +1. Create a new project on GitLab and start using that instead. + +CAUTION: **Caution:** +This process is not suitable for removing sensitive data like password or keys from your repository. +Information about commits, including file content, is cached in the database, and will remain +visible even after they have been removed from the repository. <!-- ## Troubleshooting diff --git a/doc/user/project/repository/repository_mirroring.md b/doc/user/project/repository/repository_mirroring.md index fdbea385998..f75b083e6dc 100644 --- a/doc/user/project/repository/repository_mirroring.md +++ b/doc/user/project/repository/repository_mirroring.md @@ -28,7 +28,7 @@ immediate update, unless: - The mirror is already being updated. - 5 minutes haven't elapsed since its last update. -For security reasons, from [GitLab 12.10 onwards](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/27166), +For security reasons, in [GitLab 12.10 and later](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/27166), the URL to the original repository is only displayed to users with Maintainer or Owner permissions to the mirrored project. @@ -134,7 +134,7 @@ The repository will push soon. To force a push, click the appropriate button. ## Pulling from a remote repository **(STARTER)** > - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/merge_requests/51) in GitLab Enterprise Edition 8.2. -> - [Added Git LFS support](https://gitlab.com/gitlab-org/gitlab/issues/10871) in [GitLab Starter](https://about.gitlab.com/pricing/) 11.11. +> - [Added Git LFS support](https://gitlab.com/gitlab-org/gitlab/-/issues/10871) in [GitLab Starter](https://about.gitlab.com/pricing/) 11.11. You can set up a repository to automatically have its branches, tags, and commits updated from an upstream repository. @@ -356,6 +356,24 @@ a [Push event webhook](../integrations/webhooks.md#push-events) to trigger an im pull to GitLab. Push mirroring from GitLab is rate limited to once per minute when only push mirroring protected branches. +### Configure a webhook to trigger an immediate pull to GitLab + +Assuming you have already configured the [push](#setting-up-a-push-mirror-to-another-gitlab-instance-with-2fa-activated) and [pull](#pulling-from-a-remote-repository-starter) mirrors in the upstream GitLab instance, to trigger an immediate pull as suggested above, you will need to configure a [Push Event Web Hook](../integrations/webhooks.md#push-events) in the downstream instance. + +To do this: + +- Create a [personal access token](../../profile/personal_access_tokens.md) with `API` scope. +- Navigate to **Settings > Webhooks** +- Add the webhook URL which in this case will use the [Pull Mirror API](../../../api/projects.md#start-the-pull-mirroring-process-for-a-project-starter) request to trigger an immediate pull after updates to the repository. + + ```plaintext + https://gitlab.example.com/api/v4/projects/:id/mirror/pull?private_token=<your_access_token> + ``` + +- Ensure that the **Push Events** checkbox is selected. +- Click on **Add Webhook** button to save the webhook. +- To test the integration click on the **Test** button and confirm GitLab does not return any error. + ### Preventing conflicts using a `pre-receive` hook CAUTION: **Warning:** @@ -388,13 +406,13 @@ proxy_push() REFNAME="$3" # --- Pattern of branches to proxy pushes - whitelisted=$(expr "$branch" : "\(master\)") + allowlist=$(expr "$branch" : "\(master\)") case "$refname" in refs/heads/*) branch=$(expr "$refname" : "refs/heads/\(.*\)") - if [ "$whitelisted" = "$branch" ]; then + if [ "$allowlist" = "$branch" ]; then unset GIT_QUARANTINE_PATH # handle https://git-scm.com/docs/git-receive-pack#_quarantine_environment error="$(git push --quiet $TARGET_REPO $NEWREV:$REFNAME 2>&1)" fail=$? @@ -435,7 +453,7 @@ Note that this sample has a few limitations: - This example may not work verbatim for your use case and might need modification. - It does not regard different types of authentication mechanisms for the mirror. - It does not work with forced updates (rewriting history). - - Only branches that match the `whitelisted` patterns will be proxy pushed. + - Only branches that match the `allowlist` patterns will be proxy pushed. - The script circumvents the Git hook quarantine environment because the update of `$TARGET_REPO` is seen as a ref update and Git will complain about it. diff --git a/doc/user/project/repository/x509_signed_commits/index.md b/doc/user/project/repository/x509_signed_commits/index.md index 20143af0b33..d55d5c5c2d8 100644 --- a/doc/user/project/repository/x509_signed_commits/index.md +++ b/doc/user/project/repository/x509_signed_commits/index.md @@ -65,11 +65,11 @@ git config --global gpg.format x509 ### Windows and MacOS -Install [smimesign](https://github.com/github/smimesign) by downloading the +Install [S/MIME Sign](https://github.com/github/smimesign) by downloading the installer or via `brew install smimesign` on MacOS. Get the ID of your certificate with `smimesign --list-keys` and set your -signingkey `git config --global user.signingkey ID`, then configure X.509: +signing key `git config --global user.signingkey ID`, then configure X.509: ```shell git config --global gpg.x509.program smimesign |