summaryrefslogtreecommitdiff
path: root/doc/user/project/repository/reducing_the_repo_size_using_git.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/user/project/repository/reducing_the_repo_size_using_git.md')
-rw-r--r--doc/user/project/repository/reducing_the_repo_size_using_git.md33
1 files changed, 29 insertions, 4 deletions
diff --git a/doc/user/project/repository/reducing_the_repo_size_using_git.md b/doc/user/project/repository/reducing_the_repo_size_using_git.md
index 124150c441a..baad5027703 100644
--- a/doc/user/project/repository/reducing_the_repo_size_using_git.md
+++ b/doc/user/project/repository/reducing_the_repo_size_using_git.md
@@ -25,11 +25,16 @@ Rewriting repository history is a destructive operation. Make sure to backup you
you begin. The best way back up a repository is to
[export the project](../settings/import_export.md#exporting-a-project-and-its-data).
+NOTE: **Note:**
+Git LFS files can only be removed by an Administrator using a
+[Rake task](../../../raketasks/cleanup.md). Removal of this limitation
+[is planned](https://gitlab.com/gitlab-org/gitlab/-/issues/223621).
+
## Purge files from repository history
To make cloning your project faster, rewrite branches and tags to remove unwanted files.
-1. [Install `git filter-repo`](https://github.com/newren/git-filter-repo/blob/master/INSTALL.md)
+1. [Install `git filter-repo`](https://github.com/newren/git-filter-repo/blob/main/INSTALL.md)
using a supported package manager or from source.
1. Clone a fresh copy of the repository using `--bare`:
@@ -40,12 +45,25 @@ To make cloning your project faster, rewrite branches and tags to remove unwante
1. Using `git filter-repo`, purge any files from the history of your repository.
- To purge all large files, the `--strip-blobs-bigger-than` option can be used:
+ To purge large files, the `--strip-blobs-bigger-than` option can be used:
```shell
git filter-repo --strip-blobs-bigger-than 10M
```
+ To purge large files stored using Git LFS, the `--blob--callback` option can
+ be used. The example below, uses the callback to read the file size from the
+ Git LFS pointer, and removes files larger than 10MB.
+
+ ```shell
+ git filter-repo --blob-callback '
+ if blob.data.startswith(b"version https://git-lfs.github.com/spec/v1"):
+ size_in_bytes = int.from_bytes(blob.data[124:], byteorder="big")
+ if size_in_bytes > 10*1000:
+ blob.skip()
+ '
+ ```
+
To purge specific large files by path, the `--path` and `--invert-paths` options can be combined:
```shell
@@ -80,6 +98,12 @@ To make cloning your project faster, rewrite branches and tags to remove unwante
[Protected tags](../protected_tags.md) will cause this to fail. To proceed, you must remove tag
protection, push, and then re-enable protected tags.
+1. Manually run [project housekeeping](../../../administration/housekeeping.md#manual-housekeeping)
+
+NOTE: **Note:**
+Project statistics are cached for performance. You may need to wait 5-10 minutes
+to see a reduction in storage utilization.
+
## Purge files from GitLab storage
To reduce the size of your repository in GitLab, you must remove GitLab internal references to
@@ -103,7 +127,7 @@ cannot be fetched at all.
However, these refs can be accessed from the Git bundle inside a project export.
-1. [Install `git filter-repo`](https://github.com/newren/git-filter-repo/blob/master/INSTALL.md)
+1. [Install `git filter-repo`](https://github.com/newren/git-filter-repo/blob/main/INSTALL.md)
using a supported package manager or from source.
1. Generate a fresh [export from the
@@ -128,7 +152,7 @@ However, these refs can be accessed from the Git bundle inside a project export.
trying to remove internal refs, we will rely on the `commit-map` produced by each run to tell us
which internal refs to remove.
- NOTE:**Note:**
+ NOTE: **Note:**
`git filter-repo` creates a new `commit-map` file every run, and overwrite the `commit-map` from
the previous run. You will need this file from **every** run. Do the next step every time you run
`git filter-repo`.
@@ -176,6 +200,7 @@ You will receive an email once it has completed.
When using repository cleanup, note:
+- Project statistics are cached. You may need to wait 5-10 minutes to see a reduction in storage utilization.
- Housekeeping prunes loose objects older than 2 weeks. This means objects added in the last 2 weeks
will not be removed immediately. If you have access to the
[Gitaly](../../../administration/gitaly/index.md) server, you may run `git gc --prune=now` to