summaryrefslogtreecommitdiff
path: root/doc/operations/moving_repositories.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/operations/moving_repositories.md')
-rw-r--r--doc/operations/moving_repositories.md181
1 files changed, 1 insertions, 180 deletions
diff --git a/doc/operations/moving_repositories.md b/doc/operations/moving_repositories.md
index 54adb99386a..c54bca324a5 100644
--- a/doc/operations/moving_repositories.md
+++ b/doc/operations/moving_repositories.md
@@ -1,180 +1 @@
-# Moving repositories managed by GitLab
-
-Sometimes you need to move all repositories managed by GitLab to
-another filesystem or another server. In this document we will look
-at some of the ways you can copy all your repositories from
-`/var/opt/gitlab/git-data/repositories` to `/mnt/gitlab/repositories`.
-
-We will look at three scenarios: the target directory is empty, the
-target directory contains an outdated copy of the repositories, and
-how to deal with thousands of repositories.
-
-**Each of the approaches we list can/will overwrite data in the
-target directory `/mnt/gitlab/repositories`. Do not mix up the
-source and the target.**
-
-## Target directory is empty: use a tar pipe
-
-If the target directory `/mnt/gitlab/repositories` is empty the
-simplest thing to do is to use a tar pipe. This method has low
-overhead and tar is almost always already installed on your system.
-However, it is not possible to resume an interrupted tar pipe: if
-that happens then all data must be copied again.
-
-```
-# As the git user
-tar -C /var/opt/gitlab/git-data/repositories -cf - -- . |\
- tar -C /mnt/gitlab/repositories -xf -
-```
-
-If you want to see progress, replace `-xf` with `-xvf`.
-
-### Tar pipe to another server
-
-You can also use a tar pipe to copy data to another server. If your
-'git' user has SSH access to the newserver as 'git@newserver', you
-can pipe the data through SSH.
-
-```
-# As the git user
-tar -C /var/opt/gitlab/git-data/repositories -cf - -- . |\
- ssh git@newserver tar -C /mnt/gitlab/repositories -xf -
-```
-
-If you want to compress the data before it goes over the network
-(which will cost you CPU cycles) you can replace `ssh` with `ssh -C`.
-
-## The target directory contains an outdated copy of the repositories: use rsync
-
-If the target directory already contains a partial / outdated copy
-of the repositories it may be wasteful to copy all the data again
-with tar. In this scenario it is better to use rsync. This utility
-is either already installed on your system or easily installable
-via apt, yum etc.
-
-```
-# As the 'git' user
-rsync -a --delete /var/opt/gitlab/git-data/repositories/. \
- /mnt/gitlab/repositories
-```
-
-The `/.` in the command above is very important, without it you can
-easily get the wrong directory structure in the target directory.
-If you want to see progress, replace `-a` with `-av`.
-
-### Single rsync to another server
-
-If the 'git' user on your source system has SSH access to the target
-server you can send the repositories over the network with rsync.
-
-```
-# As the 'git' user
-rsync -a --delete /var/opt/gitlab/git-data/repositories/. \
- git@newserver:/mnt/gitlab/repositories
-```
-
-## Thousands of Git repositories: use one rsync per repository
-
-Every time you start an rsync job it has to inspect all files in
-the source directory, all files in the target directory, and then
-decide what files to copy or not. If the source or target directory
-has many contents this startup phase of rsync can become a burden
-for your GitLab server. In cases like this you can make rsync's
-life easier by dividing its work in smaller pieces, and sync one
-repository at a time.
-
-In addition to rsync we will use [GNU
-Parallel](http://www.gnu.org/software/parallel/). This utility is
-not included in GitLab so you need to install it yourself with apt
-or yum. Also note that the GitLab scripts we used below were added
-in GitLab 8.1.
-
-** This process does not clean up repositories at the target location that no
-longer exist at the source. ** If you start using your GitLab instance with
-`/mnt/gitlab/repositories`, you need to run `gitlab-rake gitlab:cleanup:repos`
-after switching to the new repository storage directory.
-
-### Parallel rsync for all repositories known to GitLab
-
-This will sync repositories with 10 rsync processes at a time. We keep
-track of progress so that the transfer can be restarted if necessary.
-
-First we create a new directory, owned by 'git', to hold transfer
-logs. We assume the directory is empty before we start the transfer
-procedure, and that we are the only ones writing files in it.
-
-```
-# Omnibus
-sudo mkdir /var/opt/gitlab/transfer-logs
-sudo chown git:git /var/opt/gitlab/transfer-logs
-
-# Source
-sudo -u git -H mkdir /home/git/transfer-logs
-```
-
-We seed the process with a list of the directories we want to copy.
-
-```
-# Omnibus
-sudo -u git sh -c 'gitlab-rake gitlab:list_repos > /var/opt/gitlab/transfer-logs/all-repos-$(date +%s).txt'
-
-# Source
-cd /home/git/gitlab
-sudo -u git -H sh -c 'bundle exec rake gitlab:list_repos > /home/git/transfer-logs/all-repos-$(date +%s).txt'
-```
-
-Now we can start the transfer. The command below is idempotent, and
-the number of jobs done by GNU Parallel should converge to zero. If it
-does not some repositories listed in all-repos-1234.txt may have been
-deleted/renamed before they could be copied.
-
-```
-# Omnibus
-sudo -u git sh -c '
-cat /var/opt/gitlab/transfer-logs/* | sort | uniq -u |\
- /usr/bin/env JOBS=10 \
- /opt/gitlab/embedded/service/gitlab-rails/bin/parallel-rsync-repos \
- /var/opt/gitlab/transfer-logs/success-$(date +%s).log \
- /var/opt/gitlab/git-data/repositories \
- /mnt/gitlab/repositories
-'
-
-# Source
-cd /home/git/gitlab
-sudo -u git -H sh -c '
-cat /home/git/transfer-logs/* | sort | uniq -u |\
- /usr/bin/env JOBS=10 \
- bin/parallel-rsync-repos \
- /home/git/transfer-logs/success-$(date +%s).log \
- /home/git/repositories \
- /mnt/gitlab/repositories
-`
-```
-
-### Parallel rsync only for repositories with recent activity
-
-Suppose you have already done one sync that started after 2015-10-1 12:00 UTC.
-Then you might only want to sync repositories that were changed via GitLab
-_after_ that time. You can use the 'SINCE' variable to tell 'rake
-gitlab:list_repos' to only print repositories with recent activity.
-
-```
-# Omnibus
-sudo gitlab-rake gitlab:list_repos SINCE='2015-10-1 12:00 UTC' |\
- sudo -u git \
- /usr/bin/env JOBS=10 \
- /opt/gitlab/embedded/service/gitlab-rails/bin/parallel-rsync-repos \
- success-$(date +%s).log \
- /var/opt/gitlab/git-data/repositories \
- /mnt/gitlab/repositories
-
-# Source
-cd /home/git/gitlab
-sudo -u git -H bundle exec rake gitlab:list_repos SINCE='2015-10-1 12:00 UTC' |\
- sudo -u git -H \
- /usr/bin/env JOBS=10 \
- bin/parallel-rsync-repos \
- success-$(date +%s).log \
- /home/git/repositories \
- /mnt/gitlab/repositories
-```
+This document was moved to [administration/operations/moving_repositories](../administration/operations/moving_repositories.md).