summaryrefslogtreecommitdiff
path: root/doc/raketasks/backup_restore.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/raketasks/backup_restore.md')
-rw-r--r--doc/raketasks/backup_restore.md238
1 files changed, 174 insertions, 64 deletions
diff --git a/doc/raketasks/backup_restore.md b/doc/raketasks/backup_restore.md
index 73360c3ee4b..e9c9659d4f5 100644
--- a/doc/raketasks/backup_restore.md
+++ b/doc/raketasks/backup_restore.md
@@ -119,7 +119,7 @@ script on the GitLab task runner pod. For more details, see
[backing up a GitLab installation](https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/backup-restore/backup.md#backing-up-a-gitlab-installation).
```shell
-kubectl exec -it <gitlab task-runner pod> backup-utility
+kubectl exec -it <gitlab task-runner pod> -- backup-utility
```
Similar to the Kubernetes case, if you have scaled out your GitLab cluster to
@@ -850,66 +850,6 @@ restoring the new data, which causes an error.
Read more about [configuring NFS mounts](../administration/nfs.md)
-### Restore for installation from source
-
-First, ensure your backup tar file is in the backup directory described in the
-`gitlab.yml` configuration:
-
-```yaml
-## Backup settings
-backup:
- path: "tmp/backups" # Relative paths are relative to Rails.root (default: tmp/backups/)
-```
-
-The default is `/home/git/gitlab/tmp/backups`, and it needs to be owned by the `git` user. Now, you can begin the backup procedure:
-
-```shell
-# Stop processes that are connected to the database
-sudo service gitlab stop
-
-sudo -u git -H bundle exec rake gitlab:backup:restore RAILS_ENV=production
-```
-
-Example output:
-
-```plaintext
-Unpacking backup... [DONE]
-Restoring database tables:
--- create_table("events", {:force=>true})
- -> 0.2231s
-[...]
-- Loading fixture events...[DONE]
-- Loading fixture issues...[DONE]
-- Loading fixture keys...[SKIPPING]
-- Loading fixture merge_requests...[DONE]
-- Loading fixture milestones...[DONE]
-- Loading fixture namespaces...[DONE]
-- Loading fixture notes...[DONE]
-- Loading fixture projects...[DONE]
-- Loading fixture protected_branches...[SKIPPING]
-- Loading fixture schema_migrations...[DONE]
-- Loading fixture services...[SKIPPING]
-- Loading fixture snippets...[SKIPPING]
-- Loading fixture taggings...[SKIPPING]
-- Loading fixture tags...[SKIPPING]
-- Loading fixture users...[DONE]
-- Loading fixture users_projects...[DONE]
-- Loading fixture web_hooks...[SKIPPING]
-- Loading fixture wikis...[SKIPPING]
-Restoring repositories:
-- Restoring repository abcd... [DONE]
-- Object pool 1 ...
-Deleting tmp directories...[DONE]
-```
-
-Next, restore `/home/git/gitlab/.secret` if necessary, [as previously mentioned](#restore-prerequisites).
-
-Restart GitLab:
-
-```shell
-sudo service gitlab restart
-```
-
### Restore for Omnibus GitLab installations
This procedure assumes that:
@@ -1027,6 +967,66 @@ issue.
The GitLab Helm chart uses a different process, documented in
[restoring a GitLab Helm chart installation](https://gitlab.com/gitlab-org/charts/gitlab/blob/master/doc/backup-restore/restore.md).
+### Restore for installation from source
+
+First, ensure your backup tar file is in the backup directory described in the
+`gitlab.yml` configuration:
+
+```yaml
+## Backup settings
+backup:
+ path: "tmp/backups" # Relative paths are relative to Rails.root (default: tmp/backups/)
+```
+
+The default is `/home/git/gitlab/tmp/backups`, and it needs to be owned by the `git` user. Now, you can begin the backup procedure:
+
+```shell
+# Stop processes that are connected to the database
+sudo service gitlab stop
+
+sudo -u git -H bundle exec rake gitlab:backup:restore RAILS_ENV=production
+```
+
+Example output:
+
+```plaintext
+Unpacking backup... [DONE]
+Restoring database tables:
+-- create_table("events", {:force=>true})
+ -> 0.2231s
+[...]
+- Loading fixture events...[DONE]
+- Loading fixture issues...[DONE]
+- Loading fixture keys...[SKIPPING]
+- Loading fixture merge_requests...[DONE]
+- Loading fixture milestones...[DONE]
+- Loading fixture namespaces...[DONE]
+- Loading fixture notes...[DONE]
+- Loading fixture projects...[DONE]
+- Loading fixture protected_branches...[SKIPPING]
+- Loading fixture schema_migrations...[DONE]
+- Loading fixture services...[SKIPPING]
+- Loading fixture snippets...[SKIPPING]
+- Loading fixture taggings...[SKIPPING]
+- Loading fixture tags...[SKIPPING]
+- Loading fixture users...[DONE]
+- Loading fixture users_projects...[DONE]
+- Loading fixture web_hooks...[SKIPPING]
+- Loading fixture wikis...[SKIPPING]
+Restoring repositories:
+- Restoring repository abcd... [DONE]
+- Object pool 1 ...
+Deleting tmp directories...[DONE]
+```
+
+Next, restore `/home/git/gitlab/.secret` if necessary, [as previously mentioned](#restore-prerequisites).
+
+Restart GitLab:
+
+```shell
+sudo service gitlab restart
+```
+
### Restoring only one or a few projects or groups from a backup
Although the Rake task used to restore a GitLab instance doesn't support
@@ -1050,9 +1050,10 @@ is being discussed in [issue #17517](https://gitlab.com/gitlab-org/gitlab/-/issu
## Alternative backup strategies
-If your GitLab server contains a lot of Git repository data, you may find the
-GitLab backup script to be too slow. In this case you can consider using
-file system snapshots as part of your backup strategy.
+If your GitLab instance contains a lot of Git repository data, you may find the
+GitLab backup script to be too slow. If your GitLab instance has a lot of forked
+projects, the regular backup task also duplicates the Git data for all of them.
+In these cases, consider using file system snapshots as part of your backup strategy.
Example: Amazon EBS
@@ -1074,6 +1075,71 @@ VM snapshots of the entire GitLab server. It's not uncommon however for a VM
snapshot to require you to power down the server, which limits this solution's
practical use.
+### Back up repository data separately
+
+First, ensure you back up existing GitLab data while [skipping repositories](#excluding-specific-directories-from-the-backup):
+
+```shell
+# for Omnibus GitLab package installations
+sudo gitlab-backup create SKIP=repositories
+
+# for installations from source:
+sudo -u git -H bundle exec rake gitlab:backup:create SKIP=repositories RAILS_ENV=production
+```
+
+For manually backing up the Git repository data on disk, there are multiple possible strategies:
+
+- Use snapshots, such as the previous examples of Amazon EBS drive snapshots, or LVM snapshots + rsync.
+- Use [GitLab Geo](../administration/geo/index.md) and rely on the repository data on a Geo secondary site.
+- [Prevent writes and copy the Git repository data](#prevent-writes-and-copy-the-git-repository-data).
+- [Create an online backup by marking repositories as read-only (experimental)](#online-backup-through-marking-repositories-as-read-only-experimental).
+
+#### Prevent writes and copy the Git repository data
+
+Git repositories must be copied in a consistent way. They should not be copied during concurrent write
+operations, as this can lead to inconsistencies or corruption issues. For more details,
+[issue 270422](https://gitlab.com/gitlab-org/gitlab/-/issues/270422 "Provide documentation on preferred method of migrating Gitaly servers")
+has a longer discussion explaining the potential problems.
+
+To prevent writes to the Git repository data, there are two possible approaches:
+
+- Use [maintenance mode](../administration/maintenance_mode/index.md) to place GitLab in a read-only state.
+- Create explicit downtime by stopping all Gitaly services before backing up the repositories:
+
+ ```shell
+ sudo gitlab-ctl stop gitaly
+ # execute git data copy step
+ sudo gitlab-ctl start gitaly
+ ```
+
+You can copy Git repository data using any method, as long as writes are prevented on the data being copied
+(to prevent inconsistencies and corruption issues). In order of preference and safety, the recommended methods are:
+
+1. Use `rsync` with archive-mode, delete, and checksum options, for example:
+
+ ```shell
+ rsync -aR --delete --checksum source destination # be extra safe with the order as it will delete existing data if inverted
+ ```
+
+1. Use a [`tar` pipe to copy the entire repository's directory to another server or location](../administration/operations/moving_repositories.md#tar-pipe-to-another-server).
+
+1. Use `sftp`, `scp`, `cp`, or any other copying method.
+
+#### Online backup through marking repositories as read-only (experimental)
+
+One way of backing up repositories without requiring instance-wide downtime
+is to programmatically mark projects as read-only while copying the underlying data.
+
+There are a few possible downsides to this:
+
+- Repositories are read-only for a period of time that scales with the size of the repository.
+- Backups take a longer time to complete due to marking each project as read-only, potentially leading to inconsistencies. For example,
+ a possible date discrepancy between the last data available for the first project that gets backed up compared to
+ the last project that gets backed up.
+- Fork networks should be entirely read-only while the projects inside get backed up to prevent potential changes to the pool repository.
+
+There is an **experimental** script that attempts to automate this process in [Snippet 2149205](https://gitlab.com/-/snippets/2149205).
+
## Backup and restore for installations using PgBouncer
Do NOT backup or restore GitLab through a PgBouncer connection. These
@@ -1406,3 +1472,47 @@ If this happens, examine the following:
- Confirm there is sufficient disk space for the Gzip operation.
- If NFS is being used, check if the mount option `timeout` is set. The
default is `600`, and changing this to smaller values results in this error.
+
+### `gitaly-backup` for repository backup and restore **(FREE SELF)**
+
+> - [Introduced](https://gitlab.com/gitlab-org/gitlab/-/issues/333034) in GitLab 14.2.
+> - [Deployed behind a feature flag](../user/feature_flags.md), enabled by default.
+> - Recommended for production use.
+> - For GitLab self-managed instances, GitLab administrators can opt to [disable it](#disable-or-enable-gitaly-backup).
+
+There can be
+[risks when disabling released features](../administration/feature_flags.md#risks-when-disabling-released-features).
+Refer to this feature's version history for more details.
+
+`gitaly-backup` is used by the backup Rake task to create and restore repository backups from Gitaly.
+`gitaly-backup` replaces the previous backup method that directly calls RPCs on Gitaly from GitLab.
+
+The backup Rake task must be able to find this executable. It can be configured in Omnibus GitLab packages:
+
+1. Add the following to `/etc/gitlab/gitlab.rb`:
+
+ ```ruby
+ gitlab_rails['backup_gitaly_backup_path'] = '/path/to/gitaly-backup'
+ ```
+
+1. [Reconfigure GitLab](../administration/restart_gitlab.md#omnibus-gitlab-reconfigure)
+ for the changes to take effect
+
+#### Disable or enable `gitaly-backup`
+
+`gitaly-backup` is under development but ready for production use.
+It is deployed behind a feature flag that is **enabled by default**.
+[GitLab administrators with access to the GitLab Rails console](../administration/feature_flags.md)
+can opt to disable it.
+
+To disable it:
+
+```ruby
+Feature.disable(:gitaly_backup)
+```
+
+To enable it:
+
+```ruby
+Feature.enable(:gitaly_backup)
+```