summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorEvan Read <eread@gitlab.com>2019-03-15 06:41:12 +0000
committerEvan Read <eread@gitlab.com>2019-03-15 06:41:12 +0000
commit09669e2c3c60e7fbb0369d50ec30cd4490ce4846 (patch)
treeaf7ddca5a3280f9fd739aa028d8722c14061cbdf
parent82c0816d3fdd96412605845fa337b74b6a3534c4 (diff)
parent823695ee37113e84dabffee7d8f310bca294b234 (diff)
downloadgitlab-ce-09669e2c3c60e7fbb0369d50ec30cd4490ce4846.tar.gz
Merge branch '58738-hashed-storage-document-rollback-mechanism' into 'master'
Hashed Storage: Document Rollback mechanism Closes #58738 See merge request gitlab-org/gitlab-ce!25960
-rw-r--r--doc/administration/raketasks/storage.md48
-rw-r--r--doc/administration/repository_storage_types.md111
2 files changed, 112 insertions, 47 deletions
diff --git a/doc/administration/raketasks/storage.md b/doc/administration/raketasks/storage.md
index 7ad38abe4f5..c39fef907db 100644
--- a/doc/administration/raketasks/storage.md
+++ b/doc/administration/raketasks/storage.md
@@ -34,17 +34,59 @@ export ID_FROM=20
export ID_TO=50
```
-You can monitor the progress in the _Admin > Monitoring > Background jobs_ screen.
-There is a specific Queue you can watch to see how long it will take to finish: **project_migrate_hashed_storage**
+You can monitor the progress in the **Admin Area > Monitoring > Background Jobs** page.
+There is a specific Queue you can watch to see how long it will take to finish:
+`hashed_storage:hashed_storage_project_migrate`
After it reaches zero, you can confirm every project has been migrated by running the commands bellow.
If you find it necessary, you can run this migration script again to schedule missing projects.
-Any error or warning will be logged in the sidekiq's log file.
+Any error or warning will be logged in Sidekiq's log file.
You only need the `gitlab:storage:migrate_to_hashed` rake task to migrate your repositories, but we have additional
commands below that helps you inspect projects and attachments in both legacy and hashed storage.
+## Rollback from Hashed storage to Legacy storage
+
+If you need to rollback the storage migration for any reason, you can follow the steps described here.
+
+NOTE: **Note:** Hashed Storage will be required in future version of GitLab.
+
+To prevent new projects from being created in the Hashed storage,
+you need to undo the [enable hashed storage][storage-migration] changes.
+
+This task will schedule all your existing projects and associated attachments to be rolled back to the
+Legacy storage type.
+
+For Omnibus installations, run the following:
+
+```bash
+sudo gitlab-rake gitlab:storage:rollback_to_legacy
+```
+
+For source installations, run the following:
+
+```bash
+sudo -u git -H bundle exec rake gitlab:storage:rollback_to_legacy RAILS_ENV=production
+```
+
+Both commands accept a range as environment variable:
+
+```bash
+# to rollback any migrated project from ID 20 to 50.
+export ID_FROM=20
+export ID_TO=50
+```
+
+You can monitor the progress in the **Admin Area > Monitoring > Background Jobs** page.
+On the **Queues** tab, you can watch the `hashed_storage:hashed_storage_project_rollback` queue to see how long the process will take to finish.
+
+
+After it reaches zero, you can confirm every project has been rolled back by running the commands bellow.
+If some projects weren't rolled back, you can run this rollback script again to schedule further rollbacks.
+
+Any error or warning will be logged in Sidekiq's log file.
+
## List projects on Legacy storage
To have a simple summary of projects using **Legacy** storage:
diff --git a/doc/administration/repository_storage_types.md b/doc/administration/repository_storage_types.md
index 4934aaf39f7..40f7c5566ac 100644
--- a/doc/administration/repository_storage_types.md
+++ b/doc/administration/repository_storage_types.md
@@ -2,6 +2,24 @@
> [Introduced][ce-28283] in GitLab 10.0.
+Two different storage layouts can be used
+to store the repositories on disk and their characteristics.
+
+GitLab can be configured to use one or multiple repository shard locations
+that can be:
+
+- Mounted to the local disk
+- Exposed as an NFS shared volume
+- Acessed via [gitaly] on its own machine.
+
+In GitLab, this is configured in `/etc/gitlab/gitlab.rb` by the `git_data_dirs({})`
+configuration hash. The storage layouts discussed here will apply to any shard
+defined in it.
+
+The `default` repository shard that is available in any installations
+that haven't customized it, points to the local folder: `/var/opt/gitlab/git-data`.
+Anything discussed below is expected to be part of that folder.
+
## Legacy Storage
Legacy Storage is the storage behavior prior to version 10.0. For historical
@@ -66,34 +84,7 @@ by another folder with the next 2 characters. They are both stored in a special
"@hashed/#{hash[0..1]}/#{hash[2..3]}/#{hash}.wiki.git"
```
-### How to migrate to Hashed Storage
-
-In GitLab, go to **Admin > Settings**, find the **Repository Storage** section
-and select "_Use hashed storage paths for newly created and renamed projects_".
-
-To migrate your existing projects to the new storage type, check the specific
-[rake tasks].
-
-[ce-28283]: https://gitlab.com/gitlab-org/gitlab-ce/issues/28283
-[rake tasks]: raketasks/storage.md#migrate-existing-projects-to-hashed-storage
-[storage-paths]: repository_storage_types.md
-
-#### Rollback
-
-There is no automated rollback implemented. Below are the steps required to rollback
-from each storage migration.
-
-The rollback has to be performed in the reverse order. To get into "Legacy" state,
-you need to rollback Attachments first, then Project.
-
-Also note that if Geo is enabled, after the migration was triggered, an event is generated
-to replicate the operation on any Secondary node. That means the on disk changes will also
-need to be performed on these nodes as well. Database changes will propagate without issues.
-
-You must make sure the migration event was already processed or otherwise it may migrate
-the files back to Hashed state again.
-
-#### Hashed object pools
+### Hashed object pools
For deduplication of public forks and their parent repository, objects are pooled
in an object pool. These object pools are a third repository where shared objects
@@ -110,36 +101,60 @@ enabled for individual projects by executing
be on hashed storage, should not be a fork itself, and hashed storage should be
enabled for all new projects.
-##### Attachments
+### How to migrate to Hashed Storage
-To rollback single Attachment migration, rename `aa/bb/abcdef1234567890...` folder back to `namespace/project`.
+To start a migration, enable Hashed Storage for new projects:
+
+1. Go to **Admin > Settings** and expand the **Repository Storage** section.
+2. Select the **Use hashed storage paths for newly created and renamed projects** checkbox.
-Both folder names can be generated by the `FileUploader.absolute_base_dir(project)`, you
-just need to switch the version from the `project` back to the previous one.
+Check if the change breaks any existing integration you may have that
+either runs on the same machine as your repositories are located, or may login to that machine
+to access data (for example, a remote backup solution).
-```ruby
-project.storage_version
-# => 2
+To schedule a complete rollout, see the
+[rake task documentation for storage migration][rake/migrate-to-hashed] for instructions.
-FileUploader.absolute_base_dir(project)
-# => "/opt/gitlab/embedded/service/gitlab-rails/public/uploads/@hashed/d4/73/d4735e3a265e16eee03f59718b9b5d03019c07d8b6c51f90da3a666eec13ab35"
+If you do have any existing integration, you may want to do a small rollout first,
+to validate. You can do so by specifying a range with the operation.
-project.storage_version = 1
+This is an example of how to limit the rollout to Project IDs 50 to 100, running in
+an Omnibus Gitlab installation:
-FileUploader.absolute_base_dir(project)
-# => "/opt/gitlab/embedded/service/gitlab-rails/public/uploads/gitlab/gitlab-shell-renamed"
+```bash
+sudo gitlab-rake gitlab:storage:migrate_to_hashed ID_FROM=50 ID_TO=100
```
-##### Project
+Check the [documentation][rake/migrate-to-hashed] for additional information and instructions for
+source-based installation.
+
+#### Rollback
+
+Similar to the migration, to disable Hashed Storage for new
+projects:
-To rollback single Project migration, move `@hashed/aa/bb/aabbcdef1234567890abcdef.git` and `@hashed/aa/bb/aabbcdef1234567890abcdef.wiki.git`
-back to `namespace/project.git` and `namespace/project.wiki.git` respectively and switch the version from the `project` back to `null`.
+1. Go to **Admin > Settings** and expand the **Repository Storage** section.
+2. Uncheck the **Use hashed storage paths for newly created and renamed projects** checkbox.
+
+To schedule a complete rollback, see the
+[rake task documentation for storage rollback][rake/rollback-to-legacy] for instructions.
+
+The rollback task also supports specifying a range of Project IDs. Here is an example
+of limiting the rollout to Project IDs 50 to 100, in an Omnibus Gitlab installation:
+
+```bash
+sudo gitlab-rake gitlab:storage:rollback_to_legacy ID_FROM=50 ID_TO=100
+```
+
+If you have a Geo setup, please note that the rollback will not be reflected automatically
+on the **secondary** node. You may need to wait for a backfill operation to kick-in and remove
+the remaining repositories from the special `@hashed/` folder manually.
### Hashed Storage coverage
We are incrementally moving every storable object in GitLab to the Hashed
Storage pattern. You can check the current coverage status below (and also see
-the [issue](https://gitlab.com/gitlab-com/infrastructure/issues/2821)).
+the [issue][ce-2821]).
Note that things stored in an S3 compatible endpoint will not have the downsides
mentioned earlier, if they are not prefixed with `#{namespace}/#{project_name}`,
@@ -156,6 +171,7 @@ which is true for CI Cache and LFS Objects.
| CI Artifacts | No | No | Yes | 9.4 / 10.6 |
| CI Cache | No | No | Yes | - |
| LFS Objects | Yes | Similar | Yes | 10.0 / 10.7 |
+| Repository pools| No | Yes | - | 11.6 |
#### Implementation Details
@@ -180,3 +196,10 @@ LFS Objects implements a similar storage pattern using 2 chars, 2 level folders,
```
They are also S3 compatible since **10.0** (GitLab Premium), and available in GitLab Core since **10.7**.
+
+[ce-2821]: https://gitlab.com/gitlab-com/infrastructure/issues/2821
+[ce-28283]: https://gitlab.com/gitlab-org/gitlab-ce/issues/28283
+[rake/migrate-to-hashed]: raketasks/storage.md#migrate-existing-projects-to-hashed-storage
+[rake/rollback-to-legacy]: raketasks/storage.md#rollback
+[storage-paths]: repository_storage_types.md
+[gitaly]: gitaly/index.md