summaryrefslogtreecommitdiff
path: root/doc/administration/raketasks/storage.md
blob: 27b899dd1b1958c0ac17ebf7341f122db9139150 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
---
stage: Systems
group: Distribution
info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#assignments
---

# Repository storage Rake tasks **(FREE SELF)**

This is a collection of Rake tasks to help you list and migrate
existing projects and their attachments to the new
[hashed storage](../repository_storage_types.md) that GitLab
uses to organize the Git data.

## List projects and attachments

The following Rake tasks lists the projects and attachments that are
available on legacy and hashed storage.

### On legacy storage

To have a summary and then a list of projects and their attachments using legacy storage:

- **Omnibus installation**

  ```shell
  # Projects
  sudo gitlab-rake gitlab:storage:legacy_projects
  sudo gitlab-rake gitlab:storage:list_legacy_projects

  # Attachments
  sudo gitlab-rake gitlab:storage:legacy_attachments
  sudo gitlab-rake gitlab:storage:list_legacy_attachments
  ```

- **Source installation**

  ```shell
  # Projects
  sudo -u git -H bundle exec rake gitlab:storage:legacy_projects RAILS_ENV=production
  sudo -u git -H bundle exec rake gitlab:storage:list_legacy_projects RAILS_ENV=production

  # Attachments
  sudo -u git -H bundle exec rake gitlab:storage:legacy_attachments RAILS_ENV=production
  sudo -u git -H bundle exec rake gitlab:storage:list_legacy_attachments RAILS_ENV=production
  ```

### On hashed storage

To have a summary and then a list of projects and their attachments using hashed storage:

- **Omnibus installation**

  ```shell
  # Projects
  sudo gitlab-rake gitlab:storage:hashed_projects
  sudo gitlab-rake gitlab:storage:list_hashed_projects

  # Attachments
  sudo gitlab-rake gitlab:storage:hashed_attachments
  sudo gitlab-rake gitlab:storage:list_hashed_attachments
  ```

- **Source installation**

  ```shell
  # Projects
  sudo -u git -H bundle exec rake gitlab:storage:hashed_projects RAILS_ENV=production
  sudo -u git -H bundle exec rake gitlab:storage:list_hashed_projects RAILS_ENV=production

  # Attachments
  sudo -u git -H bundle exec rake gitlab:storage:hashed_attachments RAILS_ENV=production
  sudo -u git -H bundle exec rake gitlab:storage:list_hashed_attachments RAILS_ENV=production
  ```

## Migrate to hashed storage

WARNING:
In GitLab 13.0, [hashed storage](../repository_storage_types.md#hashed-storage)
is enabled by default and the legacy storage is deprecated.
GitLab 14.0 eliminates support for legacy storage. If you're on GitLab
13.0 and later, switching new projects to legacy storage is not possible.
The option to choose between hashed and legacy storage in the Admin Area has
been disabled.

This task must be run on any machine that has Rails/Sidekiq configured, and the task
schedules all your existing projects and attachments associated with it to be
migrated to the **Hashed** storage type:

- **Omnibus installation**

  ```shell
  sudo gitlab-rake gitlab:storage:migrate_to_hashed
  ```

- **Source installation**

  ```shell
  sudo -u git -H bundle exec rake gitlab:storage:migrate_to_hashed RAILS_ENV=production
  ```

If you have any existing integration, you may want to do a small rollout first,
to validate. You can do so by specifying an ID range with the operation by using
the environment variables `ID_FROM` and `ID_TO`. For example, to limit the rollout
to project IDs 50 to 100 in an Omnibus GitLab installation:

```shell
sudo gitlab-rake gitlab:storage:migrate_to_hashed ID_FROM=50 ID_TO=100
```

To monitor the progress in GitLab:

1. On the top bar, select **Menu > Admin**.
1. On the left sidebar, select **Monitoring > Background Jobs**.
1. Watch how long the `hashed_storage:hashed_storage_project_migrate` queue
   takes to finish. After it reaches zero, you can confirm every project
   has been migrated by running the commands above.

If you find it necessary, you can run the previous migration script again to schedule missing projects.

Any error or warning is logged in Sidekiq's log file.

If [Geo](../geo/index.md) is enabled, each project that is successfully migrated
generates an event to replicate the changes on any **secondary** nodes.

You only need the `gitlab:storage:migrate_to_hashed` Rake task to migrate your repositories, but there are
[additional commands](#list-projects-and-attachments) to help you inspect projects and attachments in both legacy and hashed storage.

## Rollback from hashed storage to legacy storage

WARNING:
In GitLab 13.0, [hashed storage](../repository_storage_types.md#hashed-storage)
is enabled by default and the legacy storage is deprecated.
GitLab 14.0 eliminates support for legacy storage. If you're on GitLab
13.0 and later, switching new projects to legacy storage is not possible.
The option to choose between hashed and legacy storage in the Admin Area has
been disabled.

This task schedules all your existing projects and associated attachments to be rolled back to the
legacy storage type.

- **Omnibus installation**

  ```shell
  sudo gitlab-rake gitlab:storage:rollback_to_legacy
  ```

- **Source installation**

  ```shell
  sudo -u git -H bundle exec rake gitlab:storage:rollback_to_legacy RAILS_ENV=production
  ```

If you have any existing integration, you may want to do a small rollback first,
to validate. You can do so by specifying an ID range with the operation by using
the environment variables `ID_FROM` and `ID_TO`. For example, to limit the rollout
to project IDs 50 to 100 in an Omnibus GitLab installation:

```shell
sudo gitlab-rake gitlab:storage:rollback_to_legacy ID_FROM=50 ID_TO=100
```

You can monitor the progress in the **Admin Area > Monitoring > Background Jobs** page.
On the **Queues** tab, you can watch the `hashed_storage:hashed_storage_project_rollback` queue to see how long the process takes to finish.

After it reaches zero, you can confirm every project has been rolled back by running the commands above.
If some projects weren't rolled back, you can run this rollback script again to schedule further rollbacks.
Any error or warning is logged in Sidekiq's log file.

If you have a Geo setup, the rollback is not reflected automatically
on the **secondary** node. You may need to wait for a backfill operation to kick-in and remove
the remaining repositories from the special `@hashed/` folder manually.

## Troubleshooting

The Rake task might not be able to complete the migration to hashed storage.
Checks on the instance will continue to report that there is legacy data:

```plaintext
* Found 1 projects using Legacy Storage
- janedoe/testproject (id: 1234)
```

If you have a subscription, [raise a ticket with GitLab support](https://support.gitlab.com)
as most of the fixes are relatively high risk, involving running code on the Rails console.

### Read only projects

If you have [set projects read only](../troubleshooting/gitlab_rails_cheat_sheet.md#make-a-project-read-only-can-only-be-done-in-the-console)
they might fail to migrate.

1. [Start a Rails console](../operations/rails_console.md#starting-a-rails-console-session).

1. Check if the project is read only:

   ```ruby
   project = Project.find_by_full_path('janedoe/testproject')
   project.repository_read_only
   ```

1. If it returns `true` (not `nil` or `false`), set it writable:

   ```ruby
   project.update!(repository_read_only: false)
   ```

1. [Re-run the migration Rake task](#migrate-to-hashed-storage).

1. Set the project read-only again:

   ```ruby
   project.update!(repository_read_only: true)
   ```

### Projects pending deletion

Check the project details in the Admin Area. If deleting the project failed
it will show as `Marked For Deletion At ..`, `Scheduled Deletion At ..` and
`pending removal`, but the dates will not be recent.

Delete the project using the Rails console:

1. [Start a Rails console](../operations/rails_console.md#starting-a-rails-console-session).

1. With the following code, select the project to be deleted and account to action it:

   ```ruby
   project = Project.find_by_full_path('janedoe/testproject')
   user = User.find_by_username('admin_handle')
   puts "\nproject selected for deletion is:\nID: #{project.id}\nPATH: #{project.full_path}\nNAME: #{project.name}\n\n"
   ```

   - Replace `janedoe/testproject` with your project path from the Rake take output or from the Admin Area.
   - Replace `admin_handle` with the handle of an instance administrator or with `root`.
   - Verify the output before proceeding. **There are no other checks performed**.

1. [Destroy the project](../troubleshooting/gitlab_rails_cheat_sheet.md#destroy-a-project) **immediately**:

   ```ruby
   Projects::DestroyService.new(project, user).execute
   ```

If destroying the project generates a stack trace relating to encryption or the error `OpenSSL::Cipher::CipherError`:

1. [Verify your GitLab secrets](check.md#verify-database-values-can-be-decrypted-using-the-current-secrets).

1. If the affected projects have secrets that cannot be decrypted it will be necessary to remove those specific secrets.
   [Our documentation for dealing with lost secrets](../../raketasks/backup_restore.md#when-the-secrets-file-is-lost)
   is for loss of all secrets, but it's possible for specific projects to be affected. For example,
   to [reset specific runner registration tokens](../../raketasks/backup_restore.md#reset-runner-registration-tokens)
   for a specific project ID:

   ```sql
   UPDATE projects SET runners_token = null, runners_token_encrypted = null where id = 1234;
   ```

### `Repository cannot be moved from` errors in Sidekiq log

Projects might fail to migrate with errors in the Sidekiq log:

```shell
# grep 'Repository cannot be moved' /var/log/gitlab/sidekiq/current
{"severity":"ERROR","time":"2021-02-29T02:29:02.021Z","message":"Repository cannot be moved from 'janedoe/testproject' to '@hashed<value>' (PROJECT_ID=1234)"}
```

This might be caused by [a bug](https://gitlab.com/gitlab-org/gitlab/-/issues/259605) in the original code for hashed storage migration.

[There is a workaround for projects still affected by this issue](https://gitlab.com/-/snippets/2039252).