summaryrefslogtreecommitdiff
path: root/doc/workflow/lfs/lfs_administration.md
blob: ba1e4e55d5beb0883270f468d134913fe2e2ba50 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
# GitLab Git LFS Administration

Documentation on how to use Git LFS are under [Managing large binary files with Git LFS doc](manage_large_binaries_with_git_lfs.md).

## Requirements

* Git LFS is supported in GitLab starting with version 8.2.
* Support for object storage, such as AWS S3, was introduced in 10.0.
* Users need to install [Git LFS client](https://git-lfs.github.com) version 1.0.1 and up.

## Configuration

Git LFS objects can be large in size. By default, they are stored on the server
GitLab is installed on.

There are various configuration options to help GitLab server administrators:

* Enabling/disabling Git LFS support
* Changing the location of LFS object storage
* Setting up an object storage supported by [Fog](http://fog.io/about/provider_documentation.html)

### Configuration for Omnibus installations

In `/etc/gitlab/gitlab.rb`:

```ruby
# Change to true to enable lfs
gitlab_rails['lfs_enabled'] = false

# Optionally, change the storage path location. Defaults to
# `#{gitlab_rails['shared_path']}/lfs-objects`. Which evaluates to
# `/var/opt/gitlab/gitlab-rails/shared/lfs-objects` by default.
gitlab_rails['lfs_storage_path'] = "/mnt/storage/lfs-objects"
```

### Configuration for installations from source

In `config/gitlab.yml`:

```yaml
# Change to true to enable lfs
  lfs:
    enabled: false
    storage_path: /mnt/storage/lfs-objects
```

## Storing LFS objects to an object storage

> [Introduced][ee-2760] in [GitLab Premium][eep] 10.0. Brought to GitLab Core
in 10.7.

It is possible to store LFS objects to a remote object storage which allows you
to offload R/W operation on local hard disk and freed up disk space significantly.
You can check which object storage can be integrated with GitLab [here](http://fog.io/about/provider_documentation.html)
(Since GitLab is tightly integrated with `Fog`, you can refer the documentation)
You can also use an object storage in a private local network. For example,
[Minio](https://www.minio.io/) is standalone object storage, easy to setup, and works well with GitLab instance.

GitLab provides two different options as the uploading mechanizm. One is "Direct upload", and another one is "Background upload".

**Option 1. Direct upload**

1. User pushes a lfs file to the GitLab instance
1. GitLab-workhorse uploads the file to the object storage
1. GitLab-workhorse notifies to GitLab-rails that the uploading process is done

**Option 2. Background upload**

1. User pushes a lfs file to the GitLab instance
1. GitLab-rails stores the file to the local files storage
1. GitLab-rails uploads the file to object storage asynchronously

The following general settings are supported.

| Setting | Description | Default |
|---------|-------------|---------|
| `enabled` | Enable/disable object storage | `false` |
| `remote_directory` | The bucket name where LFS objects will be stored| |
| `direct_upload` | Set to true to enable direct upload of LFS without the need of local shared storage. Option may be removed once we decide to support only single storage for all files. | `false` |
| `background_upload` | Set to false to disable automatic upload. Option may be removed once upload is direct to S3 | `true` |
| `proxy_download` | Set to true to enable proxying all files served. Option allows to reduce egress traffic as this allows clients to download directly from remote storage instead of proxying all data | `false` |
| `connection` | Various connection options described below | |

The `connection` settings match those provided by [Fog](https://github.com/fog).

Here is the configuration example with S3.

| Setting | Description | example |
|---------|-------------|---------|
| `provider` | The provider name | AWS |
| `aws_access_key_id` | AWS credentials, or compatible | `ABC123DEF456` |
| `aws_secret_access_key` | AWS credentials, or compatible | `ABC123DEF456ABC123DEF456ABC123DEF456` |
| `region` | AWS region | us-east-1 |
| `host` | S3 compatible host for when not using AWS, e.g. `localhost` or `storage.example.com` | s3.amazonaws.com |
| `endpoint` | Can be used when configuring an S3 compatible service such as [Minio](https://www.minio.io), by entering a URL such as `http://127.0.0.1:9000` | (optional) |
| `path_style` | Set to true to use `host/bucket_name/object` style paths instead of `bucket_name.host/object`. Leave as false for AWS S3 | false |

Here is a configuration example with GCS.

| Setting | Description | example |
|---------|-------------|---------|
| `provider` | The provider name | `Google` |
| `google_project` | GCP project name | `gcp-project-12345` |
| `google_client_email` | The email address of a service account | `foo@gcp-project-12345.iam.gserviceaccount.com` |
| `google_json_key_location` | The json key path to the  | `/path/to/gcp-project-12345-abcde.json` |

_NOTE: Service account must have a permission to access the bucket. See more https://cloud.google.com/storage/docs/authentication_

### Manual uploading to an object storage

There are two ways to do the same thing with automatic uploading which described above.

**Option 1: rake task**

```
$ rake gitlab:lfs:migrate
```

**Option 2: rails console**

```
$ sudo gitlab-rails console            # Login to rails console

> # Upload LFS files manually
> LfsObject.where(file_store: [nil, 1]).find_each do |lfs_object|
>   lfs_object.file.migrate!(ObjectStorage::Store::REMOTE) if lfs_object.file.file.exists?
> end
```

### S3 for Omnibus installations

On Omnibus installations, the settings are prefixed by `lfs_object_store_`:

1. Edit `/etc/gitlab/gitlab.rb` and add the following lines by replacing with
   the values you want:

	```ruby
	gitlab_rails['lfs_object_store_enabled'] = true
	gitlab_rails['lfs_object_store_remote_directory'] = "lfs-objects"
	gitlab_rails['lfs_object_store_connection'] = {
	  'provider' => 'AWS',
	  'region' => 'eu-central-1',
	  'aws_access_key_id' => '1ABCD2EFGHI34JKLM567N',
	  'aws_secret_access_key' => 'abcdefhijklmnopQRSTUVwxyz0123456789ABCDE',
	  # The below options configure an S3 compatible host instead of AWS
	  'host' => 'localhost',
	  'endpoint' => 'http://127.0.0.1:9000',
	  'path_style' => true
	}
	```

1. Save the file and [reconfigure GitLab]s for the changes to take effect.
1. Migrate any existing local LFS objects to the object storage:

    ```bash
    gitlab-rake gitlab:lfs:migrate
    ```

    This will migrate existing LFS objects to object storage. New LFS objects
    will be forwarded to object storage unless
    `gitlab_rails['lfs_object_store_background_upload']` is set to false.

### S3 for installations from source

For source installations the settings are nested under `lfs:` and then
`object_store:`:

1. Edit `/home/git/gitlab/config/gitlab.yml` and add or amend the following
   lines:

	```yaml
	lfs:
	enabled: true
	object_store:
	  enabled: false
	  remote_directory: lfs-objects # Bucket name
	  connection:
	    provider: AWS
	    aws_access_key_id: 1ABCD2EFGHI34JKLM567N
	    aws_secret_access_key: abcdefhijklmnopQRSTUVwxyz0123456789ABCDE
	    region: eu-central-1
	    # Use the following options to configure an AWS compatible host such as Minio
	    host: 'localhost'
	    endpoint: 'http://127.0.0.1:9000'
	    path_style: true
	```

1. Save the file and [restart GitLab][] for the changes to take effect.
1. Migrate any existing local LFS objects to the object storage:

    ```bash
    sudo -u git -H bundle exec rake gitlab:lfs:migrate RAILS_ENV=production
    ```

    This will migrate existing LFS objects to object storage. New LFS objects
    will be forwarded to object storage unless `background_upload` is set to
    false.

## Storage statistics

You can see the total storage used for LFS objects on groups and projects
in the administration area, as well as through the [groups](../../api/groups.md)
and [projects APIs](../../api/projects.md).

## Troubleshooting: `Google::Apis::TransmissionError: execution expired`

If LFS integration is configred with Google Cloud Storage and background upload (`background_upload: true` and `direct_upload: false`)
sidekiq workers may encouter this error. This is because uploading timed out by huge files.
For the record, upto 6GB lfs files can be uploaded without any extra steps, otherwise you need the following workaround.

```shell
$ sudo gitlab-rails console            # Login to rails console

> # Setup timeouts. 20 minutes is enough to upload 30GB LFS files.
> # Those settings are only effective in the same session, i.e. Those are not effective in sidekiq workers.
> ::Google::Apis::ClientOptions.default.open_timeout_sec = 1200
> ::Google::Apis::ClientOptions.default.read_timeout_sec = 1200
> ::Google::Apis::ClientOptions.default.send_timeout_sec = 1200

> # Upload LFS files manually. This process does not use sidekiq at all.
> LfsObject.where(file_store: [nil, 1]).find_each do |lfs_object|
>   lfs_object.file.migrate!(ObjectStorage::Store::REMOTE) if lfs_object.file.file.exists?
> end
```

See more information in https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/19581

## Known limitations

* Support for removing unreferenced LFS objects was added in 8.14 onwards.
* LFS authentications via SSH was added with GitLab 8.12
* Only compatible with the GitLFS client versions 1.1.0 and up, or 1.0.2.
* The storage statistics currently count each LFS object multiple times for
  every project linking to it

[reconfigure gitlab]: ../../administration/restart_gitlab.md#omnibus-gitlab-reconfigure "How to reconfigure Omnibus GitLab"
[restart gitlab]: ../../administration/restart_gitlab.md#installations-from-source "How to restart GitLab"
[eep]: https://about.gitlab.com/products/ "GitLab Premium"
[ee-2760]: https://gitlab.com/gitlab-org/gitlab-ee/merge_requests/2760