summaryrefslogtreecommitdiff
path: root/doc/administration/gitaly/index.md
blob: 4407facfca9fe73ba57c0195848df44907b8f452 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
# Gitaly

[Gitaly](https://gitlab.com/gitlab-org/gitaly) is the service that
provides high-level RPC access to Git repositories. Without it, no other
components can read or write Git data.

GitLab components that access Git repositories (gitlab-rails,
gitlab-shell, gitlab-workhorse) act as clients to Gitaly. End users do
not have direct access to Gitaly.

## Configuring Gitaly

The Gitaly service itself is configured via a TOML configuration file.
This file is documented [in the gitaly
repository](https://gitlab.com/gitlab-org/gitaly/blob/master/doc/configuration/README.md).

To change a Gitaly setting in Omnibus you can use
`gitaly['my_setting']` in `/etc/gitlab/gitlab.rb`. Changes will be applied
when you run `gitlab-ctl reconfigure`.

```ruby
gitaly['prometheus_listen_addr'] = 'localhost:9236'
```

To change a Gitaly setting in installations from source you can edit
`/home/git/gitaly/config.toml`. Changes will be applied when you run
`service gitlab restart`.

```toml
prometheus_listen_addr = "localhost:9236"
```

## Client-side GRPC logs

Gitaly uses the [gRPC](https://grpc.io/) RPC framework. The Ruby gRPC
client has its own log file which may contain useful information when
you are seeing Gitaly errors. You can control the log level of the
gRPC client with the `GRPC_LOG_LEVEL` environment variable. The
default level is `WARN`.

## Running Gitaly on its own server

> This is an optional way to deploy Gitaly which can benefit GitLab
installations that are larger than a single machine. Most
installations will be better served with the default configuration
used by Omnibus and the GitLab source installation guide.

Starting with GitLab 11.4, Gitaly is able to serve all Git requests without
needed a shared NFS mount for Git repository data.
Between 11.4 and 11.8 the exception was the
[Elasticsearch indexer](https://gitlab.com/gitlab-org/gitlab-elasticsearch-indexer).
But since 11.8 the indexer uses Gitaly for data access as well. NFS can still
be leveraged for redudancy on block level of the Git data. But only has to
be mounted on the Gitaly server.

NOTE: **Note:** While Gitaly can be used as a replacement for NFS, we do not recommend
using EFS as it may impact GitLab's performance. Please review the [relevant documentation](../high_availability/nfs.md#avoid-using-awss-elastic-file-system-efs)
for more details.

### Network architecture

-   gitlab-rails shards repositories into "repository storages"
-   `gitlab-rails/config/gitlab.yml` contains a map from storage names to
    (Gitaly address, Gitaly token) pairs
-   the `storage name` -\> `(Gitaly address, Gitaly token)` map in
    `gitlab.yml` is the single source of truth for the Gitaly network
    topology
-   a (Gitaly address, Gitaly token) corresponds to a Gitaly server
-   a Gitaly server hosts one or more storages
-   Gitaly addresses must be specified in such a way that they resolve
    correctly for ALL Gitaly clients
-   Gitaly clients are: unicorn, sidekiq, gitlab-workhorse,
    gitlab-shell, Elasticsearch Indexer, and Gitaly itself
-   special case: a Gitaly server must be able to make RPC calls **to
    itself** via its own (Gitaly address, Gitaly token) pair as
    specified in `gitlab-rails/config/gitlab.yml`
-   Gitaly servers must not be exposed to the public internet

Gitaly network traffic is unencrypted by default, but supports
[TLS](#tls-support). Authentication is done through a static token.

NOTE: **Note:** Gitaly network traffic is unencrypted so we recommend a firewall to
restrict access to your Gitaly server.

Below we describe how to configure a Gitaly server at address
`gitaly.internal:8075` with secret token `abc123secret`. We assume
your GitLab installation has two repository storages, `default` and
`storage1`.

### Installation

First install Gitaly using either Omnibus or from source.

Omnibus: [Download/install](https://about.gitlab.com/install/) the Omnibus GitLab
package you want using **steps 1 and 2** from the GitLab downloads page but
**_do not_** provide the `EXTERNAL_URL=` value.

Source: [Install Gitaly](../../install/installation.md#install-gitaly)

### Client side token configuration

Configure a token on the client side.

Omnibus installations:

```ruby
# /etc/gitlab/gitlab.rb
gitlab_rails['gitaly_token'] = 'abc123secret'
```

Source installations:

```yaml
# /home/git/gitlab/config/gitlab.yml
gitlab:
  gitaly:
    token: 'abc123secret'
```

You need to reconfigure (Omnibus) or restart (source) for these
changes to be picked up.

### Gitaly server configuration

Next, on the Gitaly server, we need to configure storage paths, enable
the network listener and configure the token.

NOTE: **Note:** if you want to reduce the risk of downtime when you enable
authentication you can temporarily disable enforcement, see [the
documentation on configuring Gitaly
authentication](https://gitlab.com/gitlab-org/gitaly/blob/master/doc/configuration/README.md#authentication)
.

Gitaly must trigger some callbacks to GitLab via GitLab Shell. As a result,
the GitLab Shell secret must be the same between the other GitLab servers and
the Gitaly server. The easiest way to accomplish this is to copy `/etc/gitlab/gitlab-secrets.json`
from an existing GitLab server to the Gitaly server. Without this shared secret,
Git operations in GitLab will result in an API error.

NOTE: **Note:** In most or all cases the storage paths below end in `/repositories` which is
different than `path` in `git_data_dirs` of Omnibus installations. Check the
directory layout on your Gitaly server to be sure.

Omnibus installations:

<!--
updates to following example must also be made at
https://gitlab.com/charts/gitlab/blob/master/doc/advanced/external-gitaly/external-omnibus-gitaly.md#configure-omnibus-gitlab
-->

```ruby
# /etc/gitlab/gitlab.rb

# Avoid running unnecessary services on the Gitaly server
postgresql['enable'] = false
redis['enable'] = false
nginx['enable'] = false
prometheus['enable'] = false
unicorn['enable'] = false
sidekiq['enable'] = false
gitlab_workhorse['enable'] = false

# Prevent database connections during 'gitlab-ctl reconfigure'
gitlab_rails['rake_cache_clear'] = false
gitlab_rails['auto_migrate'] = false

# Configure the gitlab-shell API callback URL. Without this, `git push` will
# fail. This can be your 'front door' GitLab URL or an internal load
# balancer.
# Don't forget to copy `/etc/gitlab/gitlab-secrets.json` from web server to Gitaly server.
gitlab_rails['internal_api_url'] = 'https://gitlab.example.com'

# Make Gitaly accept connections on all network interfaces. You must use
# firewalls to restrict access to this address/port.
gitaly['listen_addr'] = "0.0.0.0:8075"
gitaly['auth_token'] = 'abc123secret'

gitaly['storage'] = [
  { 'name' => 'default', 'path' => '/mnt/gitlab/default/repositories' },
  { 'name' => 'storage1', 'path' => '/mnt/gitlab/storage1/repositories' },
]

# To use TLS for Gitaly you need to add
gitaly['tls_listen_addr'] = "0.0.0.0:9999"
gitaly['certificate_path'] = "path/to/cert.pem"
gitaly['key_path'] = "path/to/key.pem"
```

Source installations:

```toml
# /home/git/gitaly/config.toml
listen_addr = '0.0.0.0:8075'
tls_listen_addr = '0.0.0.0:9999'

[tls]
certificate_path = /path/to/cert.pem
key_path = /path/to/key.pem

[auth]
token = 'abc123secret'

[[storage]]
name = 'default'
path = '/mnt/gitlab/default/repositories'

[[storage]]
name = 'storage1'
path = '/mnt/gitlab/storage1/repositories'
```

Again, reconfigure (Omnibus) or restart (source).

### Converting clients to use the Gitaly server

Now as the final step update the client machines to switch from using
their local Gitaly service to the new Gitaly server you just
configured. This is a risky step because if there is any sort of
network, firewall, or name resolution problem preventing your GitLab
server from reaching the Gitaly server then all Gitaly requests will
fail.

Additionally, you need to
[disable Rugged if previously manually enabled](../high_availability/nfs.md#improving-nfs-performance-with-gitlab).

We assume that your Gitaly server can be reached at
`gitaly.internal:8075` from your GitLab server, and that Gitaly can read and
write to `/mnt/gitlab/default` and `/mnt/gitlab/storage1` respectively.

Omnibus installations:

```ruby
# /etc/gitlab/gitlab.rb
git_data_dirs({
  'default' => { 'gitaly_address' => 'tcp://gitaly.internal:8075' },
  'storage1' => { 'gitaly_address' => 'tcp://gitaly.internal:8075' },
})

gitlab_rails['gitaly_token'] = 'abc123secret'
```

Source installations:

```yaml
# /home/git/gitlab/config/gitlab.yml
gitlab:
  repositories:
    storages:
      default:
        path: /mnt/gitlab/default/repositories
        gitaly_address: tcp://gitaly.internal:8075
      storage1:
        path: /mnt/gitlab/storage1/repositories
        gitaly_address: tcp://gitaly.internal:8075

  gitaly:
    token: 'abc123secret'
```

Now reconfigure (Omnibus) or restart (source). When you tail the
Gitaly logs on your Gitaly server (`sudo gitlab-ctl tail gitaly` or
`tail -f /home/git/gitlab/log/gitaly.log`) you should see requests
coming in. One sure way to trigger a Gitaly request is to clone a
repository from your GitLab server over HTTP.

## TLS support

> [Introduced](https://gitlab.com/gitlab-org/gitlab-ce/merge_requests/22602) in GitLab 11.8.

Gitaly supports TLS encryption. To be able to communicate
with a Gitaly instance that listens for secure connections you will need to use `tls://` url
scheme in the `gitaly_address` of the corresponding storage entry in the GitLab configuration.

The admin needs to bring their own certificate as we do not provide that automatically.
The certificate to be used needs to be installed on all Gitaly nodes and on all client nodes that communicate with it following procedures described in [GitLab custom certificate configuration](https://docs.gitlab.com/omnibus/settings/ssl.html#install-custom-public-certificates).

Note that it is possible to configure Gitaly servers with both an
unencrypted listening address `listen_addr` and an encrypted listening
address `tls_listen_addr` at the same time. This allows you to do a
gradual transition from unencrypted to encrypted traffic, if necessary.

To observe what type of connections are actually being used in a
production environment you can use the following Prometheus query:

```
sum(rate(gitaly_connections_total[5m])) by (type)
```

### Example TLS configuration

### Omnibus installations:

#### On client nodes:

```ruby
# /etc/gitlab/gitlab.rb
git_data_dirs({
  'default' => { 'gitaly_address' => 'tls://gitaly.internal:9999' },
  'storage1' => { 'gitaly_address' => 'tls://gitaly.internal:9999' },
})

gitlab_rails['gitaly_token'] = 'abc123secret'
```

#### On Gitaly server nodes:

```ruby
gitaly['tls_listen_addr'] = "0.0.0.0:9999"
gitaly['certificate_path'] = "path/to/cert.pem"
gitaly['key_path'] = "path/to/key.pem"
```

### Source installations:

#### On client nodes:

```yaml
# /home/git/gitlab/config/gitlab.yml
gitlab:
  repositories:
    storages:
      default:
        path: /mnt/gitlab/default/repositories
        gitaly_address: tls://gitaly.internal:9999
      storage1:
        path: /mnt/gitlab/storage1/repositories
        gitaly_address: tls://gitaly.internal:9999

  gitaly:
    token: 'abc123secret'
```

#### On Gitaly server nodes:

```toml
# /home/git/gitaly/config.toml
tls_listen_addr = '0.0.0.0:9999'

[tls]
certificate_path = '/path/to/cert.pem'
key_path = '/path/to/key.pem'
```

## Gitaly-ruby

Gitaly was developed to replace Ruby application code in gitlab-ce/ee.
In order to save time and/or avoid the risk of rewriting existing
application logic, in some cases we chose to copy some application code
from gitlab-ce into Gitaly almost as-is. To be able to run that code, we
made gitaly-ruby, which is a sidecar process for the main Gitaly Go
process. Some examples of things that are implemented in gitaly-ruby are
RPC's that deal with wiki's, and RPC's that create commits on behalf of
a user, such as merge commits.

### Number of gitaly-ruby workers

Gitaly-ruby has much less capacity than Gitaly itself. If your Gitaly
server has to handle a lot of request, the default setting of having
just 1 active gitaly-ruby sidecar might not be enough. If you see
ResourceExhausted errors from Gitaly it's very likely that you have not
enough gitaly-ruby capacity.

You can increase the number of gitaly-ruby processes on your Gitaly
server with the following settings.

Omnibus:

```ruby
# /etc/gitlab/gitlab.rb
# Default is 2 workers. The minimum is 2; 1 worker is always reserved as
# a passive stand-by.
gitaly['ruby_num_workers'] = 4
```

Source:

```toml
# /home/git/gitaly/config.toml
[gitaly-ruby]
num_workers = 4
```

### Observing gitaly-ruby traffic

Gitaly-ruby is a somewhat hidden, internal implementation detail of
Gitaly. There is not that much visibility into what goes on inside
gitaly-ruby processes.

If you have Prometheus set up to scrape your Gitaly process, you can see
request rates and error codes for individual RPC's in gitaly-ruby by
querying `grpc_client_handled_total`. Strictly speaking this metric does
not differentiate between gitaly-ruby and other RPC's, but in practice
(as of GitLab 11.9), all gRPC calls made by Gitaly itself are internal
calls from the main Gitaly process to one of its gitaly-ruby sidecars.

Assuming your `grpc_client_handled_total` counter only observes Gitaly,
the following query shows you RPC's are (most likely) internally
implemented as calls to gitaly-ruby.

```
sum(rate(grpc_client_handled_total[5m])) by (grpc_method) > 0
```

## Disabling or enabling the Gitaly service in a cluster environment

If you are running Gitaly [as a remote
service](#running-gitaly-on-its-own-server) you may want to disable
the local Gitaly service that runs on your GitLab server by default.

> 'Disabling Gitaly' only makes sense when you run GitLab in a custom
cluster configuration, where different services run on different
machines. Disabling Gitaly on all machines in the cluster is not a
valid configuration.

If you are setting up a GitLab cluster where Gitaly does not need to
run on all machines, you can disable the Gitaly service in your
Omnibus installation, add the following line to `/etc/gitlab/gitlab.rb`:

```ruby
gitaly['enable'] = false
```

When you run `gitlab-ctl reconfigure` the Gitaly service will be
disabled.

To disable the Gitaly service in a GitLab cluster where you installed
GitLab from source, add the following to `/etc/default/gitlab` on the
machine where you want to disable Gitaly.

```shell
gitaly_enabled=false
```

When you run `service gitlab restart` Gitaly will be disabled on this
particular machine.

## Eliminating NFS altogether

If you are planning to use Gitaly without NFS for your storage needs
and want to eliminate NFS from your environment altogether, there are
a few things that you need to do:

 1. Make sure the [`git` user home directory](https://docs.gitlab.com/omnibus/settings/configuration.html#moving-the-home-directory-for-a-user) is on local disk.
 1. Configure [database lookup of SSH keys](../operations/fast_ssh_key_lookup.md)
 to eliminate the need for a shared authorized_keys file.
 1. Configure [object storage for job artifacts](../job_artifacts.md#using-object-storage)
 including [live tracing](../job_traces.md#new-live-trace-architecture).
 1. Configure [object storage for LFS objects](../../workflow/lfs/lfs_administration.md#storing-lfs-objects-in-remote-object-storage).
 1. Configure [object storage for uploads](../uploads.md#using-object-storage-core-only).

NOTE: **Note:** One current feature of GitLab still requires a shared directory (NFS): [GitLab Pages](../../user/project/pages/index.md).
There is [work in progress](https://gitlab.com/gitlab-org/gitlab-pages/issues/196)
to eliminate the need for NFS to support GitLab Pages.

## Troubleshooting Gitaly in production

Since GitLab 11.6, Gitaly comes with a command-line tool called
`gitaly-debug` that can be run on a Gitaly server to aid in
troubleshooting. In GitLab 11.6 its only sub-command is
`simulate-http-clone` which allows you to measure the maximum possible
Git clone speed for a specific repository on the server.

For an up to date list of sub-commands see [the gitaly-debug
README](https://gitlab.com/gitlab-org/gitaly/blob/master/cmd/gitaly-debug/README.md).