summaryrefslogtreecommitdiff
path: root/doc/development/architecture.md
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development/architecture.md')
-rw-r--r--doc/development/architecture.md123
1 files changed, 109 insertions, 14 deletions
diff --git a/doc/development/architecture.md b/doc/development/architecture.md
index 8d5368bdd28..d12d4b2029c 100644
--- a/doc/development/architecture.md
+++ b/doc/development/architecture.md
@@ -125,18 +125,18 @@ Component statuses are linked to configuration documentation for each component.
| Component | Description | [Omnibus GitLab](https://docs.gitlab.com/omnibus/) | [GitLab chart](https://docs.gitlab.com/charts/) | [Minikube Minimal](https://docs.gitlab.com/charts/development/minikube/#deploying-gitlab-with-minimal-settings) | [GitLab.com](https://gitlab.com) | [Source](../install/installation.md) | [GDK](https://gitlab.com/gitlab-org/gitlab-development-kit) | CE/EE |
| --------- | ----------- |:--------------------:|:------------------:|:-----:|:--------:|:--------:|:-------:|:-------:|
-| [NGINX](#nginx) | Routes requests to appropriate components, terminates SSL | [✅][nginx-omnibus] | [✅][nginx-charts] | [⚙][nginx-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production-architecture/#service-architecture) | [⤓][nginx-source] | ❌ | CE & EE |
+| [NGINX](#nginx) | Routes requests to appropriate components, terminates SSL | [✅][nginx-omnibus] | [✅][nginx-charts] | [⚙][nginx-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production/architecture/#service-architecture) | [⤓][nginx-source] | ❌ | CE & EE |
| [Unicorn (GitLab Rails)](#unicorn) | Handles requests for the web interface and API | [✅][unicorn-omnibus] | [✅][unicorn-charts] | [✅][unicorn-charts] | [✅](../user/gitlab_com/index.md#unicorn) | [⚙][unicorn-source] | [✅][gitlab-yml] | CE & EE |
| [Sidekiq](#sidekiq) | Background jobs processor | [✅][sidekiq-omnibus] | [✅][sidekiq-charts] | [✅](https://docs.gitlab.com/charts/charts/gitlab/sidekiq/index.html) | [✅](../user/gitlab_com/index.md#sidekiq) | [✅][gitlab-yml] | [✅][gitlab-yml] | CE & EE |
-| [Gitaly](#gitaly) | Git RPC service for handling all Git calls made by GitLab | [✅][gitaly-omnibus] | [✅][gitaly-charts] | [✅][gitaly-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production-architecture/#service-architecture) | [⚙][gitaly-source] | ✅ | CE & EE |
-| [Praefect](#praefect) | A transparent proxy between any Git client and Gitaly storage nodes. | [✅][gitaly-omnibus] | [❌][gitaly-charts] | [❌][gitaly-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production-architecture/#service-architecture) | [⚙][praefect-source] | ✅ | CE & EE |
-| [GitLab Workhorse](#gitlab-workhorse) | Smart reverse proxy, handles large HTTP requests | [✅][workhorse-omnibus] | [✅][workhorse-charts] | [✅][workhorse-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production-architecture/#service-architecture) | [⚙][workhorse-source] | ✅ | CE & EE |
-| [GitLab Shell](#gitlab-shell) | Handles `git` over SSH sessions | [✅][shell-omnibus] | [✅][shell-charts] | [✅][shell-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production-architecture/#service-architecture) | [⚙][shell-source] | [✅][gitlab-yml] | CE & EE |
+| [Gitaly](#gitaly) | Git RPC service for handling all Git calls made by GitLab | [✅][gitaly-omnibus] | [✅][gitaly-charts] | [✅][gitaly-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production/architecture/#service-architecture) | [⚙][gitaly-source] | ✅ | CE & EE |
+| [Praefect](#praefect) | A transparent proxy between any Git client and Gitaly storage nodes. | [✅][gitaly-omnibus] | [❌][gitaly-charts] | [❌][gitaly-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production/architecture/#service-architecture) | [⚙][praefect-source] | ✅ | CE & EE |
+| [GitLab Workhorse](#gitlab-workhorse) | Smart reverse proxy, handles large HTTP requests | [✅][workhorse-omnibus] | [✅][workhorse-charts] | [✅][workhorse-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production/architecture/#service-architecture) | [⚙][workhorse-source] | ✅ | CE & EE |
+| [GitLab Shell](#gitlab-shell) | Handles `git` over SSH sessions | [✅][shell-omnibus] | [✅][shell-charts] | [✅][shell-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production/architecture/#service-architecture) | [⚙][shell-source] | [✅][gitlab-yml] | CE & EE |
| [GitLab Pages](#gitlab-pages) | Hosts static websites | [⚙][pages-omnibus] | [❌][pages-charts] | [❌][pages-charts] | [✅](../user/gitlab_com/index.md#gitlab-pages) | [⚙][pages-source] | [⚙][pages-gdk] | CE & EE |
| [Registry](#registry) | Container registry, allows pushing and pulling of images | [⚙][registry-omnibus] | [✅][registry-charts] | [✅][registry-charts] | [✅](../user/packages/container_registry/index.md#build-and-push-images-using-gitlab-cicd) | [⤓][registry-source] | [⚙][registry-gdk] | CE & EE |
-| [Redis](#redis) | Caching service | [✅][redis-omnibus] | [✅][redis-omnibus] | [✅][redis-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production-architecture/#service-architecture) | [⤓][redis-source] | ✅ | CE & EE |
+| [Redis](#redis) | Caching service | [✅][redis-omnibus] | [✅][redis-omnibus] | [✅][redis-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production/architecture/#service-architecture) | [⤓][redis-source] | ✅ | CE & EE |
| [PostgreSQL](#postgresql) | Database | [✅][postgres-omnibus] | [✅][postgres-charts] | [✅][postgres-charts] | [✅](../user/gitlab_com/index.md#postgresql) | [⤓][postgres-source] | ✅ | CE & EE |
-| [PgBouncer](#pgbouncer) | Database connection pooling, failover | [⚙][pgbouncer-omnibus] | [❌][pgbouncer-charts] | [❌][pgbouncer-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production-architecture/#database-architecture) | ❌ | ❌ | EE Only |
+| [PgBouncer](#pgbouncer) | Database connection pooling, failover | [⚙][pgbouncer-omnibus] | [❌][pgbouncer-charts] | [❌][pgbouncer-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production/architecture/#database-architecture) | ❌ | ❌ | EE Only |
| [Consul](#consul) | Database node discovery, failover | [⚙][consul-omnibus] | [❌][consul-charts] | [❌][consul-charts] | [✅](../user/gitlab_com/index.md#consul) | ❌ | ❌ | EE Only |
| [GitLab self-monitoring: Prometheus](#prometheus) | Time-series database, metrics collection, and query service | [✅][prometheus-omnibus] | [✅][prometheus-charts] | [⚙][prometheus-charts] | [✅](../user/gitlab_com/index.md#prometheus) | ❌ | ❌ | CE & EE |
| [GitLab self-monitoring: Alertmanager](#alertmanager) | Deduplicates, groups, and routes alerts from Prometheus | [⚙][alertmanager-omnibus] | [✅][alertmanager-charts] | [⚙][alertmanager-charts] | [✅](https://about.gitlab.com/handbook/engineering/monitoring/) | ❌ | ❌ | CE & EE |
@@ -149,10 +149,10 @@ Component statuses are linked to configuration documentation for each component.
| [GitLab Exporter](#gitlab-exporter) | Generates a variety of GitLab metrics | [✅][gitlab-exporter-omnibus] | [✅][gitlab-exporter-charts] | [✅][gitlab-exporter-charts] | [✅](https://about.gitlab.com/handbook/engineering/monitoring/) | ❌ | ❌ | CE & EE |
| [Node Exporter](#node-exporter) | Prometheus endpoint with system metrics | [✅][node-exporter-omnibus] | [N/A][node-exporter-charts] | [N/A][node-exporter-charts] | [✅](https://about.gitlab.com/handbook/engineering/monitoring/) | ❌ | ❌ | CE & EE |
| [Mattermost](#mattermost) | Open-source Slack alternative | [⚙][mattermost-omnibus] | [⤓][mattermost-charts] | [⤓][mattermost-charts] | [⤓](../user/project/integrations/mattermost.md) | ❌ | ❌ | CE & EE |
-| [MinIO](#minio) | Object storage service | [⤓][minio-omnibus] | [✅][minio-charts] | [✅][minio-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production-architecture/#storage-architecture) | ❌ | [⚙][minio-gdk] | CE & EE |
+| [MinIO](#minio) | Object storage service | [⤓][minio-omnibus] | [✅][minio-charts] | [✅][minio-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production/architecture/#storage-architecture) | ❌ | [⚙][minio-gdk] | CE & EE |
| [Runner](#gitlab-runner) | Executes GitLab CI jobs | [⤓][runner-omnibus] | [✅][runner-charts] | [⚙][runner-charts] | [✅](../user/gitlab_com/index.md#shared-runners) | [⚙][runner-source] | [⚙][runner-gdk] | CE & EE |
| [Database Migrations](#database-migrations) | Database migrations | [✅][database-migrations-omnibus] | [✅][database-migrations-charts] | [✅][database-migrations-charts] | ✅ | [⚙][database-migrations-source] | ✅ | CE & EE |
-| [Certificate Management](#certificate-management) | TLS Settings, Let's Encrypt | [✅][certificate-management-omnibus] | [✅][certificate-management-charts] | [⚙][certificate-management-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production-architecture/#secrets-management) | [⚙][certificate-management-source] | [⚙][certificate-management-gdk] | CE & EE |
+| [Certificate Management](#certificate-management) | TLS Settings, Let's Encrypt | [✅][certificate-management-omnibus] | [✅][certificate-management-charts] | [⚙][certificate-management-charts] | [✅](https://about.gitlab.com/handbook/engineering/infrastructure/production/architecture/#secrets-management) | [⚙][certificate-management-source] | [⚙][certificate-management-gdk] | CE & EE |
| [GitLab Geo Node](#gitlab-geo) | Geographically distributed GitLab nodes | [⚙][geo-omnibus] | [❌][geo-charts] | [❌][geo-charts] | ✅ | ❌ | [⚙][geo-gdk] | EE Only |
| [LDAP Authentication](#ldap-authentication) | Authenticate users against centralized LDAP directory | [⤓][ldap-omnibus] | [⤓][ldap-charts] | [⤓][ldap-charts] | [❌](https://about.gitlab.com/pricing/#gitlab-com) | [⤓][gitlab-yml] | [⤓][ldap-gdk] | CE & EE |
| [Outbound email (SMTP)](#outbound-email) | Send email messages to users | [⤓][outbound-email-omnibus] | [⤓][outbound-email-charts] | [⤓][outbound-email-charts] | [✅](../user/gitlab_com/index.md#mail-configuration) | [⤓][gitlab-yml] | [⤓][gitlab-yml] | CE & EE |
@@ -446,7 +446,7 @@ Sidekiq is a Ruby background job processor that pulls jobs from the Redis queue
- Layer: Core Service (Processor)
- Process: `unicorn`
-[Unicorn](https://bogomips.org/unicorn/) is a Ruby application server that is used to run the core Rails Application that provides the user facing features in GitLab. Often process output you will see this as `bundle` or `config.ru` depending on the GitLab version.
+[Unicorn](https://yhbt.net/unicorn/) is a Ruby application server that is used to run the core Rails Application that provides the user facing features in GitLab. Often process output you will see this as `bundle` or `config.ru` depending on the GitLab version.
#### LDAP Authentication
@@ -494,13 +494,108 @@ Below we describe the different pathing that HTTP vs. SSH Git requests will take
### Web Request (80/443)
-When you make a Git request over HTTP, the request first takes the same steps as a web HTTP request
-through NGINX and GitLab Workhorse. However, the GitLab Workhorse then diverts the request towards
-Gitaly, which processes it directly.
+Git operations over HTTP use the stateless "smart" protocol described in the
+[Git documentation](https://git-scm.com/docs/http-protocol), but responsibility
+for handling these operations is split across several GitLab components.
+
+Here is a sequence diagram for `git fetch`. Note that all requests pass through
+NGINX as well as any other HTTP load balancers, but are not transformed in any
+way by them. All paths are presented relative to a `/namespace/project.git` URL.
+
+```mermaid
+sequenceDiagram
+ participant Git on client
+ participant NGINX
+ participant Workhorse
+ participant Rails
+ participant Gitaly
+ participant Git on server
+
+ Note left of Git on client: git fetch<br/>info-refs
+ Git on client->>+Workhorse: GET /info/refs?service=git-upload-pack
+ Workhorse->>+Rails: GET /info/refs?service=git-upload-pack
+ Note right of Rails: Auth check
+ Rails-->>-Workhorse: Gitlab::Workhorse.git_http_ok
+ Workhorse->>+Gitaly: SmartHTTPService.InfoRefsUploadPack request
+ Gitaly->>+Git on server: git upload-pack --stateless-rpc --advertise-refs
+ Git on server-->>-Gitaly: git upload-pack response
+ Gitaly-->>-Workhorse: SmartHTTPService.InfoRefsUploadPack response
+ Workhorse-->>-Git on client: 200 OK
+
+ Note left of Git on client: git fetch<br/>fetch-pack
+ Git on client->>+Workhorse: POST /git-upload-pack
+ Workhorse->>+Rails: POST /git-upload-pack
+ Note right of Rails: Auth check
+ Rails-->>-Workhorse: Gitlab::Workhorse.git_http_ok
+ Workhorse->>+Gitaly: SmartHTTPService.PostUploadPack request
+ Gitaly->>+Git on server: git upload-pack --stateless-rpc
+ Git on server-->>-Gitaly: git upload-pack response
+ Gitaly-->>-Workhorse: SmartHTTPService.PostUploadPack response
+ Workhorse-->>-Git on client: 200 OK
+```
+
+The sequence is similar for `git push`, except `git-receive-pack` is used
+instead of `git-upload-pack`.
### SSH Request (22)
-TODO
+Git operations over SSH can use the stateful protocol described in the
+[Git documentation](https://git-scm.com/docs/pack-protocol#_ssh_transport), but
+responsibility for handling them is split across several GitLab components.
+
+No GitLab components speak SSH directly - all SSH connections are made between
+Git on the client machine and the SSH server, which terminates the connection.
+To the SSH server, all connections are authenticated as the `git` user; GitLab
+users are differentiated by the SSH key presented by the client.
+
+Here is a sequence diagram for `git fetch`, assuming [Fast SSH key lookup](../administration/operations/fast_ssh_key_lookup.md)
+is enabled. Note that `AuthorizedKeysCommand` is an executable provided by
+[GitLab Shell](#gitlab-shell):
+
+```mermaid
+sequenceDiagram
+ participant Git on client
+ participant SSH server
+ participant AuthorizedKeysCommand
+ participant GitLab Shell
+ participant Rails
+ participant Gitaly
+ participant Git on server
+
+ Note left of Git on client: git fetch
+ Git on client->>SSH server: git fetch-pack
+ SSH server-->>AuthorizedKeysCommand: gitlab-shell-authorized-keys-check git AAAA...
+ AuthorizedKeysCommand-->>Rails: GET /internal/api/authorized_keys?key=AAAA...
+ Note right of Rails: Lookup key ID
+ Rails-->>SSH server: 200 OK, command="gitlab-shell upload-pack key_id=1"
+ SSH server-->>GitLab Shell: gitlab-shell upload-pack key_id=1
+ GitLab Shell-->>Rails: GET /internal/api/allowed?action=upload_pack&key_id=1
+ Note right of Rails: Auth check
+ Rails-->>GitLab Shell: 200 OK, { gitaly: ... }
+ GitLab Shell-->>Gitaly: SSHService.SSHUploadPack bidirectional request
+ Gitaly-->>Git on server: git upload-pack
+ Git on server->>Git on client: SSHService.SSHUploadPack bidirectional response
+```
+
+The `git push` operation is very similar, except `git receive-pack` is used
+instead of `git upload-pack`.
+
+If fast SSH key lookups are not enabled, the SSH server reads from the
+`~git/.ssh/authorized_keys` file to determine what command to run for a given
+SSH session. This is kept up to date by an [`AuthorizedKeysWorker`](https://gitlab.com/gitlab-org/gitlab/blob/master/app/workers/authorized_keys_worker.rb)
+in Rails, scheduled to run whenever an SSH key is modified by a user.
+
+[SSH certificates](../administration/operations/ssh_certificates.md) may be used
+instead of keys. In this case, `AuthorizedKeysCommand` is replaced with an
+`AuthorizedPrincipalsCommand`. This extracts a username from the certificate
+without using the Rails internal API, which is used instead of `key_id` in the
+`/api/internal/allowed` call later.
+
+GitLab Shell also has a few operations that do not involve Gitaly, such as
+resetting two-factor authentication codes. These are handled in the same way,
+except there is no round-trip into Gitaly - Rails performs the action as part
+of the [internal API](internal_api.md) call, and GitLab Shell streams the
+response back to the user directly.
## System Layout