diff options
Diffstat (limited to 'doc/development')
-rw-r--r-- | doc/development/architecture.md | 194 | ||||
-rw-r--r-- | doc/development/resources/gitlab_diagram_overview.odg | bin | 0 -> 26292 bytes | |||
-rw-r--r-- | doc/development/resources/gitlab_diagram_overview.png | bin | 0 -> 130285 bytes | |||
-rw-r--r-- | doc/development/shell_commands.md | 111 |
4 files changed, 288 insertions, 17 deletions
diff --git a/doc/development/architecture.md b/doc/development/architecture.md index db22f0bda85..8a772344b79 100644 --- a/doc/development/architecture.md +++ b/doc/development/architecture.md @@ -1,23 +1,183 @@ -# GitLab project architecture +# GitLab Architecture Overview +--- -GitLab project consists of two parts: GitLab and GitLab shell. +# Software delivery -## GitLab +There are two editions of GitLab: [Enterprise Edition](https://www.gitlab.com/features/) (EE) and [Community Edition](http://gitlab.org/gitlab-ce/) (CE). +GitLab CE is delivered via git from the [gitlabhq repository](https://gitlab.com/gitlab-org/gitlab-ce/tree/master). +New versions of GitLab are released in stable branches and the master branch is for bleeding edge development. -Web application with background jobs workers. -Provides you with UI and most of functionality. -For some operations like repo creation - uses GitLab shell. +EE releases are available not long after CE releases. +To obtain the GitLab EE there is a [repository at gitlab.com](https://gitlab.com/subscribers/gitlab-ee). +For more information about the release process see the section 'New versions and upgrading' in the readme. -Uses: - * Ruby as main language for application code and most libraries. - * [Rails](http://rubyonrails.org/) web framework as main framework for application. - * Mysql or postgres as main databases. Used for persistent data storage(users, project, issues etc). - * Redis database. Used for cache and exchange data between some components. - * Python2 because of [pygments](http://pygments.org/) as code syntax highlighter. +Both EE and CE require an add-on component called gitlab-shell. +It is obtained from the [gitlab-shell repository](https://gitlab.com/gitlab-org/gitlab-shell/tree/master). +New versions are usually tags but staying on the master branch will give you the latest stable version. +New releases are generally around the same time as GitLab CE releases with exception for informal security updates deemed critical. -## GitLab shell +# System Layout -Command line ruby application. Used by GitLab through shell commands. -It provides interface to all kind of manipulations with repositories and ssh keys. -Full list of commands you can find in README of GitLab shell repo. -Works on pure ruby and do not require any additional software. +When referring to ~git in the picures it means the home directory of the git user which is typically /home/git. + +GitLab is primarily installed within the `/home/git` user home directory as `git` user. +Within the home directory is where the gitlabhq server software resides as well as the repositories (though the repository location is configurable). +The bare repositories are located in `/home/git/repositories`. +GitLab is a ruby on rails application so the particulars of the inner workings can be learned by studying how a ruby on rails application works. +To serve repositories over SSH there's an add-on application called gitlab-shell which is installed in `/home/git/gitlab-shell`. + +## Components + + + +A typical install of GitLab will be on Ubuntu Linux or RHEL/CentOS. +It uses Nginx or Apache as a web front end to proxypass the Unicorn web server. +By default, communication between Unicorn and the front end is via a Unix domain socket but forwarding requests via TCP is also supported. +The web front end accesses `/home/git/gitlab/public` bypassing the Unicorn server to serve static pages, uploads (e.g. avatar images or attachments), and precompiled assets. +GitLab serves web pages and a [GitLab API](https://gitlab.com/gitlab-org/gitlab-ce/tree/master/doc/api) using the Unicorn web server. +It uses Sidekiq as a job queue which, in turn, uses redis as a non-persistent database backend for job information, meta data, and incomming jobs. +The GitLab web app uses MySQL or PostgreSQL for persistent database information (e.g. users, permissions, issues, other meta data). +GitLab stores the bare git repositories it serves in `/home/git/repositories` by default. +It also keeps default branch and hook information with the bare repository. +`/home/git/gitlab-satellites` keeps checked out repositories when performing actions such as a merge request, editing files in the web interface, etc. +The satellite repository is used by the web interface for editing repositories and the wiki which is also a git repository. +When serving repositories over HTTP/HTTPS GitLab utilizes the GitLab API to resolve authorization and access as well as serving git objects. + +The add-on component gitlab-shell serves repositories over SSH. +It manages the SSH keys within `/home/git/.ssh/authorized_keys` which should not be manually edited. +gitlab-shell accesses the bare repositories directly to serve git objects and communicates with redis to submit jobs to Sidekiq for GitLab to process. + gitlab-shell queries the GitLab API to determine authorization and access. + +## Installation Folder Summary + +To summarize here's the [directory structure of the `git` user home directory](../install/structure.md). + + +## Processes + + ps aux | grep '^git' + +GitLab has several components to operate. +As a system user (i.e. any user that is not the `git` user) it requires a persistent database (MySQL/PostreSQL) and redis database. +It also uses Apache httpd or nginx to proxypass Unicorn. +As the `git` user it starts Sidekiq and Unicorn (a simple ruby HTTP server running on port `8080` by default). +Under the gitlab user there are normally 4 processes: `unicorn_rails master` (1 process), `unicorn_rails worker` (2 processes), `sidekiq` (1 process). + +## Repository access + +Repositories get accessed via HTTP or SSH. +HTTP cloning/push/pull utilizes the GitLab API and SSH cloning is handled by gitlab-shell (previously explained). + +# Troubleshooting + +See the README for more information. + +## Init scripts of the services + +The GitLab init script starts and stops Unicorn and Sidekiq. + +``` +/etc/init.d/gitlab +Usage: service gitlab {start|stop|restart|reload|status} +``` + +Redis (key-value store/non-persistent database) + +``` +/etc/init.d/redis +Usage: /etc/init.d/redis {start|stop|status|restart|condrestart|try-restart} +``` + +SSH daemon + +``` +/etc/init.d/sshd +Usage: /etc/init.d/sshd {start|stop|restart|reload|force-reload|condrestart|try-restart|status} +``` + +Web server (one of the following) + +``` +/etc/init.d/httpd +Usage: httpd {start|stop|restart|condrestart|try-restart|force-reload|reload|status|fullstatus|graceful|help|configtest} + +$ /etc/init.d/nginx +Usage: nginx {start|stop|restart|reload|force-reload|status|configtest} +``` + +Persistent database (one of the following) + +``` +/etc/init.d/mysqld +Usage: /etc/init.d/mysqld {start|stop|status|restart|condrestart|try-restart|reload|force-reload} + +$ /etc/init.d/postgresql +Usage: /etc/init.d/postgresql {start|stop|restart|reload|force-reload|status} [version ..] +``` + +## Log locations of the services + +Note: `/home/git/` is shorthand for `/home/git`. + +gitlabhq (includes Unicorn and Sidekiq logs) + +* `/home/git/gitlab/log/` contains `application.log`, `production.log`, `sidekiq.log`, `unicorn.stdout.log`, `githost.log`, `satellites.log`, and `unicorn.stderr.log` normally. + +gitlab-shell + +* `/home/git/gitlab-shell/gitlab-shell.log` + +ssh + +* `/var/log/auth.log` auth log (on Ubuntu). +* `/var/log/secure` auth log (on RHEL). + +nginx + +* `/var/log/nginx/` contains error and access logs. + +Apache httpd + +* [Explanation of apache logs](http://httpd.apache.org/docs/2.2/logs.html). +* `/var/log/apache2/` contains error and output logs (on Ubuntu). +* `/var/log/httpd/` contains error and output logs (on RHEL). + +redis + +* `/var/log/redis/redis.log` there are also logrotated logs there. + +PostgreSQL + +* `/var/log/postgresql/*` + +MySQL + +* `/var/log/mysql/*` +* `/var/log/mysql.*` + +## GitLab specific config files + +GitLab has configuration files located in `/home/git/gitlab/config/*`. +Commonly referenced config files include: + +* `gitlab.yml` - GitLab configuration. +* `unicorn.rb` - Unicorn web server settings. +* `database.yml` - Database connection settings. + +gitlab-shell has a configuration file at `/home/git/gitlab-shell/config.yml`. + +## Maintenance Tasks + +[GitLab](https://gitlab.com/gitlab-org/gitlab-ce/tree/master) provides rake tasks with which you see version information and run a quick check on your configuration to ensure it is configured properly within the application. +See [maintenance rake tasks](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/raketasks/maintenance.md). +In a nutshell, do the following: + +``` +sudo -i -u git +cd gitlab +bundle exec rake gitlab:env:info RAILS_ENV=production +bundle exec rake gitlab:check RAILS_ENV=production +``` + +Note: It is recommended to log into the `git` user using `sudo -i -u git` or `sudo su - git`. +While the sudo commands provided by gitlabhq work in Ubuntu they do not always work in RHEL. diff --git a/doc/development/resources/gitlab_diagram_overview.odg b/doc/development/resources/gitlab_diagram_overview.odg Binary files differnew file mode 100644 index 00000000000..b7e02f8fa78 --- /dev/null +++ b/doc/development/resources/gitlab_diagram_overview.odg diff --git a/doc/development/resources/gitlab_diagram_overview.png b/doc/development/resources/gitlab_diagram_overview.png Binary files differnew file mode 100644 index 00000000000..b5831cf0a4c --- /dev/null +++ b/doc/development/resources/gitlab_diagram_overview.png diff --git a/doc/development/shell_commands.md b/doc/development/shell_commands.md new file mode 100644 index 00000000000..57b1172d5e6 --- /dev/null +++ b/doc/development/shell_commands.md @@ -0,0 +1,111 @@ +# Guidelines for shell commands in the GitLab codebase + +## Use File and FileUtils instead of shell commands + +Sometimes we invoke basic Unix commands via the shell when there is also a Ruby API for doing it. +Use the Ruby API if it exists. +http://www.ruby-doc.org/stdlib-2.0.0/libdoc/fileutils/rdoc/FileUtils.html#module-FileUtils-label-Module+Functions + +```ruby +# Wrong +system "mkdir -p tmp/special/directory" +# Better (separate tokens) +system *%W(mkdir -p tmp/special/directory) +# Best (do not use a shell command) +FileUtils.mkdir_p "tmp/special/directory" + +# Wrong +contents = `cat #{filename}` +# Correct +contents = File.read(filename) +``` + +This coding style could have prevented CVE-2013-4490. + +## Bypass the shell by splitting commands into separate tokens + +When we pass shell commands as a single string to Ruby, Ruby will let `/bin/sh` evaluate the entire string. +Essentially, we are asking the shell to evaluate a one-line script. +This creates a risk for shell injection attacks. +It is better to split the shell command into tokens ourselves. +Sometimes we use the scripting capabilities of the shell to change the working directory or set environment variables. +All of this can also be achieved securely straight from Ruby + +```ruby +# Wrong +system "cd /home/git/gitlab && bundle exec rake db:#{something} RAILS_ENV=production" +# Correct +system({'RAILS_ENV' => 'production'}, *%W(bundle exec rake db:#{something}), chdir: '/home/git/gitlab') + +# Wrong +system "touch #{myfile}" +# Better +system "touch", myfile +# Best (do not run a shell command at all) +FileUtils.touch myfile +``` + +This coding style could have prevented CVE-2013-4546. + +## Separate options from arguments with -- + +Make the difference between options and arguments clear to the argument parsers of system commands with `--`. +This is supported by many but not all Unix commands. + +To understand what `--` does, consider the problem below. + +``` +# Example +$ echo hello > -l +$ cat -l +cat: illegal option -- l +usage: cat [-benstuv] [file ...] +``` + +In the example above, the argument parser of `cat` assumes that `-l` is an option. +The solution in the example above is to make it clear to `cat` that `-l` is really an argument, not an option. +Many Unix command line tools follow the convention of separating options from arguments with `--`. + +``` +# Example (continued) +$ cat -- -l +hello +``` + +In the GitLab codebase, we avoid the option/argument ambiguity by _always_ using `--`. + +```ruby +# Wrong +system(*%W(git branch -d #{branch_name})) +# Correct +system(*%W(git branch -d -- #{branch_name})) +``` + +This coding style could have prevented CVE-2013-4582. + +## Do not use the backticks + +Capturing the output of shell commands with backticks reads nicely, but you are forced to pass the command as one string to the shell. +We explained above that this is unsafe. +In the main GitLab codebase, the solution is to use `Gitlab::Popen.popen` instead. + +```ruby +# Wrong +logs = `cd #{repo_dir} && git log` +# Correct +logs, exit_status = Gitlab::Popen.popen(%W(git log), repo_dir) + +# Wrong +user = `whoami` +# Correct +user, exit_status = Gitlab::Popen.popen(%W(whoami)) +``` + +In other repositories, such as gitlab-shell you can also use `IO.popen`. + +```ruby +# Safe IO.popen example +logs = IO.popen(%W(git log), chdir: repo_dir).read +``` + +Note that unlike `Gitlab::Popen.popen`, `IO.popen` does not capture standard error. |