summaryrefslogtreecommitdiff
path: root/doc/development
diff options
context:
space:
mode:
Diffstat (limited to 'doc/development')
-rw-r--r--doc/development/architecture.md194
-rw-r--r--doc/development/resources/gitlab_diagram_overview.odgbin0 -> 26292 bytes
-rw-r--r--doc/development/resources/gitlab_diagram_overview.pngbin0 -> 130285 bytes
-rw-r--r--doc/development/shell_commands.md111
4 files changed, 288 insertions, 17 deletions
diff --git a/doc/development/architecture.md b/doc/development/architecture.md
index db22f0bda85..8a772344b79 100644
--- a/doc/development/architecture.md
+++ b/doc/development/architecture.md
@@ -1,23 +1,183 @@
-# GitLab project architecture
+# GitLab Architecture Overview
+---
-GitLab project consists of two parts: GitLab and GitLab shell.
+# Software delivery
-## GitLab
+There are two editions of GitLab: [Enterprise Edition](https://www.gitlab.com/features/) (EE) and [Community Edition](http://gitlab.org/gitlab-ce/) (CE).
+GitLab CE is delivered via git from the [gitlabhq repository](https://gitlab.com/gitlab-org/gitlab-ce/tree/master).
+New versions of GitLab are released in stable branches and the master branch is for bleeding edge development.
-Web application with background jobs workers.
-Provides you with UI and most of functionality.
-For some operations like repo creation - uses GitLab shell.
+EE releases are available not long after CE releases.
+To obtain the GitLab EE there is a [repository at gitlab.com](https://gitlab.com/subscribers/gitlab-ee).
+For more information about the release process see the section 'New versions and upgrading' in the readme.
-Uses:
- * Ruby as main language for application code and most libraries.
- * [Rails](http://rubyonrails.org/) web framework as main framework for application.
- * Mysql or postgres as main databases. Used for persistent data storage(users, project, issues etc).
- * Redis database. Used for cache and exchange data between some components.
- * Python2 because of [pygments](http://pygments.org/) as code syntax highlighter.
+Both EE and CE require an add-on component called gitlab-shell.
+It is obtained from the [gitlab-shell repository](https://gitlab.com/gitlab-org/gitlab-shell/tree/master).
+New versions are usually tags but staying on the master branch will give you the latest stable version.
+New releases are generally around the same time as GitLab CE releases with exception for informal security updates deemed critical.
-## GitLab shell
+# System Layout
-Command line ruby application. Used by GitLab through shell commands.
-It provides interface to all kind of manipulations with repositories and ssh keys.
-Full list of commands you can find in README of GitLab shell repo.
-Works on pure ruby and do not require any additional software.
+When referring to ~git in the picures it means the home directory of the git user which is typically /home/git.
+
+GitLab is primarily installed within the `/home/git` user home directory as `git` user.
+Within the home directory is where the gitlabhq server software resides as well as the repositories (though the repository location is configurable).
+The bare repositories are located in `/home/git/repositories`.
+GitLab is a ruby on rails application so the particulars of the inner workings can be learned by studying how a ruby on rails application works.
+To serve repositories over SSH there's an add-on application called gitlab-shell which is installed in `/home/git/gitlab-shell`.
+
+## Components
+
+![GitLab Diagram Overview](resources/gitlab_diagram_overview.png "GitLab Diagram Overview")
+
+A typical install of GitLab will be on Ubuntu Linux or RHEL/CentOS.
+It uses Nginx or Apache as a web front end to proxypass the Unicorn web server.
+By default, communication between Unicorn and the front end is via a Unix domain socket but forwarding requests via TCP is also supported.
+The web front end accesses `/home/git/gitlab/public` bypassing the Unicorn server to serve static pages, uploads (e.g. avatar images or attachments), and precompiled assets.
+GitLab serves web pages and a [GitLab API](https://gitlab.com/gitlab-org/gitlab-ce/tree/master/doc/api) using the Unicorn web server.
+It uses Sidekiq as a job queue which, in turn, uses redis as a non-persistent database backend for job information, meta data, and incomming jobs.
+The GitLab web app uses MySQL or PostgreSQL for persistent database information (e.g. users, permissions, issues, other meta data).
+GitLab stores the bare git repositories it serves in `/home/git/repositories` by default.
+It also keeps default branch and hook information with the bare repository.
+`/home/git/gitlab-satellites` keeps checked out repositories when performing actions such as a merge request, editing files in the web interface, etc.
+The satellite repository is used by the web interface for editing repositories and the wiki which is also a git repository.
+When serving repositories over HTTP/HTTPS GitLab utilizes the GitLab API to resolve authorization and access as well as serving git objects.
+
+The add-on component gitlab-shell serves repositories over SSH.
+It manages the SSH keys within `/home/git/.ssh/authorized_keys` which should not be manually edited.
+gitlab-shell accesses the bare repositories directly to serve git objects and communicates with redis to submit jobs to Sidekiq for GitLab to process.
+ gitlab-shell queries the GitLab API to determine authorization and access.
+
+## Installation Folder Summary
+
+To summarize here's the [directory structure of the `git` user home directory](../install/structure.md).
+
+
+## Processes
+
+ ps aux | grep '^git'
+
+GitLab has several components to operate.
+As a system user (i.e. any user that is not the `git` user) it requires a persistent database (MySQL/PostreSQL) and redis database.
+It also uses Apache httpd or nginx to proxypass Unicorn.
+As the `git` user it starts Sidekiq and Unicorn (a simple ruby HTTP server running on port `8080` by default).
+Under the gitlab user there are normally 4 processes: `unicorn_rails master` (1 process), `unicorn_rails worker` (2 processes), `sidekiq` (1 process).
+
+## Repository access
+
+Repositories get accessed via HTTP or SSH.
+HTTP cloning/push/pull utilizes the GitLab API and SSH cloning is handled by gitlab-shell (previously explained).
+
+# Troubleshooting
+
+See the README for more information.
+
+## Init scripts of the services
+
+The GitLab init script starts and stops Unicorn and Sidekiq.
+
+```
+/etc/init.d/gitlab
+Usage: service gitlab {start|stop|restart|reload|status}
+```
+
+Redis (key-value store/non-persistent database)
+
+```
+/etc/init.d/redis
+Usage: /etc/init.d/redis {start|stop|status|restart|condrestart|try-restart}
+```
+
+SSH daemon
+
+```
+/etc/init.d/sshd
+Usage: /etc/init.d/sshd {start|stop|restart|reload|force-reload|condrestart|try-restart|status}
+```
+
+Web server (one of the following)
+
+```
+/etc/init.d/httpd
+Usage: httpd {start|stop|restart|condrestart|try-restart|force-reload|reload|status|fullstatus|graceful|help|configtest}
+
+$ /etc/init.d/nginx
+Usage: nginx {start|stop|restart|reload|force-reload|status|configtest}
+```
+
+Persistent database (one of the following)
+
+```
+/etc/init.d/mysqld
+Usage: /etc/init.d/mysqld {start|stop|status|restart|condrestart|try-restart|reload|force-reload}
+
+$ /etc/init.d/postgresql
+Usage: /etc/init.d/postgresql {start|stop|restart|reload|force-reload|status} [version ..]
+```
+
+## Log locations of the services
+
+Note: `/home/git/` is shorthand for `/home/git`.
+
+gitlabhq (includes Unicorn and Sidekiq logs)
+
+* `/home/git/gitlab/log/` contains `application.log`, `production.log`, `sidekiq.log`, `unicorn.stdout.log`, `githost.log`, `satellites.log`, and `unicorn.stderr.log` normally.
+
+gitlab-shell
+
+* `/home/git/gitlab-shell/gitlab-shell.log`
+
+ssh
+
+* `/var/log/auth.log` auth log (on Ubuntu).
+* `/var/log/secure` auth log (on RHEL).
+
+nginx
+
+* `/var/log/nginx/` contains error and access logs.
+
+Apache httpd
+
+* [Explanation of apache logs](http://httpd.apache.org/docs/2.2/logs.html).
+* `/var/log/apache2/` contains error and output logs (on Ubuntu).
+* `/var/log/httpd/` contains error and output logs (on RHEL).
+
+redis
+
+* `/var/log/redis/redis.log` there are also logrotated logs there.
+
+PostgreSQL
+
+* `/var/log/postgresql/*`
+
+MySQL
+
+* `/var/log/mysql/*`
+* `/var/log/mysql.*`
+
+## GitLab specific config files
+
+GitLab has configuration files located in `/home/git/gitlab/config/*`.
+Commonly referenced config files include:
+
+* `gitlab.yml` - GitLab configuration.
+* `unicorn.rb` - Unicorn web server settings.
+* `database.yml` - Database connection settings.
+
+gitlab-shell has a configuration file at `/home/git/gitlab-shell/config.yml`.
+
+## Maintenance Tasks
+
+[GitLab](https://gitlab.com/gitlab-org/gitlab-ce/tree/master) provides rake tasks with which you see version information and run a quick check on your configuration to ensure it is configured properly within the application.
+See [maintenance rake tasks](https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/raketasks/maintenance.md).
+In a nutshell, do the following:
+
+```
+sudo -i -u git
+cd gitlab
+bundle exec rake gitlab:env:info RAILS_ENV=production
+bundle exec rake gitlab:check RAILS_ENV=production
+```
+
+Note: It is recommended to log into the `git` user using `sudo -i -u git` or `sudo su - git`.
+While the sudo commands provided by gitlabhq work in Ubuntu they do not always work in RHEL.
diff --git a/doc/development/resources/gitlab_diagram_overview.odg b/doc/development/resources/gitlab_diagram_overview.odg
new file mode 100644
index 00000000000..b7e02f8fa78
--- /dev/null
+++ b/doc/development/resources/gitlab_diagram_overview.odg
Binary files differ
diff --git a/doc/development/resources/gitlab_diagram_overview.png b/doc/development/resources/gitlab_diagram_overview.png
new file mode 100644
index 00000000000..b5831cf0a4c
--- /dev/null
+++ b/doc/development/resources/gitlab_diagram_overview.png
Binary files differ
diff --git a/doc/development/shell_commands.md b/doc/development/shell_commands.md
new file mode 100644
index 00000000000..57b1172d5e6
--- /dev/null
+++ b/doc/development/shell_commands.md
@@ -0,0 +1,111 @@
+# Guidelines for shell commands in the GitLab codebase
+
+## Use File and FileUtils instead of shell commands
+
+Sometimes we invoke basic Unix commands via the shell when there is also a Ruby API for doing it.
+Use the Ruby API if it exists.
+http://www.ruby-doc.org/stdlib-2.0.0/libdoc/fileutils/rdoc/FileUtils.html#module-FileUtils-label-Module+Functions
+
+```ruby
+# Wrong
+system "mkdir -p tmp/special/directory"
+# Better (separate tokens)
+system *%W(mkdir -p tmp/special/directory)
+# Best (do not use a shell command)
+FileUtils.mkdir_p "tmp/special/directory"
+
+# Wrong
+contents = `cat #{filename}`
+# Correct
+contents = File.read(filename)
+```
+
+This coding style could have prevented CVE-2013-4490.
+
+## Bypass the shell by splitting commands into separate tokens
+
+When we pass shell commands as a single string to Ruby, Ruby will let `/bin/sh` evaluate the entire string.
+Essentially, we are asking the shell to evaluate a one-line script.
+This creates a risk for shell injection attacks.
+It is better to split the shell command into tokens ourselves.
+Sometimes we use the scripting capabilities of the shell to change the working directory or set environment variables.
+All of this can also be achieved securely straight from Ruby
+
+```ruby
+# Wrong
+system "cd /home/git/gitlab && bundle exec rake db:#{something} RAILS_ENV=production"
+# Correct
+system({'RAILS_ENV' => 'production'}, *%W(bundle exec rake db:#{something}), chdir: '/home/git/gitlab')
+
+# Wrong
+system "touch #{myfile}"
+# Better
+system "touch", myfile
+# Best (do not run a shell command at all)
+FileUtils.touch myfile
+```
+
+This coding style could have prevented CVE-2013-4546.
+
+## Separate options from arguments with --
+
+Make the difference between options and arguments clear to the argument parsers of system commands with `--`.
+This is supported by many but not all Unix commands.
+
+To understand what `--` does, consider the problem below.
+
+```
+# Example
+$ echo hello > -l
+$ cat -l
+cat: illegal option -- l
+usage: cat [-benstuv] [file ...]
+```
+
+In the example above, the argument parser of `cat` assumes that `-l` is an option.
+The solution in the example above is to make it clear to `cat` that `-l` is really an argument, not an option.
+Many Unix command line tools follow the convention of separating options from arguments with `--`.
+
+```
+# Example (continued)
+$ cat -- -l
+hello
+```
+
+In the GitLab codebase, we avoid the option/argument ambiguity by _always_ using `--`.
+
+```ruby
+# Wrong
+system(*%W(git branch -d #{branch_name}))
+# Correct
+system(*%W(git branch -d -- #{branch_name}))
+```
+
+This coding style could have prevented CVE-2013-4582.
+
+## Do not use the backticks
+
+Capturing the output of shell commands with backticks reads nicely, but you are forced to pass the command as one string to the shell.
+We explained above that this is unsafe.
+In the main GitLab codebase, the solution is to use `Gitlab::Popen.popen` instead.
+
+```ruby
+# Wrong
+logs = `cd #{repo_dir} && git log`
+# Correct
+logs, exit_status = Gitlab::Popen.popen(%W(git log), repo_dir)
+
+# Wrong
+user = `whoami`
+# Correct
+user, exit_status = Gitlab::Popen.popen(%W(whoami))
+```
+
+In other repositories, such as gitlab-shell you can also use `IO.popen`.
+
+```ruby
+# Safe IO.popen example
+logs = IO.popen(%W(git log), chdir: repo_dir).read
+```
+
+Note that unlike `Gitlab::Popen.popen`, `IO.popen` does not capture standard error.