From d8f33c0a51d9106ece6cd4bae469e40734e05f85 Mon Sep 17 00:00:00 2001 From: Achilleas Pipinellis Date: Sun, 25 Sep 2016 12:44:09 +0200 Subject: Move operations/ to new location [ci skip] --- doc/operations/README.md | 6 +- doc/operations/cleaning_up_redis_sessions.md | 53 +------- doc/operations/moving_repositories.md | 181 +-------------------------- doc/operations/sidekiq_memory_killer.md | 41 +----- doc/operations/unicorn.md | 87 +------------ 5 files changed, 5 insertions(+), 363 deletions(-) (limited to 'doc/operations') diff --git a/doc/operations/README.md b/doc/operations/README.md index 6a35dab7b6c..58f16aff7bd 100644 --- a/doc/operations/README.md +++ b/doc/operations/README.md @@ -1,5 +1 @@ -# GitLab operations - -- [Sidekiq MemoryKiller](sidekiq_memory_killer.md) -- [Cleaning up Redis sessions](cleaning_up_redis_sessions.md) -- [Understanding Unicorn and unicorn-worker-killer](unicorn.md) +This document was moved to [administration/operations](../administration/operations.md). diff --git a/doc/operations/cleaning_up_redis_sessions.md b/doc/operations/cleaning_up_redis_sessions.md index 93521e976d5..2a1d0a8c8eb 100644 --- a/doc/operations/cleaning_up_redis_sessions.md +++ b/doc/operations/cleaning_up_redis_sessions.md @@ -1,52 +1 @@ -# Cleaning up stale Redis sessions - -Since version 6.2, GitLab stores web user sessions as key-value pairs in Redis. -Prior to GitLab 7.3, user sessions did not automatically expire from Redis. If -you have been running a large GitLab server (thousands of users) since before -GitLab 7.3 we recommend cleaning up stale sessions to compact the Redis -database after you upgrade to GitLab 7.3. You can also perform a cleanup while -still running GitLab 7.2 or older, but in that case new stale sessions will -start building up again after you clean up. - -In GitLab versions prior to 7.3.0, the session keys in Redis are 16-byte -hexadecimal values such as '976aa289e2189b17d7ef525a6702ace9'. Starting with -GitLab 7.3.0, the keys are -prefixed with 'session:gitlab:', so they would look like -'session:gitlab:976aa289e2189b17d7ef525a6702ace9'. Below we describe how to -remove the keys in the old format. - -First we define a shell function with the proper Redis connection details. - -``` -rcli() { - # This example works for Omnibus installations of GitLab 7.3 or newer. For an - # installation from source you will have to change the socket path and the - # path to redis-cli. - sudo /opt/gitlab/embedded/bin/redis-cli -s /var/opt/gitlab/redis/redis.socket "$@" -} - -# test the new shell function; the response should be PONG -rcli ping -``` - -Now we do a search to see if there are any session keys in the old format for -us to clean up. - -``` -# returns the number of old-format session keys in Redis -rcli keys '*' | grep '^[a-f0-9]\{32\}$' | wc -l -``` - -If the number is larger than zero, you can proceed to expire the keys from -Redis. If the number is zero there is nothing to clean up. - -``` -# Tell Redis to expire each matched key after 600 seconds. -rcli keys '*' | grep '^[a-f0-9]\{32\}$' | awk '{ print "expire", $0, 600 }' | rcli -# This will print '(integer) 1' for each key that gets expired. -``` - -Over the next 15 minutes (10 minutes expiry time plus 5 minutes Redis -background save interval) your Redis database will be compacted. If you are -still using GitLab 7.2, users who are not clicking around in GitLab during the -10 minute expiry window will be signed out of GitLab. +This document was moved to [administration/operations/cleaning_up_redis_sessions](../administration/operations/cleaning_up_redis_sessions.md). diff --git a/doc/operations/moving_repositories.md b/doc/operations/moving_repositories.md index 54adb99386a..c54bca324a5 100644 --- a/doc/operations/moving_repositories.md +++ b/doc/operations/moving_repositories.md @@ -1,180 +1 @@ -# Moving repositories managed by GitLab - -Sometimes you need to move all repositories managed by GitLab to -another filesystem or another server. In this document we will look -at some of the ways you can copy all your repositories from -`/var/opt/gitlab/git-data/repositories` to `/mnt/gitlab/repositories`. - -We will look at three scenarios: the target directory is empty, the -target directory contains an outdated copy of the repositories, and -how to deal with thousands of repositories. - -**Each of the approaches we list can/will overwrite data in the -target directory `/mnt/gitlab/repositories`. Do not mix up the -source and the target.** - -## Target directory is empty: use a tar pipe - -If the target directory `/mnt/gitlab/repositories` is empty the -simplest thing to do is to use a tar pipe. This method has low -overhead and tar is almost always already installed on your system. -However, it is not possible to resume an interrupted tar pipe: if -that happens then all data must be copied again. - -``` -# As the git user -tar -C /var/opt/gitlab/git-data/repositories -cf - -- . |\ - tar -C /mnt/gitlab/repositories -xf - -``` - -If you want to see progress, replace `-xf` with `-xvf`. - -### Tar pipe to another server - -You can also use a tar pipe to copy data to another server. If your -'git' user has SSH access to the newserver as 'git@newserver', you -can pipe the data through SSH. - -``` -# As the git user -tar -C /var/opt/gitlab/git-data/repositories -cf - -- . |\ - ssh git@newserver tar -C /mnt/gitlab/repositories -xf - -``` - -If you want to compress the data before it goes over the network -(which will cost you CPU cycles) you can replace `ssh` with `ssh -C`. - -## The target directory contains an outdated copy of the repositories: use rsync - -If the target directory already contains a partial / outdated copy -of the repositories it may be wasteful to copy all the data again -with tar. In this scenario it is better to use rsync. This utility -is either already installed on your system or easily installable -via apt, yum etc. - -``` -# As the 'git' user -rsync -a --delete /var/opt/gitlab/git-data/repositories/. \ - /mnt/gitlab/repositories -``` - -The `/.` in the command above is very important, without it you can -easily get the wrong directory structure in the target directory. -If you want to see progress, replace `-a` with `-av`. - -### Single rsync to another server - -If the 'git' user on your source system has SSH access to the target -server you can send the repositories over the network with rsync. - -``` -# As the 'git' user -rsync -a --delete /var/opt/gitlab/git-data/repositories/. \ - git@newserver:/mnt/gitlab/repositories -``` - -## Thousands of Git repositories: use one rsync per repository - -Every time you start an rsync job it has to inspect all files in -the source directory, all files in the target directory, and then -decide what files to copy or not. If the source or target directory -has many contents this startup phase of rsync can become a burden -for your GitLab server. In cases like this you can make rsync's -life easier by dividing its work in smaller pieces, and sync one -repository at a time. - -In addition to rsync we will use [GNU -Parallel](http://www.gnu.org/software/parallel/). This utility is -not included in GitLab so you need to install it yourself with apt -or yum. Also note that the GitLab scripts we used below were added -in GitLab 8.1. - -** This process does not clean up repositories at the target location that no -longer exist at the source. ** If you start using your GitLab instance with -`/mnt/gitlab/repositories`, you need to run `gitlab-rake gitlab:cleanup:repos` -after switching to the new repository storage directory. - -### Parallel rsync for all repositories known to GitLab - -This will sync repositories with 10 rsync processes at a time. We keep -track of progress so that the transfer can be restarted if necessary. - -First we create a new directory, owned by 'git', to hold transfer -logs. We assume the directory is empty before we start the transfer -procedure, and that we are the only ones writing files in it. - -``` -# Omnibus -sudo mkdir /var/opt/gitlab/transfer-logs -sudo chown git:git /var/opt/gitlab/transfer-logs - -# Source -sudo -u git -H mkdir /home/git/transfer-logs -``` - -We seed the process with a list of the directories we want to copy. - -``` -# Omnibus -sudo -u git sh -c 'gitlab-rake gitlab:list_repos > /var/opt/gitlab/transfer-logs/all-repos-$(date +%s).txt' - -# Source -cd /home/git/gitlab -sudo -u git -H sh -c 'bundle exec rake gitlab:list_repos > /home/git/transfer-logs/all-repos-$(date +%s).txt' -``` - -Now we can start the transfer. The command below is idempotent, and -the number of jobs done by GNU Parallel should converge to zero. If it -does not some repositories listed in all-repos-1234.txt may have been -deleted/renamed before they could be copied. - -``` -# Omnibus -sudo -u git sh -c ' -cat /var/opt/gitlab/transfer-logs/* | sort | uniq -u |\ - /usr/bin/env JOBS=10 \ - /opt/gitlab/embedded/service/gitlab-rails/bin/parallel-rsync-repos \ - /var/opt/gitlab/transfer-logs/success-$(date +%s).log \ - /var/opt/gitlab/git-data/repositories \ - /mnt/gitlab/repositories -' - -# Source -cd /home/git/gitlab -sudo -u git -H sh -c ' -cat /home/git/transfer-logs/* | sort | uniq -u |\ - /usr/bin/env JOBS=10 \ - bin/parallel-rsync-repos \ - /home/git/transfer-logs/success-$(date +%s).log \ - /home/git/repositories \ - /mnt/gitlab/repositories -` -``` - -### Parallel rsync only for repositories with recent activity - -Suppose you have already done one sync that started after 2015-10-1 12:00 UTC. -Then you might only want to sync repositories that were changed via GitLab -_after_ that time. You can use the 'SINCE' variable to tell 'rake -gitlab:list_repos' to only print repositories with recent activity. - -``` -# Omnibus -sudo gitlab-rake gitlab:list_repos SINCE='2015-10-1 12:00 UTC' |\ - sudo -u git \ - /usr/bin/env JOBS=10 \ - /opt/gitlab/embedded/service/gitlab-rails/bin/parallel-rsync-repos \ - success-$(date +%s).log \ - /var/opt/gitlab/git-data/repositories \ - /mnt/gitlab/repositories - -# Source -cd /home/git/gitlab -sudo -u git -H bundle exec rake gitlab:list_repos SINCE='2015-10-1 12:00 UTC' |\ - sudo -u git -H \ - /usr/bin/env JOBS=10 \ - bin/parallel-rsync-repos \ - success-$(date +%s).log \ - /home/git/repositories \ - /mnt/gitlab/repositories -``` +This document was moved to [administration/operations/moving_repositories](../administration/operations/moving_repositories.md). diff --git a/doc/operations/sidekiq_memory_killer.md b/doc/operations/sidekiq_memory_killer.md index b5e78348989..cf7c3b2e2ed 100644 --- a/doc/operations/sidekiq_memory_killer.md +++ b/doc/operations/sidekiq_memory_killer.md @@ -1,40 +1 @@ -# Sidekiq MemoryKiller - -The GitLab Rails application code suffers from memory leaks. For web requests -this problem is made manageable using -[unicorn-worker-killer](https://github.com/kzk/unicorn-worker-killer) which -restarts Unicorn worker processes in between requests when needed. The Sidekiq -MemoryKiller applies the same approach to the Sidekiq processes used by GitLab -to process background jobs. - -Unlike unicorn-worker-killer, which is enabled by default for all GitLab -installations since GitLab 6.4, the Sidekiq MemoryKiller is enabled by default -_only_ for Omnibus packages. The reason for this is that the MemoryKiller -relies on Runit to restart Sidekiq after a memory-induced shutdown and GitLab -installations from source do not all use Runit or an equivalent. - -With the default settings, the MemoryKiller will cause a Sidekiq restart no -more often than once every 15 minutes, with the restart causing about one -minute of delay for incoming background jobs. - -## Configuring the MemoryKiller - -The MemoryKiller is controlled using environment variables. - -- `SIDEKIQ_MEMORY_KILLER_MAX_RSS`: if this variable is set, and its value is - greater than 0, then after each Sidekiq job, the MemoryKiller will check the - RSS of the Sidekiq process that executed the job. If the RSS of the Sidekiq - process (expressed in kilobytes) exceeds SIDEKIQ_MEMORY_KILLER_MAX_RSS, a - delayed shutdown is triggered. The default value for Omnibus packages is set - [in the omnibus-gitlab - repository](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/files/gitlab-cookbooks/gitlab/attributes/default.rb). -- `SIDEKIQ_MEMORY_KILLER_GRACE_TIME`: defaults 900 seconds (15 minutes). When - a shutdown is triggered, the Sidekiq process will keep working normally for - another 15 minutes. -- `SIDEKIQ_MEMORY_KILLER_SHUTDOWN_WAIT`: defaults to 30 seconds. When the grace - time has expired, the MemoryKiller tells Sidekiq to stop accepting new jobs. - Existing jobs get 30 seconds to finish. After that, the MemoryKiller tells - Sidekiq to shut down, and an external supervision mechanism (e.g. Runit) must - restart Sidekiq. -- `SIDEKIQ_MEMORY_KILLER_SHUTDOWN_SIGNAL`: defaults to `SIGKILL`. The name of - the final signal sent to the Sidekiq process when we want it to shut down. +This document was moved to [administration/operations/sidekiq_memory_killer](../administration/operations/sidekiq_memory_killer.md). diff --git a/doc/operations/unicorn.md b/doc/operations/unicorn.md index bad61151bda..fbc9697b755 100644 --- a/doc/operations/unicorn.md +++ b/doc/operations/unicorn.md @@ -1,86 +1 @@ -# Understanding Unicorn and unicorn-worker-killer - -## Unicorn - -GitLab uses [Unicorn](http://unicorn.bogomips.org/), a pre-forking Ruby web -server, to handle web requests (web browsers and Git HTTP clients). Unicorn is -a daemon written in Ruby and C that can load and run a Ruby on Rails -application; in our case the Rails application is GitLab Community Edition or -GitLab Enterprise Edition. - -Unicorn has a multi-process architecture to make better use of available CPU -cores (processes can run on different cores) and to have stronger fault -tolerance (most failures stay isolated in only one process and cannot take down -GitLab entirely). On startup, the Unicorn 'master' process loads a clean Ruby -environment with the GitLab application code, and then spawns 'workers' which -inherit this clean initial environment. The 'master' never handles any -requests, that is left to the workers. The operating system network stack -queues incoming requests and distributes them among the workers. - -In a perfect world, the master would spawn its pool of workers once, and then -the workers handle incoming web requests one after another until the end of -time. In reality, worker processes can crash or time out: if the master notices -that a worker takes too long to handle a request it will terminate the worker -process with SIGKILL ('kill -9'). No matter how the worker process ended, the -master process will replace it with a new 'clean' process again. Unicorn is -designed to be able to replace 'crashed' workers without dropping user -requests. - -This is what a Unicorn worker timeout looks like in `unicorn_stderr.log`. The -master process has PID 56227 below. - -``` -[2015-06-05T10:58:08.660325 #56227] ERROR -- : worker=10 PID:53009 timeout (61s > 60s), killing -[2015-06-05T10:58:08.699360 #56227] ERROR -- : reaped # worker=10 -[2015-06-05T10:58:08.708141 #62538] INFO -- : worker=10 spawned pid=62538 -[2015-06-05T10:58:08.708824 #62538] INFO -- : worker=10 ready -``` - -### Tunables - -The main tunables for Unicorn are the number of worker processes and the -request timeout after which the Unicorn master terminates a worker process. -See the [omnibus-gitlab Unicorn settings -documentation](https://gitlab.com/gitlab-org/omnibus-gitlab/blob/master/doc/settings/unicorn.md) -if you want to adjust these settings. - -## unicorn-worker-killer - -GitLab has memory leaks. These memory leaks manifest themselves in long-running -processes, such as Unicorn workers. (The Unicorn master process is not known to -leak memory, probably because it does not handle user requests.) - -To make these memory leaks manageable, GitLab comes with the -[unicorn-worker-killer gem](https://github.com/kzk/unicorn-worker-killer). This -gem [monkey-patches](https://en.wikipedia.org/wiki/Monkey_patch) the Unicorn -workers to do a memory self-check after every 16 requests. If the memory of the -Unicorn worker exceeds a pre-set limit then the worker process exits. The -Unicorn master then automatically replaces the worker process. - -This is a robust way to handle memory leaks: Unicorn is designed to handle -workers that 'crash' so no user requests will be dropped. The -unicorn-worker-killer gem is designed to only terminate a worker process _in -between requests_, so no user requests are affected. - -This is what a Unicorn worker memory restart looks like in unicorn_stderr.log. -You see that worker 4 (PID 125918) is inspecting itself and decides to exit. -The threshold memory value was 254802235 bytes, about 250MB. With GitLab this -threshold is a random value between 200 and 250 MB. The master process (PID -117565) then reaps the worker process and spawns a new 'worker 4' with PID -127549. - -``` -[2015-06-05T12:07:41.828374 #125918] WARN -- : #: worker (pid: 125918) exceeds memory limit (256413696 bytes > 254802235 bytes) -[2015-06-05T12:07:41.828472 #125918] WARN -- : Unicorn::WorkerKiller send SIGQUIT (pid: 125918) alive: 23 sec (trial 1) -[2015-06-05T12:07:42.025916 #117565] INFO -- : reaped # worker=4 -[2015-06-05T12:07:42.034527 #127549] INFO -- : worker=4 spawned pid=127549 -[2015-06-05T12:07:42.035217 #127549] INFO -- : worker=4 ready -``` - -One other thing that stands out in the log snippet above, taken from -GitLab.com, is that 'worker 4' was serving requests for only 23 seconds. This -is a normal value for our current GitLab.com setup and traffic. - -The high frequency of Unicorn memory restarts on some GitLab sites can be a -source of confusion for administrators. Usually they are a [red -herring](https://en.wikipedia.org/wiki/Red_herring). +This document was moved to [administration/operations/unicorn](../administration/operations/unicorn.md). -- cgit v1.2.1