summaryrefslogtreecommitdiff
path: root/doc/administration/troubleshooting
diff options
context:
space:
mode:
Diffstat (limited to 'doc/administration/troubleshooting')
-rw-r--r--doc/administration/troubleshooting/debug.md110
-rw-r--r--doc/administration/troubleshooting/diagnostics_tools.md27
-rw-r--r--doc/administration/troubleshooting/elasticsearch.md345
-rw-r--r--doc/administration/troubleshooting/kubernetes_cheat_sheet.md261
-rw-r--r--doc/administration/troubleshooting/sidekiq.md118
5 files changed, 806 insertions, 55 deletions
diff --git a/doc/administration/troubleshooting/debug.md b/doc/administration/troubleshooting/debug.md
index 8f7280d5128..604dff5983d 100644
--- a/doc/administration/troubleshooting/debug.md
+++ b/doc/administration/troubleshooting/debug.md
@@ -10,43 +10,43 @@ an SMTP server, but you're not seeing mail delivered. Here's how to check the se
1. Run a Rails console:
- ```sh
- sudo gitlab-rails console production
- ```
+ ```sh
+ sudo gitlab-rails console production
+ ```
- or for source installs:
+ or for source installs:
- ```sh
- bundle exec rails console production
- ```
+ ```sh
+ bundle exec rails console production
+ ```
1. Look at the ActionMailer `delivery_method` to make sure it matches what you
intended. If you configured SMTP, it should say `:smtp`. If you're using
Sendmail, it should say `:sendmail`:
- ```ruby
- irb(main):001:0> ActionMailer::Base.delivery_method
- => :smtp
- ```
+ ```ruby
+ irb(main):001:0> ActionMailer::Base.delivery_method
+ => :smtp
+ ```
1. If you're using SMTP, check the mail settings:
- ```ruby
- irb(main):002:0> ActionMailer::Base.smtp_settings
- => {:address=>"localhost", :port=>25, :domain=>"localhost.localdomain", :user_name=>nil, :password=>nil, :authentication=>nil, :enable_starttls_auto=>true}```
- ```
+ ```ruby
+ irb(main):002:0> ActionMailer::Base.smtp_settings
+ => {:address=>"localhost", :port=>25, :domain=>"localhost.localdomain", :user_name=>nil, :password=>nil, :authentication=>nil, :enable_starttls_auto=>true}```
+ ```
- In the example above, the SMTP server is configured for the local machine. If this is intended, you may need to check your local mail
- logs (e.g. `/var/log/mail.log`) for more details.
+ In the example above, the SMTP server is configured for the local machine. If this is intended, you may need to check your local mail
+ logs (e.g. `/var/log/mail.log`) for more details.
-1. Send a test message via the console.
+1. Send a test message via the console.
- ```ruby
- irb(main):003:0> Notify.test_email('youremail@email.com', 'Hello World', 'This is a test message').deliver_now
- ```
+ ```ruby
+ irb(main):003:0> Notify.test_email('youremail@email.com', 'Hello World', 'This is a test message').deliver_now
+ ```
- If you do not receive an e-mail and/or see an error message, then check
- your mail server settings.
+ If you do not receive an e-mail and/or see an error message, then check
+ your mail server settings.
## Advanced Issues
@@ -103,37 +103,37 @@ downtime. Otherwise skip to the next section.
1. Run `sudo gdb -p <PID>` to attach to the unicorn process.
1. In the gdb window, type:
- ```
- call (void) rb_backtrace()
- ```
+ ```
+ call (void) rb_backtrace()
+ ```
1. This forces the process to generate a Ruby backtrace. Check
`/var/log/gitlab/unicorn/unicorn_stderr.log` for the backtace. For example, you may see:
- ```ruby
- from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:33:in `block in start'
- from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:33:in `loop'
- from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:36:in `block (2 levels) in start'
- from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:44:in `sample'
- from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:68:in `sample_objects'
- from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:68:in `each_with_object'
- from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:68:in `each'
- from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:69:in `block in sample_objects'
- from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:69:in `name'
- ```
+ ```ruby
+ from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:33:in `block in start'
+ from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:33:in `loop'
+ from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:36:in `block (2 levels) in start'
+ from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:44:in `sample'
+ from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:68:in `sample_objects'
+ from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:68:in `each_with_object'
+ from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:68:in `each'
+ from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:69:in `block in sample_objects'
+ from /opt/gitlab/embedded/service/gitlab-rails/lib/gitlab/metrics/sampler.rb:69:in `name'
+ ```
1. To see the current threads, run:
- ```
- thread apply all bt
- ```
+ ```
+ thread apply all bt
+ ```
1. Once you're done debugging with `gdb`, be sure to detach from the process and exit:
- ```
- detach
- exit
- ```
+ ```
+ detach
+ exit
+ ```
Note that if the unicorn process terminates before you are able to run these
commands, gdb will report an error. To buy more time, you can always raise the
@@ -162,21 +162,21 @@ separate Rails process to debug the issue:
1. Create a Personal Access Token for your user (Profile Settings -> Access Tokens).
1. Bring up the GitLab Rails console. For omnibus users, run:
- ```
- sudo gitlab-rails console
- ```
+ ```
+ sudo gitlab-rails console
+ ```
1. At the Rails console, run:
- ```ruby
- [1] pry(main)> app.get '<URL FROM STEP 2>/?private_token=<TOKEN FROM STEP 3>'
- ```
+ ```ruby
+ [1] pry(main)> app.get '<URL FROM STEP 2>/?private_token=<TOKEN FROM STEP 3>'
+ ```
- For example:
+ For example:
- ```ruby
- [1] pry(main)> app.get 'https://gitlab.com/gitlab-org/gitlab-ce/issues/1?private_token=123456'
- ```
+ ```ruby
+ [1] pry(main)> app.get 'https://gitlab.com/gitlab-org/gitlab-ce/issues/1?private_token=123456'
+ ```
1. In a new window, run `top`. It should show this ruby process using 100% CPU. Write down the PID.
1. Follow step 2 from the previous section on using gdb.
@@ -209,7 +209,7 @@ ps auwx | grep unicorn | awk '{ print " -p " $2}' | xargs strace -tt -T -f -s 10
The output in `/tmp/unicorn.txt` may help diagnose the root cause.
-# More information
+## More information
- [Debugging Stuck Ruby Processes](https://blog.newrelic.com/2013/04/29/debugging-stuck-ruby-processes-what-to-do-before-you-kill-9/)
- [Cheatsheet of using gdb and ruby processes](gdb-stuck-ruby.txt)
diff --git a/doc/administration/troubleshooting/diagnostics_tools.md b/doc/administration/troubleshooting/diagnostics_tools.md
new file mode 100644
index 00000000000..ab3b25f0e97
--- /dev/null
+++ b/doc/administration/troubleshooting/diagnostics_tools.md
@@ -0,0 +1,27 @@
+---
+type: reference
+---
+
+# Diagnostics tools
+
+These are some of the diagnostics tools the GitLab Support team uses during troubleshooting.
+They are listed here for transparency, and they may be useful for users with experience
+with troubleshooting GitLab. If you are currently having an issue with GitLab, you
+may want to check your [support options](https://about.gitlab.com/support/) first,
+before attempting to use these tools.
+
+## gitlabsos
+
+The [gitlabsos](https://gitlab.com/gitlab-com/support/toolbox/gitlabsos/) utility
+provides a unified method of gathering info and logs from GitLab and the system it's
+running on.
+
+## strace-parser
+
+[strace-parser](https://gitlab.com/wchandler/strace-parser) is a small tool to analyze
+and summarize raw strace data.
+
+## Pritaly
+
+[Pritaly](https://gitlab.com/wchandler/pritaly) takes Gitaly logs and colorizes output
+or converts the logs to JSON.
diff --git a/doc/administration/troubleshooting/elasticsearch.md b/doc/administration/troubleshooting/elasticsearch.md
new file mode 100644
index 00000000000..c4a7ba01fae
--- /dev/null
+++ b/doc/administration/troubleshooting/elasticsearch.md
@@ -0,0 +1,345 @@
+# Troubleshooting ElasticSearch
+
+Troubleshooting ElasticSearch requires:
+
+- Knowledge of common terms.
+- Establishing within which category the problem fits.
+
+## Common terminology
+
+- **Lucene**: A full-text search library written in Java.
+- **Near Realtime (NRT)**: Refers to the slight latency from the time to index a
+ document to the time when it becomes searchable.
+- **Cluster**: A collection of one or more nodes that work together to hold all
+ the data, providing indexing and search capabilities.
+- **Node**: A single server that works as part of a cluster.
+- **Index**: A collection of documents that have somewhat similar characteristics.
+- **Document**: A basic unit of information that can be indexed.
+- **Shards**: Fully-functional and independent subdivisions of indices. Each shard is actually
+ a Lucene index.
+- **Replicas**: Failover mechanisms that duplicate indices.
+
+## Troubleshooting workflows
+
+The type of problem will determine what steps to take. The possible troubleshooting workflows are for:
+
+- Search results.
+- Indexing.
+- Integration.
+- Performance.
+
+### Search Results workflow
+
+The following workflow is for ElasticSearch search results issues:
+
+```mermaid
+graph TD;
+ B --> |No| B1
+ B --> |Yes| B4
+ B1 --> B2
+ B2 --> B3
+ B4 --> B5
+ B5 --> |Yes| B6
+ B5 --> |No| B7
+ B7 --> B8
+ B{Is GitLab using<br>ElasticSearch for<br>searching?}
+ B1[Check Admin Area > Integrations<br>to ensure the settings are correct]
+ B2[Perform a search via<br>the rails console]
+ B3[If all settings are correct<br>and it still doesn't show ElasticSearch<br>doing the searches, escalate<br>to GitLab support.]
+ B4[Perform<br>the same search via the<br>ElasticSearch API]
+ B5{Are the results<br>the same?}
+ B6[This means it is working as intended.<br>Speak with GitLab support<br>to confirm if the issue lies with<br>the filters.]
+ B7[Check the index status of the project<br>containing the missing search<br>results.]
+ B8(Indexing Troubleshooting)
+```
+
+### Indexing workflow
+
+The following workflow is for ElasticSearch indexing issues:
+
+```mermaid
+graph TD;
+ C --> |Yes| C1
+ C1 --> |Yes| C2
+ C1 --> |No| C3
+ C3 --> |Yes| C4
+ C3 --> |No| C5
+ C --> |No| C6
+ C6 --> |No| C10
+ C7 --> |GitLab| C8
+ C7 --> |ElasticSearch| C9
+ C6 --> |Yes| C7
+ C10 --> |No| C12
+ C10 --> |Yes| C11
+ C12 --> |Yes| C13
+ C12 --> |No| C14
+ C14 --> |Yes| C15
+ C14 --> |No| C16
+ C{Is the problem with<br>creating an empty<br>index?}
+ C1{Does the gitlab-production<br>index exist on the<br>ElasticSearch instance?}
+ C2(Try to manually<br>delete the index on the<br>ElasticSearch instance and<br>retry creating an empty index.)
+ C3{Can indices be made<br>manually on the ElasticSearch<br>instance?}
+ C4(Retry the creation of an empty index)
+ C5(It is best to speak with an<br>ElasticSearch admin concerning the<br>instance's inability to create indices.)
+ C6{Is the indexer presenting<br>errors during indexing?}
+ C7{Is the error a GitLab<br>error or an ElasticSearch<br>error?}
+ C8[Escalate to<br>GitLab support]
+ C9[You will want<br>to speak with an<br>ElasticSearch admin.]
+ C10{Does the index status<br>show 100%?}
+ C11[Escalate to<br>GitLab support]
+ C12{Does re-indexing the project<br> present any GitLab errors?}
+ C13[Rectify the GitLab errors and<br>restart troubleshooting, or<br>escalate to GitLab support.]
+ C14{Does re-indexing the project<br>present errors on the <br>ElasticSearch instance?}
+ C15[It would be best<br>to speak with an<br>ElasticSearch admin.]
+ C16[This is likely a bug/issue<br>in GitLab and will require<br>deeper investigation. Escalate<br>to GitLab support.]
+```
+
+### Integration workflow
+
+The following workflow is for ElasticSearch integration issues:
+
+```mermaid
+graph TD;
+ D --> |No| D1
+ D --> |Yes| D2
+ D2 --> |No| D3
+ D2 --> |Yes| D4
+ D4 --> |No| D5
+ D4 --> |Yes| D6
+ D{Is the error concerning<br>the beta indexer?}
+ D1[It would be best<br>to speak with an<br>ElasticSearch admin.]
+ D2{Is the ICU development<br>package installed?}
+ D3>This package is required.<br>Install the package<br>and retry.]
+ D4{Is the error stemming<br>from the indexer?}
+ D5[This would indicate an OS level<br> issue. It would be best to<br>contact your sysadmin.]
+ D6[This is likely a bug/issue<br>in GitLab and will require<br>deeper investigation. Escalate<br>to GitLab support.]
+```
+
+### Performance workflow
+
+The following workflow is for ElasticSearch performance issues:
+
+```mermaid
+graph TD;
+ F --> |Yes| F1
+ F --> |No| F2
+ F2 --> |No| F3
+ F2 --> |Yes| F4
+ F4 --> F5
+ F5 --> |No| F6
+ F5 --> |Yes| F7
+ F{Is the ElasticSearch instance<br>running on the same server<br>as the GitLab instance?}
+ F1(This is not advised and will cause issues.<br>We recommend moving the ElasticSearch<br>instance to a different server.)
+ F2{Does the ElasticSearch<br>server have at least 8<br>GB of RAM and 2 CPU<br>cores?}
+ F3(According to ElasticSearch, a non-prod<br>server needs these as a base requirement.<br>Production often requires more. We recommend<br>you increase the server specifications.)
+ F4(Obtain the <br>cluster health information)
+ F5(Does it show the<br>status as green?)
+ F6(We recommend you speak with<br>an ElasticSearch admin<br>about implementing sharding.)
+ F7(Escalate to<br>GitLab support.)
+```
+
+## Troubleshooting walkthrough
+
+Most ElasticSearch troubleshooting can be broken down into 4 categories:
+
+- [Troubleshooting search results](#troubleshooting-search-results)
+- [Troubleshooting indexing](#troubleshooting-indexing)
+- [Troubleshooting integration](#troubleshooting-integration)
+- [Troubleshooting performance](#troubleshooting-performance)
+
+Generally speaking, if it does not fall into those four categories, it is either:
+
+- Something GitLab support needs to look into.
+- Not a true ElasticSearch issue.
+
+Exercise caution. Issues that appear to be ElasticSearch problems can be OS-level issues.
+
+### Troubleshooting search results
+
+Troubleshooting search result issues is rather straight forward on ElasticSearch.
+
+The first step is to confirm GitLab is using ElasticSearch for the search function.
+To do this:
+
+1. Confirm the integration is enabled in **Admin Area > Settings > Integrations**.
+1. Confirm searches utilize ElasticSearch by accessing the rails console
+ (`sudo gitlab-rails console`) and running the following commands:
+
+ ```rails
+ u = User.find_by_email('email_of_user_doing_search')
+ s = SearchService.new(u, {:search => 'search_term'})
+ pp s.search_objects.class.name
+ ```
+
+The ouput from the last command is the key here. If it shows:
+
+- `ActiveRecord::Relation`, **it is not** using ElasticSearch.
+- `Kaminari::PaginatableArray`, **it is** using ElasticSearch.
+
+| Not using ElasticSearch | Using ElasticSearch |
+|--------------------------|------------------------------|
+| `ActiveRecord::Relation` | `Kaminari::PaginatableArray` |
+
+If all the settings look correct and it is still not using ElasticSearch for the search function, it is best to escalate to GitLab support. This could be a bug/issue.
+
+Moving past that, it is best to attempt the same search using the [ElasticSearch Search API](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-search.html) and compare the results from what you see in GitLab.
+
+If the results:
+
+- Sync up, then there is not a technical "issue" per se. Instead, it might be a problem
+ with the ElasticSearch filters we are using. This can be complicated, so it is best to
+ escalate to GitLab support to check these and guide you on the potential on whether or
+ not a feature request is needed.
+- Do not match up, this indicates a problem with the documents generated from the
+ project. It is best to re-index that project and proceed with
+ [Troubleshooting indexing](#troubleshooting-indexing).
+
+### Troubleshooting indexing
+
+Troubleshooting indexing issues can be tricky. It can pretty quickly go to either GitLab
+support or your ElasticSearch admin.
+
+The best place to start is to determine if the issue is with creating an empty index.
+If it is, check on the ElasticSearch side to determine if the `gitlab-production` (the
+name for the GitLab index) exists. If it exists, manually delete it on the ElasticSearch
+side and attempt to recreate it from the
+[`create_empty_index`](../../integration/elasticsearch.md#gitlab-elasticsearch-rake-tasks)
+rake task.
+
+If you still encounter issues, try creating an index manually on the ElasticSearch
+instance. The details of the index aren't important here, as we want to test if indices
+can be made. If the indices:
+
+- Cannot be made, speak with your ElasticSearch admin.
+- Can be made, Escalate this to GitLab support.
+
+If the issue is not with creating an empty index, the next step is to check for errors
+during the indexing of projects. If errors do occur, they will either stem from the indexing:
+
+- On the GitLab side. You need to rectify those. If they are not
+ something you are familiar with, contact GitLab support for guidance.
+- Within the ElasticSearch instance itself. See if the error is [documented and has a fix](../../integration/elasticsearch.md#troubleshooting). If not, speak with your ElasticSearch admin.
+
+If the indexing process does not present errors, you will want to check the status of the indexed projects. You can do this via the following rake tasks:
+
+- [`sudo gitlab-rake gitlab:elastic:index_projects_status`](../../integration/elasticsearch.md#gitlab-elasticsearch-rake-tasks) (shows the overall status)
+- [`sudo gitlab-rake gitlab:elastic:projects_not_indexed`](../../integration/elasticsearch.md#gitlab-elasticsearch-rake-tasks) (shows specific projects that are not indexed)
+
+If:
+
+- Everything is showing at 100%, escalate to GitLab support. This could be a potential
+ bug/issue.
+- You do see something not at 100%, attempt to reindex that project. To do this,
+ run `sudo gitlab-rake gitlab:elastic:index_projects ID_FROM=<project ID> ID_TO=<project ID>`.
+
+If reindexing the project shows:
+
+- Errors on the GitLab side, escalate those to GitLab support.
+- ElasticSearch errors or doesn't present any errors at all, reach out to your
+ ElasticSearch admin to check the instance.
+
+### Troubleshooting integration
+
+Troubleshooting integration tends to be pretty straight forward, as there really isn't
+much to "integrate" here.
+
+If the issue is:
+
+- Not concerning the beta indexer, it is almost always an
+ ElasticSearch-side issue. This means you should reach out to your ElasticSearch admin
+ regarding the error(s) you are seeing. If you are unsure here, it never hurts to reach
+ out to GitLab support.
+- With the beta indexer, check if the ICU development package is installed.
+ This is a required package so make sure you install it.
+
+Beyond that, you will want to review the error. If it is:
+
+- Specifically from the indexer, this could be a bug/issue and should be escalated to
+ GitLab support.
+- An OS issue, you will want to reach out to your systems administrator.
+
+### Troubleshooting performance
+
+Troubleshooting performance can be difficult on ElasticSearch. There is a ton of tuning
+that *can* be done, but the majority of this falls on shoulders of a skilled
+ElasticSearch administrator.
+
+Generally speaking, ensure:
+
+* The ElasticSearch server **is not** running on the same node as GitLab.
+* The ElasticSearch server have enough RAM and CPU cores.
+* That sharding **is** being used.
+
+Going into some more detail here, if ElasticSearch is running on the same server as GitLab, resource contention is **very** likely to occur. Ideally, ElasticSearch, which requires ample resources, should be running on its own server (maybe coupled with logstash and kibana).
+
+When it comes to ElasticSearch, RAM is the key resource. ElasticSearch themselves recommend:
+
+- **At least** 8 GB of RAM for a non-production instance.
+- **At least** 16 GB of RAM for a production instance.
+- Ideally, 64 GB of RAM.
+
+For CPU, ElasticSearch recommends at least 2 CPU cores, but ElasticSearch states common
+setups use up to 8 cores. For more details on server specs, check out
+[ElasticSearch's hardware guide](https://www.elastic.co/guide/en/elasticsearch/guide/current/hardware.html).
+
+Beyond the obvious, sharding comes into play. Sharding is a core part of ElasticSearch.
+It allows for horizontal scaling of indices, which is helpful when you are dealing with
+a large amount of data.
+
+With the way GitLab does indexing, there is a **huge** amount of documents being
+indexed. By utilizing sharding, you can speed up ElasticSearch's ability to locate
+data, since each shard is a Lucene index.
+
+If you are not using sharding, you are likely to hit issues when you start using
+ElasticSearch in a production environment.
+
+Keep in mind that an index with only one shard has **no scale factor** and will
+likely encounter issues when called upon with some frequency.
+
+If you need to know how many shards, read
+[ElasticSearch's documentation on capacity planning](https://www.elastic.co/guide/en/elasticsearch/guide/2.x/capacity-planning.html),
+as the answer is not straight forward.
+
+The easiest way to determine if sharding is in use is to check the output of the
+[ElasticSearch Health API](https://www.elastic.co/guide/en/elasticsearch/reference/current/cluster-health.html):
+
+- Red means the cluster is down.
+- Yellow means it is up with no sharding/replication.
+- Green means it is healthy (up, sharding, replicating).
+
+For production use, it should always be green.
+
+Beyond these steps, you get into some of the more complicated things to check,
+such as merges and caching. These can get complicated and it takes some time to
+learn them, so it is best to escalate/pair with an ElasticSearch expert if you need to
+dig further into these.
+
+Feel free to reach out to GitLab support, but this is likely to be something a skilled
+ElasticSearch admin has more experience with.
+
+## Common issues
+
+All common issues [should be documented](../../integration/elasticsearch.md#troubleshooting). If not,
+feel free to update that page with issues you encounter and solutions.
+
+## Replication
+
+Setting up ElasticSearch isn't too bad, but it can be a bit finnicky and time consuming.
+
+The eastiest method is to spin up a docker container with the required version and
+bind ports 9200/9300 so it can be used.
+
+The following is an example of running a docker container of ElasticSearch v7.2.0:
+
+```bash
+docker pull docker.elastic.co/elasticsearch/elasticsearch:7.2.0
+docker run -p 9200:9200 -p 9300:9300 -e "discovery.type=single-node" docker.elastic.co/elasticsearch/elasticsearch:7.2.0
+```
+
+From here, you can:
+
+- Grab the IP of the docker container (use `docker inspect <container_id>`)
+- Use `<IP.add.re.ss:9200>` to communicate with it.
+
+This is a quick method to test out ElasticSearch, but by no means is this a
+production solution.
diff --git a/doc/administration/troubleshooting/kubernetes_cheat_sheet.md b/doc/administration/troubleshooting/kubernetes_cheat_sheet.md
new file mode 100644
index 00000000000..260af333e8e
--- /dev/null
+++ b/doc/administration/troubleshooting/kubernetes_cheat_sheet.md
@@ -0,0 +1,261 @@
+---
+type: reference
+---
+
+# Kubernetes, GitLab and You
+
+This is a list of useful information regarding Kubernetes that the GitLab Support
+Team sometimes uses while troubleshooting. GitLab is making this public, so that anyone
+can make use of the Support team's collected knowledge
+
+CAUTION: **Caution:**
+These commands **can alter or break** your Kubernetes components so use these at your own risk.
+
+If you are on a [paid tier](https://about.gitlab.com/pricing/) and are not sure how
+to use these commands, it is best to [contact Support](https://about.gitlab.com/support/)
+and they will assist you with any issues you are having.
+
+## Generic kubernetes commands
+
+- How to authorize to your GCP project (can be especially useful if you have projects
+ under different GCP accounts):
+
+ ```bash
+ gcloud auth login
+ ```
+
+- How to access Kubernetes dashboard:
+
+ ```bash
+ # for minikube:
+ minikube dashboard —url
+ # for non-local installations if access via Kubectl is configured:
+ kubectl proxy
+ ```
+
+- How to ssh to a Kubernetes node and enter the container as root
+ <https://github.com/kubernetes/kubernetes/issues/30656>:
+
+ - For GCP, you may find the node name and run `gcloud compute ssh node-name`.
+ - List containers using `docker ps`.
+ - Enter container using `docker exec --user root -ti container-id bash`.
+
+- How to copy a file from local machine to a pod:
+
+ ```bash
+ kubectl cp file-name pod-name:./destination-path
+ ```
+
+- What to do with pods in `CrashLoopBackoff` status:
+
+ - Check logs via Kubernetes dashboard.
+ - Check logs via Kubectl:
+
+ ```bash
+ kubectl logs <unicorn pod> -c dependencies
+ ```
+
+- How to tail all Kubernetes cluster events in real time:
+
+ ```bash
+ kubectl get events -w --all-namespaces
+ ```
+
+- How to get logs of the previously terminated pod instance:
+
+ ```bash
+ kubectl logs <pod-name> --previous
+ ```
+
+ NOTE: **Note:**
+ No logs are kept in the containers/pods themselves, everything is written to stdout.
+ This is the principle of Kubernetes, read [Twelve-factor app](https://12factor.net/)
+ for details.
+
+## GitLab-specific kubernetes information
+
+- Minimal config that can be used to test a Kubernetes helm chart can be found
+ [here](https://gitlab.com/charts/gitlab/issues/620).
+
+- Tailing logs of a separate pod. An example for a unicorn pod:
+
+ ```bash
+ kubectl logs gitlab-unicorn-7656fdd6bf-jqzfs -c unicorn
+ ```
+
+- Tail and follow all pods that share a label (in this case, `unicorn`):
+
+ ```bash
+ # all containers in the unicorn pods
+ kubectl logs -f -l app=unicorn --all-containers=true --max-log-requests=50
+
+ # only the unicorn containers in all unicorn pods
+ kubectl logs -f -l app=unicorn -c unicorn --max-log-requests=50
+ ```
+
+- One can stream logs from all containers at once, similar to the Omnibus
+ command `gitlab-ctl tail`:
+
+ ```bash
+ kubectl logs -f -l release=gitlab --all-containers=true --max-log-requests=100
+ ```
+
+- Check all events in the `gitlab` namespace (the namespace name can be different if you
+ specified a different one when deploying the helm chart):
+
+ ```bash
+ kubectl get events -w --namespace=gitlab
+ ```
+
+- Most of the useful GitLab tools (console, rake tasks, etc) are found in the task-runner
+ pod. You may enter it and run commands inside or run them from the outside:
+
+ ```bash
+ # find the pod
+ kubectl get pods | grep task-runner
+
+ # enter it
+ kubectl exec -it <task-runner-pod-name> bash
+
+ # open rails console
+ # rails console can be also called from other GitLab pods
+ /srv/gitlab/bin/rails console
+
+ # source-style commands should also work
+ /srv/gitlab && bundle exec rake gitlab:check RAILS_ENV=production
+
+ # run GitLab check. Note that the output can be confusing and invalid because of the specific structure of GitLab installed via helm chart
+ /usr/local/bin/gitlab-rake gitlab:check
+
+ # open console without entering pod
+ kubectl exec -it <task-runner-pod-name> /srv/gitlab/bin/rails console
+
+ # check the status of DB migrations
+ kubectl exec -it <task-runner-pod-name> /usr/local/bin/gitlab-rake db:migrate:status
+ ```
+
+ You can also use `gitlab-rake`, instead of `/usr/local/bin/gitlab-rake`.
+
+- Troubleshooting **Operations > Kubernetes** integration:
+
+ - Check the output of `kubectl get events -w --all-namespaces`.
+ - Check the logs of pods within `gitlab-managed-apps` namespace.
+ - On the side of GitLab check sidekiq log and kubernetes log. When GitLab is installed
+ via Helm Chart, `kubernetes.log` can be found inside the sidekiq pod.
+
+- How to get your initial admin password <https://docs.gitlab.com/charts/installation/deployment.html#initial-login>:
+
+ ```bash
+ # find the name of the secret containing the password
+ kubectl get secrets | grep initial-root
+ # decode it
+ kubectl get secret <secret-name> -ojsonpath={.data.password} | base64 --decode ; echo
+ ```
+
+- How to connect to a GitLab Postgres database:
+
+ ```bash
+ kubectl exec -it <task-runner-pod-name> -- /srv/gitlab/bin/rails dbconsole -p
+ ```
+
+- How to get info about Helm installation status:
+
+ ```bash
+ helm status name-of-installation
+ ```
+
+- How to update GitLab installed using Helm Chart:
+
+ ```bash
+ helm repo upgrade
+
+ # get current values and redirect them to yaml file (analogue of gitlab.rb values)
+ helm get values <release name> > gitlab.yaml
+
+ # run upgrade itself
+ helm upgrade <release name> <chart path> -f gitlab.yaml
+ ```
+
+ After <https://canary.gitlab.com/charts/gitlab/issues/780> is fixed, it should
+ be possible to use [Updating GitLab using the Helm Chart](https://docs.gitlab.com/ee/install/kubernetes/gitlab_chart.html#updating-gitlab-using-the-helm-chart)
+ for upgrades.
+
+- How to apply changes to GitLab config:
+
+ - Modify the `gitlab.yaml` file.
+ - Run the following command to apply changes:
+
+ ```bash
+ helm upgrade <release name> <chart path> -f gitlab.yaml
+ ```
+
+## Installation of minimal GitLab config via Minukube on macOS
+
+This section is based on [Developing for Kubernetes with Minikube](https://gitlab.com/charts/gitlab/blob/master/doc/minikube/index.md)
+and [Helm](https://gitlab.com/charts/gitlab/blob/master/doc/helm/index.md). Refer
+to those documents for details.
+
+- Install Kubectl via Homebrew:
+
+ ```bash
+ brew install kubernetes-cli
+ ```
+
+- Install Minikube via Homebrew:
+
+ ```bash
+ brew cask install minikube
+ ```
+
+- Start Minikube and configure it. If Minikube cannot start, try running `minikube delete && minikube start`
+ and repeat the steps:
+
+ ```bash
+ minikube start --cpus 3 --memory 8192 # minimum amount for GitLab to work
+ minikube addons enable ingress
+ minikube addons enable kube-dns
+ ```
+
+- Install Helm via Homebrew and initialize it:
+
+ ```bash
+ brew install kubernetes-helm
+ helm init --service-account tiller
+ ```
+
+- Copy the file <https://gitlab.com/charts/gitlab/raw/master/examples/values-minikube-minimum.yaml>
+ to your workstation.
+
+- Find the IP address in the output of `minikube ip` and update the yaml file with
+ this IP address.
+
+- Install the GitLab Helm Chart:
+
+ ```bash
+ helm repo add gitlab https://charts.gitlab.io
+ helm install --name gitlab -f <path-to-yaml-file> gitlab/gitlab
+ ```
+
+ If you want to modify some GitLab settings, you can use the above-mentioned config
+ as a base and create your own yaml file.
+
+- Monitor the installation progress via `helm status gitlab` and `minikube dashboard`.
+ The installation could take up to 20-30 minutes depending on the amount of resources
+ on your workstation.
+
+- When all the pods show either a `Running` or `Completed` status, get the GitLab password as
+ described in [Initial login](https://docs.gitlab.com/ee/install/kubernetes/gitlab_chart.html#initial-login),
+ and log in to GitLab via the UI. It will be accessible via `https://gitlab.domain`
+ where `domain` is the value provided in the yaml file.
+
+<!-- ## Troubleshooting
+
+Include any troubleshooting steps that you can foresee. If you know beforehand what issues
+one might have when setting this up, or when something is changed, or on upgrading, it's
+important to describe those, too. Think of things that may go wrong and include them here.
+This is important to minimize requests for support, and to avoid doc comments with
+questions that you know someone might ask.
+
+Each scenario can be a third-level heading, e.g. `### Getting error message X`.
+If you have none to add when creating a doc, leave this section in place
+but commented out to help encourage others to add to it in the future. -->
diff --git a/doc/administration/troubleshooting/sidekiq.md b/doc/administration/troubleshooting/sidekiq.md
index 7067958ecb4..9b016c64e29 100644
--- a/doc/administration/troubleshooting/sidekiq.md
+++ b/doc/administration/troubleshooting/sidekiq.md
@@ -169,3 +169,121 @@ The PostgreSQL wiki has details on the query you can run to see blocking
queries. The query is different based on PostgreSQL version. See
[Lock Monitoring](https://wiki.postgresql.org/wiki/Lock_Monitoring) for
the query details.
+
+## Managing Sidekiq queues
+
+It is possible to use [Sidekiq API](https://github.com/mperham/sidekiq/wiki/API)
+to perform a number of troubleshoting on Sidekiq.
+
+These are the administrative commands and it should only be used if currently
+admin interface is not suitable due to scale of installation.
+
+All this commands should be run using `gitlab-rails console`.
+
+### View the queue size
+
+```ruby
+Sidekiq::Queue.new("pipeline_processing:build_queue").size
+```
+
+### Enumerate all enqueued jobs
+
+```ruby
+queue = Sidekiq::Queue.new("chaos:chaos_sleep")
+queue.each do |job|
+ # job.klass # => 'MyWorker'
+ # job.args # => [1, 2, 3]
+ # job.jid # => jid
+ # job.queue # => chaos:chaos_sleep
+ # job["retry"] # => 3
+ # job.item # => {
+ # "class"=>"Chaos::SleepWorker",
+ # "args"=>[1000],
+ # "retry"=>3,
+ # "queue"=>"chaos:chaos_sleep",
+ # "backtrace"=>true,
+ # "queue_namespace"=>"chaos",
+ # "jid"=>"39bc482b823cceaf07213523",
+ # "created_at"=>1566317076.266069,
+ # "correlation_id"=>"c323b832-a857-4858-b695-672de6f0e1af",
+ # "enqueued_at"=>1566317076.26761},
+ # }
+
+ # job.delete if job.jid == 'abcdef1234567890'
+end
+```
+
+### Enumerate currently running jobs
+
+```ruby
+workers = Sidekiq::Workers.new
+workers.each do |process_id, thread_id, work|
+ # process_id is a unique identifier per Sidekiq process
+ # thread_id is a unique identifier per thread
+ # work is a Hash which looks like:
+ # {"queue"=>"chaos:chaos_sleep",
+ # "payload"=>
+ # { "class"=>"Chaos::SleepWorker",
+ # "args"=>[1000],
+ # "retry"=>3,
+ # "queue"=>"chaos:chaos_sleep",
+ # "backtrace"=>true,
+ # "queue_namespace"=>"chaos",
+ # "jid"=>"b2a31e3eac7b1a99ff235869",
+ # "created_at"=>1566316974.9215662,
+ # "correlation_id"=>"e484fb26-7576-45f9-bf21-b99389e1c53c",
+ # "enqueued_at"=>1566316974.9229589},
+ # "run_at"=>1566316974}],
+end
+```
+
+### Remove sidekiq jobs for given parameters (destructive)
+
+```ruby
+# for jobs like this:
+# RepositoryImportWorker.new.perform_async(100)
+id_list = [100]
+
+queue = Sidekiq::Queue.new('repository_import')
+queue.each do |job|
+ job.delete if id_list.include?(job.args[0])
+end
+```
+
+### Remove specific job ID (destructive)
+
+```ruby
+queue = Sidekiq::Queue.new('repository_import')
+queue.each do |job|
+ job.delete if job.jid == 'my-job-id'
+end
+```
+
+## Canceling running jobs (destructive)
+
+> Introduced in GitLab 12.3.
+
+This is highly risky operation and use it as last resort.
+Doing that might result in data corruption, as the job
+is interrupted mid-execution and it is not guaranteed
+that proper rollback of transactions is implemented.
+
+```ruby
+Gitlab::SidekiqMonitor.cancel_job('job-id')
+```
+
+> This requires the Sidekiq to be run with `SIDEKIQ_MONITOR_WORKER=1`
+> environment variable.
+
+To perform of the interrupt we use `Thread.raise` which
+has number of drawbacks, as mentioned in [Why Ruby’s Timeout is dangerous (and Thread.raise is terrifying)](https://jvns.ca/blog/2015/11/27/why-rubys-timeout-is-dangerous-and-thread-dot-raise-is-terrifying/):
+
+> This is where the implications get interesting, and terrifying. This means that an exception can get raised:
+>
+> * during a network request (ok, as long as the surrounding code is prepared to catch Timeout::Error)
+> * during the cleanup for the network request
+> * during a rescue block
+> * while creating an object to save to the database afterwards
+> * in any of your code, regardless of whether it could have possibly raised an exception before
+>
+> Nobody writes code to defend against an exception being raised on literally any line. That’s not even possible. So Thread.raise is basically like a sneak attack on your code that could result in almost anything. It would probably be okay if it were pure-functional code that did not modify any state. But this is Ruby, so that’s unlikely :)