Merge branch 'master' into update-todo-in-uiupdate-todo-in-ui

* master: (435 commits) Change occurrence of Sidekiq::Testing.inline! Fix order-dependent spec failure in appearance_spec.rb Put a failed example from appearance_spec in quarantine Cache PerformanceBar.allowed_user_ids list locally and in Redis Add Grafana to Admin > Monitoring menu when enabled Add changelog entry Add salesforce logo Move error_tracking_frontend specs to Jest Only save Peek session in Redis when Peek is enabled Migrate markdown header_spec.js to Jest Fix golint command in Go guide doc to be recursive Move images to their own dirs Gitlab -> GitLab Re-align CE and EE API docs Rename Release groups in issue_workflow.md Update api docs to finish aligning EE and CE docs Update locale.pot Update TODO: allow_collaboration column renaming Show upcoming status for releases Rebased and squashed commits ...
author: Filipa Lacerda <filipa@gitlab.com> 2019-07-03 22:39:10 +0100
committer: Filipa Lacerda <filipa@gitlab.com> 2019-07-03 22:39:10 +0100
commit: 50be7237f41b0ac44b9aaf8b73c57993548d4c35 (patch)
tree: ecfeeae58829dadbd90de4f834c730d1d8c55e74 /doc/development/elasticsearch.md
parent: 35331c435196ea1155eb15161f3f9a481a01501d (diff)
parent: 2ad75a4f96c4d377e18788966e7eefee4d78b6d2 (diff)
download: gitlab-ce-update-todo-in-ui.tar.gz
1 files changed, 31 insertions, 16 deletions
diff --git a/doc/development/elasticsearch.md b/doc/development/elasticsearch.md
index c8c70fa7216..603a756ff56 100644
--- a/doc/development/elasticsearch.md
+++ b/doc/development/elasticsearch.md
@@ -2,14 +2,14 @@
 
 This area is to maintain a compendium of useful information when working with elasticsearch.
 
-Information on how to enable ElasticSearch and perform the initial indexing is kept in ../integration/elasticsearch.md#enabling-elasticsearch
+Information on how to enable Elasticsearch and perform the initial indexing is kept in ../integration/elasticsearch.md#enabling-elasticsearch
 
 ## Deep Dive
 
-In June 2019, Mario de la Ossa hosted a [Deep Dive] on GitLab's [ElasticSearch integration] to share his domain specific knowledge with anyone who may work in this part of the code base in the future. You can find the [recording on YouTube], and the slides on [Google Slides] and in [PDF]. Everything covered in this deep dive was accurate as of GitLab 12.0, and while specific details may have changed since then, it should still serve as a good introduction.
+In June 2019, Mario de la Ossa hosted a [Deep Dive] on GitLab's [Elasticsearch integration] to share his domain specific knowledge with anyone who may work in this part of the code base in the future. You can find the [recording on YouTube], and the slides on [Google Slides] and in [PDF]. Everything covered in this deep dive was accurate as of GitLab 12.0, and while specific details may have changed since then, it should still serve as a good introduction.
 
 [Deep Dive]: https://gitlab.com/gitlab-org/create-stage/issues/1
-[ElasticSearch integration]: ../integration/elasticsearch.md
+[Elasticsearch integration]: ../integration/elasticsearch.md
 [recording on YouTube]: https://www.youtube.com/watch?v=vrvl-tN2EaA
 [Google Slides]: https://docs.google.com/presentation/d/1H-pCzI_LNrgrL5pJAIQgvLX8Ji0-jIKOg1QeJQzChug/edit
 [PDF]: https://gitlab.com/gitlab-org/create-stage/uploads/c5aa32b6b07476fa8b597004899ec538/Elasticsearch_Deep_Dive.pdf
@@ -57,27 +57,32 @@ Additionally, if you need large repos or multiple forks for testing, please cons
 
 ## How does it work?
 
-The ElasticSearch integration depends on an external indexer. We ship a [ruby indexer](https://gitlab.com/gitlab-org/gitlab-ee/blob/master/bin/elastic_repo_indexer) by default but are also working on an [indexer written in Go](https://gitlab.com/gitlab-org/gitlab-elasticsearch-indexer). The user must trigger the initial indexing via a rake task, but after this is done GitLab itself will trigger reindexing when required via `after_` callbacks on create, update, and destroy that are inherited from [/ee/app/models/concerns/elastic/application_search.rb](https://gitlab.com/gitlab-org/gitlab-ee/blob/master/ee/app/models/concerns/elastic/application_search.rb).
+The Elasticsearch integration depends on an external indexer. We ship a [ruby indexer](https://gitlab.com/gitlab-org/gitlab-ee/blob/master/bin/elastic_repo_indexer) by default but are also working on an [indexer written in Go](https://gitlab.com/gitlab-org/gitlab-elasticsearch-indexer). The user must trigger the initial indexing via a rake task, but after this is done GitLab itself will trigger reindexing when required via `after_` callbacks on create, update, and destroy that are inherited from [/ee/app/models/concerns/elastic/application_search.rb](https://gitlab.com/gitlab-org/gitlab-ee/blob/master/ee/app/models/concerns/elastic/application_search.rb).
 
 All indexing after the initial one is done via `ElasticIndexerWorker` (sidekiq jobs).
 
 Search queries are generated by the concerns found in [ee/app/models/concerns/elastic](https://gitlab.com/gitlab-org/gitlab-ee/tree/master/ee/app/models/concerns/elastic). These concerns are also in charge of access control, and have been a historic source of security bugs so please pay close attention to them!
 
 ## Existing Analyzers/Tokenizers/Filters
-These are all defined in https://gitlab.com/gitlab-org/gitlab-ee/blob/master/ee/lib/elasticsearch/git/model.rb
+
+These are all defined in <https://gitlab.com/gitlab-org/gitlab-ee/blob/master/ee/lib/elasticsearch/git/model.rb>
 
 ### Analyzers
+
 #### `path_analyzer`
+
 Used when indexing blobs' paths. Uses the `path_tokenizer` and the `lowercase` and `asciifolding` filters.
 
 Please see the `path_tokenizer` explanation below for an example.
 
 #### `sha_analyzer`
+
 Used in blobs and commits. Uses the `sha_tokenizer` and the `lowercase` and `asciifolding` filters.
 
 Please see the `sha_tokenizer` explanation later below for an example.
 
 #### `code_analyzer`
+
 Used when indexing a blob's filename and content. Uses the `whitespace` tokenizer and the filters: `code`, `edgeNGram_filter`, `lowercase`, and `asciifolding`
 
 The `whitespace` tokenizer was selected in order to have more control over how tokens are split. For example the string `Foo::bar(4)` needs to generate tokens like `Foo` and `bar(4)` in order to be properly searched.
@@ -85,15 +90,19 @@ The `whitespace` tokenizer was selected in order to have more control over how t
 Please see the `code` filter for an explanation on how tokens are split.
 
 #### `code_search_analyzer`
+
 Not directly used for indexing, but rather used to transform a search input. Uses the `whitespace` tokenizer and the `lowercase` and `asciifolding` filters.
 
 ### Tokenizers
+
 #### `sha_tokenizer`
+
 This is a custom tokenizer that uses the [`edgeNGram` tokenizer](https://www.elastic.co/guide/en/elasticsearch/reference/5.5/analysis-edgengram-tokenizer.html) to allow SHAs to be searcheable by any sub-set of it (minimum of 5 chars).
 
-example:
+Example:
 
 `240c29dc7e` becomes:
+
 - `240c2`
 - `240c29`
 - `240c29d`
@@ -102,21 +111,26 @@ example:
 - `240c29dc7e`
 
 #### `path_tokenizer`
+
 This is a custom tokenizer that uses the [`path_hierarchy` tokenizer](https://www.elastic.co/guide/en/elasticsearch/reference/5.5/analysis-pathhierarchy-tokenizer.html) with `reverse: true` in order to allow searches to find paths no matter how much or how little of the path is given as input.
 
-example:
+Example:
 
 `'/some/path/application.js'` becomes:
+
 - `'/some/path/application.js'`
 - `'some/path/application.js'`
 - `'path/application.js'`
 - `'application.js'`
 
 ### Filters
+
 #### `code`
-Uses a [Pattern Capture token filter](https://www.elastic.co/guide/en/elasticsearch/reference/5.5/analysis-pattern-capture-tokenfilter.html) to split tokens into more easily searched versions of themselves. 
+
+Uses a [Pattern Capture token filter](https://www.elastic.co/guide/en/elasticsearch/reference/5.5/analysis-pattern-capture-tokenfilter.html) to split tokens into more easily searched versions of themselves.
 
 Patterns:
+
 - `"(\\p{Ll}+|\\p{Lu}\\p{Ll}+|\\p{Lu}+)"`: captures CamelCased and lowedCameCased strings as separate tokens
 - `"(\\d+)"`: extracts digits
 - `"(?=([\\p{Lu}]+[\\p{L}]+))"`: captures CamelCased strings recursively. Ex: `ThisIsATest` => `[ThisIsATest, IsATest, ATest, Test]`
@@ -126,6 +140,7 @@ Patterns:
 - `'\/?([^\/]+)(?=\/|\b)'`: separate path terms `like/this/one`
 
 #### `edgeNGram_filter`
+
 Uses an [Edge NGram token filter](https://www.elastic.co/guide/en/elasticsearch/reference/5.5/analysis-edgengram-tokenfilter.html) to allow inputs with only parts of a token to find the token. For example it would turn `glasses` into permutations starting with `gl` and ending with `glasses`, which would allow a search for "`glass`" to find the original token `glasses`
 
 ## Gotchas
@@ -140,13 +155,13 @@ Uses an [Edge NGram token filter](https://www.elastic.co/guide/en/elasticsearch/
 You might get an error such as
 
 ```
-[2018-10-31T15:54:19,762][WARN ][o.e.c.r.a.DiskThresholdMonitor] [pval5Ct] 
-   flood stage disk watermark [95%] exceeded on 
-   [pval5Ct7SieH90t5MykM5w][pval5Ct][/usr/local/var/lib/elasticsearch/nodes/0] free: 56.2gb[3%], 
+[2018-10-31T15:54:19,762][WARN ][o.e.c.r.a.DiskThresholdMonitor] [pval5Ct]
+   flood stage disk watermark [95%] exceeded on
+   [pval5Ct7SieH90t5MykM5w][pval5Ct][/usr/local/var/lib/elasticsearch/nodes/0] free: 56.2gb[3%],
    all indices on this node will be marked read-only
 ```
 
-This is because you've exceeded the disk space threshold - it thinks you don't have enough disk space left, based on the default 95% threshold.  
+This is because you've exceeded the disk space threshold - it thinks you don't have enough disk space left, based on the default 95% threshold.
 
 In addition, the `read_only_allow_delete` setting will be set to `true`.  It will block indexing, `forcemerge`, etc
 
@@ -158,19 +173,19 @@ Add this to your `elasticsearch.yml` file:
 
 ```
 # turn off the disk allocator
-cluster.routing.allocation.disk.threshold_enabled: false 
+cluster.routing.allocation.disk.threshold_enabled: false
 ```
 
 _or_
 
 ```
 # set your own limits
-cluster.routing.allocation.disk.threshold_enabled: true 
+cluster.routing.allocation.disk.threshold_enabled: true
 cluster.routing.allocation.disk.watermark.flood_stage: 5gb   # ES 6.x only
-cluster.routing.allocation.disk.watermark.low: 15gb 
+cluster.routing.allocation.disk.watermark.low: 15gb
 cluster.routing.allocation.disk.watermark.high: 10gb
 ```
 
-Restart ElasticSearch, and the `read_only_allow_delete` will clear on it's own.
+Restart Elasticsearch, and the `read_only_allow_delete` will clear on it's own.
 
 _from "Disk-based Shard Allocation | Elasticsearch Reference" [5.6](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/disk-allocator.html#disk-allocator) and [6.x](https://www.elastic.co/guide/en/elasticsearch/reference/6.x/disk-allocator.html)_
author	Filipa Lacerda <filipa@gitlab.com>	2019-07-03 22:39:10 +0100
committer	Filipa Lacerda <filipa@gitlab.com>	2019-07-03 22:39:10 +0100
commit	50be7237f41b0ac44b9aaf8b73c57993548d4c35 (patch)
tree	ecfeeae58829dadbd90de4f834c730d1d8c55e74 /doc/development/elasticsearch.md
parent	35331c435196ea1155eb15161f3f9a481a01501d (diff)
parent	2ad75a4f96c4d377e18788966e7eefee4d78b6d2 (diff)
download	gitlab-ce-update-todo-in-ui.tar.gz