diff options
author | Robert Speicher <rspeicher@gmail.com> | 2021-01-20 13:34:23 -0600 |
---|---|---|
committer | Robert Speicher <rspeicher@gmail.com> | 2021-01-20 13:34:23 -0600 |
commit | 6438df3a1e0fb944485cebf07976160184697d72 (patch) | |
tree | 00b09bfd170e77ae9391b1a2f5a93ef6839f2597 /doc/development/elasticsearch.md | |
parent | 42bcd54d971da7ef2854b896a7b34f4ef8601067 (diff) | |
download | gitlab-ce-9c92629b014f99bee07847e53f057eabe7f1fea0.tar.gz |
Add latest changes from gitlab-org/gitlab@13-8-stable-eev13.8.0-rc42
Diffstat (limited to 'doc/development/elasticsearch.md')
-rw-r--r-- | doc/development/elasticsearch.md | 48 |
1 files changed, 48 insertions, 0 deletions
diff --git a/doc/development/elasticsearch.md b/doc/development/elasticsearch.md index 1c92601dde9..8bf8a5fccb8 100644 --- a/doc/development/elasticsearch.md +++ b/doc/development/elasticsearch.md @@ -216,6 +216,9 @@ cron worker sequentially. Any update to the Elastic index mappings should be replicated in [`Elastic::Latest::Config`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/elastic/latest/config.rb). +Migrations can be built with a retry limit and have the ability to be [failed and marked as halted](https://gitlab.com/gitlab-org/gitlab/-/blob/66e899b6637372a4faf61cfd2f254cbdd2fb9f6d/ee/lib/elastic/migration.rb#L40). +Any data or index cleanup needed to support migration retries should be handled within the migration. + ### Migration options supported by the [`Elastic::MigrationWorker`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/app/workers/elastic/migration_worker.rb) - `batched!` - Allow the migration to run in batches. If set, the [`Elastic::MigrationWorker`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/app/workers/elastic/migration_worker.rb) @@ -337,3 +340,48 @@ cluster.routing.allocation.disk.watermark.high: 10gb Restart Elasticsearch, and the `read_only_allow_delete` will clear on it's own. _from "Disk-based Shard Allocation | Elasticsearch Reference" [5.6](https://www.elastic.co/guide/en/elasticsearch/reference/5.6/disk-allocator.html#disk-allocator) and [6.x](https://www.elastic.co/guide/en/elasticsearch/reference/6.7/disk-allocator.html)_ + +### Disaster recovery/data loss/backups + +The use of Elasticsearch in GitLab is only ever as a secondary data store. +This means that all of the data stored in Elasticsearch can always be derived +again from other data sources, specifically PostgreSQL and Gitaly. Therefore if +the Elasticsearch data store is ever corrupted for whatever reason you can +simply reindex everything from scratch. + +If your Elasticsearch index is incredibly large it may be too time consuming or +cause too much downtime to reindex from scratch. There aren't any built in +mechanisms for automatically finding discrepencies and resyncing an +Elasticsearch index if it gets out of sync but one tool that may be useful is +looking at the logs for all the updates that occurred in a time range you +believe may have been missed. This information is very low level and only +useful for operators that are familiar with the GitLab codebase. It is +documented here in case it is useful for others. The relevant logs that could +theoretically be used to figure out what needs to be replayed are: + +1. All non-repository updates that were synced can be found in + [`elasticsearch.log`](../administration/logs.md#elasticsearchlog) by + searching for + [`track_items`](https://gitlab.com/gitlab-org/gitlab/-/blob/1e60ea99bd8110a97d8fc481e2f41cab14e63d31/ee/app/services/elastic/process_bookkeeping_service.rb#L25) + and these can be replayed by sending these items again through + `::Elastic::ProcessBookkeepingService.track!` +1. All repository updates that occurred can be found in + [`elasticsearch.log`](../administration/logs.md#elasticsearchlog) by + searching for + [`indexing_commit_range`](https://gitlab.com/gitlab-org/gitlab/-/blob/6f9d75dd3898536b9ec2fb206e0bd677ab59bd6d/ee/lib/gitlab/elastic/indexer.rb#L41). + Replaying these requires resetting the + [`IndexStatus#last_commit/last_wiki_commit`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/models/index_status.rb) + to the oldest `from_sha` in the logs and then triggering another index of + the project using + [`ElasticCommitIndexerWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/elastic_commit_indexer_worker.rb) +1. All project deletes that occurred can be found in + [`sidekiq.log`](../administration/logs.md#sidekiqlog) by searching for + [`ElasticDeleteProjectWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/elastic_delete_project_worker.rb). + These updates can be replayed by triggering another + `ElasticDeleteProjectWorker`. + +With the above methods and taking regular [Elasticsearch +snapshots](https://www.elastic.co/guide/en/elasticsearch/reference/current/snapshot-restore.html) +we should be able to recover from different kinds of data loss issues in a +relatively short period of time compared to indexing everything from +scratch. |