diff options
Diffstat (limited to 'doc/development/elasticsearch.md')
-rw-r--r-- | doc/development/elasticsearch.md | 31 |
1 files changed, 13 insertions, 18 deletions
diff --git a/doc/development/elasticsearch.md b/doc/development/elasticsearch.md index 6b829faf74d..d5f6e95033f 100644 --- a/doc/development/elasticsearch.md +++ b/doc/development/elasticsearch.md @@ -36,15 +36,15 @@ Additionally, if you need large repositories or multiple forks for testing, plea ## How does it work? -The Elasticsearch integration depends on an external indexer. We ship an [indexer written in Go](https://gitlab.com/gitlab-org/gitlab-elasticsearch-indexer). The user must trigger the initial indexing via a Rake task but, after this is done, GitLab itself will trigger reindexing when required via `after_` callbacks on create, update, and destroy that are inherited from [`/ee/app/models/concerns/elastic/application_versioned_search.rb`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/app/models/concerns/elastic/application_versioned_search.rb). +The Elasticsearch integration depends on an external indexer. We ship an [indexer written in Go](https://gitlab.com/gitlab-org/gitlab-elasticsearch-indexer). The user must trigger the initial indexing via a Rake task but, after this is done, GitLab itself will trigger reindexing when required via `after_` callbacks on create, update, and destroy that are inherited from [`/ee/app/models/concerns/elastic/application_versioned_search.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/models/concerns/elastic/application_versioned_search.rb). After initial indexing is complete, create, update, and delete operations for all models except projects (see [#207494](https://gitlab.com/gitlab-org/gitlab/-/issues/207494)) are tracked in a Redis [`ZSET`](https://redis.io/topics/data-types#sorted-sets). A regular `sidekiq-cron` `ElasticIndexBulkCronWorker` processes this queue, updating many Elasticsearch documents at a time with the [Bulk Request API](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html). -Search queries are generated by the concerns found in [`ee/app/models/concerns/elastic`](https://gitlab.com/gitlab-org/gitlab/tree/master/ee/app/models/concerns/elastic). These concerns are also in charge of access control, and have been a historic source of security bugs so please pay close attention to them! +Search queries are generated by the concerns found in [`ee/app/models/concerns/elastic`](https://gitlab.com/gitlab-org/gitlab/-/tree/master/ee/app/models/concerns/elastic). These concerns are also in charge of access control, and have been a historic source of security bugs so please pay close attention to them! ## Existing Analyzers/Tokenizers/Filters -These are all defined in [`ee/lib/elastic/latest/config.rb`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/lib/elastic/latest/config.rb) +These are all defined in [`ee/lib/elastic/latest/config.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/lib/elastic/latest/config.rb) ### Analyzers @@ -214,7 +214,7 @@ end ``` Applied migrations are stored in `gitlab-#{RAILS_ENV}-migrations` index. All migrations not executed -are applied by the [`Elastic::MigrationWorker`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/app/workers/elastic/migration_worker.rb) +are applied by the [`Elastic::MigrationWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/elastic/migration_worker.rb) cron worker sequentially. To update Elastic index mappings, apply the configuration to the respective files: @@ -227,15 +227,15 @@ Any data or index cleanup needed to support migration retries should be handled ### Migration options supported by the `Elastic::MigrationWorker` -[`Elastic::MigrationWorker`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/app/workers/elastic/migration_worker.rb) supports the following migration options: +[`Elastic::MigrationWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/elastic/migration_worker.rb) supports the following migration options: -- `batched!` - Allow the migration to run in batches. If set, the [`Elastic::MigrationWorker`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/app/workers/elastic/migration_worker.rb) +- `batched!` - Allow the migration to run in batches. If set, the [`Elastic::MigrationWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/elastic/migration_worker.rb) will re-enqueue itself with a delay which is set using the `throttle_delay` option described below. The batching must be handled within the `migrate` method, this setting controls the re-enqueuing only. - `throttle_delay` - Sets the wait time in between batch runs. This time should be set high enough to allow each migration batch enough time to finish. Additionally, the time should be less than 30 minutes since that is how often the -[`Elastic::MigrationWorker`](https://gitlab.com/gitlab-org/gitlab/blob/master/ee/app/workers/elastic/migration_worker.rb) +[`Elastic::MigrationWorker`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/ee/app/workers/elastic/migration_worker.rb) cron worker runs. Default value is 5 minutes. - `pause_indexing!` - Pause indexing while the migration runs. This setting will record the indexing setting before @@ -308,12 +308,15 @@ We choose to use GitLab major version upgrades as a safe time to remove backwards compatibility for indices that have not been fully migrated. We [document this in our upgrade documentation](../update/index.md#upgrading-to-a-new-major-version). We also -choose to remove the migration code and tests so that: +choose to replace the migration code with the halted migration +and remove tests so that: - We don't need to maintain any code that is called from our Advanced Search migrations. - We don't waste CI time running tests for migrations that we don't support anymore. +- Operators who have not run this migration and who upgrade directly to the + target version will see a message prompting them to reindex from scratch. To be extra safe, we will not delete migrations that were created in the last minor version before the major upgrade. So, if we are upgrading to `%14.0`, @@ -334,18 +337,10 @@ For every migration that was created 2 minor versions before the major version being upgraded to, we do the following: 1. Confirm the migration has actually completed successfully for GitLab.com. -1. Replace the content of `migrate` and `completed?` methods as follows: +1. Replace the content of the migration with: ```ruby - def migrate - log_raise "Migration has been deleted in the last major version upgrade." \ - "Migrations are supposed to be finished before upgrading major version https://docs.gitlab.com/ee/update/#upgrading-to-a-new-major-version ." \ - "To correct this issue, recreate your index from scratch: https://docs.gitlab.com/ee/integration/elasticsearch.html#last-resort-to-recreate-an-index." - end - - def completed? - false - end + include Elastic::MigrationObsolete ``` 1. Delete any spec files to support this migration. |