diff options
Diffstat (limited to 'doc/development/geo.md')
-rw-r--r-- | doc/development/geo.md | 29 |
1 files changed, 24 insertions, 5 deletions
diff --git a/doc/development/geo.md b/doc/development/geo.md index 446d85fceed..5010e44e826 100644 --- a/doc/development/geo.md +++ b/doc/development/geo.md @@ -101,15 +101,16 @@ it's successful, we replace the main repo with the newly cloned one. ### Uploads replication File uploads are also being replicated to the **secondary** node. To -track the state of syncing, the `Geo::FileRegistry` model is used. +track the state of syncing, the `Geo::UploadRegistry` model is used. -#### File Registry +#### Upload Registry Similar to the [Project Registry](#project-registry), there is a -`Geo::FileRegistry` model that tracks the synced uploads. +`Geo::UploadRegistry` model that tracks the synced uploads. -CI Job Artifacts are synced in a similar way as uploads or LFS -objects, but they are tracked by `Geo::JobArtifactRegistry` model. +CI Job Artifacts and LFS objects are synced in a similar way as uploads, +but they are tracked by `Geo::JobArtifactRegistry`, and `Geo::LfsObjectRegistry` +models respectively. #### File Download Dispatch worker @@ -490,6 +491,24 @@ When some write actions are not allowed because the node is a The database itself will already be read-only in a replicated setup, so we don't need to take any extra step for that. +## Steps needed to replicate a new data type + +As GitLab evolves, we constantly need to add new resources to the Geo replication system. +The implementation depends on resource specifics, but there are several things +that need to be taken care of: + +- Event generation on the primary site. Whenever a new resource is changed/updated, we need to + create a task for the Log Cursor. +- Event handling. The Log Cursor needs to have a handler for every event type generated by the primary site. +- Dispatch worker (cron job). Make sure the backfill condition works well. +- Sync worker. +- Registry with all possible states. +- Verification. +- Cleaner. When sync settings are changed for the secondary site, some resources need to be cleaned up. +- Geo Node Status. We need to provide API endpoints as well as some presentation in the GitLab Admin Area. +- Health Check. If we can perform some pre-cheŃks and make node unhealthy if something is wrong, we should do that. + The `rake gitlab:geo:check` command has to be updated too. + ## History of communication channel The communication channel has changed since first iteration, you can |