diff options
Diffstat (limited to 'doc/development/geo.md')
-rw-r--r-- | doc/development/geo.md | 115 |
1 files changed, 0 insertions, 115 deletions
diff --git a/doc/development/geo.md b/doc/development/geo.md index 5c781a60bac..57959b07e49 100644 --- a/doc/development/geo.md +++ b/doc/development/geo.md @@ -209,121 +209,12 @@ To migrate the tracking database, run: bundle exec rake geo:db:migrate ``` -### Foreign Data Wrapper - -> Introduced in GitLab 10.1. - -Foreign Data Wrapper ([FDW](#fdw)) is used by the [Geo Log Cursor](#geo-log-cursor) and improves -the performance of many synchronization operations. - -FDW is a PostgreSQL extension ([`postgres_fdw`](https://www.postgresql.org/docs/11/postgres-fdw.html)) that is enabled within -the Geo Tracking Database (on a **secondary** node), which allows it -to connect to the read-only database replica and perform queries and filter -data from both instances. - -This persistent connection is configured as an FDW server -named `gitlab_secondary`. This configuration exists within the database's user -context only. To access the `gitlab_secondary`, GitLab needs to use the -same database user that had previously been configured. - -The Geo Tracking Database accesses the read-only database replica via FDW as a regular user, -limited by its own restrictions. The credentials are configured as a -`USER MAPPING` associated with the `SERVER` mapped previously -(`gitlab_secondary`). - -FDW configuration and credentials definition are managed automatically by the -Omnibus GitLab `gitlab-ctl reconfigure` command. - -#### Refreshing the Foreign Tables - -Whenever a new Geo node is configured or the database schema changes on the -**primary** node, you must refresh the foreign tables on the **secondary** node -by running the following: - -```shell -bundle exec rake geo:db:refresh_foreign_tables -``` - -Failure to do this will prevent the **secondary** node from -functioning properly. The **secondary** node will generate error -messages, as the following PostgreSQL error: - -```sql -ERROR: relation "gitlab_secondary.ci_job_artifacts" does not exist at character 323 -STATEMENT: SELECT a.attname, format_type(a.atttypid, a.atttypmod), - pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod - FROM pg_attribute a LEFT JOIN pg_attrdef d - ON a.attrelid = d.adrelid AND a.attnum = d.adnum - WHERE a.attrelid = '"gitlab_secondary"."ci_job_artifacts"'::regclass - AND a.attnum > 0 AND NOT a.attisdropped - ORDER BY a.attnum -``` - -#### Accessing data from a Foreign Table - -At the SQL level, all you have to do is `SELECT` data from `gitlab_secondary.*`. - -Here's an example of how to access all projects from the Geo Tracking Database's FDW: - -```sql -SELECT * FROM gitlab_secondary.projects; -``` - -As a more real-world example, this is how you filter for unarchived projects -on the Tracking Database: - -```sql -SELECT project_registry.* - FROM project_registry - JOIN gitlab_secondary.projects - ON (project_registry.project_id = gitlab_secondary.projects.id - AND gitlab_secondary.projects.archived IS FALSE) -``` - -At the ActiveRecord level, we have additional Models that represent the -foreign tables. They must be mapped in a slightly different way, and they are read-only. - -Check the existing FDW models in `ee/app/models/geo/fdw` for reference. - -From a developer's perspective, it's no different than creating a model that -represents a Database View. - -With the examples above, you can access the projects with: - -```ruby -Geo::Fdw::Project.all -``` - -and to access the `ProjectRegistry` filtering by unarchived projects: - -```ruby -# We have to use Arel here: -project_registry_table = Geo::ProjectRegistry.arel_table -fdw_project_table = Geo::Fdw::Project.arel_table - -project_registry_table.join(fdw_project_table) - .on(project_registry_table[:project_id].eq(fdw_project_table[:id])) - .where((fdw_project_table[:archived]).eq(true)) # if you append `.to_sql` you can check generated query -``` - ## Finders Geo uses [Finders](https://gitlab.com/gitlab-org/gitlab/tree/master/app/finders), which are classes take care of the heavy lifting of looking up projects/attachments/etc. in the tracking database and main database. -### Finders Performance - -The Finders need to compare data from the main database with data in -the tracking database. For example, counting the number of synced -projects normally involves retrieving the project IDs from one -database and checking their state in the other database. This is slow -and requires a lot of memory. - -To overcome this, the Finders use [FDW](#fdw), or Foreign Data -Wrappers. This allows a regular `JOIN` between the main database and -the tracking database. - ## Redis Redis on the **secondary** node works the same as on the **primary** @@ -397,12 +288,6 @@ migration do not need to run on the secondary nodes. A database on each Geo **secondary** node that keeps state for the node on which it resides. Read more in [Using the Tracking database](#using-the-tracking-database). -### FDW - -Foreign Data Wrapper, or FDW, is a feature built-in in PostgreSQL. It -allows data to be queried from different data sources. In Geo, it's -used to query data from different PostgreSQL instances. - ## Geo Event Log The Geo **primary** stores events in the `geo_event_log` table. Each |