summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
authorGabriel Mazetto <brodock@gmail.com>2019-05-20 22:18:59 +0200
committerGabriel Mazetto <brodock@gmail.com>2019-05-29 16:25:15 +0200
commitbed5ae425bc4695d20cd4ad1980266a575472ab1 (patch)
treea4b4d12e17d7bf788abc5e6d5f440564410657f2
parent785e55c23ee2480ef90510dcf63b5a7a2c6fe9ce (diff)
downloadgitlab-ce-ee-10587-geo-development-fdw-docs.tar.gz
Added FDW development documentationee-10587-geo-development-fdw-docs
Improve existing FDW documentation in `doc/development` to include additional information about the architecture and how to use FDW as a developer.
-rw-r--r--doc/development/geo.md78
1 files changed, 72 insertions, 6 deletions
diff --git a/doc/development/geo.md b/doc/development/geo.md
index 87ec34ec5c4..6e59fab34c7 100644
--- a/doc/development/geo.md
+++ b/doc/development/geo.md
@@ -10,6 +10,7 @@ the diagram below and are described in more detail within this document.
## Replication layer
Geo handles replication for different components:
+
- [Database](#database-replication): includes the entire application, except cache and jobs.
- [Git repositories](#repository-replication): includes both projects and wikis.
- [Uploaded blobs](#uploads-replication): includes anything from images attached on issues
@@ -209,20 +210,38 @@ bundle exec rake geo:db:migrate
### Foreign Data Wrapper
-The use of [FDW](#fdw) was introduced in GitLab 10.1.
+> Introduced in GitLab 10.1.
+
+Foreign Data Wrapper ([FDW](#fdw)) is used by the [Geo Log Cursor](#geo-log-cursor) and improves
+the performance of many synchronization operations.
-This is useful for the [Geo Log Cursor](#geo-log-cursor) and improves
-the performance of some synchronization operations.
+FDW is a PostgreSQL extension ([`postgres_fdw`](https://www.postgresql.org/docs/current/postgres-fdw.html)) that is enabled within
+the Geo Tracking Database (on a **secondary** node), which allows it
+to connect to the readonly database replica and perform queries and filter
+data from both instances.
While FDW is available in older versions of PostgreSQL, we needed to
raise the minimum required version to 9.6 as this includes many
performance improvements to the FDW implementation.
+This persistent connection is configured as an FDW server
+named `gitlab_secondary`. This configuration exists within the database's user
+context only. To access the `gitlab_secondary`, GitLab needs to use the
+same database user that had previously been configured.
+
+The Geo Tracking Database accesses the readonly database replica via FDW as a regular user,
+limited by its own restrictions. The credentials are configured as a
+`USER MAPPING` associated with the `SERVER` mapped previously
+(`gitlab_secondary`).
+
+FDW configuration and credentials definition are managed automatically by the
+Omnibus GitLab `gitlab-ctl reconfigure` command.
+
#### Refeshing the Foreign Tables
-Whenever the database schema changes on the **primary** node, the
-**secondary** node will need to refresh its foreign tables by running
-the following:
+Whenever a new Geo node is configured or the database schema changes on the
+**primary** node, you must refresh the foreign tables on the **secondary** node
+by running the following:
```sh
bundle exec rake geo:db:refresh_foreign_tables
@@ -243,6 +262,53 @@ STATEMENT: SELECT a.attname, format_type(a.atttypid, a.atttypmod)
ORDER BY a.attnum
```
+#### Accessing data from a Foreign Table
+
+At the SQL level, all you have to do is `SELECT` data from `gitlab_secondary.*`.
+
+Here's an example of how to access all projects from the Geo Tracking Database's FDW:
+
+```sql
+SELECT * FROM gitlab_secondary.projects;
+```
+
+As a more real-world example, this is how you filter for unarchived projects
+on the Tracking Database:
+
+```sql
+SELECT project_registry.*
+ FROM project_registry
+ JOIN gitlab_secondary.projects
+ ON (project_registry.project_id = gitlab_secondary.projects.id
+ AND gitlab_secondary.projects.archived IS FALSE)
+```
+
+At the ActiveRecord level, we have additional Models that represent the
+foreign tables. They must be mapped in a slightly different way, and they are read-only.
+
+Check the existing FDW models in `ee/app/models/geo/fdw` for reference.
+
+From a developer's perspective, it's no different than creating a model that
+represents a Database View.
+
+With the examples above, you can access the projects with:
+
+```ruby
+Geo::Fdw::Project.all
+```
+
+and to access the `ProjectRegistry` filtering by unarchived projects:
+
+```ruby
+# We have to use Arel here:
+project_registry_table = Geo::ProjectRegistry.arel_table
+fdw_project_table = Geo::Fdw::Project.arel_table
+
+project_registry_table.join(fdw_project_table)
+ .on(project_registry_table[:project_id].eq(fdw_project_table[:id]))
+ .where((fdw_project_table[:archived]).eq(true)) # if you append `.to_sql` you can check generated query
+```
+
## Finders
Geo uses [Finders](https://gitlab.com/gitlab-org/gitlab-ee/tree/master/app/finders),