diff options
Diffstat (limited to 'doc/development/graphql_guide/pagination.md')
-rw-r--r-- | doc/development/graphql_guide/pagination.md | 173 |
1 files changed, 169 insertions, 4 deletions
diff --git a/doc/development/graphql_guide/pagination.md b/doc/development/graphql_guide/pagination.md index bf9eaa99158..d5140363396 100644 --- a/doc/development/graphql_guide/pagination.md +++ b/doc/development/graphql_guide/pagination.md @@ -1,3 +1,9 @@ +--- +stage: none +group: unassigned +info: To determine the technical writer assigned to the Stage/Group associated with this page, see https://about.gitlab.com/handbook/engineering/ux/technical-writing/#designated-technical-writers +--- + # GraphQL pagination ## Types of pagination @@ -59,13 +65,13 @@ Some of the benefits and tradeoffs of keyset pagination are - Performance is much better. -- Data stability is greater since you're not going to miss records due to +- More data stability for end-users since records are not missing from lists due to deletions or insertions. - It's the best way to do infinite scrolling. - It's more difficult to program and maintain. Easy for `updated_at` and - `sort_order`, complicated (or impossible) for complex sorting scenarios. + `sort_order`, complicated (or impossible) for [complex sorting scenarios](#limitations-of-query-complexity). ## Implementation @@ -80,12 +86,171 @@ However, there are some cases where we have to use the offset pagination connection, `OffsetActiveRecordRelationConnection`, such as when sorting by label priority in issues, due to the complexity of the sort. -<!-- ### Keyset pagination --> +### Keyset pagination + +The keyset pagination implementation is a subclass of `GraphQL::Pagination::ActiveRecordRelationConnection`, +which is a part of the `graphql` gem. This is installed as the default for all `ActiveRecord::Relation`. +However, instead of using a cursor based on an offset (which is the default), GitLab uses a more specialized cursor. + +The cursor is created by encoding a JSON object which contains the relevant ordering fields. For example: + +```ruby +ordering = {"id"=>"72410125", "created_at"=>"2020-10-08 18:05:21.953398000 UTC"} +json = ordering.to_json +cursor = Base64Bp.urlsafe_encode64(json, padding: false) + +"eyJpZCI6IjcyNDEwMTI1IiwiY3JlYXRlZF9hdCI6IjIwMjAtMTAtMDggMTg6MDU6MjEuOTUzMzk4MDAwIFVUQyJ9" + +json = Base64Bp.urlsafe_decode64(cursor) +Gitlab::Json.parse(json) + +{"id"=>"72410125", "created_at"=>"2020-10-08 18:05:21.953398000 UTC"} +``` + +The benefits of storing the order attribute values in the cursor: + +- If only the ID of the object were stored, the object and its attributes could be queried. + That would require an additional query, and if the object is no longer there, then the needed + attributes are not available. +- If an attribute is `NULL`, then one SQL query can be used. If it's not `NULL`, then a + different SQL query can be used. + +Based on whether the main attribute field being sorted on is `NULL` in the cursor, the proper query +condition is built. The last ordering field is considered to be unique (a primary key), meaning the +column never contains `NULL` values. + +#### Limitations of query complexity + +We only support two ordering fields, and one of those fields needs to be the primary key. + +Here are two examples of pseudocode for the query: + +- **Two-condition query.** `X` represents the values from the cursor. `C` represents + the columns in the database, sorted in ascending order, using an `:after` cursor, and with `NULL` + values sorted last. + + ```plaintext + X1 IS NOT NULL + AND + (C1 > X1) + OR + (C1 IS NULL) + OR + (C1 = X1 + AND + C2 > X2) + + X1 IS NULL + AND + (C1 IS NULL + AND + C2 > X2) + ``` + + Below is an example based on the relation `Issue.order(relative_position: :asc).order(id: :asc)` + with an after cursor of `relative_position: 1500, id: 500`: + + ```plaintext + when cursor[relative_position] is not NULL + + ("issues"."relative_position" > 1500) + OR ( + "issues"."relative_position" = 1500 + AND + "issues"."id" > 500 + ) + OR ("issues"."relative_position" IS NULL) + + when cursor[relative_position] is NULL + + "issues"."relative_position" IS NULL + AND + "issues"."id" > 500 + ``` + +- **Three-condition query.** The example below is not complete, but shows the + complexity of adding one more condition. `X` represents the values from the cursor. `C` represents + the columns in the database, sorted in ascending order, using an `:after` cursor, and with `NULL` + values sorted last. + + ```plaintext + X1 IS NOT NULL + AND + (C1 > X1) + OR + (C1 IS NULL) + OR + (C1 = X1 AND C2 > X2) + OR + (C1 = X1 + AND + X2 IS NOT NULL + AND + ((C2 > X2) + OR + (C2 IS NULL) + OR + (C2 = X2 AND C3 > X3) + OR + X2 IS NULL..... + ``` + +By using +[`Gitlab::Graphql::Pagination::Keyset::QueryBuilder`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/graphql/pagination/keyset/query_builder.rb), +we're able to build the necessary SQL conditions and apply them to the Active Record relation. + +Complex queries can be difficult or impossible to use. For example, +in [`issuable.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/concerns/issuable.rb), +the `order_due_date_and_labels_priority` method creates a very complex query. + +These types of queries are not supported. In these instances, you can use offset pagination. + +### Offset pagination + +There are times when the [complexity of sorting](#limitations-of-query-complexity) +is more than our keyset pagination can handle. + +For example, in [`IssuesResolver`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/graphql/resolvers/issues_resolver.rb), +when sorting by `priority_asc`, we can't use keyset pagination as the ordering is much +too complex. For more information, read [`issuable.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/models/concerns/issuable.rb). -<!-- ### Offset pagination --> +In cases like this, we can fall back to regular offset pagination by returning a +[`Gitlab::Graphql::Pagination::OffsetActiveRecordRelationConnection`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/lib/gitlab/graphql/pagination/offset_active_record_relation_connection.rb) +instead of an `ActiveRecord::Relation`: + +```ruby + def resolve(parent, finder, **args) + issues = apply_lookahead(Gitlab::Graphql::Loaders::IssuableLoader.new(parent, finder).batching_find_all) + + if non_stable_cursor_sort?(args[:sort]) + # Certain complex sorts are not supported by the stable cursor pagination yet. + # In these cases, we use offset pagination, so we return the correct connection. + Gitlab::Graphql::Pagination::OffsetActiveRecordRelationConnection.new(issues) + else + issues + end + end +``` <!-- ### External pagination --> +### External pagination + +There may be times where you need to return data through the GitLab API that is stored in +another system. In these cases you may have to paginate a third-party's API. + +An example of this is with our [Error Tracking](../../operations/error_tracking.md) implementation, +where we proxy [Sentry errors](../../operations/error_tracking.md#sentry-error-tracking) through +the GitLab API. We do this by calling the Sentry API which enforces its own pagination rules. +This means we cannot access the collection within GitLab to perform our own custom pagination. + +For consistency, we manually set the pagination cursors based on values returned by the external API, using `Gitlab::Graphql::ExternallyPaginatedArray.new(previous_cursor, next_cursor, *items)`. + +You can see an example implementation in the following files: + +- [`types/error__tracking/sentry_error_collection_type.rb`](https://gitlab.com/gitlab-org/gitlab/-/blob/master/app/graphql/types/error_tracking/sentry_error_collection_type.rb) which adds an extension to `field :errors`. +- [`resolvers/error_tracking/sentry_errors_resolver.rb`](https://gitlab.com/gitlab-org/gitlab/blob/master/app/graphql/resolvers/error_tracking/sentry_errors_resolver.rb) which returns the data from the resolver. + ## Testing Any GraphQL field that supports pagination and sorting should be tested |