diff options
author | GitLab Bot <gitlab-bot@gitlab.com> | 2021-02-10 18:09:02 +0000 |
---|---|---|
committer | GitLab Bot <gitlab-bot@gitlab.com> | 2021-02-10 18:09:02 +0000 |
commit | 577bb49691b11bc8ebae3a4966153ed39af60d87 (patch) | |
tree | c34970de0f1fc58463448da0f34be13a2f3f47f9 /doc/architecture/blueprints/graphql_api | |
parent | 6cffe9ea21d0974ebd3c544a3b711ffcd35649e2 (diff) | |
download | gitlab-ce-577bb49691b11bc8ebae3a4966153ed39af60d87.tar.gz |
Add latest changes from gitlab-org/gitlab@master
Diffstat (limited to 'doc/architecture/blueprints/graphql_api')
-rw-r--r-- | doc/architecture/blueprints/graphql_api/index.md | 183 |
1 files changed, 183 insertions, 0 deletions
diff --git a/doc/architecture/blueprints/graphql_api/index.md b/doc/architecture/blueprints/graphql_api/index.md new file mode 100644 index 00000000000..7798d9da587 --- /dev/null +++ b/doc/architecture/blueprints/graphql_api/index.md @@ -0,0 +1,183 @@ +--- +stage: none +group: unassigned +comments: false +description: 'GraphQL API architecture foundation' +--- + +# GraphQL API + +[GraphQL](https://graphql.org/) is a data query and manipulation language for +APIs, and a runtime for fulfilling queries with existing data. + +At GitLab we want to adopt GraphQL to make it easier for the wider community to +interact with GitLab in a reliable way, but also to advance our own product by +modeling communication between backend and frontend components using GraphQL. + +We've recently increased the pace of the adoption by defining quarterly OKRs +related to GraphQL migration. This resulted in us spending more time on the +GraphQL development and helped to surface the need of improving tooling we use +to extend the new API. + +This document describes the work that is needed to build a stable foundation that +will support our development efforts and a large-scale usage of the [GraphQL +API](https://docs.gitlab.com/ee/api/graphql/index.html). + +## Summary + +The GraphQL initiative at GitLab [started around three years ago](https://gitlab.com/gitlab-org/gitlab/-/commit/9c6c17cbcdb8bf8185fc1b873dcfd08f723e4df5). +Most of the work around the GraphQL ecosystem has been done by volunteers that are +[GraphQL experts](https://gitlab.com/groups/gitlab-org/graphql-experts/-/group_members?with_inherited_permissions=exclude). + +The [retrospective on our progress](https://gitlab.com/gitlab-org/gitlab/-/issues/235659) +surfaced a few opportunities to streamline our GraphQL development efforts and +to reduce the risk of performance degradations and possible outages that may +be related to the gaps in the essential mechanisms needed to make the GraphQL +API observable and operable at scale. + +Amongst small improvements to the GraphQL engine itself we want to build a +comprehensive monitoring dashboard, that will enable team members to make sense +of what is happening inside our GraphQL API. We want to make it possible to define +SLOs, triage breached SLIs and to be able to zoom into relevant details using +Grafana and Elastic. We want to see historical data and predict future usage. + +It is an opportunity to learn from our experience in evolving the REST API, for +the scale, and to apply this knowledge onto the GraphQL development efforts. We +can do that by building query-to-feature correlation mechanisms, adding +scalable state synchronization support and aligning GraphQL with other +architectural initiatives being executed in parallel, like [the support for +direct uploads](https://gitlab.com/gitlab-org/gitlab/-/issues/280819). + +GraphQL should be secure by default. We can avoid common security mistakes by +building mechanisms that will help us to enforce [OWASP GraphQL +recommendations](https://cheatsheetseries.owasp.org/cheatsheets/GraphQL_Cheat_Sheet.html) +that are relevant to us. + +Understanding what are the needs of the wider community will also allow us to +plan deprecation policies better and to design parity between GraphQL and REST +API that suits their needs. + +## Challenges + +### Make sense of what is happening in GraphQL + +Being able to see how GraphQL performs in a production environment is a +prerequisite for improving performance and reliability of that service. + +We do not yet have tools that would make it possible for us to answer a +question of how GraphQL performs and what the bottlenecks we should optimize +are. This, combined with a pace of GraphQL adoption and the scale in which we +expect it operate, imposes a risk of an increased rate of production incidents +what will be difficult to resolve. + +We want to build a comprehensive Grafana dashboard that will focus on +delivering insights of how GraphQL endpoint performs, while still empowering +team members with capability of zooming in into details. We want to improve +logging to make it possible to better correlate GraphQL queries with feature +using Elastic and to index them in a way that performance problems can be +detected early. + +- Build a comprehensive Grafana dashboard for GraphQL +- Build a GraphQL query-to-feature correlation mechanisms +- Improve logging GraphQL queries in Elastic +- Redesign error handling on frontend to surface warnings + +### Manage volatile GraphQL data structures + +Our GraphQL API will evolve with time. GraphQL has been designed to make such +evolution easier. GraphQL APIs are easier to extend because of how composable +GraphQL is. On the other hand this is also a reason why versioning of GraphQL +APIs is considered unnecessary. Instead of versioning the API we want to mark +some fields as deprecated, but we need to have a way to understand what is the +usage of deprecated fields, types and a way to visualize it in a way that is +easy to understand. We might want to detect usage of deprecated fields and +notify users that we plan to remove them. + +- Define a data-informed deprecation policy that will serve our users better +- Build a dashboard showing usage frequency of deprecated GraphQL fields +- Build mechanisms required to send deprecated fields usage in usage ping + +### Ensure consistency with the rest of the codebase + +GraphQL is not the only thing we work on, but it cuts across the entire +application. It is being used to expose data collected and processed in almost +every part of our product. It makes it tightly coupled with our monolithic +codebase. + +We need to ensure that how we use GraphQL is consistent with other mechanisms +we've designed to improve performance and reliability of GitLab. + +We have extensive experience with evolving our REST API. We want to apply +this knowledge onto GraphQL and make it performant and secure by default. + +- Design direct uploads for GraphQL +- Build GraphQL query depth and complexity histograms +- Visualize the amount of GraphQL queries reaching limits +- Add support for GraphQL etags for existing features + +### Design GraphQL interoperability with REST API + +We do not plan to deprecate our REST API. It is a simple way to interact with +GitLab, and GraphQL might never become a full replacement of a traditional REST +API. The two APIs will need to coexist together. We will need to remove +duplication between them to make their codebases maintainable. This symbiosis, +however, is not only a technical challenge we need to resolve on the backend. +Users might want to use the two APIs interchangeably or even at the same time. +Making it interoperable by exposing a common scheme for resource identifiers is +a prerequisite for interoperability. + +- Make GraphQL and REST API interoperable +- Design common resource identifiers for both APIs + +### Design scalable state synchronization mechanisms + +One of the most important goals related to GraphQL adoption at GitLab is using +it to model interactions between GitLab backend and frontend components. This +is an ongoing process that has already surfaced the need of building better +state synchronization mechanisms and hooking into existing ones. + +- Design a scalable state synchronization mechanism +- Evaluate state synchronization through pub/sub and websockets +- Build a generic support for GraphQL feature correlation and feature etags +- Redesign frontend code responsible for managing shared global state + +## Iterations + +1. [Build comprehensive Grafana dashboard for GraphQL](https://gitlab.com/groups/gitlab-com/-/epics/1343) +1. [Improve logging of GraphQL requests in Elastic](https://gitlab.com/groups/gitlab-org/-/epics/4646) +1. [Build a scalable state synchronization for GraphQL](https://gitlab.com/groups/gitlab-org/-/epics/5319) +1. [Build GraphQL feature-to-query correlation mechanisms](https://gitlab.com/groups/gitlab-org/-/epics/5320) +1. [Design a better data-informed deprecation policy](https://gitlab.com/groups/gitlab-org/-/epics/5321) +1. [Add support for direct uploads for GraphQL](https://gitlab.com/gitlab-org/gitlab/-/issues/280819) +1. [Review GraphQL design choices related to security](https://gitlab.com/gitlab-org/security/gitlab/-/issues/339) + +## Status + +Current status: in progress. + +## Who + +Proposal: + +<!-- vale gitlab.Spelling = NO --> + +| Role | Who +|------------------------------|-------------------------| +| Author | Grzegorz Bizon | +| Architecture Evolution Coach | Kamil TrzciĆski | +| Engineering Leader | Darva Satcher | +| Product Manager | Patrick Deuley | +| Domain Expert / GraphQL | Charlie Ablett | +| Domain Expert / GraphQL | Alex Kalderimis | +| Domain Expert / GraphQL | Natalia Tepluhina | +| Domain Expert / Scalability | Bob Van Landuyt | + +DRIs: + +| Role | Who +|------------------------------|------------------------| +| Leadership | Darva Satcher | +| Product | Patrick Deuley | +| Engineering | | + +<!-- vale gitlab.Spelling = YES --> |