How to solve the GraphQL n+1 problem
In this article, you'll take a closer look at the n+1 problem and what it looks like in GraphQL. You'll also get an overview of different methodologies and tools that you can use to avoid the problem and improve performance.
GraphQL is a query language and runtime to build APIs that reduces data results to only what the user requests. GraphQL uses schemas to define data inputs and responses from a single endpoint to a GraphQL runtime. The schemas allow clients to request specific information, so the responses will only include what the client needs.
In GraphQL the client specifies the data to return, but not how to fetch it from storage. Sometimes, a query could lead to unintentional, excessive backend requests. The n+1 problem is a typical example of when this can happen in GraphQL.
The n+1 problem is when multiple types of data are requested in one query, but where n
requests are required instead of just one. This is typically encountered when data is nested, such as if you were requesting musicians and the names of their album titles. A list of musicians can be acquired in a single query, but to get their album titles requires at least one query per musician: one query to get n
musicians, and n
queries to get a list of albums for each musician. When n
becomes sufficiently large, performance issues and failures can arise. This is easy to do in GraphQL because of how queries are built into the client.
In this article, you'll take a closer look at the n+1 problem and what it looks like in GraphQL. You'll also get an overview of different methodologies and tools that you can use to avoid the problem and improve performance.
How do GraphQL runtimes return data?Anchor
GraphQL queries are made against a single endpoint. GraphQL runtimes like Apollo Server allow the backend to define a schema, which is a data model that indicates what data can be returned on each query. This allows the client to request some or all of the data, depending on what it needs.
Sample GraphQL runtimeAnchor
When building a runtime in GraphQL, a unique resolver should be present for each discrete data type. This practice limits the backend's queries against different databases to return requested data.
The following shows a sample Apollo Server setup for nested data. The client can make calls against this runtime to get musician data, and optionally, include albums the musician has produced.
//Schema Definitionsconst typeDefs = gql`type Album {id: ID!title: String!artistId: ID!}type Musician {id: ID!name: String!albums: [Album]}type Query {musicians: [Musician]}`;//Resolver definitionsconst resolvers = {Query: {musicians: () => {//Database fetch to get a list of musiciansreturn musicians;},},Musician: {albums: (musician) => {//Input is a single musician//Database fetch to get a list of albums for this single musicianreturn albums},},};const server = new ApolloServer({ typeDefs, resolvers });server.listen(3000).then(({ url }) => {console.log(`Starting new Apollo Server at ${url}`);});
Sample client queryAnchor
The client creates requests against the GraphQL server, tailored to include the information needed for display. In the example server, the client is able to request a list of musicians without any other information, or to request a list of musicians and associated albums, simply by changing the query. In servers run with REST architecture, the client cannot choose what data it receives, leading to over-fetching data.
The client query below requests all the data available for musicians. Since you know the schema of this GraphQL server, you can see that the way the runtime will execute this query can lead to issues. The runtime is set up to gather the albums for each musician individually, so when this query is executed, the runtime will fetch a list of musicians, and then use each musician to get a list of albums.
query {musicians {id,name,albums {title}}}
The n+1 problem in GraphQLAnchor
GraphQL is very powerful because of how flexible returned data can be. The queries can be set up to return data from nested datasets, such as the albums nested underneath musicians in the example above. When querying for nested data, like in the music example, a scaling issue known as the n+1 problem can occur.
In the example, the query will fetch a list of musicians. Let’s say it finds n
musicians in the database. For each musician found, the albums()
resolver will be invoked to locate all the albums associated with that musician. This resolver will trigger a database call for each musician, which will be n
calls. This means that in total, there are n+1 database calls occurring. This is much less efficient than having two database calls, one for musicians and one for albums.
Symptoms of the n+1 problemAnchor
The n+1 problem can lead to several client and server issues. The first problem is simply the number of calls to the database. These calls take time, so there will be a lag on the page that's making the call. Lag reduces your ability to retain customers, so should be avoided as much as possible and closely monitored by DevOps teams.
Since the amount of lag depends on the call's size, you may also experience inconsistent performance. Pages without many nested calls will load quickly, while others that require more nested data will be much slower.
If you are using a cloud-based server like AWS or Azure to run a database, extra calls to the database also cost more in service fees.
How to solve n+1 in GraphQLAnchor
Since this is a common issue with GraphQL, there are well-established solutions for handling it. These include using batching or using data loaders on the client.
Data loaders in GraphQLAnchor
Data loaders were a conceptual solution initially proposed by engineers at Facebook. One way around the n+1 problem is to bypass the nested fetch requests that do not have enough information to be efficient with their queries. Data loaders can batch client GraphQL requests into a single query that defers fetching data for later.
When using a data loader, the fetcher responds with a promise, then moves to the next fetch at the same data level rather than moving onto the nested data. A promise is a proxy for the response of a function that allows processing to continue. The code guarantees that a response for the call will come later, making it a non-blocking function. Once all the data at one level is retrieved (or once all the promises have been fulfilled), a single request is made to get all the nested data.
Implementing data loadersAnchor
The implementation of data loaders depends on which version of GraphQL you are using. Some have built-in data loader functionality, like the java-dataloader for GraphQL Java.
GraphQL has also built a dataloader library that can be layered onto your GraphQL server. This utility mimics the original GraphQL calls with loaders passed to each resolver in the context value. The example below shows what the GraphQL query for musicians would look like with a data loader.
const DataLoader = require('dataloader')// The dataloader takes in an array of musician ids and returns Promises that will eventually // return the album data for each musicianconst albumLoader = new DataLoader(musicianIds => {//SQL query to fetch albums for all musicianIds at oncereturn sqlRun(SELECT * FROM Albums WHERE musicianId IN (musicianIds))})//Use this context in the Apollo Server definition to pass to each resolver when executedconst context = async () => {const loaders = {album: albumLoader(musicianIds),//Create more loaders for other data that is nested in your schema}return loaders;}
Once the data loader is set up and available in the context, the resolver can be updated to use that loader. The loader is only triggered once, for the list of musicians fetched, so the number of SQL queries has been reduced to only two.
//Resolver definitionsconst resolvers = {Query: {musicians: (_, args, { loaders }) => {//Database fetch to get a list of musiciansconst musicians = SELECT * FROM Musiciansreturn loaders.album(musicians.map(thisMusician => thisMusician.id));}}};
Batching in GraphQLAnchor
Batching in GraphQL is an expansion of the data loader concept discussed in the previous section. Essentially, batching libraries provide ways to ensure that nested data is retrieved with fewer queries by defining how to group and load similar data. Note that the main GraphQL library also supports batch execution, which is a different concept about invoking multiple resolvers at once.
Promises are the key to how batching works in GraphQL. The request executes the appropriate resolver first, and tries to resolve all the requested fields. Where batch loaders are specified, the data is resolved as a promise. GraphQL Batch can then iterate through grouped data identifiers to fulfill the promises together by retrieving data with as few calls as the batch loader will allow.
There are many batch loaders to choose from, implemented in different languages. Choose the appropriate tool based on the language your client or server are implemented in. For those using Ruby, Shopify has produced an open source plugin to use. Ruby users could also use this popular open source plugin, while Javascript users can use an open source library.
Implementing batchingAnchor
Using the Shopify Ruby batching plugin, we can implement a batch loader on the server side.
First, define a custom loader that will be used to group database calls:
class AlbumLoader < GraphQL::Batch::Loaderdef initialize(model)@model = modelenddef perform(ids)@model.where(id: ids).each { |album| fulfill(album.id,album) }ids.each { |id| fulfill(id, nil) unless fulfilled?(id) }endend
Next, apply the batching plugin to the GraphQL schema. It is advised that the plugin be defined after mutations so the batching can extend mutation fields and allow for cache clearing.
class MySchema < GraphQL::Schemaquery MyQueryTypemutation MyMutationTypeuse GraphQL::Batchend
Finally, use the batch loader class in the resolver with grouped identifiers to get a batch of nested data:
field :musician, Types::Musician, null: true doargument :id, ID, required: trueenddef musician(id:)AlbumLoader.for(Musician).load(id)end
ConclusionAnchor
In this article, you've learned that the n+1 problem is an issue where a naive query can cause excessive calls against databases in GraphQL. These extra calls can lead to an expensive and laggy webpage. GraphQL’s community provides several options for building your GraphQL server more robustly to streamline processing. Tools for this include data loaders and batching, which can be built directly into your schema.
The n+1 problem is a single issue you may encounter with a GraphQL server setup. To use GraphQL for your webpage without needing to build the server side from scratch, consider using tools like Hygraph, a headless server that can host your data sets in a GraphQL API without requiring any server setup.