Due to the flexibility of GraphQL, we often need to execute multiple queries when we load an object's associated objects. This causes the famous N+1 query problem. To solve this problem, we can use DataLoader.
The DataLoader
is able to reduce the number of queries to the database by merging multiple requests into a single one, and also caches the results of the query to avoid repetitive queries.
Consider that we have the following simple objects User
and Book
:
On the Book
object, we have an authorID
field that references the id
field of the User
object.
In addition, we need to prepare some simple data:
Let's write a simple resolver for the Book
object:
In the above code, we have defined an additional field author
for Book
objects which will return User
objects matching the authorID
field. We also define a query called books
that will return all Book
objects.
Here, we use the users
array directly to find users. For the following query:
We will look up the author
field for each Book
instance, and in doing so, we will directly traverse the users
array to find users that match the authorID
field.
Here we have 6 Book
instances, so we will execute 6 lookups. Is there a better way to reduce the number of queries?
Next, we'll use the DataLoader to optimize our query.
First we need to install the dataloader
package:
In the code above, we used createMemoization
to create a useUserLoader
function that returns a DataLoader
instance.
The memoization function ensures that the same DataLoader
instance is always used within the same request.
In createMemoization
, we directly constructed the DataLoader
instance and passed a query function, let's dive into how this query function works:
DataLoader
: number
and IUser | undefined
:DataLoader
, i.e. the type of the authorID
property of the Book
object.DataLoader
, i.e., User
object or undefined
.In the query function, we receive an array of authorIDs
containing the id
of the User
object to be loaded. When we call the author
field on the Book
object, the DataLoader
automatically merges all the authorIDs
within the same request and passes them to the query function.
In the query function, we first create a Set
object, which we use to quickly check if authorID
exists in the authorIDs
array.
We then create a Map
object to store the mapping between authorID
and User
objects.
Next, we iterate through the users
array and add user.id
to the authorMap
if it exists in the authorIDSet
.
Finally, we retrieve the corresponding User
objects from authorMap
in the order of the authorIDs
array and return an array containing those User
objects.
It must be ensured that the order of the return array of the query function matches the order of the IDs
array. The DataLoader
relies on this order to merge the results correctly. For more information, see the DataLoader documentation.
In this way, we can use the useUserLoader
function in BookResolver
to load the author
field of the Book
object.
When calling the author
field for all 6 Book
instances, DataLoader
automatically merges these requests and iterates through the users
array only once, thus improving performance.