Dataloader
Due to the flexibility of GraphQL, we often need to execute multiple queries when we load an object's associated objects. This causes the famous N+1 query problem. To solve this problem, we can use DataLoader.
The DataLoader
is able to reduce the number of queries to the database by merging multiple requests into a single one, and also caches the results of the query to avoid repetitive queries.
Example
Consider that we have the following simple objects User
and Book
:
On the Book
object, we have an authorID
field that references the id
field of the User
object.
In addition, we need to prepare some simple data:
Let's write a simple resolver for the Book
object:
In the above code, we have defined an additional field author
for Book
objects which will return User
objects matching the authorID
field. We also define a query called books
that will return all Book
objects.
Here, we use the users
array directly to find users. For the following query:
We will look up the author
field for each Book
instance, and in doing so, we will directly traverse the users
array to find users that match the authorID
field.
Here we have 6 Book
instances, so we will execute 6 lookups. Is there a better way to reduce the number of queries?
Using the DataLoader
Next, we'll use the DataLoader to optimize our query.
We can use the EasyDataLoader
class from the @gqloom/core
package for basic functionality, or opt for the more popular DataLoader.
Defining Batch Queries
In the code above, we used createMemoization
to create a useUserLoader
function that returns a EasyDataLoader
instance.
The memoization function ensures that the same EasyDataLoader
instance is always used within the same request.
Inside createMemoization
, we directly construct an EasyDataLoader
instance and pass a query function. Let's delve into how this query function works:
-
We pass a batch query function that takes a parameter of type
number[]
when constructing theEasyDataLoader
. -
In the query function, we receive an array of
authorIDs
containing theid
of theUser
object to be loaded. When we call theauthor
field on theBook
object, theDataLoader
automatically merges all theauthorIDs
within the same request and passes them to the query function. -
In the query function, we first create a
Set
object, which we use to quickly check ifauthorID
exists in theauthorIDs
array. -
We then create a
Map
object to store the mapping betweenauthorID
andUser
objects. -
Next, we iterate through the
users
array and adduser.id
to theauthorMap
if it exists in theauthorIDSet
. -
Finally, we retrieve the corresponding
User
objects fromauthorMap
in the order of theauthorIDs
array and return an array containing thoseUser
objects.
It must be ensured that the order of the return array of the query function matches the order of the IDs
array. The DataLoader
relies on this order to merge the results correctly.
In this way, we can use the useUserLoader
function in BookResolver
to load the author
field of the Book
object.
When calling the author
field for all 6 Book
instances, DataLoader
automatically merges these requests and iterates through the users
array only once, thus improving performance.