Skip to content
GQLoom

Data Loaders

Due to the flexibility of GraphQL, when loading related objects of a certain object, we often need to execute multiple queries. This leads to the notorious N+1 query problem. To solve this, we can use DataLoader.

DataLoader can merge multiple requests into a single one, thereby reducing the number of database queries, and also caching query results to avoid redundant queries.

The N+1 Query Problem

Consider a scenario where we need to query all users and their respective posts. Our data table structure is as follows:

ts
import { 
drizzleSilk
} from "@gqloom/drizzle"
import {
relations
} from "drizzle-orm"
import * as
t
from "drizzle-orm/pg-core"
export const
roleEnum
=
t
.
pgEnum
("role", ["user", "admin"])
export const
users
=
drizzleSilk
(
t
.
pgTable
("users", {
id
:
t
.
serial
().
primaryKey
(),
createdAt
:
t
.
timestamp
().
defaultNow
(),
email
:
t
.
text
().
unique
().
notNull
(),
name
:
t
.
text
(),
role
:
roleEnum
().
default
("user"),
}) ) export const
usersRelations
=
relations
(
users
, ({
many
}) => ({
posts
:
many
(
posts
),
})) export const
posts
=
drizzleSilk
(
t
.
pgTable
("posts", {
id
:
t
.
serial
().
primaryKey
(),
createdAt
:
t
.
timestamp
().
defaultNow
(),
updatedAt
:
t
.
timestamp
()
.
defaultNow
()
.
$onUpdateFn
(() => new
Date
()),
published
:
t
.
boolean
().
default
(false),
title
:
t
.
varchar
({
length
: 255 }).
notNull
(),
authorId
:
t
.
integer
().
notNull
(),
}) ) export const
postsRelations
=
relations
(
posts
, ({
one
}) => ({
author
:
one
(
users
, {
fields
: [
posts
.
authorId
],
references
: [
users
.
id
] }),
}))

A straightforward resolver implementation might look like this:

ts
import { 
field
,
query
,
resolver
} from "@gqloom/core"
import {
eq
} from "drizzle-orm"
import {
db
} from "src/db"
import {
posts
,
users
} from "src/schema"
export const
userResolver
=
resolver
.
of
(
users
, {
users
:
query
(
users
.
$list
()).
resolve
(() =>
db
.
select
().
from
(
users
)),
posts
:
field
(
posts
.
$list
())
.
derivedFrom
("id")
.
resolve
((
user
) =>
db
.
select
().
from
(
posts
).
where
(
eq
(
posts
.
authorId
,
user
.
id
))
), })

When we execute the following query:

graphql
query usersWithPosts {
  users {
    id
    name
    posts {
      id
      title
    }
  }
}

The backend execution flow will be:

  1. Execute one query to fetch all user lists (SELECT * FROM users).
  2. For each returned user, execute another query to fetch that user's posts (SELECT * FROM posts WHERE authorId = ?).

If the first query returns N users, then to fetch their posts, we would collectively execute 1 (fetch users) + N (fetch posts for each user) queries. This is known as the "N+1 Query Problem". When N is large, this puts immense pressure on the database, leading to performance bottlenecks.

GQLoom provides powerful tools to elegantly solve this problem.

field().load() Method

The simplest way is to use the field().load() method. It transforms the resolver function from handling a single parent object to handling a batch of parent objects, allowing for bulk data fetching.

The load method accepts an asynchronous function as a parameter. The first parameter of this function is an array of parent objects, parents, and subsequent parameters are the input arguments args for that field. This asynchronous function needs to return an array of the same length as the parents array, where each element corresponds to the result for a parent object.

INFO

It is crucial that the returned array strictly matches the order and length of the parents array. DataLoader relies on this order to correctly map results back to each parent object.

Let's look at an example. To solve the N+1 problem mentioned above, we can modify the resolver like this:

ts
import { 
field
,
resolver
} from "@gqloom/core"
import {
inArray
} from "drizzle-orm"
import {
db
} from "src/db"
import {
posts
,
users
} from "src/schema"
export const
userResolver
=
resolver
.
of
(
users
, {
posts
:
field
(
posts
.
$list
())
.
derivedFrom
("id")
.
load
(async (
userList
) => {
// 1. Fetch all posts for the users at once const
postList
= await
db
.
select
()
.
from
(
posts
)
.
where
(
inArray
(
posts
.
authorId
,
userList
.
map
((
u
) =>
u
.
id
)
) ) // 2. Group posts by authorId const
grouped
=
Map
.
groupBy
(
postList
, (
p
) =>
p
.
authorId
)
// 3. Map the posts back to each user in order return
userList
.
map
((
u
) =>
grouped
.
get
(
u
.
id
) ?? [])
}), })

In the code above, the load function receives a userList array. We extract the id of all users and use the inArray operation to fetch all related posts from the database in a single query. Then, we group the posts by authorId and finally map them back to an array whose order matches userList.

Thus, regardless of how many users we request, the query to the posts table will only be executed once.

LoomDataLoader

field().load() is a convenient API provided by GQLoom, which internally creates and manages DataLoader instances for us. However, in some scenarios, we might need finer control, or want to share the same data loader instance across different resolvers. In such cases, we can use LoomDataLoader.

GQLoom provides the LoomDataLoader abstract class and the EasyDataLoader convenience class for creating custom data loaders.

Custom Data Loaders (LoomDataLoader)

We can create a custom data loader by extending LoomDataLoader and implementing the batchLoad method.

ts
import { 
LoomDataLoader
,
field
,
query
,
resolver
} from "@gqloom/core"
import {
createMemoization
} from "@gqloom/core/context"
import {
inArray
} from "drizzle-orm"
import {
db
} from "src/db"
import {
posts
,
users
} from "src/schema"
import * as
v
from "valibot"
// 1. Create a custom DataLoader export class
UserLoader
extends
LoomDataLoader
<
number, typeof
users
.
$inferSelect
> { protected async
batchLoad
(
keys
: number[]
):
Promise
<(typeof
users
.
$inferSelect
| Error)[]> {
const
userList
= await
db
.
select
()
.
from
(
users
)
.
where
(
inArray
(
users
.
id
,
keys
))
const
userMap
= new
Map
(
userList
.
map
((
u
) => [
u
.
id
,
u
]))
return
keys
.
map
(
(
key
) =>
userMap
.
get
(
key
) ?? new
Error
(`User ${
key
} not found`)
) } } // 2. Use createMemoization to create a shared loader instance within the request export const
useUserLoader
=
createMemoization
(() => new
UserLoader
())
// 3. Use it in the resolver export const
postResolver
=
resolver
.
of
(
posts
, {
author
:
field
(
users
)
.
derivedFrom
("authorId")
.
resolve
((
post
) => {
const
loader
=
useUserLoader
()
return
loader
.
load
(
post
.
authorId
)
}), }) export const
userResolver
=
resolver
.
of
(
users
, {
user
:
query
(
users
)
.
input
({
id
:
v
.
number
() })
.
resolve
(({
id
}) => {
const
loader
=
useUserLoader
()
return
loader
.
load
(
id
)
}), })

To ensure that each request has an independent data loader instance and to prevent data cache pollution between different requests, we typically combine it with the createMemoization function from Context. This will create a singleton loader within the lifecycle of each request.

ts
import { 
LoomDataLoader
,
field
,
query
,
resolver
} from "@gqloom/core"
import {
createMemoization
} from "@gqloom/core/context"
import {
inArray
} from "drizzle-orm"
import {
db
} from "src/db"
import {
posts
,
users
} from "src/schema"
import * as
v
from "valibot"
// 1. Create a custom DataLoader export class
UserLoader
extends
LoomDataLoader
<
number, typeof
users
.
$inferSelect
> { protected async
batchLoad
(
keys
: number[]
):
Promise
<(typeof
users
.
$inferSelect
| Error)[]> {
const
userList
= await
db
.
select
()
.
from
(
users
)
.
where
(
inArray
(
users
.
id
,
keys
))
const
userMap
= new
Map
(
userList
.
map
((
u
) => [
u
.
id
,
u
]))
return
keys
.
map
(
(
key
) =>
userMap
.
get
(
key
) ?? new
Error
(`User ${
key
} not found`)
) } } // 2. Use createMemoization to create a shared loader instance within the request export const
useUserLoader
=
createMemoization
(() => new
UserLoader
())
// 3. Use it in the resolver export const
postResolver
=
resolver
.
of
(
posts
, {
author
:
field
(
users
)
.
derivedFrom
("authorId")
.
resolve
((
post
) => {
const
loader
=
useUserLoader
()
return
loader
.
load
(
post
.
authorId
)
}), }) export const
userResolver
=
resolver
.
of
(
users
, {
user
:
query
(
users
)
.
input
({
id
:
v
.
number
() })
.
resolve
(({
id
}) => {
const
loader
=
useUserLoader
()
return
loader
.
load
(
id
)
}), })

In this example, when useUserLoader() is called multiple times within the same GraphQL request, it will return the same UserLoader instance. Therefore, multiple calls to loader.load(id) will be automatically batched, and the batchLoad function will only be executed once.

Convenient Data Loaders (EasyDataLoader)

If you are not a fan of object-oriented programming, you can use EasyDataLoader. It accepts a batchLoad function as a constructor parameter.

The useUserLoader above can be simplified with EasyDataLoader:

ts
import { 
EasyDataLoader
,
field
,
resolver
} from "@gqloom/core"
import {
createMemoization
} from "@gqloom/core/context"
import {
inArray
} from "drizzle-orm"
import {
db
} from "src/db"
import {
posts
,
users
} from "src/schema"
const
useUserLoader
=
createMemoization
(() => {
return new
EasyDataLoader
<number, typeof
users
.
$inferSelect
>(async (
keys
) => {
const
userList
= await
db
.
select
()
.
from
(
users
)
.
where
(
inArray
(
users
.
id
,
keys
))
const
userMap
= new
Map
(
userList
.
map
((
u
) => [
u
.
id
,
u
]))
return
keys
.
map
(
(
key
) =>
userMap
.
get
(
key
) ?? new
Error
(`User ${
key
} not found`)
) }) }) // The usage in the resolver remains the same export const
postResolver
=
resolver
.
of
(
posts
, {
author
:
field
(
users
)
.
derivedFrom
("authorId")
.
resolve
((
post
) => {
const
loader
=
useUserLoader
()
return
loader
.
load
(
post
.
authorId
)
}), })