I've been having trouble wrapping my head around how pagination works with a batched d

Great explanation, that helps a ton. Thanks <a class="user-mention notranslate" data-h

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

How does pagination work with dataloader? about dataloader HOT 9 CLOSED

tonyghita commented on May 3, 2024

How does pagination work with dataloader?

from dataloader.

Comments (9)

leebyron commented on May 3, 2024 54

DataLoader is typically not responsible for pagination. The batching behavior it provides applies to loading many keys in one dispatch to a key-value store. It shouldn't matter if those keys are related as part of a "page" or not. Since fetching ranges of lists doesn't really fit the "load by key" model, it typically happens orthogonally to DataLoader.

I've seen pagination implemented in lots of different ways, each with different tradeoffs which may be a better fit in different scenarios (normalized vs kv store, 10 servers vs 100,000 servers, on-box backend vs distributed backend, etc)

Here is one way that is simple to implement: a two-phase loading of paged information.

You'll need to provide two data APIs (let's use SQL as an example)

Given some paging criteria produces a list of keys. (learn more)
SELECT id2 FROM friends WHERE id1 = :from AND id2 > :after ORDER BY id2 LIMIT :first
Given keys, provide values.
SELECT * FROM users WHERE id IN (:ids)

When paginating, you'll likely run something like this 1st query without batching or caching help from DataLoader. The result will be a list of ids to load data for. Then, you can provide those IDs to the 2nd kind of query - which is exactly the kind of query DataLoader is good at providing batching and caching behavior for. DataLoader's .loadMany() is intended for this purpose.

Advanced techniques

Of course with SQL you could write this as a join query, though it could result in over-fetching should you have many queries in a single request which have a high probability of overlapping (that
is, high odds of a DataLoader cache-hit). Also if you want to populate the DataLoader cache after a join query, you'll need to use .prime().

Other sorts of data storage backends have different best practices for pagination which I recommend you investigate before thinking about how you would use DataLoader alongside it.

from dataloader.

h3yduck commented on May 3, 2024 2

I think we can batch the id queries as well using UNION statements without over-fetching:

(SELECT id1, id2 FROM friends WHERE id1 = :from1 AND id2 > :after1 ORDER BY id2 LIMIT :first)
UNION
(SELECT id1, id2 FROM friends WHERE id1 = :from2 AND id2 > :after2 ORDER BY id2 LIMIT :first)
UNION
(SELECT id1, id2 FROM friends WHERE id1 = :from3 AND id2 > :after3 ORDER BY id2 LIMIT :first);

@leebyron's solution + this will always result in 2 SQL queries.

from dataloader.

leebyron commented on May 3, 2024 1

You're correct that a DataLoader is best used to apply only for a single request.

In my previous comment I explained how DataLoader can help as part of pagination in modeling a "join query" to request a set of elements as part of a single request. Both loading the edges in a page and loading the data at the end of each edge occur within a single request.

from dataloader.

amzhang commented on May 3, 2024 1

Just want to add that the first query:

SELECT id2 FROM friends WHERE id1 = :from AND id2 > :after ORDER BY id2 LIMIT :first

is likely an index-only scan, which makes it quite a bit faster.

from dataloader.

tonyghita commented on May 3, 2024

Great explanation, that helps a ton. Thanks @leebyron!

from dataloader.

tvvignesh commented on May 3, 2024

@leebyron Thanks for the great explanation. One doubt though. Since dataloader is most typically used to cache on a per request basis, is this needed? While moving to the next page, it is a brand new request, so why should we even bother doing that. What if we just store the keys in the dataloader as per the pagination results and the next time the user changes the page, all existing keys in the dataloader are anyway erased and we store the next set of keys in the dataloader.

Am i right in my assumption?

from dataloader.

jychen7 commented on May 3, 2024

@leebyron sorry, I am still have question about how data loader work with pagination.

for example,

{
  me {
    name
    followers(first: 3) {
      name
      followers(first: 2) {
        name
      }
    }
  }
}

suppose it is MySQL

followers

id	user_id	follower_id
1	m	a
2	m	b
3	m	c
4	m	d
5	a	e
6	a	f
7	a	g
8	b	h
9	b	i
10	b	j
11	c	k
12	c	l
13	c	m

without pagination, the batch resolver can be

select follower_id from followers where user_id = m; # result is [a, b, c, d]
select user_id, follower_id from followers where user_id IN (a,b,c,d);

but if first query is

select follower_id from followers where user_id = m limit 3; # result is [a, b, c]

how can we batch query the 2 friends for each user in [a, b, c] ?

Thanks.

from dataloader.

leebyron commented on May 3, 2024

I’m definitely not a SQL expect and don’t have an answer to your question. There may be a way to model what you’re trying to do in a single query but I’m not aware of it. Perhaps https://www.postgresql.org/docs/9.1/static/queries-with.html could be helpful?

from dataloader.

jychen7 commented on May 3, 2024

@leebyron we are using MySQL here, but never mind, thanks for your help as well

from dataloader.

How does pagination work with dataloader? about dataloader HOT 9 CLOSED

Comments (9)

Advanced techniques

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent