Coder Social home page Coder Social logo

Comments (9)

leebyron avatar leebyron commented on May 3, 2024 54

DataLoader is typically not responsible for pagination. The batching behavior it provides applies to loading many keys in one dispatch to a key-value store. It shouldn't matter if those keys are related as part of a "page" or not. Since fetching ranges of lists doesn't really fit the "load by key" model, it typically happens orthogonally to DataLoader.

I've seen pagination implemented in lots of different ways, each with different tradeoffs which may be a better fit in different scenarios (normalized vs kv store, 10 servers vs 100,000 servers, on-box backend vs distributed backend, etc)

Here is one way that is simple to implement: a two-phase loading of paged information.

You'll need to provide two data APIs (let's use SQL as an example)

  1. Given some paging criteria produces a list of keys. (learn more)
    SELECT id2 FROM friends WHERE id1 = :from AND id2 > :after ORDER BY id2 LIMIT :first

  2. Given keys, provide values.
    SELECT * FROM users WHERE id IN (:ids)

When paginating, you'll likely run something like this 1st query without batching or caching help from DataLoader. The result will be a list of ids to load data for. Then, you can provide those IDs to the 2nd kind of query - which is exactly the kind of query DataLoader is good at providing batching and caching behavior for. DataLoader's .loadMany() is intended for this purpose.

Advanced techniques

Of course with SQL you could write this as a join query, though it could result in over-fetching should you have many queries in a single request which have a high probability of overlapping (that
is, high odds of a DataLoader cache-hit). Also if you want to populate the DataLoader cache after a join query, you'll need to use .prime().

Other sorts of data storage backends have different best practices for pagination which I recommend you investigate before thinking about how you would use DataLoader alongside it.

from dataloader.

h3yduck avatar h3yduck commented on May 3, 2024 2

I think we can batch the id queries as well using UNION statements without over-fetching:

(SELECT id1, id2 FROM friends WHERE id1 = :from1 AND id2 > :after1 ORDER BY id2 LIMIT :first)
UNION
(SELECT id1, id2 FROM friends WHERE id1 = :from2 AND id2 > :after2 ORDER BY id2 LIMIT :first)
UNION
(SELECT id1, id2 FROM friends WHERE id1 = :from3 AND id2 > :after3 ORDER BY id2 LIMIT :first);

@leebyron's solution + this will always result in 2 SQL queries.

from dataloader.

leebyron avatar leebyron commented on May 3, 2024 1

You're correct that a DataLoader is best used to apply only for a single request.

In my previous comment I explained how DataLoader can help as part of pagination in modeling a "join query" to request a set of elements as part of a single request. Both loading the edges in a page and loading the data at the end of each edge occur within a single request.

from dataloader.

amzhang avatar amzhang commented on May 3, 2024 1

Just want to add that the first query:

SELECT id2 FROM friends WHERE id1 = :from AND id2 > :after ORDER BY id2 LIMIT :first

is likely an index-only scan, which makes it quite a bit faster.

from dataloader.

tonyghita avatar tonyghita commented on May 3, 2024

Great explanation, that helps a ton. Thanks @leebyron!

from dataloader.

tvvignesh avatar tvvignesh commented on May 3, 2024

@leebyron Thanks for the great explanation. One doubt though. Since dataloader is most typically used to cache on a per request basis, is this needed? While moving to the next page, it is a brand new request, so why should we even bother doing that. What if we just store the keys in the dataloader as per the pagination results and the next time the user changes the page, all existing keys in the dataloader are anyway erased and we store the next set of keys in the dataloader.

Am i right in my assumption?

from dataloader.

jychen7 avatar jychen7 commented on May 3, 2024

@leebyron sorry, I am still have question about how data loader work with pagination.

for example,

{
  me {
    name
    followers(first: 3) {
      name
      followers(first: 2) {
        name
      }
    }
  }
}

suppose it is MySQL

followers

id user_id follower_id
1 m a
2 m b
3 m c
4 m d
5 a e
6 a f
7 a g
8 b h
9 b i
10 b j
11 c k
12 c l
13 c m

without pagination, the batch resolver can be

select follower_id from followers where user_id = m; # result is [a, b, c, d]
select user_id, follower_id from followers where user_id IN (a,b,c,d);

but if first query is

select follower_id from followers where user_id = m limit 3; # result is [a, b, c]

how can we batch query the 2 friends for each user in [a, b, c] ?

Thanks.

from dataloader.

leebyron avatar leebyron commented on May 3, 2024

I’m definitely not a SQL expect and don’t have an answer to your question. There may be a way to model what you’re trying to do in a single query but I’m not aware of it. Perhaps https://www.postgresql.org/docs/9.1/static/queries-with.html could be helpful?

from dataloader.

jychen7 avatar jychen7 commented on May 3, 2024

@leebyron we are using MySQL here, but never mind, thanks for your help as well

from dataloader.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.