Coder Social home page Coder Social logo

Comments (8)

lemonmade avatar lemonmade commented on July 17, 2024

My understanding is that there is no way to implement batching with hooks and suspense, at least not for hooks run in the same component. Asynchronous hooks signal to React that they need to activate the nearest suspense boundary by throwing a promise. This means that, in the actual execution of a component, only the first hook that suspends is ever "seen" — React can’t just move past the error to execute the rest of the component because it may depend on the (non-error) result of the first suspending hook.

I haven't checked, but my guess is that React will at least find all the suspending components that are siblings in a single pass, which should allow for batching across components, but it might be confusing if that same behavior doesn't work within a single component.

from hydrogen-v1.

jplhomer avatar jplhomer commented on July 17, 2024

Thanks for the deep-dive @lancelafontaine! Super interesting to get all that context. I really like the ergonomics you propose, too.

I think @lemonmade nails it with regard to the hooks run in the same component. Thankfully, it seems unlikely that a developer would query the same data source with different queries in the same component.

However, this raises a bigger issue with how Suspense data fetching works and our decision to use react-query.

As you've unpacked, we decided to use react-query recently instead of our own server-fetching hook. This is because we struggled with our implementation of cache across suspended render attempts.

Suspense works by throwing pending Promises to the nearest <Suspense> boundary. When those promises resolve, they indicate to the React renderer that it's time to try rendering again. This behavior is abstracted away from the developer and instead lives within data-fetching libraries (like react-query, react-fetch, react-fs, react-pg, etc). This is why asynchronous functions look synchronous when they're really suspending.

While I was struggling to find a good solution for an inter-render cache, react-query seemed to handle it perfectly. That's pretty much the only reason I chose to adopt it, beyond having really slick useQuery ergonomics. It sounds like you've been diving into the source code more than I have, so it's possible you know even more about that implementation than I do at this point 😄

It's becoming more apparent that our limited use of react-query for server-side data fetching may not be the best fit for Hydrogen. We'll probably end up writing a custom hook which uses React's new (experimental) built-in Suspense cache or adopt Suspense-friendly IO helpers like react-fetch.

So, can batching work with Suspense?

Gosh — maybe?

From what I understand, DataLoader manages an internal clock while it accepts keys to fetch from consumer code. After a certain time has passed (e.g. a Node.js event loop) it decides to execute a batch request and return it to the consumer(s).

The consumers, meanwhile, are waiting on a Promise to resolve. This is where we'd need to adjust our expectations for a Suspense world: instead of returning a Promise, DataLoader would need to throw a promise.

Some notes, questions, caveats here:

  • Is this feasible with the way DataLoader works? e.g. does a thrown Promise in the React renderer result in the event loop being terminated, meaning the DataLoader would move ahead and execute the current (single) request set?
  • Based on how a developer formats their project, batching will be inconsistent. E.g. if a server component loads a child server component, the child will not be included in the batch (because the parent will be suspended). Still, if we find the benefits of batching to be of value, it's probably worth building toward this outcome.
  • What can we do next? We should investigate returning to a react-query-less world.

from hydrogen-v1.

lancelafontaine avatar lancelafontaine commented on July 17, 2024

Thank you both for the context! I'm still trying to wrap my mind around Suspense/React's execution model but this discussion is definitely making things a lot clearer 🙇

I think @lemonmade nails it with regard to the hooks run in the same component. Thankfully, it seems unlikely that a developer would query the same data source with different queries in the same component.

Suspense works by throwing pending Promises to the nearest boundary. When those promises resolve, they indicate to the React renderer that it's time to try rendering again.

Gotcha! This was a great explanation, thanks.

I could imagine one given component making two queries to two separate services within one component. With this existing throwing-promises mechanism, it seems like those couldn't even be run in parallel (unless I'm misunderstanding).

I haven't checked, but my guess is that React will at least find all the suspending components that are siblings in a single pass, which should allow for batching across components

Yep, this would be a good opportunity to batch in "two dimensions":

  1. distinct queries to the same service (eg. useBatchedGraphQLQuery) across sibling components. This would allow for a future in which each component could be responsible for encapsulating the data that needs to be fetched to be properly rendered, without incurring the cost of an additional query per component.
  2. same queries to the same service but with different records (eg. useBatchedRecordsQuery ) across sibling components. I could imagine this being useful for many identical components with the same selection sets but for different record being rendered one the same page (eg. one card per article in a blog, one card per product in a collection, etc)

This is where we'd need to adjust our expectations for a Suspense world: instead of returning a Promise, DataLoader would need to throw a promise.

I haven't had the chance to look into the mechanics of this yet, but I'd expect that for batching to occur, these hooks would need to throw a Promise wrapping a literal value that only describes the GraphQL request that would need to be made to fulfill that data (without actually making the fetch calls), and that all dataloading would have to occur at the Suspense boundary where all thrown promises are aggregated and combined into one batched request. Definitely sounds interesting, although I wouldn't be surprised if we needed to dig into React's internals to tweak this behaviour.

E.g. if a server component loads a child server component, the child will not be included in the batch (because the parent will be suspended).

I'll spend more time thinking about this one, but my gut reaction is that without having multiple passes, you're probably right that data fetching would only be applicable for sibling components and not child components 🤔 it could be worth investigating and trying to determine whether encouraging a pattern which can't fully be batched due to these constraints would result is a better experience than our current data-fetching hoisting-at-the-highest-level pattern.

from hydrogen-v1.

lancelafontaine avatar lancelafontaine commented on July 17, 2024

Last thought: any idea whether any of this changes in an RSC world? or any indication as to what data fetching might look like then? Mostly asking because the few resources I could find seem to indicate that this sort of Suspense-bound scheduling might not be necessary then, and can just rely on async libraries that could possibly integrate dataloaders more simply (eg. db.notes.get(props.id) in the official RSC RFC, that RFC even mentions Waterfalls are still not ideal on the server, so we will provide an API to preload data requests as an optimization.)

from hydrogen-v1.

jplhomer avatar jplhomer commented on July 17, 2024

I could imagine one given component making two queries to two separate services within one component. With this existing throwing-promises mechanism, it seems like those couldn't even be run in parallel (unless I'm misunderstanding).

Yeah agreed that they can't run in parallel.

Last thought: any idea whether any of this changes in an RSC world?

Nothing really changes in an RSC world. React 18 introduces streaming SSR, and RSC is a somewhat separate concept of server/client component separation which still leverages the streaming SSR added in React 18.

FWIW, the db.notes.get(props.id) in the RSC RFC is likely alluding to something like react-pg which is Suspense-based (throws Promises until it resolves).

I hadn't heard of the preload optimization you pulled out. I'll ask the React team about that!

from hydrogen-v1.

jplhomer avatar jplhomer commented on July 17, 2024

Update: I asked about preloading and got an in-depth answer from Sebastian!


There's no automatic framework that does this for you but all the React I/O APIs should expose a preload() method.

E.g. react-fetch has a fetch(...) API and a preload(...) API.

https://github.com/facebook/react/blob/main/packages/react-fetch/src/ReactFetchNode.js#L226

The principle is that most data fetching should be done in two steps:

  1. preload
  2. read

A preload just starts downloading the thing where as a read (e.g. fetch(...)) blocks and returns the actual result.

To achieve optimal performance it's often necessary to start the preload as early as you think you might need the data. I.e. it's better to fetch data you won't actually need before you need it to lower the time spent waiting on it. However, if you do that for everything you might flood the network and spend too high cost. This is not a problem that can be directly automated - since it's a constraint solving problem and the ideal solution likely includes statistics. So there's a research/framework space to make this ideal.

However, in simple cases it's easy to just add manual preload.

Often that can be in a component closer to the root rather than deep in the tree. Like if you always is going to need some piece of data, you might as well start loading it as early as possible. However, it would be unnecessary to block on it until deeper when you really need it.

Since preloads doesn't block - you can parallelize by calling multiple preloads.

function App() {
  preload('a.json');
  preload('b.json');
  preload('c.json');
  return <Stuff><Component /></Stuff>;
}

function Component() {
  let a = fetch('a.json').json();
  let b = fetch('b.json').json();
  let c = fetch('c.json').json();
  return <div>{a + b + c}</div>;
}

Sometimes it might not work and you have to put them in the same component but you should still use the same pattern:

function Component() {
  let items = fetch('items.json');
  for (let item of items) preload(item.url);
  let children = items.map(item => {
    let data = fetch(item.url).json();
    return <Item data={data} />;
  });
  return <div>{children}</div>;
}

It might be tempting to try to unify them into some kind of Promise.all things but that pattern doesn't scale up to help you preload earlier in a completely different component. In fact, IMO, once you get your head around thinking this new way Promise.all starts looking kind of ridiculous.

Even in the example I used it's often better to do the read within each item rather than in the parent. That allows React to also parallelize the rendering of children and not block on all of them loading before it can start work on either.

function Component() {
  let items = fetch('items.json');
  for (let item of items) preload(item.url);
    let children = items.map(item => {
    return <Item url={url} />;
  });
  return <div>{children}</div>;
}

function Item({url}) {
  let data = fetch(url).json();
  // ...
}

I hadn't even considered preload in our world yet. I wonder if we can leverage this aspect in both 3P queries but also 1P. Unfortunately, tossing all of our 3P queries into a useQuery type function makes this difficult.

There's also considerations for interacting with a subrequest cache and whether this method is compatible.

Pinging @igrigorik for visibility!

from hydrogen-v1.

igrigorik avatar igrigorik commented on July 17, 2024

First, perhaps as an obvious disclaimer: preloading and batching are entirely different optimizations. I suspect we should spin out preload discussion into a separate thread. With that as context...

Eager-fetch of critical 1P+3P data is definitely something we should explore. Unlocking this requires either an explicit signal from the developer that data fetch that might be nested in a component tree should be done early, or automated smarts to learn / heuristically determine these optimizations.

Naively, using Suspense already gets you pretty far by eliminating blocking fetch-and-render-before-proceed boundaries, allowing fetches to be kicked off and streamed in parallel. There might still be room for "start it earlier" but it's not clear how much that will unlock.. As an idea, perhaps one thing we can experiment is adding some for of preload: true option to useQuery which could then leverage via static analysis? Plumbing this knowledge into the runtime is another matter though.

In short, I propose we open a new RFC/discussion thread on preload to explore this space, but my intuition is that proper use of Suspense will already get you pretty far and we should first see & gather some telemetry here.

from hydrogen-v1.

jplhomer avatar jplhomer commented on July 17, 2024

Update to interested parties: we shipped preloading support in Shopify/hydrogen#700 using preload: true and preload: '*' 🎉

Additional work could be done to analyze and combine GraphQL queries in a way that doesn't create an even slower or more complex resulting query.

For now, we should see how this works for our use case.

from hydrogen-v1.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.