Coder Social home page Coder Social logo

Comments (15)

lukejagodzinski avatar lukejagodzinski commented on May 3, 2024 14

Many good replies here but I will add my solution too:

const ensureOrder = options => {
  const {
    docs,
    keys,
    prop,
    error = key => `Document does not exist (${key})`
  } = options;
  // Put documents (docs) into a map where key is a document's ID or some
  // property (prop) of a document and value is a document.
  const docsMap = new Map();
  docs.forEach(doc => docsMap.set(doc[prop], doc));
  // Loop through the keys and for each one retrieve proper document. For not
  // existing documents generate an error.
  return keys.map(key => {
    return (
      docsMap.get(key) ||
      new Error(typeof error === "function" ? error(key) : error)
    );
  });
};

Usage

new DataLoader(async ids => {
  const users = await getUsersByIds(ids);
  return ensureOrder({
    docs: users,
    keys: ids,
    prop: "id",
    error: id => `User does not exist (${id})`
  });
});

Field will be null if there is no document matching key/ID.
You can also pass error function/string to generate custom error that will be sent back to the client in the errors field.

from dataloader.

andrewreedy avatar andrewreedy commented on May 3, 2024 8

We've faced similar issues with sorting based on keys of multiple types. The risk of returning incorrect data for the wrong key is too high to not be bullet proof. We've built on the previous examples and built out a utility function with tests that may be useful to others. Keys can be in the form of strings, numbers, or objects.

https://github.com/reachifyio/dataloader-sort

We also write tests for all our our dataloaders just to be 100% sure.

from dataloader.

joanrodriguez avatar joanrodriguez commented on May 3, 2024 6

I do it in Javascript like this:

  static findByKeys(keys) {
    return db.select(...this.fields)
      .from('table_name')
      .whereIn('id', keys)
      .then(rows => keys.map(k => rows.filter(r => r.id === k)[0]));
  }

from dataloader.

ThisIsMissEm avatar ThisIsMissEm commented on May 3, 2024 3

What about using an object for the batch return: { [key:any]: any }, that is, when you return you return a hash of values keyed by their ID.

Granted, this may be less performant in cases where you can rely on your database to return a sorted set of values, with undefined values representing values that don't exist.

from dataloader.

ericclemmons avatar ericclemmons commented on May 3, 2024 2

I use:

db(tableName)
  .whereIn("id", ids)
  .orderByRaw(`FIELD(id, ${ids})`)
  .then(this.normalize)

But I'm wondering the same thing: how to maintain key order for when the new results are merged with existing.

from dataloader.

panetta-net-au avatar panetta-net-au commented on May 3, 2024 1

From my point of view, I think there has been sufficient discussion in this issue to close it but I'll leave that up to leebyron.

I think the reworked documentation is sufficient to at least push people in the right direction and the suggestions provided have given me some reasonable options.

For me, any of the below will work fine for sorting

  • writing my own helper method
  • finding an NPM package
  • leveraging database sorting as suggested by ericclemmons

Thanks all for your helpful and detailed comments

from dataloader.

JeffreySoriano5 avatar JeffreySoriano5 commented on May 3, 2024 1

Here is a helper function made by me and @jorgecuesta which we think could be useful for people facing the same problem. It also has better performance compared to the above sort solution.

Here is the gist: https://gist.github.com/JeffreySoriano5/05e8c66f3b1fbec1d20403a774b0c15d

Hope someone finds it useful.

from dataloader.

nagarajay avatar nagarajay commented on May 3, 2024

Great Observation! I also ran into this problem. I am working with MySQL, GraphQL and Sequelize and realized that when using dataloader, the key to the value returned from the promise has to be in sync or else the results are incorrect.(Not to mention the length of the Array has to be maintained with the same number of returned values)
Also faced a lot of problems with JOINS as the order was pretty random. It was totally dependent on the type of JOIN that I had marked. Later I realized that in some I needed INNER JOIN and in some LEFT OUTER JOIN. Finally I realized that the problem is not with the JOINS but how Dataloader treats the batch load function.
@leebyron If we could have a predictable behavior with respect to some sorting in place it would be a boon. This is my opinion.

@Solution
Currently I have used a helper function to maintain the output from the Database promise so that the returned values are maintained in the same order as the index of the input array and also maintain the number of returned values.

Apologies if this sounds confusing.

Look forward to some direction in this regard.

Thanks

from dataloader.

leebyron avatar leebyron commented on May 3, 2024

Thanks for the report. I just added a section in the Readme that made the constraints of the batch function more clear, hopefully that aids future confusion.

I took a look at your gist and it's definitely non performant, but I don't believe that it's inherent to the initial order of the batched keys, but just an artifact of the implementation you used.

Specifically the "push-filter" implementation is O(N^2) with respect to the number of keys being batched, it requires copying, and filter is typically not optimized by the VM. As a result, it's not very fast compared to sorting. Also, because of the N^2 factor, if your set of batched keys is typically small, then even push-filter could be fine. With your gist using 100 keys per batch (which seems pretty high to me) it's of course significantly slower, but with 5-10 keys, it's a little slower but not extremely so.

However that is also not the only possible implementation of re-ordering results, and others can perform much better, even beating that of sorting keys and values. I created a variation of your gist with one example (though certainly not the only possible one) which builds an index map to sort the values. It's a bit faster than simply sorting both.

However both of these techniques may fall short if your backend doesn't return anything for non-existent data.

Of course, if your backend returns results matching the order they were requested, then you wouldn't need to do anything at all.

I'm certainly open to ideas on how to improve the library, but the goal should be to provide as thin a solution as is reasonable so as not to create new performance burdens where they're not necessary and to keep the set of things you need to learn low.

from dataloader.

leebyron avatar leebyron commented on May 3, 2024

That might be tricky since not all keys can be used as object property names, for instance MongoDB keys are often represented as objects. Also, this return style wouldn't work for non-caching DataLoaders which could have duplicate keys.

However, I think there's ample npm territory for anyone to seek out a transform function that does this! It would only work in some circumstances, but it could be a nice utility for those cases

from dataloader.

Shahor avatar Shahor commented on May 3, 2024

@joanrodriguez Yes but then you would have to do that in almost every DataLoader which gets annoying pretty fast :|

@andrewreedy 's packaged solution could do the trick for now

from dataloader.

leebyron avatar leebyron commented on May 3, 2024

Great to see all the various approaches here! Ecosystems ftw!

from dataloader.

kihonq avatar kihonq commented on May 3, 2024

Many good replies here but I will add my solution too:

const ensureOrder = options => {
  const {
    docs,
    keys,
    prop,
    error = key => `Document does not exist (${key})`
  } = options;
  // Put documents (docs) into a map where key is a document's ID or some
  // property (prop) of a document and value is a document.
  const docsMap = new Map();
  docs.forEach(doc => docsMap.set(doc[prop], doc));
  // Loop through the keys and for each one retrieve proper document. For not
  // existing documents generate an error.
  return keys.map(key => {
    return (
      docsMap.get(key) ||
      new Error(typeof error === "function" ? error(key) : error)
    );
  });
};

Usage

new DataLoader(async ids => {
  const users = await getUsersByIds(ids);
  return ensureOrder({
    docs: users,
    keys: ids,
    prop: "id",
    error: id => `User does not exist (${id})`
  });
});

Field will be null if there is no document matching key/ID.
You can also pass error function/string to generate custom error that will be sent back to the client in the errors field.

Not sure why, but I had to toString() the key and the prop or else it keeps throwing me error.

from dataloader.

lukejagodzinski avatar lukejagodzinski commented on May 3, 2024

@qeehong does the error come from the ensureOrder function or DataLoader or just database? It's very unlikely that this code will have this kind of problem.

from dataloader.

kihonq avatar kihonq commented on May 3, 2024

@qeehong does the error come from the ensureOrder function or DataLoader or just database? It's very unlikely that this code will have this kind of problem.

It's coming from the ensureOrder. Or maybe it just me because I need to use ObjectId for the key. Had to make following changes:

const ensureOrder = (options) => {
  const { docs, keys, prop, error = (key) => `Document does not exist (${key})` } = options;
  const docsMap = new Map();
  docs.forEach((doc) => {
    const stringKey = doc[prop].toString();
    docsMap.set(stringKey, doc);
  });

  return keys.map((key) => {
    const stringKey = key.toString();
    return (
      docsMap.get(stringKey) ||
      new Error(typeof error === 'function' ? error(stringKey) : error)
    );
  });
};

from dataloader.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.