Coder Social home page Coder Social logo

overblog / dataloader-php Goto Github PK

View Code? Open in Web Editor NEW
192.0 13.0 20.0 151 KB

DataLoaderPhp is a generic utility to be used as part of your application's data fetching layer to provide a simplified and consistent API over various remote data sources such as databases or web services via batching and caching.

License: MIT License

PHP 100.00%
dataloader cache php batch graphql

dataloader-php's People

Contributors

alafon avatar jpastoor avatar mcg-web avatar owlycode avatar renatomefi avatar ruudk avatar simpod avatar vhenzl avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

dataloader-php's Issues

Correct use without deprecated `GraphQL::execute()`?

GraphQL::execute() has been deprecated many versions ago.

There are two examples of how to use dataloaders with webonyx/graphql-php (I'm aware of):

A test in this project:

$response = GraphQL::execute($schema, $query);

And https://github.com/mcg-web/sandbox-dataloader-graphql-php. But both use this deprecated API.

What's the correct use without deprecated GraphQL::execute() (particularly in combination with the "native" SyncPromise)?

GraphQL::executeQuery() can't be use as it internally always uses GraphQL\Executor\Promise\Adapter\SyncPromiseAdapter\SyncPromiseAdapter instance, not Overblog\DataLoader\Promise\Adapter\Webonyx\GraphQL\SyncPromiseAdapter instance set with GraphQL::setPromiseAdapter() and it's execution results in "GraphQL\Error\InvariantViolation: Could not resolve promise" error.

Assuming from this comment, the whole use of GraphQL::setPromiseAdapter() should be considered deprecated and for all cases that don't use GraphQL\Executor\Promise\Adapter\SyncPromiseAdapter\SyncPromiseAdapter, GraphQL::promiseToExecute() should be used.

Given these adapters:

use Overblog\DataLoader\Promise\Adapter\Webonyx\GraphQL\SyncPromiseAdapter;
use Overblog\PromiseAdapter\Adapter\WebonyxGraphQLSyncPromiseAdapter;

$graphQLPromiseAdapter = new SyncPromiseAdapter();
$dataLoaderPromiseAdapter = new WebonyxGraphQLSyncPromiseAdapter($graphQLPromiseAdapter);

I see two possible solutions. Either:

$promise = GraphQL::promiseToExecute($graphQLPromiseAdapter, $schema, $query /* ... */);
$result = $graphQLPromiseAdapter->wait($promise);

which kinda replicates what GraphQL::execute() does, or:

$promise = GraphQL::promiseToExecute($graphQLPromiseAdapter, $schema, $query /* ... */);
$result = DataLoader::await($promise)->toArray();    

Is that correct? Both works and seems to work equally. Is there any real difference between them? Any pros and cons using one over the other?

Promise returned from DataLoader::{load,loadMany} is not "thenable"

Forget it.

Looking for the issue 3 hours, then finally decide to issue, and exact 3 minutes later I notice a return missing -> so the resolved promise is not enqueued.

If I did look for another 2 hours I would not have found it, its when you post you see things......

And I was thinking all the time: "it did work, it can`t be the DataLoader". So the learning is: listen to your mind, when its whispering you hints...

Naming confusion

The Interface naming generates a knot in my head everytime I want to do something, even though I understand the inner working of the two:

  • Overblog\PromiseAdapter\PromiseAdapterInterface
  • GraphQL\Executor\Promise\PromiseAdapter

The first is used to promise the IDs, while the second is then'd with all ids to resolve the final data usind the batch load callback. Or is it not?

All the overblog related packages are very helpful, but regarding the naming I always get knots in my head ;(

The second comes from the webonyx project and makes sense. It is an interface for a promise based result. The first should maybe indicate it being an aggregate or higher order or collector promise (something that direction)?

I know, naming things is not easy :( .

http://hilton.org.uk/blog/why-naming-things-is-hard

Log activities to help profiling / debugging

As a Symfony developper, I want to develop a DataCollector to know more about our usage during development process.

A dedicated DataLoaderLogger with logged get and cached activities could be used to develop such a DataCollector.

  • How many time keys have been fetched, with a ratio HIT / MISS.
  • How many batches have been executed to fetch data

Memory leak at WebonyxGraphQLSyncPromiseAdapter

At method \Overblog\PromiseAdapter\Adapter\WebonyxGraphQLSyncPromiseAdapter::create()
$canceller is always added to array, but never removed
Problem begins here $this->cancellers[spl_object_hash($promise)] = $canceller;

How to ensure keys get mapped to the correct objects?

Hi, I'm not sure if it's an issue or just a misunderstanding. Suppose we have the following query:

{
  u1: user (id: 25) {
    id
  }
  u2: user (id: 30) {
    id
  } 
  u3: user (id: 20) {
    id
  }
}

And our batchLoadFn is something like this (Laravel):

...
$userLoader = new DataLoader(function ($ids) use ($promiseAdapter) {
  $collection = User::whereIn('id', $ids)->get();

  return $promiseAdapter->createFulfilled($collection);
}, $promiseAdapter)

Since there's no key mapping between the collection items and the keys, this will give us the following incorrect result (the resulting user objects are in the wrong order):

{
  "data": {
    "u1": {
      "id": "20"
    },
    "u2": {
      "id": "25"
    },
    "u3": {
      "id": "30"
    }
  }
}

How can i ensure the keys are mapped to correct objects?

DataLoader::await resolves all loaders

Hi,
great job on the lib guys! It really does bring cool possibilities to PHP-driven apps :)

Quite recently I started to look into possibilities to introduce DataLoader to a new kind of API introduced into my project. Whilst it fits really cool for the use case it does have one downside which I managed to resolve but perhaps it could be a good addition to the lib in general.

In the project, I do have a couple of independently sourced loaders for different kinds of data - all of the loaders besides one return promises being resolved on the serialization/normalization process of the API response - and it generates a great result. Code is clean and the amount of backend/database roundtrips is significantly reduced.
However, since I do have to resolve one loader's data in the runtime of generation of the response I use the DataLoaderInterface::await method - and the issue is that it resolves ALL the promises from all of the loaders. It's understandable of course that the method exists and it works how it does but in this very case instead of having x(amount of loaders - 1) + y(unique runtime resolutions) I end up with x*y roundtrips.

As a workaround, I prepared a new extending interface implementing the non-static awaitSingle method. In the implementation instead of using Overblog\DataLoader\Promise\Adapter\Webonyx\GraphQL\SyncPromiseAdapter I use GraphQL\Executor\Promise\Adapter\SyncPromiseAdapter allowing to resolve only specific loader.

Any thoughts on the topic? I could give it a try and prepare the appropriate implementation if this would sound like something you'd like to have in the solution :)

Returned data preconditions

– The Array of values must be the same length as the Array of keys.
– Each index in the Array of values must correspond to the same index in the Array of keys.

This confused me, can you point me to part of code where I can examine this logic? Thanks!

Error when doing a GraphQL request: "Cannot change rejection reason"

Hello,

When I try to execute a request in production environment, I'm experiencing this issue:

{"code":500,"message":"Internal Server Error"}<br />
<b>Fatal error</b>:  Uncaught Exception: Cannot change rejection reason in /my/local/directory/poc/vendor/webonyx/graphql-php/src/Executor/Promise/Adapter/SyncPromise.php:63
Stack trace:
#0 /my/local/directory/poc/vendor/overblog/dataloader-php/src/DataLoader.php(418): GraphQL\Executor\Promise\Adapter\SyncPromise-&gt;reject(Object(RuntimeException))
#1 /my/local/directory/poc/vendor/overblog/dataloader-php/src/DataLoader.php(367): Overblog\DataLoader\DataLoader-&gt;failedDispatch(Array, Object(RuntimeException))
#2 /my/local/directory/poc/vendor/overblog/dataloader-php/src/DataLoader.php(349): Overblog\DataLoader\DataLoader-&gt;dispatchQueueBatch(Array)
#3 /my/local/directory/poc/vendor/overblog/dataloader-php/src/DataLoader.php(225): Overblog\DataLoader\DataLoader-&gt;dispatchQueue()
#4 /my/local/directory/poc/vendor/overblog/dataloader-php/src/DataLoader.php(297): Overblog\DataLoader\DataLoader-&gt;process()
#5 /my/local/directory/poc/vendor/over in <b>/my/local/directory/poc/vendor/webonyx/graphql-php/src/Executor/Promise/Adapter/SyncPromise.php</b> on line <b>63</b><br />

Something happens in the destruct function, but I have no idea why.
Is it configured wrongly ?

When I debug, the message passed into the exception GraphQL\Error\Error is DataLoader destroyed before promise complete..

Everything works fine without the loader.
My end to end tests are green with the dataloader (and it's disturbing...).

My graphql configuration is:

overblog_dataloader:
    defaults:
        promise_adapter: "overblog_dataloader.webonyx_graphql_sync_promise_adapter"
        options:
            batch: true
            cache: true
            max_batch_size: 100
            cache_map: "overblog_dataloader.cache_map"

    loaders:
        users:
            alias: "attribute_dataloader"
            batch_load_fn: "@pim_research.infrastructure.persistence.database.cached_attribute_repository:withCodes"

My type declaration:

    pim_research.infrastructure.delivery.api.graphql.type.family_type:
        class: '%pim_research.infrastructure.delivery.api.graphql.type.family_type.class%'
        arguments:
            - '@pim_research.infrastructure.delivery.api.graphql.types'
            - '@pim_research.infrastructure.persistence.database.cached_attribute_repository'
            - '@attribute_dataloader'
        tags:
            - { name: 'pim_research.infrastructure.delivery.api.graphql.type' }
        lazy: true

and my class:

class FamilyType extends ObjectType
{
    public function __construct(Types $types, AttributeRepository $attributeRepository, DataLoader $attributeDataLoader)
    {
        $config = [
            'name' => 'family',
            'description' => 'Family',
            'fields' => function() use ($types) {
                return [
                    'code' => Type::string(),
                    'attributes' => Type::listOf($types->get(AttributeType::class))
                ];
            },
            'resolveField' => function(Family $family, $args, $context, ResolveInfo $info) use ($attributeRepository, $attributeDataLoader) {
                switch ($info->fieldName) {
                    case 'code':
                        return $family->code()->getValue();
                    case 'attributes':
                        return $attributeDataLoader->loadMany($family->attributeCodes());
                        //return $attributeRepository->withCodes($family->attributeCodes());
                    default:
                        return null;
                }
            }
        ];
        parent::__construct($config);
    }
}

My dependency versions:

"webonyx/graphql-php": "v0.11.2"
"overblog/dataloader-php": "v0.5.2"
"overblog/dataloader-bundle": "v0.4.0"

Thanks.

Promise never triggered

I use the example here, but the promise returned by loadMany method never get triggered, why?

class CustomerLoader {

   /**
     * @var WebonyxGraphQLSyncPromiseAdapter
     */
    private $promiseAdapter;

    public function __construct($promiseAdapter)
    {
        $this->promiseAdapter = $promiseAdapter
    }

 public function all(array $userIds)
    {
        $users = CSessionInfo::query()
            ->whereIn('id', $userIds)
            ->get()
            ->groupBy('id');

        $indexed = array_reduce($userIds, function($curry, $current) use($users) {
             $curry[] = $users->has($current) ? $users->get($current): null;
             return $curry;
        }, []);

        return $this->promiseAdapter->all($indexed);
    }

  public function getLoader()
    {
        return new DataLoader(function(array $userIds){
            $this->all($userIds);
        }, $this->promiseAdapter);
    }

   // this method is mapped to author field .
    public function resolveAuthor($parent)
    {
        return $this->getLoader()->load($parent['authorId']);
    }
}

$graphQLPromiseAdapter = new SyncPromiseAdapter();
$dataLoaderPromiseAdapter = new WebonyxGraphQLSyncPromiseAdapter($graphQLPromiseAdapter);
$userLoader = new DataLoader(function ($keys) { /.../ }, $dataLoaderPromiseAdapter);

GraphQL::setPromiseAdapter($graphQLPromiseAdapter);

Throw/Unwrap PHP7 \Error in await()

PHP7 introduced the concept of Errors in addition to Exceptions. (Both implement the shared \Throwable interface).

Current behaviour of await() is to check if $rejectedReason is an \Exception, but this does not include errors like TypeError. So when a TypeError is thrown in the promise chain, it is discarded in await() and this makes for some hard to debug situations.

Would you be open for a pull request which checks for Throwable $rejectReasons as well?

if ($isPromiseCompleted) {
    // rejected ?
    if ($rejectedReason instanceof \Exception || (interface_exists('\Throwable') && $rejectedReason instanceof \Throwable)) {
        if (!$unwrap) {
            return $rejectedReason;
        }
        throw $rejectedReason;
    }

    return $resolvedValue;
}

Context for fetching

Hello. I need to load data from an elasticsearch instance. All is working fine, but my issue is about overfetching of data and not using it in response (for instance, querying for name field also reads a description field which is not needed actually).

So I was thinking about something like a context for every load. The queue will be grouped by unique context values and requests will be made for each of it. In my case, the context will be the fields that I need.

If this sounds ok, I can try a pull request. Or maybe there is another way if doing this which I'm not aware of.

Thanks.

Problem with 1:N relation

The problem is apparent even in the readme of this package:

$userType = new ObjectType([
  'name' => 'User',
  'fields' => function () use (&$userType, $userLoader, $dbh) {
     return [
            'name' => ['type' => Type::string()],
            'bestFriend' => [
                'type' => $userType,
                'resolve' => function ($user) use ($userLoader) {
                    $userLoader->load($user['bestFriendID']);
                }
            ],
            'friends' => [
                'args' => [
                    'first' => ['type' => Type::int() ],
                ],
                'type' => Type::listOf($userType),
                'resolve' => function ($user, $args) use ($userLoader, $dbh) {
                    $sth = $dbh->prepare('SELECT toID FROM friends WHERE fromID=:userID LIMIT :first');
                    $sth->bindParam(':userID', $user['id'], PDO::PARAM_INT);
                    $sth->bindParam(':first', $args['first'], PDO::PARAM_INT);
                    $friendIDs = $sth->execute();

                    return $userLoader->loadMany($friendIDs);
                }
            ]
        ];
    }
]);

Now if I ask for several users + their best friends it will work nicely. All best friends will be loaded together.

{
  users {
    name
    bestFriend {
      name
    }
  }
}

However if I ask for several users + 5 friends for each user then the query SELECT toID FROM friends WHERE fromID=:userID LIMIT :first will be repeated separately for each user instead of making just one query. So the N+1 problem is not really gone, just more hidden.

{
  users {
    name
    friends(first: 5) {
      name
    }
  }
}

How should I eliminate this behavior?

Support for PHP 8

The library should be updated to support PHP 8.

Doing so, older PHP versions could be removed, tests for PHP 7.3 should be fixed (failing currently) and added for PHP 7.4. Also, dev dependencies should be bumped - webonyx/graphql-php v0.11 vs current v14.

I'm happy to prepare a PR…

Unable to run dataloader for simple example

I am really struggling to create even a simple version of this using the documentation.

Here is my code:

use GuzzleHttp\Promise\Promise;
use Overblog\DataLoader\DataLoader;
use Overblog\PromiseAdapter\Adapter\GuzzleHttpPromiseAdapter;

class Sandbox
{
    public function handle()
    {
        $myBatchGetUsers = function ($keys) {
            echo "Running myBatchGetUsers()...\n";
            $promise = new Promise();
            $promise->then(function ($value) {
                echo "Running data promise..."; // never runs!
                return [
                    ['name' => 'John'],
                    ['name' => 'Sara'],
                ];
            });
            return $promise;
        };

        $promiseAdapter = new GuzzleHttpPromiseAdapter();
        $userLoader = new DataLoader($myBatchGetUsers, $promiseAdapter);

        $userLoader->load(4)
            ->then(function ($user) use ($userLoader) {
                echo "{$user['name']}\n"; // never runs
            });

        $userLoader->load(5)
            ->then(function ($user) use ($userLoader) {
                echo "{$user['name']}\n"; // never runs
            });

        $userLoader->await();
    }
}

The expected output is:

Running myBatchGetUsers()...
Running data promise...
John
Sara

However the actual output is:

Running myBatchGetUsers()...

As you can see it is a very simple example, yet it does not work.
What am I doing wrong?

Many Thanks :-)

Thoughts: ORM, DDD conflicting?

Prelude:

When I found the dataloader concept by facebook via this package, it was the last piece that missed in the whole concept behind graphql, to make it "composable" without 1+n overkill. Before, I had to use ResolveInfo (overblog/graphql) to look ahead the selection further, than the resolver for the specific type needed to be aware of. This also caused the (doctrine) hydration due to multi-join blow-up of the result set to become extremely time consuming.

Against an graphql API of an existing application with some sample of about ~450 datasets in different related t ables, I made a super-query against everything. Then I started to adopt dataloader everywhere.
I started at about 1700 queries. Applying dataloading on all entity resolvers reduced it to 1500. What I have noticed is, that in most cases it almost does not help in 1:1 situations, because doctrine ORM is smart enough with lazy loading (in this case proxy) to keep the object deduplicated and so once an entity is loaded in a single relation, same entity is loaded everywhere. So the 200 loaded are "different" 1:1 entities of different proxy objects with different identities. The dataloading actually would have saved more, when doctrine not already did most of the job.

What did the most troubles were the associations between entities {1,*}:n.
When using ORM, you get the lazy collection, which has no means to efficiently load Ids and even if so, you would still query all the ids and kind of have still 1+n. Without or, you still would have to query the ids for each association.

So I came with the idea to make a loader for association, which keys by parent id. It then queries for all the requested parents its children, group them by parent/order correctly (which is kind of tricky, but works). Then I "pipe" the ids into the EntityLoader. And using class metadata from doctrine the forwarding can be automated (though the association loader still caches only the id list).

Works, now I have 6 queries for ALL apps data (from previously 1700) with lots of :n relation, no matter how much depth I have in the graphql query. And the funny thing is: no huge joins, only implicit on n:m tables to batch-query association ids (so at most depth 2).

Actual problems (or not?):

Now come things like computed properties, which I usually implemented on domain objects (for example said entities). The computed data might depend on owned & associated data of an entity, which it needs to access to compute the result. The problem is, that it obviously bypasses dataloading, when I access those (lazy-)collections inside the entity to compute some result. It does not matter whether all the associated entities are managed by entity manager at this point, because the lazy collection did not yet fetch the associations. It would still trigger a query for each entity you want to compute a result for.

Before dataloader I looked into ResolveInfo inside the resolver for the computed property and join-fetched data, which was required for the computation. A bad thing, because resolvers had to know what is required inside the entity to compute the result...

So what? Well I could make dataloader for said computed values, but that would require me do ditch much of the model from entities. So far I did not find any other solution for it. I'm not quite sure if that is "normal" or I just not see "it".

I'm at a point of thinking to ditch doctrine in this case and have my dataloading API become the domain for data access, with flexible implementation behind it (could be simple in-memory array source for testing, or backaed by database). Because ORM looses any point as it looks to me, when I can not implement my model.

Argument 1 passed to GraphQL\Executor\Promise\Adapter\SyncPromiseAdapter::wait() must be an instance of GraphQL\Executor\Promise\Promise, instance of GuzzleHttp\Promise\Promise given, called

I think I discovered a bug in Dataloader.

It has a static variable Dataloader::$instances where it keeps instances of itself. On __destruct the instance that destructs is taken out of the $instances array.

I have now a weird situation where I run a full test suite that runs both functional tests and unit tests.

First, a functional tests runs, and later I run a unit test that calls MyLoader::await($promise);.

That one fails with this error:

TypeError : Argument 1 passed to GraphQL\Executor\Promise\Adapter\SyncPromiseAdapter::wait() must be an instance of GraphQL\Executor\Promise\Promise, instance of GuzzleHttp\Promise\Promise given, called in /Volumes/CS/www/vendor/overblog/dataloader-php/lib/promise-adapter/src/Adapter/WebonyxGraphQLSyncPromiseAdapter.php on line 126
 /Volumes/CS/www/vendor/webonyx/graphql-php/src/Executor/Promise/Adapter/SyncPromiseAdapter.php:144
 /Volumes/CS/www/vendor/overblog/dataloader-php/lib/promise-adapter/src/Adapter/WebonyxGraphQLSyncPromiseAdapter.php:126
 /Volumes/CS/www/vendor/overblog/dataloader-php/src/DataLoader.php:282
 /Volumes/CS/www/tests/Infrastructure/GraphQL/Resolver/PlatformAccountResolverTest.php:60

Somehow, the previous instance is not cleaned up nicely.

How to solve this?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.