First off, thanks for the work on this – it's exciting stuff! I'm aware that any optim

Thanks <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Benchmarks about graphql-clj HOT 9 CLOSED

tendant commented on June 8, 2024

Benchmarks

from graphql-clj.

Comments (9)

aew commented on June 8, 2024 2

Having thought a bit more about this, I think that validation is intended to be a mandatory step (albeit one completed outside the inner loop of repeatedly executing a given query against a given schema):

https://facebook.github.io/graphql/#sec-Validation

An invalid request is still technically executable, and will always produce a stable result as defined by the procedures in the Execution section, however that result may be ambiguous, surprising, or unexpected relative to the request containing validation errors, so execution should not occur for invalid requests.

Typically validation is performed in the context of a request immediately before execution, however a GraphQL service may execute a request without explicitly validating it if that exact same request is known to have been validated before.`

This part of the spec explicitly states that execution should not occur for invalid requests. On top of this, it is more work to also support transformations directly from parse output to execution input rather than consistently arriving at execution via the validation phase, which introduces additional transformations needed for validation (and likely useful for an execution tree visit too).

from graphql-clj.

tendant commented on June 8, 2024

Thanks @xsc for putting time on the performance.

The performance is really important for this library.

For the parser performance, it is easy to support caching. I plan to enhance executor to accept parsed query. Once it is done, it will be up to user on how to cache parsed query.

It is even a security feature. Since you can choose to accept only cached query. And the parser will never been exposed to arbitrary query. I have heard this is what Facebook doing.

from graphql-clj.

tendant commented on June 8, 2024

I am debating the interface of executor.

Two options so far:

Provide an new function, which support parsed query.
Use multi-method to define exiting execute function. However multi-method might have performance concern.

I will leave this issue open for more comments and ideas.

from graphql-clj.

aew commented on June 8, 2024

I'm in favor of a new function, and in general I think that separating the parsing (and validation) step from the execution step (with a cache in between) makes a lot of sense. This takes performance optimization pressure off of the parse / validate steps, which are complex, and also reduces the number of corner cases that can leak into the execution phase.

If it were me, I'd focus on (in order):

robustness
feature completeness
code cleanliness
execution phase performance
documentation and examples
other stuff

from graphql-clj.

aew commented on June 8, 2024

I've thought about this a bit more, and I would propose we make validation a mandatory upstream step from execution.

My reasoning:

In order to preserve line and column metadata necessary to produce standards-compliant error messages, our parser needs to do fewer transformations than it is currently doing. Transformations to raw strings / values wipe out metadata, which instaparse uses to track locations in the AST. Also, to do line and column, we have to do another parser pass, which makes it slower.
I find it natural that for most applications, queries will be parsed and validated during a development phase (where good errors are as important as latency, and user permissions may be different) and then cached associated with a hash of the query document + schema or something like that.
The optimal data structure for validation can be different than the data structure resulting from parsing and validation, which would be designed for execution (we can do transformations as part of the validation tree visit). Some of the indirection / extra wrappers / vectors of maps instead of key value maps / etc are only necessary for validation and can be further optimized after validation passes.
There are problems today where the data structure that exits validation cannot be passed into execution, which we need to fix in any event.

from graphql-clj.

tendant commented on June 8, 2024

Just want to have more discussion around data and flow:

Existing flow:

Query/Schema String -- Parser --> Parsed AST -- Transformation --> Transformed Map -- Executor --> Result

Complete flow:

Query/Schema String -- Parser --> Parsed AST -- Transformation --> Transformed Map -- Validator --> Validation Error or Validated Map -- Executor --> Executor Error or Result

Transformed Map and Validated Map should be same format if possible. So that validation process can be skipped if needed. Although some data can be added to validated map.
Provide new executor function, it will take Validated Map (Transformed Map) as parameter in addition to existing execute function which takes only query string. Validated Map and Transformed Map can be cached and even persisted (like edn).
Validation Error should be same format as Executor Error, which is defined in GraphQL Spec. So that it can be return to client as result.

from graphql-clj.

xsc commented on June 8, 2024

Adding to that, it might be reasonable to have a separate repository/artifact specifying (as in clojure.spec) the format of what you named "Transformed Map", "Validation Error" and "Execution Error". It would then be possible to provide different parser/validator/executor implementations, as long as they fulfill those specs. (I'm not sure, though, whether there is a lot left that would remain within a graphql-clj project, maybe default implementations and executor contracts?)

For example, I've been trying out ANTLR4 for parsing (here), producing an AST conforming to this spec. I'd very much like to reuse an existing validator implementation (or plug in a different executor subsystem, e.g. one based on claro) but for that we need a common AST format that isn't prone to change (bar a GraphQL spec update).

(This also adds to @aew's "robustness" and "code cleanliness" points since those things can be tested and tuned independently.)

Thus, generally, I'm very much in favour of using more of a pipeline approach (parser -> validator -> executor) as long as the interfaces between the pipeline parts are well-defined and stable.

from graphql-clj.

tendant commented on June 8, 2024

@xsc Parsing, validation and execution phase are separated intentionally, in order to make those parts pluggable. It is still better to keep it in one repository, so it is easy to maintain and release. I am open to separate the library to multiple repositories, once it is stable enough. You can still use some parts of graphql-clj as you wish.

I agree it is very important to come up right spec/format for "Transformed Map". @aew did great job in introducing clojure.spec in validation. This helps us understand what exactly needs to be in the spec/format. I hope we can come to certain conclusion once validation is done.

For executor, I prefer to reuse the same spec/format as "Transformed Map" if possible. It will be painful to maintain two set of spec/format. (#35)

Current executor is not the ideal solution. There are a few things needed here:

(#36) Executor: which can walk through "Transformed Map" and realize the result map.
(#37) Dataloader: Generic data-loader can be used anywhere to provide caching and solution for n+1 problem. For example: https://github.com/facebook/dataloader. It should be a independent library.
(#38) Resolver: Provide mapping from (Type, field) -> fn. We need figure out what is good definition of resolver-fn.

@xsc I have a quick peek of claro, looks pretty promising. I had looked into muse before. I prefer something simple instead.

There are a lot of things can be improved, I will create issue for some of them. It will be great, we can keep getting comments/ideas around them.

from graphql-clj.

tendant commented on June 8, 2024

The parser performance is about 200 times faster in most recent version 0.2.3 with Java Parser.

Here is a result for comparison:

Case: :complex-query
Evaluation count : 7699500 in 60 samples of 128325 calls.
Execution time mean : 7.910276 µs
Execution time std-deviation : 183.697851 ns
Execution time lower quantile : 7.670498 µs ( 2.5%)
Execution time upper quantile : 8.329677 µs (97.5%)

Found 9 outliers in 60 samples (15.0000 %)
low-severe 9 (15.0000 %)
Variance from outliers : 11.0094 % Variance is moderately inflated by outliers

Detail performance bench can be found in:

https://github.com/xsc/graphql-clj-bench/blob/master/result-0.2.3.txt

from graphql-clj.

Benchmarks about graphql-clj HOT 9 CLOSED

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent