Coder Social home page Coder Social logo

cypher-stream's People

Contributors

aseemk avatar brian-gates avatar gitter-badger avatar greenkeeperio-bot avatar samlebarbare avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

cypher-stream's Issues

Benchmarking

Benchmarking

I've been doing some preliminary benchmarking to determine the performance impact of stream-parsing vs waiting for the complete result set to load.

I took CypherStream and replaced oboe with request and ran some sieges against a REST endpoint that runs a cypher query and returns the results.

Jump to the bottom to see the conclusion if you're not interested in the numbers.

Tests

20 Records

Non-streaming

Transactions:          10177 hits
Availability:         100.00 %
Elapsed time:          59.45 secs
Data transferred:        96.36 MB
Response time:            0.09 secs
Transaction rate:       171.19 trans/sec
Throughput:           1.62 MB/sec
Concurrency:           15.25
Successful transactions:       10177
Failed transactions:             0
Longest transaction:          1.00
Shortest transaction:         0.00

Streaming

Transactions:           7705 hits
Availability:         100.00 %
Elapsed time:          59.54 secs
Data transferred:        69.86 MB
Response time:            0.27 secs
Transaction rate:       129.41 trans/sec
Throughput:           1.17 MB/sec
Concurrency:           35.38
Successful transactions:        7705
Failed transactions:             0
Longest transaction:          0.91
Shortest transaction:         0.01

1763 records

Non-streaming

Transactions:            228 hits
Availability:          87.69 %
Elapsed time:          59.57 secs
Data transferred:         3.91 MB
Response time:           13.25 secs
Transaction rate:         3.83 trans/sec
Throughput:           0.07 MB/sec
Concurrency:           50.71
Successful transactions:         228
Failed transactions:            32
Longest transaction:         30.50
Shortest transaction:         0.12

Streamed

Transactions:            329 hits
Availability:          95.36 %
Elapsed time:          59.95 secs
Data transferred:         6.93 MB
Response time:           11.61 secs
Transaction rate:         5.49 trans/sec
Throughput:           0.12 MB/sec
Concurrency:           63.69
Successful transactions:         329
Failed transactions:            16
Longest transaction:         30.50
Shortest transaction:         0.04

Conclusion

Stream-parsing small, fast data sets doesn't make sense. Being that this is likely the most common use-case, I don't think the current implementation makes sense as a default generic database-querying mechanism.

Proposal

Some options I can think of:

  1. Stream-parse only when the response exceeds a certain size or time threshold.
  2. Use sockets. This is probably going to be a huge performance gain in general. I see massive potential in this. Unfortunately it would require users to install a neo4j plugin to see the gains.
  3. Never stream-parse. Just wait for the complete response, then emit results, maintaining the streaming API.
  4. Investigate other parsing mechanisms. Perhaps there are more efficient solutions than Oboe.js.
  5. Just accept the cost and use another library if streaming doesn't make sense for you.
  6. Allow the user to control whether to stream-parse or not. This allows for fine-tuned optimizations, but also requires a higher level of knowledge from the user.

Other Notes

Query results that require heavy I/O processing could also make sense since you could do the processing eagerly and in parallel. Maybe backpressure would be useful in this case too (I have yet to test whether back-pressure actually works).

I'm assuming the slowdown is due to CPU bottlenecking. I'm not sure if multiple cores are being taken advantage of currently.

No operations allowed until you send an INIT message successfully

When running a basic test example, I get this error.

{ [Error: No operations allowed until you send an INIT message successfully.] message: 'No operations allowed until you send an INIT message successfully.', code: 'Neo.ClientError.Request.Invalid' }

var cypher = require('cypher-stream')('bolt://localhost','neo4j','neo4j')

cypher('MATCH (n:User) RETURN n LIMIT 1')
    .on('data', function(result){
        console.log(result)
    })
    .on('error', function(error){
        console.log(error)
    })

Installed using:
npm install [email protected]

Am I missing something really obvious here? I don't see anything that requires anything else. I'm running neo4j 3.0.4 if that makes a difference as well

Differentiate results across queries in a transaction

Hey @brian-gates, I'm working on thingdom/node-neo4j#143 now, specifically the design of the transactional API that'd wrap this guy. =)

Thinking about a need / use case we at @fiftythree have in our own app, one thing we'd need is the ability to differentiate results across the various queries that make up a transaction.

AFAICT right now, the current cypher-stream design doesn't differentiate. A transaction is a single stream of results, with data events that don't tell you which query (statement) each result corresponds to.

For example, if the caller doesn't know in advance how many results a query will give, they don't know when the results for one query ends and the next begins.

What are your thoughts for how to achieve that?

[Error: write after end] statusCode: undefined,

Running a server-side process, where webclients connect to via socket.io.

On server startup once a cypherStream(url) is given,
and based on the webclients it issue a query.

It works fine when the server process is just started. When I reconnect the client, it gives errors.

Getting the following errors (attaching to the error event of cypher-stream):
{ [Error: write after end] statusCode: undefined, body: undefined, jsonBody: undefined }
followed by a stream of [Error: stream.push() after EOF]

Even when I do not connect the cypher stream to the client (so it stays serverside only), I get these errors.
Trying to figure out what underlying stream is having the issue....

Test case suite fails on Neo4j v 2.3

I am getting an error with Neo4J 2.3 and Node 4.2.2

[email protected] test /Users/redpanda/work/github/cypher-stream
make test

./node_modules/.bin/mocha -b
(node) child_process: options.customFds option is deprecated. Use options.stdio instead.

․․

1 passing (656ms)
1 failing

  1. Cypher stream handles errors:
  expected 'AssertionError
  + expected - actual

  +"Error: Query Failure: Invalid input 'i': expected <init> (line 1, column 1)\n\"invalid query\"\n ^"
  -"AssertionError: expected 'Error: Query Failure: Invalid input \\'i\\': expected <init> (line 1, column 1 (offset: 0))\\n\"invalid query\"\\n ^' to be 'Error: Query Failure: Invalid input \\'i\\': expected <init> (line 1, column 1)\\n\"invalid query\"\\n ^'"

  at Assertion.prop.(anonymous function) [as exactly] (/Users/redpanda/work/github/cypher-stream/node_modules/should/lib/should.js:60:14)
  at CypherStream.<anonymous> (/Users/redpanda/work/github/cypher-stream/test/cypher-stream-test.js:50:30)
  at emitOne (events.js:77:13)
  at CypherStream.emit (events.js:169:7)
  at CypherStreamHandleFailure (/Users/redpanda/work/github/cypher-stream/CypherStream.js:182:12)
  at applyEach (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:496:20)
  at Object.emit (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1929:10)
  at /Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:2323:33
  at apply (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:171:14)
  at /Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:2300:10
  at applyEach (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:496:20)
  at emit (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1929:10)
  at emitMatchingNode (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:2104:7)
  at /Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:2149:13
  at applyEach (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:496:20)
  at emit (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1929:10)
  at nodeClosed (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1494:7)
  at /Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1048:19
  at applyEach (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:496:20)
  at emit (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1929:10)
  at handleData (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:761:14)
  at applyEach (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:496:20)
  at Object.emit (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1929:10)
  at IncomingMessage.<anonymous> (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1128:34)
  at emitOne (events.js:77:13)
  at IncomingMessage.emit (events.js:169:7)
  at IncomingMessage.Readable.read (_stream_readable.js:360:10)
  at flow (_stream_readable.js:743:26)
  at resume_ (_stream_readable.js:723:3)
  at doNTCallback2 (node.js:441:9)
  at process._tickCallback (node.js:355:17)

make: *** [test] Error 1
npm ERR! Test failed. See above for more details.

Shouldn't emit 'end' if error?

Hi there,

Fantastic library! I'm the author/maintainer of node-neo4j, looking to simplify and experiment. (Especially since Neo4j 2.0 embraces Cypher.) This lib was recommended to me by @wfreeman. (Thanks Wes!)

I'm trying it out in our main backend service at @fiftythree, and one thing I noticed was that the stream emits an end event even in the case of errors. This strikes me as unusual, but after some fair bit of googling, doc reading, and experimentation, I'm not able to confirm that.

I can just say that as a caller, if I want to wrap the streaming in a callback-based API, I can't simply say callback(error) on error and callback(null, results) on end; I have to keep track of whether an error happened or not myself. I would expect the stream to do that, but am I wrong?

Thanks for the consideration, and great work again!

Cheers,
Aseem

Custom (and default?) headers

An important use case for us at @fiftythree is to pass custom headers in the underlying requests. (And they vary across requests, so it's not just a static default.)

Examples of how we use those:

  • User-Agent. The Neo4j team encourages drivers to set this (to driver/version), btw, so cypher-stream should do this by default I think (e.g. cypher-stream/0.2.1), but we're also thinking of expanding this to include our app's name, since we'll soon be having two apps talking to the same Neo4j database. (That'll let us differentiate and understand our load across the two.)
  • Tracing. We send along a header with an ID for the outermost "app-level" request, so we can both visualize all the database requests that made up an app requests, and debug failed app requests. This is a common technique, e.g. Twitter's Zipkin (what we use), Heroku's X-Request-ID, etc.
  • Query names. We want to be able to analyze the performance of our queries, but using the full query body would be untenable. So we send along a X-Query-Name header with every query that's a short, human-readable name for this query (e.g. User_getByEmail), which we then log. E.g.:

Example screenshot of the last case:

home-stream-perf-queries

So it'd be really valuable to be able to pass custom headers along to Cypher requests.

WDYT of supporting these alternate signature then?

var cypher = require('cypher-stream')({
    url: 'http://localhost:7474',
    headers: {...}  // default headers, if you want to set any; e.g. User-Agent
});

cypher({
    query: 'match (user:User {email: {email}}) return user',
    params: {email: '[email protected]'},
    headers: {...}  // e.g. X-Query-Name
})
  .on('data', function (result){
    console.log(result.user.first_name);
  })
  .on('end', function() {
    console.log('all done');
  })
;

Use new Cypher endpoint with leaner JSON?

Neo4j 2.0 supports a new "transactional Cypher" endpoint which, importantly, returns leaner JSON: just property data, no more hyperlinks. You're already returning just the data, not extracting native node IDs or similar, so in theory this shouldn't lose any functionality.

http://docs.neo4j.org/chunked/stable/rest-api-transactional.html#rest-api-begin-and-commit-a-transaction-in-one-request

Would you be open to updating to this? I'd be happy to give a pull request a stab if so. The only thing is that I don't have any experience with Oboe, so it might take me some time. =)

One thing to note is that error handling will get a bit more complex now, because errors will no longer result in a different HTTP response.

http://docs.neo4j.org/chunked/stable/rest-api-transactional.html#rest-api-handling-errors

Documentation error

Hello,
The username/password have to be passed in a single hash like { username: username, password: password }, not 2 separate arguments like it says in the documentation.

Authentication

Hi,
Can you tell me, how can I pass authentication details using cypher-stream?
For example, if I have to use neo4j driver I could do something like
neo4j.driver("bolt://localhost", neo4j.auth.basic("neo4j", "neo4j"));

Thanks,

Properly tagged versions

I'd recommend tagging your final versions which allows npm and package-managers distinguish semver-updates.

Support for REST format

Quick bug first: extractData recursively calls itself if an object has a data property, so this means it'll incorrectly drop legitimate property data if there happens to be a property named data. =)

But wait! That bug's not worth fixing, because (I think) you don't even need that function anymore, since the transactional endpoint returns just property data by default now. So you should just remove it.

But on that note, I'd like to request an option to return the REST format instead of the lean property-only format. Having node and relationship metadata is needed for ORM/OGM-type libraries.

Easy enough to request it:

http://neo4j.com/docs/stable/rest-api-transactional.html#rest-api-execute-statements-in-an-open-transaction-in-rest-format-for-the-return

This could potentially be just another option to add to #10's options-based API, e.g. format.

WDYT? SGTY? ORLY?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.