codex-digital / cypher-stream Goto Github PK

View Code? Open in Web Editor NEW

55.0 6.0 13.0 127 KB

Neo4j Cypher queries as Node.js object streams

JavaScript 97.49% Makefile 1.03% Shell 1.48%

neo4j bolt cypher stream streaming-api streams streaming

cypher-stream's Introduction

cypher-stream

Neo4j cypher queries as node object streams.

Installation

npm install cypher-stream

Basic usage

var cypher = require('cypher-stream')('bolt://localhost', 'username', 'password');

cypher('match (user:User) return user')
  .on('data', function (result){
    console.log(result.user.first_name);
  })
  .on('end', function() {
    console.log('all done');
  })
;

Handling errors

var cypher = require('cypher-stream')('bolt://localhost', 'username', 'password');
var should = require('should');
it('handles errors', function (done) {
  var errored = false;
    cypher('invalid query')
    .on('error', error => {
      errored = true;
      should.equal(
        error.code,
        'Neo.ClientError.Statement.SyntaxError'
      );
      should.equal(
        error.message,
        'Invalid input \'i\': expected <init> (line 1, column 1 (offset: 0))\n"invalid query"\n ^'
      );
    })
    .on('end', () => {
      should.equal(true, errored);
      done();
    })
    .resume()
    ;
});

Transactions

Transactions are duplex streams that allow you to write query statements and read the results.

Transactions have three methods: write, commit, and rollback, which add queries and commit or rollback the queue respectively.

Creating a transaction

var transaction = cypher.transaction(options)

Adding queries to a transaction

transaction.write(query_statement);

A query_statement can be a string or a query statement object. A query statement object consists of a statement property and an optional parameters property. Additionally, you can pass an array of either.

The following are all valid options:

var transaction = cypher.transaction();

transaction.write('match (n:User) return n');

transaction.write({ statement: 'match (n:User) return n' });

transaction.write({
  statement  : 'match (n:User) where n.first_name = {first_name} return n',
  parameters : { first_name: "Bob" }
});

transaction.write([
  {
    statement  : 'match (n:User) where n.first_name = {first_name} return n',
    parameters : { first_name: "Bob" }
  },
  'match (n:User) where n.first_name = {first_name} return n'
]);

Committing or rolling back

transaction.commit();
transaction.rollback();

Alternatively, a query statement may contain a commit or rollback property.

transaction.write({ statement: 'match (n:User) return n', commit: true });

transaction.write({
  statement  : 'match (n:User) where n.first_name = {first_name} return n',
  parameters : { first_name: "Bob" },
  commit     : true
});

Stream per statement

To get a stream per statement, just pass a callback function with the statement object. This works for regular cypher calls and transactions.

var results = 0;
var calls   = 0;
var ended   = 0;
var query   = 'match (n:Test) return n limit 2';

function callback(stream) {
  stream
    .on('data', function (result) {
      result.should.eql({ n: { test: true } });
      results++;
    })
    .on('end', function () {
      ended++;
    })
  ;
  calls++;
}

var statement = { statement: query, callback: callback };

cypher([ statement, statement ])
.on('end', function () {
  calls.should.equal(2);
  ended.should.equal(2);
  results.should.equal(4);
  done();
})
.resume();

var results = 0;
var calls   = 0;
var ended   = 0;
var query   = 'match (n:Test) return n limit 2';

function callback(stream) {
  stream
    .on('data', function (result) {
      result.should.eql({ n: { test: true } });
      results++;
    })
    .on('end', function () {
      ended++;
    })
  ;
  calls++;
}

var statement = { statement: query, callback: callback };
var transaction = cypher.transaction();

transaction.write(statement);

transaction.write(statement);

transaction.commit();

transaction.resume();

transaction.on('end', function() {
  calls.should.equal(2);
  ended.should.equal(2);
  results.should.equal(4);
  done();
});

Unsafe Integers

Unsafe integers* are returned as strings. If your system deals with particularly large or small numbers, this will require special handling.

See "A note on numbers and the Integer type" on the neo4j-javascript-driver README for more information.

* Unsafe integers are any integers greater than Number.MAX_SAFE_INTEGER or less than Number.MIN_SAFE_INTEGER.

cypher-stream's People

Contributors

Stargazers

Watchers

Forkers

lelandcope loupio hans-d gitter-badger legraphista samlebarbare amstutz ineo4j deberny andrea-veritas stefb965 brian-gates zhangaz1

cypher-stream's Issues

[Error: write after end] statusCode: undefined,

Running a server-side process, where webclients connect to via socket.io.

On server startup once a cypherStream(url) is given,
and based on the webclients it issue a query.

It works fine when the server process is just started. When I reconnect the client, it gives errors.

Getting the following errors (attaching to the error event of cypher-stream):
{ [Error: write after end] statusCode: undefined, body: undefined, jsonBody: undefined }
followed by a stream of [Error: stream.push() after EOF]

Even when I do not connect the cypher stream to the client (so it stays serverside only), I get these errors.
Trying to figure out what underlying stream is having the issue....

Benchmarking

I've been doing some preliminary benchmarking to determine the performance impact of stream-parsing vs waiting for the complete result set to load.

I took CypherStream and replaced oboe with request and ran some sieges against a REST endpoint that runs a cypher query and returns the results.

Jump to the bottom to see the conclusion if you're not interested in the numbers.

Tests

20 Records

Non-streaming

Transactions:          10177 hits
Availability:         100.00 %
Elapsed time:          59.45 secs
Data transferred:        96.36 MB
Response time:            0.09 secs
Transaction rate:       171.19 trans/sec
Throughput:           1.62 MB/sec
Concurrency:           15.25
Successful transactions:       10177
Failed transactions:             0
Longest transaction:          1.00
Shortest transaction:         0.00

Streaming

Transactions:           7705 hits
Availability:         100.00 %
Elapsed time:          59.54 secs
Data transferred:        69.86 MB
Response time:            0.27 secs
Transaction rate:       129.41 trans/sec
Throughput:           1.17 MB/sec
Concurrency:           35.38
Successful transactions:        7705
Failed transactions:             0
Longest transaction:          0.91
Shortest transaction:         0.01

1763 records

Non-streaming

Transactions:            228 hits
Availability:          87.69 %
Elapsed time:          59.57 secs
Data transferred:         3.91 MB
Response time:           13.25 secs
Transaction rate:         3.83 trans/sec
Throughput:           0.07 MB/sec
Concurrency:           50.71
Successful transactions:         228
Failed transactions:            32
Longest transaction:         30.50
Shortest transaction:         0.12

Streamed

Transactions:            329 hits
Availability:          95.36 %
Elapsed time:          59.95 secs
Data transferred:         6.93 MB
Response time:           11.61 secs
Transaction rate:         5.49 trans/sec
Throughput:           0.12 MB/sec
Concurrency:           63.69
Successful transactions:         329
Failed transactions:            16
Longest transaction:         30.50
Shortest transaction:         0.04

Conclusion

Stream-parsing small, fast data sets doesn't make sense. Being that this is likely the most common use-case, I don't think the current implementation makes sense as a default generic database-querying mechanism.

Proposal

Some options I can think of:

Stream-parse only when the response exceeds a certain size or time threshold.
Use sockets. This is probably going to be a huge performance gain in general. I see massive potential in this. Unfortunately it would require users to install a neo4j plugin to see the gains.
Never stream-parse. Just wait for the complete response, then emit results, maintaining the streaming API.
Investigate other parsing mechanisms. Perhaps there are more efficient solutions than Oboe.js.
Just accept the cost and use another library if streaming doesn't make sense for you.
Allow the user to control whether to stream-parse or not. This allows for fine-tuned optimizations, but also requires a higher level of knowledge from the user.

Other Notes

Query results that require heavy I/O processing could also make sense since you could do the processing eagerly and in parallel. Maybe backpressure would be useful in this case too (I have yet to test whether back-pressure actually works).

I'm assuming the slowdown is due to CPU bottlenecking. I'm not sure if multiple cores are being taken advantage of currently.

Shouldn't emit 'end' if error?

Hi there,

Fantastic library! I'm the author/maintainer of node-neo4j, looking to simplify and experiment. (Especially since Neo4j 2.0 embraces Cypher.) This lib was recommended to me by @wfreeman. (Thanks Wes!)

I'm trying it out in our main backend service at @fiftythree, and one thing I noticed was that the stream emits an end event even in the case of errors. This strikes me as unusual, but after some fair bit of googling, doc reading, and experimentation, I'm not able to confirm that.

I can just say that as a caller, if I want to wrap the streaming in a callback-based API, I can't simply say callback(error) on error and callback(null, results) on end; I have to keep track of whether an error happened or not myself. I would expect the stream to do that, but am I wrong?

Thanks for the consideration, and great work again!

Cheers,
Aseem

Custom (and default?) headers

An important use case for us at @fiftythree is to pass custom headers in the underlying requests. (And they vary across requests, so it's not just a static default.)

Examples of how we use those:

User-Agent. The Neo4j team encourages drivers to set this (to driver/version), btw, so cypher-stream should do this by default I think (e.g. cypher-stream/0.2.1), but we're also thinking of expanding this to include our app's name, since we'll soon be having two apps talking to the same Neo4j database. (That'll let us differentiate and understand our load across the two.)
Tracing. We send along a header with an ID for the outermost "app-level" request, so we can both visualize all the database requests that made up an app requests, and debug failed app requests. This is a common technique, e.g. Twitter's Zipkin (what we use), Heroku's X-Request-ID, etc.
Query names. We want to be able to analyze the performance of our queries, but using the full query body would be untenable. So we send along a X-Query-Name header with every query that's a short, human-readable name for this query (e.g. User_getByEmail), which we then log. E.g.:

Example screenshot of the last case:

So it'd be really valuable to be able to pass custom headers along to Cypher requests.

WDYT of supporting these alternate signature then?

var cypher = require('cypher-stream')({
    url: 'http://localhost:7474',
    headers: {...}  // default headers, if you want to set any; e.g. User-Agent
});

cypher({
    query: 'match (user:User {email: {email}}) return user',
    params: {email: '[email protected]'},
    headers: {...}  // e.g. X-Query-Name
})
  .on('data', function (result){
    console.log(result.user.first_name);
  })
  .on('end', function() {
    console.log('all done');
  })
;

Authentication

Hi,
Can you tell me, how can I pass authentication details using cypher-stream?
For example, if I have to use neo4j driver I could do something like
neo4j.driver("bolt://localhost", neo4j.auth.basic("neo4j", "neo4j"));

Thanks,

Test case suite fails on Neo4j v 2.3

I am getting an error with Neo4J 2.3 and Node 4.2.2

[email protected] test /Users/redpanda/work/github/cypher-stream
make test

./node_modules/.bin/mocha -b
(node) child_process: options.customFds option is deprecated. Use options.stdio instead.

․․

1 passing (656ms)
1 failing

Cypher stream handles errors:

  expected 'AssertionError
  + expected - actual

  +"Error: Query Failure: Invalid input 'i': expected <init> (line 1, column 1)\n\"invalid query\"\n ^"
  -"AssertionError: expected 'Error: Query Failure: Invalid input \\'i\\': expected <init> (line 1, column 1 (offset: 0))\\n\"invalid query\"\\n ^' to be 'Error: Query Failure: Invalid input \\'i\\': expected <init> (line 1, column 1)\\n\"invalid query\"\\n ^'"

  at Assertion.prop.(anonymous function) [as exactly] (/Users/redpanda/work/github/cypher-stream/node_modules/should/lib/should.js:60:14)
  at CypherStream.<anonymous> (/Users/redpanda/work/github/cypher-stream/test/cypher-stream-test.js:50:30)
  at emitOne (events.js:77:13)
  at CypherStream.emit (events.js:169:7)
  at CypherStreamHandleFailure (/Users/redpanda/work/github/cypher-stream/CypherStream.js:182:12)
  at applyEach (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:496:20)
  at Object.emit (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1929:10)
  at /Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:2323:33
  at apply (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:171:14)
  at /Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:2300:10
  at applyEach (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:496:20)
  at emit (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1929:10)
  at emitMatchingNode (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:2104:7)
  at /Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:2149:13
  at applyEach (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:496:20)
  at emit (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1929:10)
  at nodeClosed (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1494:7)
  at /Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1048:19
  at applyEach (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:496:20)
  at emit (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1929:10)
  at handleData (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:761:14)
  at applyEach (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:496:20)
  at Object.emit (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1929:10)
  at IncomingMessage.<anonymous> (/Users/redpanda/work/github/cypher-stream/node_modules/oboe/dist/oboe-node.js:1128:34)
  at emitOne (events.js:77:13)
  at IncomingMessage.emit (events.js:169:7)
  at IncomingMessage.Readable.read (_stream_readable.js:360:10)
  at flow (_stream_readable.js:743:26)
  at resume_ (_stream_readable.js:723:3)
  at doNTCallback2 (node.js:441:9)
  at process._tickCallback (node.js:355:17)

make: *** [test] Error 1
npm ERR! Test failed. See above for more details.

Differentiate results across queries in a transaction

Hey @brian-gates, I'm working on thingdom/node-neo4j#143 now, specifically the design of the transactional API that'd wrap this guy. =)

Thinking about a need / use case we at @fiftythree have in our own app, one thing we'd need is the ability to differentiate results across the various queries that make up a transaction.

AFAICT right now, the current cypher-stream design doesn't differentiate. A transaction is a single stream of results, with data events that don't tell you which query (statement) each result corresponds to.

For example, if the caller doesn't know in advance how many results a query will give, they don't know when the results for one query ends and the next begins.

What are your thoughts for how to achieve that?

No operations allowed until you send an INIT message successfully

When running a basic test example, I get this error.

{ [Error: No operations allowed until you send an INIT message successfully.] message: 'No operations allowed until you send an INIT message successfully.', code: 'Neo.ClientError.Request.Invalid' }

var cypher = require('cypher-stream')('bolt://localhost','neo4j','neo4j')

cypher('MATCH (n:User) RETURN n LIMIT 1')
    .on('data', function(result){
        console.log(result)
    })
    .on('error', function(error){
        console.log(error)
    })

Installed using:
npm install [email protected]

Am I missing something really obvious here? I don't see anything that requires anything else. I'm running neo4j 3.0.4 if that makes a difference as well

Use new Cypher endpoint with leaner JSON?

Neo4j 2.0 supports a new "transactional Cypher" endpoint which, importantly, returns leaner JSON: just property data, no more hyperlinks. You're already returning just the data, not extracting native node IDs or similar, so in theory this shouldn't lose any functionality.

http://docs.neo4j.org/chunked/stable/rest-api-transactional.html#rest-api-begin-and-commit-a-transaction-in-one-request

Would you be open to updating to this? I'd be happy to give a pull request a stab if so. The only thing is that I don't have any experience with Oboe, so it might take me some time. =)

One thing to note is that error handling will get a bit more complex now, because errors will no longer result in a different HTTP response.

http://docs.neo4j.org/chunked/stable/rest-api-transactional.html#rest-api-handling-errors

Support Basic HttpAuth

url like :
http://user:[email protected]:7474/

seems to not work with oboe...

can you confirm ?

Support for REST format

Quick bug first: extractData recursively calls itself if an object has a data property, so this means it'll incorrectly drop legitimate property data if there happens to be a property named data. =)

But wait! That bug's not worth fixing, because (I think) you don't even need that function anymore, since the transactional endpoint returns just property data by default now. So you should just remove it.

But on that note, I'd like to request an option to return the REST format instead of the lean property-only format. Having node and relationship metadata is needed for ORM/OGM-type libraries.

Easy enough to request it:

http://neo4j.com/docs/stable/rest-api-transactional.html#rest-api-execute-statements-in-an-open-transaction-in-rest-format-for-the-return

This could potentially be just another option to add to #10's options-based API, e.g. format.

WDYT? SGTY? ORLY?

Documentation error

Hello,
The username/password have to be passed in a single hash like { username: username, password: password }, not 2 separate arguments like it says in the documentation.

Properly tagged versions

I'd recommend tagging your final versions which allows npm and package-managers distinguish semver-updates.

codex-digital / cypher-stream Goto Github PK

cypher-stream's Introduction

cypher-stream

Installation

Basic usage

Handling errors

Transactions

Creating a transaction

Adding queries to a transaction

Committing or rolling back

Stream per statement

Unsafe Integers

cypher-stream's People

Contributors

Stargazers

Watchers

Forkers

cypher-stream's Issues

Benchmarking

Tests

20 Records

Non-streaming

Streaming

1763 records

Non-streaming

Streamed

Conclusion

Proposal

Other Notes

Recommend Projects

Recommend Topics

Recommend Org