Coder Social home page Coder Social logo

Comments (12)

mikeal avatar mikeal commented on July 3, 2024

Prior art by @jeresig : http://ejohn.org/blog/node-js-stream-playground/

from docs.

maxogden avatar maxogden commented on July 3, 2024

modules you should use if you care about error handling (aka "why i dont use .pipe in production"):

other ones I use all the time

good streaming data format modules

from docs.

remy avatar remy commented on July 3, 2024

So, what Jan was alluding to (re: twitter) is that I built a project that initially blew up node (due to buffering so much data in memory) so I moved to streams.

A quick list (off the top of my head) of the main pain points of working with streams (though I had it working in the end):

  • There's no real information on when a stream starts or why it might stop
  • Large streams with lots of through points can randomly stop (I was supposed to be processing 1/2 million objects, but my streaming code would just randomly stop at ~120,000 objects)
  • I'd regularly see streams bail halfway through, but I'd never see an actual error (and I wasn't entirely sure where I was I supposed to put an error handler)
  • Telling if the stream was no longer readable because it hadn't started or because it had finished wasn't clear (there's a readable property, but I couldn't find the documentation that went with the property, and I'm wary on relying on guess work)
  • I wrote a peek method to look at the first object in a stream, but it would frequently totally bork my stream - I had tests working, but production code (wrapped up in promises - obviously bespoke to me)

I used through a lot/exclusively, and found it the simplest way of working.

You can see the project source code that I ended up with here: https://github.com/eHealthAfrica/universal-exporter/

The big issue I kept hitting is that all the examples and demos around streaming were for extremely small datasets, which is fine, but this isn't what streaming solves. Streaming is a perfect match for very large datasets. Buffered programming is less taxing on the brain, so if I'm committing to the new/better way of framing the problem around streams, I need to have examples that are dealing with equally large datasets (and yes, stdin is infinite, but you rarely push a gig or more through it during a demo).

from docs.

maxogden avatar maxogden commented on July 3, 2024

@remy great list! here's some random feedback

  • for peeking we use https://www.npmjs.com/package/peek-stream
  • definitely use through2 over through
  • for error handling, always use pump with a callback and .destroy
  • on debugging, I started this discussion rvagg/through2#33 but it is not resolved yet. the current idea is to monkeypatch the Stream base constructor to insert debug info but nobody has done this

in general I think debugging sucks (lack of stream specific debugging tools), but on perf and error handling I am happy

from docs.

remy avatar remy commented on July 3, 2024

@maxogden I did look at that peek-stream module and it made no sense to me. It only seemed to solve the specific problem of redirecting the stream. I read through the code several times before - trust me, I definitely did not want to re-invent anything!

Re: through2 vs through - why? When I asked, through2 just seemed to give more control/options and thus complexity.

What I'd love to see: a visualisation of stream pipelines, and some way of injecting a test object into the stream to watch it mutate through the stream(s). ...but I can dream on! :)

from docs.

maxogden avatar maxogden commented on July 3, 2024

@remy ah ok, sorry about that. I'll try and improve the docs. To use it to only peek at the value you could treat it like a transform stream, e.g.

var read = fs.createReadStream('orig.txt')
var write = fs.createWriteStream('copy.txt')
var peeker = peek(function(data, swap) {
  console.log(data)
  swap(writer) // immediately swap
})

pump(read, peeker, function (err) { })

re: through vs through2, all the reasons are related to streams2, the tl;dr of which is that backpressure works differently, and better, in streams2 and above, so its important to use streams2 modules with other streams2 to make sure things get buffered correctly etc

from docs.

jeresig avatar jeresig commented on July 3, 2024

👍 to good stream tutorials! Feel free to do what you'd like with my playground, as well: https://github.com/jeresig/node-stream-playground

If you'd like to host it (!) that'd be super-appreciated, as well. Would be happy to transfer the domain name and everything. Let me know how I can help!

from docs.

maxogden avatar maxogden commented on July 3, 2024

@jeresig oooh maybe @finnp would be interested in collaborating there, he has been working on http://www.finnpauls.de/streams-editor/

from docs.

Qard avatar Qard commented on July 3, 2024

👍 Streams are rather confusing to new devs. There's a lot of magic in there.

We've discussed having separate guides for both using streams and for implementing streams, along with a topic page on understanding streams internals. PR are of course welcome. Do you think that set of docs pages could cover everything well?

from docs.

maxogden avatar maxogden commented on July 3, 2024

The biggest antipattern I've seen in 'implementing streams' education is where people (both teachers and learners) think they have to know all aspects of streams1, 2 and 3, exhaustively list all caveats and properties, methods etc. In reality you don't need to know 80% of that stuff to get started implementing streams, and trying to learn it all at once always fails and causes people to give up. Also I don't think people should require('stream') personally to implement streams (it's just too hard to learn), they should use the abstractions that already exist.

from docs.

kharandziuk avatar kharandziuk commented on July 3, 2024

I wrote a small article about real use case of using streams in Node.js https://hackernoon.com/node-js-streams-in-action-1495c22fafec
Maybe somebody find it useful

from docs.

Trott avatar Trott commented on July 3, 2024

Closing as this repository is dormant and likely to be archived soon. If this is still an issue, feel free to open it as an issue on the main node repository.

from docs.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.