Coder Social home page Coder Social logo

Comments (8)

Lukasa avatar Lukasa commented on June 1, 2024

The backpressure handler addresses one real case. The lens through which it makes sense is to conceptualise of reading data as “accepting more work”, and writing data as “completing work”. The backpressure handler addresses the following attack vector (illustrative python code):

sock, _= socket.create_connection((host, port))
while True:
    sock.send(SOME_HTTP_REQUEST)

This attack can cause an application to accept wildly too much work, buffering the output data that the other peer is not reading.

This attack is a trivial and common DoS against naive async servers, and is not resolved by applying backpressure from the DB: applying that backpressure only slows the attack down, but can still lead to unbounded memory consumption.

However, you are very right that the BackpressureHandler is not a one-size fits all solution to Backpressure, and we should certainly make it clear in the handler description that it is only one part of a serious backpressure solution.

from swift-nio.

helje5 avatar helje5 commented on June 1, 2024

@Lukasa Please re-read what I wrote. My complaint is that the current BP handler specifically does not address the attack vector you describe. I'm going to bring down any real world HTTP endpoint using NIO which relies on the BP handler within minutes (it is naive in precisely the way you describe) :-)

The core of your misconception is that you consider

writing data (at the same volume to the same socket!) as “completing work”.

This is almost never the case (also notice that the amount of writing needs to be balanced w/ the input to make it work).
As I explain in the issue, input data rarely (to a degree of "never") means output data on the same socket. There are three scenarios to consider here:

  1. a pure transforming endpoint. E.g. you send JSON to the service, and it returns you the same data as XML
  2. pretty much any common TCP protocol (HTTP just for illustration): 2a) big POST with a lot of data, return an OK (big=>small), OR 2b) send a request to return a lot of data (small=>big).
  3. bidirectional services. There is both, a lot of read and a lot of write traffic, but they do not correlate. (e.g. a chat connection)

The BP handler only addresses 1). Apart from maybe leftpad.io (and echo) I can see 0 applications of this BP handler. (which is why I think it should be hidden somewhere deep and annotated w/ a lot of warnings to use it).

Makes sense? (I admit I may have a flaw in my thinking still, but if so I can't see it yet :-) )

P.S.: You may think that this might address 2b), but it doesn't either, because the 2b) attack doesn't produce a lot of input (there is no second 'read' which would stop the processing). (in that scenario the BP of the output needs to be attached to the producers output, e.g. a database channel)

P.S.2: "applying that backpressure only slows the attack down" - no, applying that back pressure ensures the system is not overloaded w/ I/O at any time. You never read (or write) more than you can process - which is what BP means.

from swift-nio.

Lukasa avatar Lukasa commented on June 1, 2024

My complaint is that the current BP handler specifically does not address the attack vector you describe

Sure it does, but only if you read my code carefully. 😉 I am very deliberately producing well-formed requests for work: I'm not shoving an infinitely large request body down the pipeline, which will be subject to your notes, I am just sending well-formed short web requests and never reading the bodies. This is related to, but not the same as, sending request bodies of unbounded size.

I should note as well that volume is irrelevant. The BP handler doesn't confirm that there are the same amount of bytes going in and out, only that if we produce output bytes they have somewhere to go. The assumption it makes is that if we cannot send our output data (because of either network backlog or the client refusing to read it), we should not keep reading from that socket.

P.S.2: "applying that backpressure only slows the attack down" - no, applying that back pressure ensures the system is not overloaded w/ I/O at any time. You never read (or write) more than you can process - which is what BP means.

This demonstrates that I don't think you've understood the attack to which I refer.

Suppose you propagate back pressure based on database transactions: you will only allow 1 concurrent DB transaction at a time, so while your channel handler waits to obtain the DB transaction lock it stops reading from the socket (so as to not process further requests). Eventually, my request will be served and your server will generate a response, at which time it will hand back the DB transaction: the work is done, no need to prevent other requests being served.

Based on this, I can construct a malicious client that pipelines one million HTTP requests down that connection and never reads from its socket. Now, each requests will only be served once at a time, and I'll wait a lot for you to obtain your DB transaction. But the request is good, so each time you'll serve the request, generate the response, write it to the pipeline and release the transaction lock.

Because my client is never reading from its socket, though, what happens? Well, first your TCP send buffer in the kernel fills up. Now our socket isn't marked writable, so we start queueing your response data in the Channel. Eventually, the Channel is no longer marked writable.

At this point the BackpressureHandler says "ok, enough, until this send backlog shrinks we are not reading any more data from the connection" and my attack is foiled. If you do not have the back pressure handler, you will keep reading my requests and keep buffering ever more response data, until eventually the pipeline falls over.

You see what I mean? Yes, of course you need to propagate back pressure from the DB. But you also need to propagate backpressure from the network. The key thing is that back pressure is a complex, multifaceted thing: you need many sources of back pressure, all working together, to ensure that the system never overloads. This is why we added the HTTPServerPipelineHandler as well, which actually does address your DB back pressure concern.

But the BP handler as it stands today is not useless. It's just one part of a complete system.

from swift-nio.

helje5 avatar helje5 commented on June 1, 2024

Sure it does, but only if you read my code carefully. 😉

Thanks, I read it carefully. What is wrong in my analysis of it? It associates readability with writability on the same socket. Anything I missed?

You are drifting off and mixin in other topics. As part of this issue I'm solely concerned about the topic "back pressure" and the (non)usefulness of the existing BackPressureHandler.

I'm not trying to consider topics like "how do you properly implement BP" (yes, only waiting on the DB is also wrong) or "how to protect against arbitrary" DoS attacks (limiters are a separate topic). I'm happy to discuss those as well, but this is not what this issue is about at all.

Stripping down the DB example you reiterate over:

Because my client is never reading from its socket ... Well, first your TCP send buffer in the kernel fills up

The core point I'm trying to explain is: it doesn't! As outlined the usual response will be a simple 200 OK at the end.
This is why the write volume as extremely relevant. It need to be close to your read volume to make this BP even kick in. Which is extremely rare in real world endpoints. (see scenarios 1,2,3 above)

There rarely is a mapping between read volume and write volume (almost never). Those are in all but exceptional situations disconnected aspects of an endpoint.

But the BP handler as it stands today is not useless. It's just one part of a complete system.

OK, lets agree that we disagree. In my opinion it is not only useless but very dangerous (w/ the current class name and w/o detailed instructions what it actually does [more importantly NOT does]). Because it doesn't do what you assume it does: Managing backpressure :-) ¯\(ツ)

from swift-nio.

Lukasa avatar Lukasa commented on June 1, 2024

As outlined the usual response will be a simple 200 OK at the end.

Sure, per request. But if the client doesn't read from their socket, where do you think the bytes end up? They don't magically vanish. The client's receive buffer initially fills up, then the server's send buffer, then NIO itself has to start buffering it. That's what I'm getting at here.

from swift-nio.

helje5 avatar helje5 commented on June 1, 2024

But if the client doesn't read from their socket, where do you think the bytes end up?

If the OS receive buffer is full, TCP flow control will take over and block the client, whose write buffer will fill up and eventually stop sending. We call that back pressure ;-)

from swift-nio.

Lukasa avatar Lukasa commented on June 1, 2024

If the OS receive buffer is full, TCP flow control will take over and block the client

That is not how TCP flow control works. TCP is fully-duplex: being unable to send in one direction does not block sending in the other direction.

If the client's receive buffer is full, it prevents the server from sending, but so long as the server's is not full the client can continue to send. That is specifically what the BackpressureHandler addresses: it performs the function you are (incorrectly) saying the OS performs.

from swift-nio.

helje5 avatar helje5 commented on June 1, 2024

Man, you are really stressing it ... I mean, I already closed the issue in a "lets agree that we disagree" ;-)

TCP flow works exactly how I think it does. I'm TCP flowing for like 25 years now ;-)
You asked:

But if the client doesn't read from their socket, where do you think the bytes end up?

There is nothing duplex in that. If one end doesn't read what the other writes - and the reverse, the other end gets blocked. This is plain back pressure and it works like it always worked. The same happens in Unix pipe back pressure and so on (you know cat /tmp/hugefile | sed s#x#y#g | grep murks).
The only point where it doesn't happen like that, is the BP handler of NIO ;-) Why is that, because you insist to tie input pressure to output pressure, which is a correlation that just doesn't usually exist. Those are separate channels and need separate handling.

But seriously, lets leave it alone. If you think it is useful and won't hurt - your framework your choice. I already closed the issue. Lets just hope that people using NIO properly understand the issue (🤣), there is no chance that this can end in a bad way!

from swift-nio.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.