I'm (apparently) sending too many items to the internal queue and getting back: <d

How to handle ws::ErrorKind::Queue? about ws-rs HOT 5 CLOSED

nickhs commented on August 26, 2024

How to handle ws::ErrorKind::Queue?

from ws-rs.

Comments (5)

housleyjk commented on August 26, 2024

In order to avoid overfilling the queue, you need to set the queue size high enough. In this case, you need Settings::max_connections. The event queue is configured to be 5 times the number of allowed connections. The default is 100, so 500 events at a time. You can increase the number of max connections up to usize::max_value() / 5 ( which is 3689348814741910323 on 64bit systems, if you anticipate handling that many simultaneous websockets, you will need more than one instance of WS-RS 😄 ).

The bench-server example shows how to increase the maximum number of allowed connections. In that example, max_connections is set to 10,000, which gives a queue size of 50,000.

In my experience 10_000 is a nice number for a production environment, but you can do higher if you need to.

from ws-rs.

nickhs commented on August 26, 2024

Thanks Jason! That totally makes sense - I guess what I'm asking for here is instead of making the queue size "large enough hopefully" is there some sort of feedback mechanism I can get from the sender that tells me when the queue's about to be full so I can try again later? Something like an EAGAIN?

from ws-rs.

housleyjk commented on August 26, 2024

Simply encountering a full queue shouldn't bring down a connection. The queue error exists to inform the caller that the queue is full, so then they can try the operation again later. Sleeping or blocking or trying to schedule a timeout all won't help because to consume events, we need to return to the event loop immediately. Luckily, this should never happen if max_connections is set to a reasonably high value.

Anyway, it was not intended behavior that a queue error should result in a connection disconnecting, so this is a bug. The fix I propose is to delegate handling queue errors to handlers and also add a setting to allow people to just increase the per-connection queue size without needing to accept more connections. You should be able to do something like max_connections=1, queue_size=7000 if you know you will only ever have one connection but you want to queue a bunch of messages all at once.

You can handle the queue error yourself with something like this even now:

   fn replay_lines(&mut self, count: usize) {
        let mut buf = Vec::new();
        for _ in 0..count {
            let res = self.fh.read_until(self.sess_args.delim, &mut buf);
            let bytes_read = res.expect("cannot read file");
            if bytes_read == 0 {
                self.active = false;
            }
            // Note, this ignores other errors that may occur, if you want to handle more than
            // just a queue error, you will want to match on the error and handle each case
            if let Err(ws::Error{kind: ws::ErrorKind::Queue(_), ..}) = self.sender.send(&buf[..]) {
                // Since we are reading a file, our position is already tracked, so we don't need to 
                // store our unfinished work anywhere else. Just break and wait for the timeout to
                // occur to read the next section
                println!("Waiting for queue to clear before queuing next message.");
                break;
            }
        }
    }

But I did test this out and the connection is disconnected because of the queue failure at another point, which defect the new changes will fix.

I noticed in your code that it's the UpfrontPlayback setting that is causing the queue to exhaust. For example, if you have a UpfrontPlayback::Count(_) approaching 500, then line 88 of your code will hit the queue limit before any messages have a chance to be sent or received.

It's up to you of course, but my advice is that you should set the new queue_size setting to be greater than the count or perc setting passed into your cli. That by itself will prevent this issue. If this code goes into production, you should probably also not use expect anywhere while listen is running because a panic will drop all connections, whereas simply encountering an error due to hitting the max_connections limit or the event loop queue limit will only ever terminate the offending connection.

from ws-rs.

nickhs commented on August 26, 2024

Simply encountering a full queue shouldn't bring down a connection.

Okay, I agree. Does that mean I'm misinterpreting https://ws-rs.org/api_docs/src/ws/src/result.rs.html#44 when it states that the connection will disconnect? Or is that documentation wrong in light of the bug?

Sleeping or blocking or trying to schedule a timeout all won't help because to consume events, we need to return to the event loop immediately.

Does mio have any sort of provision for a yield type call? Or is returning my only option?

But I did test this out and the connection is disconnected because of the queue failure at another point, which defect the new changes will fix.

That's what I saw too - even if I did a return on receiving a queue full error I'd still see a disconnect. Glad to know I wasn't going insane 😄 .

It's up to you of course, but my advice is that you should set the new queue_size setting to be greater than the count or perc setting passed into your cli

Yeap, seems like an obvious win. I'd have to do some pre-processing in the percentage case (determine the total number of lines in the file to figure out the message count) but nothing horrid.

As an aside (and this my naivety), where is the 5 * max_connections buffer created? I was unable to find it after grepping around aside from a reference in src/io.rs

Thanks for your time! (and apologies again if I'm being dumb)

from ws-rs.

housleyjk commented on August 26, 2024

You are not dumb. This was definitely a weakness in the documentation and the implementation.

Thank you for catching that in result. It used to be true, but it's not now that we let the handler decide what to do in response to a full queue. I will update the docs.

Sadly, there is no yield in vanilla mio and no yield yet in Rust. Maybe someday, but for now, the only way to get the queue cleared is to return from the function.

The size of the queue is set here or here. If you look at those lines in master, you will see the old 5.

Thank you for pointing this stuff out!

from ws-rs.

How to handle ws::ErrorKind::Queue? about ws-rs HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent