Coder Social home page Coder Social logo

High CPU usage about reverb HOT 23 OPEN

bbashy avatar bbashy commented on July 22, 2024
High CPU usage

from reverb.

Comments (23)

bbashy avatar bbashy commented on July 22, 2024 1

I've set REVERB_APP_PING_INTERVAL=3600 and I've restarted reverb. Back to 0-1% CPU and will run the monitoring for the process.

Must be something to do with the amount of pings along with the connections subscribe/unsubscribe.

I use it as an online list and it's included on most pages, so page change will be a disconnect/connect.

from reverb.

clue avatar clue commented on July 22, 2024 1

@bbashy Sudden CPU spikes are definitely not to be expected and if you've encountered the exact same error in both the original Laravel Websockets package and now Reverb, this might in fact indicate a problem in the underlying Ratchet/ReactPHP components. Similar reports have been posted in the past and we have some ideas what could cause this, but we've never been able to reproduce this locally, see ratchetphp/Ratchet#939 (comment) / ratchetphp/Ratchet#939 (comment). If this matches your issue and you can reproduce this locally, please reach out and I'm happy to take a look at this.

from reverb.

Alex0Fury avatar Alex0Fury commented on July 22, 2024 1

@bbashy
I faced a similar problem. Supervisor is raised under the docker container, the reverb process is started there. This process was loaded at 100%, 500+ connections at the same time.

The solution turned out to be to add an extension as indicated in the documentation
Event Cycle

Dockerfile

FROM php:8.3.3-fpm-bullseye
#....
#....
RUN apt-get install openssl
RUN docker-php-ext-install sockets
RUN pecl channel-update pecl.php.net
RUN apt-get update && apt-get install -y libevent-dev

RUN printf "\n\n\n\n\nno\nyes\n\n" | pecl install event
RUN docker-php-ext-enable event

Also installed outside the host system ulimit -n 32766

in the supervisor settings supervisor.conf
minfds=10000

[supervisord]
nodaemon=true
chmod=0770
logfile=/var/log/supervisor/supervisord.log
pidfile=/var/run/supervisord.pid
minfds=10000

[program:laravel-websocket]
numprocs=8
process_name=%(program_name)s_%(process_num)02d
chmod=0770
command=php /var/www/html/artisan reverb:start
autostart=true
autorestart=true
redirect_stderr=true
stdout_logfile=/var/www/html/storage/logs/reverb.log

finally

image

from reverb.

bbashy avatar bbashy commented on July 22, 2024 1

Might have fixed an issue for you, but I believe mine was slightly different.

I have no problems when running it under Soketi on the same machine/ulimit.

from reverb.

joedixon avatar joedixon commented on July 22, 2024 1

This is still on my radar, I just haven't had the chance to take a deep dive on it just yet.

from reverb.

joedixon avatar joedixon commented on July 22, 2024

Hey @bbashy, how many messages are you sending across the connections and what sort of server are you running?

Also, are you running Reverb alongside another an application or standalone?

from reverb.

bbashy avatar bbashy commented on July 22, 2024

Hi.

Using it as a presence channel for online user list. Hit 180 users at certain times. It is used on most pages across the site, so people will be leaving/joining on page change. People visit a few pages every few seconds.

Using a Hetzner cloud server AMD EPYC (8c 16GB).

No other applications, this server is just for this site. Was using Soketi (no CPU issues) but it doesn't work on node v20, which is why I switched to Reverb after Laravel upgrade.

Echo.join('online-users')
    .here((users) => {
        this.onlineUsers = users;
    })
    .joining((user) => {
        this.onlineUsers.push(user);
    })
    .leaving((user) => {
        this.onlineUsers = this.onlineUsers.filter(u => u.id !== user.id);
    });

In channels.php it's just a return of user data. Username, country, link to profile.

It's been fine since I deployed last - 24/03/2024 22:00 UTC.

from reverb.

joedixon avatar joedixon commented on July 22, 2024

Thanks for the reply. Going to close this for now, but please mention me in this thread if you see the issue again and I will reopen.

from reverb.

bbashy avatar bbashy commented on July 22, 2024

@joedixon Started happening again. If there's any debugging I can do, let me know.

Screenshot 2024-03-27 at 9 12 33 PM

from reverb.

joedixon avatar joedixon commented on July 22, 2024

@bbashy Would be good to get a better picture of your environment and use case. Would you mind answering the following:

  1. Does the CPU continually increase over time or does it suddenly spike to 100%?
  2. If you leave it, does the CPU eventually drop back down?
  3. Are you using the Pulse integration?
  4. Do you have debug mode enabled?
  5. Are you broadcasting messages (if so, roughly how many) or just using for presence indication?
  6. Are you able to use tinker to run the code samplese below to get the current number of connections and channels:
// Connections
app(\Illuminate\Broadcasting\BroadcastManager::class)
  ->getPusher()
  ->get('/connections');

// Channels
app(\Illuminate\Broadcasting\BroadcastManager::class)
  ->getPusher()
  ->get('/channels', ['info' => 'subscription_count']);

from reverb.

bbashy avatar bbashy commented on July 22, 2024
  1. I've only noticed it being 1% or 99-100%.
  2. It's still at that since my last reply (16 hours).
  3. No but I plan to once out of beta. Shall I install?
  4. No. Logs just show "INFO Stopping server on 0.0.0.0:2096. INFO Starting secure server on 0.0.0.0:2096 (example.com)." etc.
  5. We have a 'chat room' but it's hardly used.
  6. See below.
Psy Shell v0.12.2 (PHP 8.2.17 — cli) by Justin Hileman
> app(\Illuminate\Broadcasting\BroadcastManager::class)->getPusher()->get('/connections');
= {#5634
    +"connections": 117,
  }

> app(\Illuminate\Broadcasting\BroadcastManager::class)->getPusher()->get('/channels', ['info' => 'subscription_count']);
= {#5606
    +"channels": {#5612
      +"servers-booked": {#5610
        +"subscription_count": 14,
      },
      +"stats-counter": {#5605
        +"subscription_count": 14,
      },
      +"presence-public-chat": {#5607},
    },
  }

from reverb.

joedixon avatar joedixon commented on July 22, 2024

No need to install Pulse, just wanted to rule it out as a factor.

Would be interesting to run with debug mode enabled for a little while (which in itself will increase CPU) to see if anything untoward is happening when the CPU spikes.

from reverb.

bbashy avatar bbashy commented on July 22, 2024

Running with debug now and I'm running a script to gather the cpu % every 2 seconds.

Will report here with data.

from reverb.

bbashy avatar bbashy commented on July 22, 2024

Gone up to 100% again, here's the cpu usage, pretty much instant, not gradual. Running in debug for 24 hours.

2024-04-01 20:33:46 :: reverb:start --debug [2282066] CPU Usage: 1%
2024-04-01 20:33:48 :: reverb:start --debug [2282066] CPU Usage: 2%
2024-04-01 20:33:50 :: reverb:start --debug [2282066] CPU Usage: 3%
2024-04-01 20:33:52 :: reverb:start --debug [2282066] CPU Usage: 2%
2024-04-01 20:33:54 :: reverb:start --debug [2282066] CPU Usage: 2%
2024-04-01 20:33:56 :: reverb:start --debug [2282066] CPU Usage: 3%
2024-04-01 20:33:58 :: reverb:start --debug [2282066] CPU Usage: 2%
2024-04-01 20:34:00 :: reverb:start --debug [2282066] CPU Usage: 2%
2024-04-01 20:34:02 :: reverb:start --debug [2282066] CPU Usage: 2%
2024-04-01 20:34:04 :: reverb:start --debug [2282066] CPU Usage: 4%
2024-04-01 20:34:06 :: reverb:start --debug [2282066] CPU Usage: 1%
2024-04-01 20:34:08 :: reverb:start --debug [2282066] CPU Usage: 1%
2024-04-01 20:34:10 :: reverb:start --debug [2282066] CPU Usage: 4%
2024-04-01 20:34:12 :: reverb:start --debug [2282066] CPU Usage: 2%
2024-04-01 20:34:14 :: reverb:start --debug [2282066] CPU Usage: 3%
2024-04-01 20:34:16 :: reverb:start --debug [2282066] CPU Usage: 2%
2024-04-01 20:34:18 :: reverb:start --debug [2282066] CPU Usage: 3%
2024-04-01 20:34:20 :: reverb:start --debug [2282066] CPU Usage: 52%
2024-04-01 20:34:22 :: reverb:start --debug [2282066] CPU Usage: 100%
2024-04-01 20:34:24 :: reverb:start --debug [2282066] CPU Usage: 100%
2024-04-01 20:34:26 :: reverb:start --debug [2282066] CPU Usage: 100%
2024-04-01 20:34:28 :: reverb:start --debug [2282066] CPU Usage: 99%
2024-04-01 20:34:30 :: reverb:start --debug [2282066] CPU Usage: 100%
2024-04-01 20:34:32 :: reverb:start --debug [2282066] CPU Usage: 100%
2024-04-01 20:34:34 :: reverb:start --debug [2282066] CPU Usage: 100%
2024-04-01 20:34:36 :: reverb:start --debug [2282066] CPU Usage: 100%
2024-04-01 20:34:38 :: reverb:start --debug [2282066] CPU Usage: 100%
2024-04-01 20:34:40 :: reverb:start --debug [2282066] CPU Usage: 99%
2024-04-01 20:34:42 :: reverb:start --debug [2282066] CPU Usage: 100%
2024-04-01 20:34:44 :: reverb:start --debug [2282066] CPU Usage: 100%
2024-04-01 20:34:46 :: reverb:start --debug [2282066] CPU Usage: 99%
2024-04-01 20:34:48 :: reverb:start --debug [2282066] CPU Usage: 100%

The reverb logs don't show much apart from maybe this? This is quite regular in the whole log even when it was low usage.

Message Handled ........................................ 285448861.320304079  
Connection Closed ...................................... 880659721.152382869  
Pruning Stale Connections ..................................................  
Pinging Inactive Connections ...............................................  
Connection Pinged ...................................... 373144331.352179112  
Connection Pinged ...................................... 770897394.824165909  
Connection Pinged ...................................... 128326687.479625382  
Connection Pinged ...................................... 238258499.280963052  
Connection Pinged ...................................... 711357617.733355114  
Connection Pinged ...................................... 913581555.461208734  
Connection Pinged ...................................... 750335547.483737098  
Connection Pinged ...................................... 824799629.748502904  
Connection Pinged ...................................... 373757589.569113713  
Connection Pinged ....................................... 88650420.659599424  
Connection Pinged ...................................... 875588527.671792201  
Connection Pinged ...................................... 645841583.993685622  
Connection Pinged ....................................... 69443263.770122343  
Connection Pinged ....................................... 969366386.39638130  
Connection Pinged ....................................... 97541188.835314754  
Connection Pinged ...................................... 336550205.253472927  
Connection Pinged ...................................... 837108525.856830003  
Connection Pinged ...................................... 843470688.187968972  
Connection Pinged ...................................... 903186882.544248154  
Connection Pinged ....................................... 70865328.772779885  

then after 170~ lines of that, it does loads of;

Message Received ....................................... 913581555.461208734  

1    {
2        "event": "pusher:ping",
3        "data": []
4    }

Message Handled ........................................ 913581555.461208734  
Message Received ......................................... 746315846.6165177  

1    {
2        "event": "pusher:ping",
3        "data": []
4    }

from reverb.

joedixon avatar joedixon commented on July 22, 2024

@bbashy the logs look normal, but I'm interested to see if the issue is related to the pinging of connections which Reverb defaults to running every 60 seconds. Can you try increasing the value of REVERB_APP_PING_INTERVAL to see whether that makes any difference. Perhaps starting at 1 hour (3600) and increasing from there.

from reverb.

joedixon avatar joedixon commented on July 22, 2024

Right, but I would say that's pretty standard usage. Perhaps the difference is the online list as that will cause events to be broadcast across all connected clients on every connection / disconnection. Pusher actually limit this functionality to 100 connections only.

from reverb.

bbashy avatar bbashy commented on July 22, 2024

Yep, I saw that and Soketi limit it too. I increased the limit on Soketi to 500 and the cpu usage was still below 10% even after a month of uptime.

The online list is only on the homepage though, other pages only have a counter, but I guess it still updates to all clients.

from reverb.

bbashy avatar bbashy commented on July 22, 2024

Gone up again. Shall I try a higher ping interval? It's weird it goes straight up 0-100% in 4 seconds.

2024-04-04 19:28:00 :: reverb:start --debug [237336] CPU Usage: 2%
2024-04-04 19:28:02 :: reverb:start --debug [237336] CPU Usage: 0%
2024-04-04 19:28:04 :: reverb:start --debug [237336] CPU Usage: 1%
2024-04-04 19:28:06 :: reverb:start --debug [237336] CPU Usage: 0%
2024-04-04 19:28:08 :: reverb:start --debug [237336] CPU Usage: 1%
2024-04-04 19:28:10 :: reverb:start --debug [237336] CPU Usage: 1%
2024-04-04 19:28:12 :: reverb:start --debug [237336] CPU Usage: 0%
2024-04-04 19:28:14 :: reverb:start --debug [237336] CPU Usage: 1%
2024-04-04 19:28:16 :: reverb:start --debug [237336] CPU Usage: 0%
2024-04-04 19:28:18 :: reverb:start --debug [237336] CPU Usage: 0%
2024-04-04 19:28:20 :: reverb:start --debug [237336] CPU Usage: 87%
2024-04-04 19:28:22 :: reverb:start --debug [237336] CPU Usage: 100%
2024-04-04 19:28:24 :: reverb:start --debug [237336] CPU Usage: 99%
2024-04-04 19:28:26 :: reverb:start --debug [237336] CPU Usage: 100%
2024-04-04 19:28:28 :: reverb:start --debug [237336] CPU Usage: 100%
2024-04-04 19:28:30 :: reverb:start --debug [237336] CPU Usage: 100%

If it's not something others have had and we can't resolve, I'll look into changing the online user list to something ws message based.

from reverb.

joedixon avatar joedixon commented on July 22, 2024

It would be good to try increasing the ping interval to see if it has any impact on when the CPU spikes. Agree it's strange that it ramps up so quickly.

Are you in a position to be able to temporarily disable the online user list so we can categorically determine that is actually the issue?

from reverb.

bbashy avatar bbashy commented on July 22, 2024

@clue Looks very similar and relates to my use-case of high connect/disconnect events. In theory, all we'd need to do is replicate a lot of users (100+) connecting/disconnecting.
It started looking like it was because of that after I saw the time it shot up. 19:30-20:30 is when my site has the most users on it, and they'll be doing a lot of page changes (disconnect/connect).

from reverb.

Alex0Fury avatar Alex0Fury commented on July 22, 2024

finally

Возможно, я решил для вас проблему, но я думаю, что у меня было немного иначе.

I believe that our answers can help other developers to cope with such a problem.
And the owners of the repository may take into account some of our practices

from reverb.

Alex0Fury avatar Alex0Fury commented on July 22, 2024

Это все еще в моем поле зрения, просто у меня еще не было возможности глубоко погрузиться в него.

Yesterday I experimented with installing the package pecl install ev, which similarly saves the situation. In a production environment with 600+ connections, the load on the process has decreased. There is one running artisan reverb hanging in the artisan reverb:start

it is necessary to specify in php.ini extension=ev.so

image

from reverb.

rawezhcode avatar rawezhcode commented on July 22, 2024

image

When the number of connections exceeds > 1000, the CPU 100 % increases at a strange rate and causes it to stop and restart.
On an VPS ( 8core, 16GB RAM ) there is this problem

from reverb.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.