Coder Social home page Coder Social logo

High CPU usage about oomd HOT 8 CLOSED

hakavlad avatar hakavlad commented on July 19, 2024 5
High CPU usage

from oomd.

Comments (8)

hakavlad avatar hakavlad commented on July 19, 2024

Comparison of userspace killers:

user@user-pc:~$ sudo service earlyoom status && sudo service nohang status && sudo service oomd status
● earlyoom.service - Early OOM Daemon
   Loaded: loaded (/lib/systemd/system/earlyoom.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2019-08-12 22:06:23 +09; 6min ago
     Docs: man:earlyoom(1)
           https://github.com/rfjakob/earlyoom
 Main PID: 590 (earlyoom)
    Tasks: 1 (limit: 2302)
   Memory: 916.0K
      CPU: 58ms
   CGroup: /system.slice/earlyoom.service
           └─590 /usr/bin/earlyoom -r 60

авг 12 22:06:24 user-pc earlyoom[590]: earlyoom v1.2
авг 12 22:06:24 user-pc earlyoom[590]: mem  total: 1990 MiB, sending SIGTERM at 10 %, SIGKILL at  5 %
авг 12 22:06:24 user-pc earlyoom[590]: swap total:    0 MiB, sending SIGTERM at 10 %, SIGKILL at  5 %
авг 12 22:06:24 user-pc earlyoom[590]: mem avail: 1792 of 1990 MiB (90 %), swap free:    0 of    0 MiB ( 0 %)
авг 12 22:07:24 user-pc earlyoom[590]: mem avail: 1503 of 1990 MiB (75 %), swap free:    0 of    0 MiB ( 0 %)
авг 12 22:08:25 user-pc earlyoom[590]: mem avail: 1393 of 1990 MiB (69 %), swap free: 3981 of 3981 MiB (100 %)
авг 12 22:09:25 user-pc earlyoom[590]: mem avail: 1392 of 1990 MiB (69 %), swap free: 3981 of 3981 MiB (100 %)
авг 12 22:10:25 user-pc earlyoom[590]: mem avail: 1393 of 1990 MiB (69 %), swap free: 3981 of 3981 MiB (100 %)
авг 12 22:11:25 user-pc earlyoom[590]: mem avail: 1393 of 1990 MiB (69 %), swap free: 3981 of 3981 MiB (100 %)
авг 12 22:12:26 user-pc earlyoom[590]: mem avail: 1393 of 1990 MiB (69 %), swap free: 3981 of 3981 MiB (100 %)
● nohang.service - Highly configurable OOM prevention daemon
   Loaded: loaded (/etc/systemd/system/nohang.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2019-08-12 22:06:24 +09; 6min ago
     Docs: man:nohang(1)
           https://github.com/hakavlad/nohang
 Main PID: 619 (python3)
    Tasks: 1 (limit: 50)
   Memory: 7.8M (max: 100.0M)
      CPU: 259ms
   CGroup: /nohang.slice/nohang.service
           └─619 python3 /usr/local/bin/nohang --config /etc/nohang/nohang.conf

авг 12 22:06:24 user-pc systemd[1]: Started Highly configurable OOM prevention daemon.
авг 12 22:06:36 user-pc nohang[619]: config: /etc/nohang/nohang.conf
авг 12 22:06:36 user-pc nohang[619]: Monitoring has started!
● oomd.service - Userland out-of-memory killer daemon
   Loaded: loaded (/etc/systemd/system/oomd.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2019-08-12 22:06:23 +09; 6min ago
 Main PID: 591 (oomd_bin)
    Tasks: 3 (limit: 2302)
   Memory: 4.8M (min: 64.0M low: 64.0M)
      CPU: 25.900s
   CGroup: /system.slice/oomd.service
           └─591 /usr/local/bin/oomd_bin --interval 1 --config /etc/oomd.json

авг 12 22:13:20 user-pc oomd[591]: [../util/Fs.cpp:224] Unable to open /sys/fs/cgroup/user.slice/user-1000.slice/[email protected]/memory.max/memory.stat
авг 12 22:13:20 user-pc oomd[591]: [../util/Fs.cpp:224] Unable to open /sys/fs/cgroup/system.slice/memory.swap.current/memory.stat
авг 12 22:13:20 user-pc oomd[591]: [../util/Fs.cpp:224] Unable to open /sys/fs/cgroup/system.slice/memory.swap.max/memory.stat
авг 12 22:13:20 user-pc oomd[591]: [../util/Fs.cpp:224] Unable to open /sys/fs/cgroup/system.slice/cgroup.procs/memory.stat
авг 12 22:13:20 user-pc oomd[591]: [../util/Fs.cpp:224] Unable to open /sys/fs/cgroup/system.slice/pids.max/memory.stat
авг 12 22:13:20 user-pc oomd[591]: [../util/Fs.cpp:224] Unable to open /sys/fs/cgroup/system.slice/io.pressure/memory.stat
авг 12 22:13:20 user-pc oomd[591]: [../util/Fs.cpp:224] Unable to open /sys/fs/cgroup/system.slice/memory.low/memory.stat
авг 12 22:13:20 user-pc oomd[591]: [../util/Fs.cpp:224] Unable to open /sys/fs/cgroup/system.slice/cgroup.max.depth/memory.stat
авг 12 22:13:20 user-pc oomd[591]: [../util/Fs.cpp:224] Unable to open /sys/fs/cgroup/system.slice/cgroup.events/memory.stat
авг 12 22:13:20 user-pc oomd[591]: [../util/Fs.cpp:224] Unable to open /sys/fs/cgroup/system.slice/memory.pressure/memory.stat

from oomd.

danobi avatar danobi commented on July 19, 2024

We see this internally too. Something like 4.5% of a core all the time. When we profiled it we saw most of the cost coming from walking/reading cgroupfs. At some point in the next 6 months I'm planning on using the edge triggered PSI epoll interface to see if we can cut down on CPU.

from oomd.

hakavlad avatar hakavlad commented on July 19, 2024

What files does oomd monitor besides memory.pressure? memory.stat? memory.current? something else?

Can I configure oomd so that it only monitors /proc/pressure/memory or only /foo/bar/memory.pressure (and nothing more to decrease CPU usage)?

from oomd.

danobi avatar danobi commented on July 19, 2024

https://github.com/facebookincubator/oomd/blob/master/Oomd.cpp#L192-L201

That information is collected for every cgroup oomd is assigned to monitor.

Can I configure oomd so that it only monitors /proc/pressure/memory or only /foo/bar/memory.pressure

Yes, that should be possible. You probably only want the pressure_above plugin. See the docs for details on how to configure.

from oomd.

hakavlad avatar hakavlad commented on July 19, 2024

Thank you!

That information is collected for every cgroup

Seems like PSI epoll interface won't solve the problem (memory.pressure files is small portion of readable files).

from oomd.

danobi avatar danobi commented on July 19, 2024

Seems like PSI epoll interface won't solve the problem (memory.pressure files is small portion of readable files).

Yeah that's true. I was thinking we might be able to introduce some short circuit behavior w/ the rule handling. But the problem is that some plugins need to be run to update their sliding windows. We may need to do something clever.

from oomd.

danobi avatar danobi commented on July 19, 2024

That or I go try and fix improve the perf on the kernel side. Need to decide which is more feasible. The kernel stuff would probably benefit everyone else the most.

from oomd.

cdown avatar cdown commented on July 19, 2024

3013322 should also help a bit in the interim, since it avoids going through these code paths. In my tests, depending on cgroup hierarchy size, it can save from 10-20% CPU.

from oomd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.