Comments (10)
I also am experiencing this issue -- I just updated from 1.44.3 to 1.45.1 and it segfaults on startup.
Added:
[plugin:proc]
/sys/devices/pci/aer = no
to the config, and it no longer crashes.
here is the core dump it gave me:
Process 1506142 (netdata) of user 134 dumped core.
Module /opt/netdata/bin/srv/netdata without build-id.
Stack trace of thread 1506589:
#0 0x000000000088a46d recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#1 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#2 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#3 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#4 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#5 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#6 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#7 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#8 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#9 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#10 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#11 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#12 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#13 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#14 0x000000000088a67f recursively_find_pci_aer.isra.0 (/opt/netdata/bin/srv/net>
#15 0x000000000088b265 do_proc_sys_devices_pci_aer (/opt/netdata/bin/srv/netdata>
#16 0x00000000008230cb proc_main (/opt/netdata/bin/srv/netdata + 0x4230cb)
#17 0x00000000009794bc netdata_thread_init (/opt/netdata/bin/srv/netdata + 0x579>
#18 0x0000000000f5f733 start (/opt/netdata/bin/srv/netdata + 0xb5f733)
#19 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#20 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#21 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#22 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#23 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#24 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#25 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#26 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#27 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#28 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#29 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#30 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#31 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#32 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#33 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#34 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#35 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#36 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#37 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#38 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#39 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#40 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#41 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#42 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#43 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#44 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#45 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#46 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#47 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#48 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#49 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#50 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#51 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#52 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#53 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#54 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#55 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#56 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
#57 0x0000000000f613e3 __clone (/opt/netdata/bin/srv/netdata + 0xb613e3)
from netdata.
Wow! This is bad!
I see several issues:
- @Ferroin this is an ubuntu 22.04 and aparently @Woart used kickstart.sh to install Netdata. However he got a static install, instead of a binary package.
- @vkalintiris @stelfrag this latest stable Netdata and is crashing every minute. Please step in to find the issue and fix it.
from netdata.
@Woart hey. There appears to be an issue with a proc.plugin collector causing crashes.
Can you help us to find which one?
- open
netdata.conf
. - search for the
[plugin:proc]
section. - uncomment one entry, change
yes
tono
and restart netdata service. I suggest to start with/sys/class/drm
.
from netdata.
@Woart hey. There appears to be an issue with a proc.plugin collector causing crashes.
Can you help us to find which one?
- open
netdata.conf
.- search for the
[plugin:proc]
section.- uncomment one entry, change
yes
tono
and restart netdata service. I suggest to start with/sys/class/drm
.
/opt/netdata/etc/netdata/netdata.conf
[plugin:proc]
# /proc/net/dev = yes
# /proc/pagetypeinfo = no
# /proc/stat = yes
# /proc/uptime = yes
# /proc/loadavg = yes
# /proc/sys/fs/file-nr = yes
# /proc/sys/kernel/random/entropy_avail = yes
# /proc/pressure = yes
# /proc/interrupts = yes
# /proc/softirqs = yes
# /proc/vmstat = yes
# /proc/meminfo = yes
# /sys/kernel/mm/ksm = yes
# /sys/block/zram = yes
# /sys/devices/system/edac/mc = yes
# /sys/devices/pci/aer = yes
# /sys/devices/system/node = yes
# /proc/net/wireless = yes
# /proc/net/sockstat = yes
# /proc/net/sockstat6 = yes
# /proc/net/netstat = yes
# /proc/net/sctp/snmp = yes
# /proc/net/softnet_stat = yes
# /proc/net/ip_vs/stats = yes
# /sys/class/infiniband = yes
# /proc/net/stat/conntrack = yes
# /proc/net/stat/synproxy = yes
# /proc/diskstats = yes
# /proc/mdstat = yes
# /proc/net/rpc/nfsd = yes
# /proc/net/rpc/nfs = yes
# /proc/spl/kstat/zfs/arcstats = yes
# /proc/spl/kstat/zfs/pool/state = yes
# /sys/fs/btrfs = yes
# ipc = yes
# /sys/class/power_supply = yes
/sys/class/drm = no
Mar 29 16:15:49 MS184 systemd[1]: Reloading.
Mar 29 16:16:08 MS184 systemd[1]: netdata.service: Scheduled restart job, restart counter is at 15553.
Mar 29 16:16:08 MS184 systemd[1]: Stopped Real time performance monitoring.
Mar 29 16:16:08 MS184 systemd[1]: Starting Real time performance monitoring...
Mar 29 16:16:09 MS184 systemd[1]: [email protected]: Deactivated successfully.
Mar 29 16:16:09 MS184 systemd[1]: Starting Journal Service for Namespace netdata...
Mar 29 16:16:09 MS184 systemd-journald[3416793]: Failed to open /dev/kmsg, ignoring: Operation not permitted
Mar 29 16:16:09 MS184 systemd[1]: Started Real time performance monitoring.
Mar 29 16:16:09 MS184 systemd[1]: Started Journal Service for Namespace netdata.
Mar 29 16:16:10 MS184 kernel: [55055388.647775] P[proc][3417093]: segfault at 7f5d9b5607d8 ip 0000000000889c6d sp 00007f5d9b5607d8 error 6 in netdata[401000+b68000]
Mar 29 16:16:10 MS184 kernel: [55055388.647798] Code: 48 89 c7 5b e9 94 19 09 00 0f 1f 40 00 41 57 41 56 41 55 41 54 55 53 48 81 ec 00 10 00 00 48 83 0c 24 00 48 81 ec 00 10 00 00 <48> 83 0c 24 00 48 83 ec 28 64 48 8b 04 25 28 00 00 00 4889 84 24
Mar 29 16:16:10 MS184 systemd[1]: netdata.service: Main process exited, code=killed, status=11/SEGV
Mar 29 16:16:10 MS184 systemd[1]: netdata.service: Failed with result 'signal'.
from netdata.
i have same servers configuration with /opt/netdata installations (and same version 1.45) without problems
difference in ubuntu version.
working on ubuntu 20.04
segfault at ubuntu 22.04
from netdata.
@Woart can you try disabling [plugin:proc]
modules one by one and find the one that is causing Netdata to crash?
from netdata.
@Woart can you try disabling
[plugin:proc]
modules one by one and find the one that is causing Netdata to crash?
netdata works after
/sys/devices/pci/aer = no
from netdata.
@Woart, any chance you can install from source to get the coredump?
sudo apt-get install systemd-coredump
git clone --recursive https://github.com/netdata/netdata
# install required packages for building from the source
cd netdata
sudo packaging/installer/install-required-packages.sh netdata
# build from source
sudo CFLAGS="-O0 -ggdb" ./netdata-installer.sh --install-prefix /opt --disable-telemetry
# after installation it will try to start and will crash, you will see core dumps using
sudo coredumpctl list
# to get see the latest
sudo coredumpctl debug
# press Enter until you can type and type
bt
from netdata.
CFLAGS="-O0 -ggdb" ./netdata-installer.sh --install-prefix /opt --disable-telemetry
# /opt/netdata/bin/netdata -v
netdata v1.45.0-101-g926956efc
/opt/netdata/etc/netdata # cat netdata.conf
...
[plugin:proc]
# /proc/net/dev = no
# /proc/pagetypeinfo = no
# /proc/stat = no
# /proc/uptime = no
# /proc/loadavg = no
# /proc/sys/fs/file-nr = no
# /proc/sys/kernel/random/entropy_avail = no
# /proc/pressure = no
# /proc/interrupts = no
# /proc/softirqs = no
# /proc/vmstat = no
# /proc/meminfo = no
# /sys/kernel/mm/ksm = no
# /sys/block/zram = no
# /sys/devices/system/edac/mc = no
# /sys/devices/pci/aer = no
# /sys/devices/system/node = no
# /proc/net/wireless = no
# /proc/net/sockstat = no
# /proc/net/sockstat6 = no
# /proc/net/netstat = no
# /proc/net/sctp/snmp = no
# /proc/net/softnet_stat = no
# /proc/net/ip_vs/stats = no
# /sys/class/infiniband = no
# /proc/net/stat/conntrack = no
# /proc/net/stat/synproxy = no
# /proc/diskstats = no
# /proc/net/rpc/nfsd = no
# /proc/mdstat = no
# /proc/net/rpc/nfs = no
# /proc/spl/kstat/zfs/arcstats = no
# /proc/spl/kstat/zfs/pool/state = no
# /sys/fs/btrfs = no
# ipc = no
# /sys/class/power_supply = no
# /sys/class/drm = no
netdata doesn't want crashes anymore
from netdata.
This should be fixed in tomorrow's nightly version. Can you guys test it?
from netdata.
Related Issues (20)
- [Bug]: Debian native: nightly: Netdata service not started after installation
- [Bug]: Incorrect bloat calculation in go.d.plugin, postgres module HOT 6
- [Platform EOL]: centos-stream 8 will be EOL soon.
- [Bug]: latest netdata version (security patch) is missing in opensuse 15.5 repo HOT 2
- Upload source files to sentry.
- [Bug]: apcupsd module is unable to connect to UPS (docker) HOT 3
- [Bug]: ubuntu 22.04 dependencies missing HOT 4
- [Bug]: Ubuntu 22.04 Unmet dependencies HOT 4
- [Bug]: Chart filtering to prometheus appears to be broken HOT 3
- [Bug]: Netdata 1.45.4 fails to build with h2o, cloud and aclk disabled
- [Bug]: go.d/smartctl bad default graphs HOT 1
- [Bug]: Runing netdata plugin from nested directory uses wrong config dir HOT 1
- [Bug]: Increased number of Read/Write operations per second HOT 3
- [question]: my netdata cloud plan does not allow more than 5 active nodes.?earlier plans cancel and +8505USD/YEAR ! HOT 2
- [Bug]: cron netdata-updater script failed HOT 2
- [Feat]: carts for mariadb circular buffers
- [Bug]: Netdata agent that reports offline and then back online
- [Bug]: netdata plugins segfaults after update 1.45.3 to 1.45.4 HOT 7
- [Feat]: rspamd monitoring with netdata HOT 3
- [Bug]: You have no nodes HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from netdata.