icinga / icinga2-diagnostics Goto Github PK
View Code? Open in Web Editor NEWShell script for analyzing Icinga 2 installations.
License: GNU General Public License v3.0
Shell script for analyzing Icinga 2 installations.
License: GNU General Public License v3.0
Could someone please run virt-what
on a VMware Guest (Linux) and enter the output in this issue?
This could be useful for determining:
depends on #3
This needs #3 to be implemented first
Make sure to check the setting that's actually used, especially with FPM
Check if InfluxDB is installed and if so, get some basic information like version.
If sar
is installed use it's output. Otherwise use tools like free
, uptime
and vmstat
.
Check environment variables, curlrc, etc. for configured proxies.
Take care to check them for root, icinga and every user that might be involved.
Only show if there is a proxy set (maybe the way it was set) not the proxy itself because the configuration might contain passwords.
Add an option for a mode which analyzes the setup and gives hints where to look for problems and how to fix them yourself. Maybe this could be the first step when having a problem. Rerunning the script in its current default mode to provide information when asking for help could become the second step when all suggestions did not help with debugging.
Thanks @dnsmichi for the idea.
Especially the one Director uses if installed.
In standard output there are some passwords (e.g. Api Users). Mask them as a default.
Think about adding an option to have them in clear text though.
In rare occasions the sync between nodes, esp. master nodes can break. If it breaks completely you see the nodes as disconnected but in even rarer occasions the sync continues but misses some objects.
e.g. you acknowledge a problem on one node but the other one misses the acknowledgement. This way you end up with an acknowledged problem but you might still get notifications for it.
We should find a way how to check if two nodes within a zone are synced completely.
Add an option to have the script output not on stdout but a tarball which includes the whole icinga 2 configuration, logs, etc.
e.g. openssl.
Depends on #6
Shows "inactive" even when Iptables is activated
Needs #6 to be implemented first
*Just output "local firewall off/on"
A uses setup showed Icinga 2 segfaulting. Search for entries like this in the syslog logs and report them as anomaly.
This is a thing we see very often in support.
Since there are no packages for director right now, there are different ways to get Director installed and sometimes it's hard for support engineers and even the user to determine the current version.
Possible ways of installation I can think of:
git clone
git clone
with checkout
of specific tag or releaseWork through the logs to see which plugins are called. Don't just list the loglines but only count the plugin with full path. Options to plugins often contain sensitive data so we don't want that to be part of the output.
The empty headline "Anomalies found" can be misleading and users might think there were anomalies found while this is only a headline. When there is nothing below the headline there were no anomalies found.
A users setup showed lots of zombie processes which only disappeared after restarting Icinga 2.
Check for such processes and show them in the anomaly list.
Verify that the icinga
CheckCommand is used.
Even better to use it's return to check for all used Icinga 2 versiones in the setup. Some users can not update all nodes within a reasonable timeframe so checking if they are up-to-date might be important for debugging.
Maybe it makes sense to have some external script to proceed this data? Cause it takes ~10 seconds to generate output for configuration with ~5k services, ~400 hosts and ~15 zones.
Check if the checks are evently distributred between nodes within an zone.
When this new feature is implemented, we should collect some sample data from setups and then introduce thresholds for anomaly detection.
Ways to find if the nodes are unbalanced are discribed in https://www.icinga.com/docs/icinga2/latest/doc/15-troubleshooting/#late-check-results-in-distributed-environments and https://www.icinga.com/2016/08/11/analyse-icinga-2-problems-using-the-console-api/
Check if Graphite is installed and if so, get some basic information like version.
This will significantly shorten the output but not the time it takes to generate the output. Maybe we should print the headline of packages and start searching
Use variables to list all packages not signed by the icinga key. Just add every matching package to a list (and maybe increase a counter). Show the list in the end where anomalies are listed.
Change to a more elegant way than lots of nested if's.
Just an idea.
OS Version: cat: /etc/redhat-release: No such file or directory
Ubuntu
14.04
Hypervisor: Running on hardware or unknown hypervisor
CPU cores: 12
RAM: 20G
Firewall: active
Checked script on Ubuntu machine
Maybe even when a release as a git clone is found because this might indicate that the user just wants to use git but it also might indicate that they just hit a release by default.
Be silent about checked but not found anomalies.
Currently 2 is the maximum.
I think we don't need a version check for the maximum, it seems, this will persist for some time.
Only do this check when -z
is added because it takes some time on bigger installation.
The script still finds the old PHP installation used before Icinga Web 2 2.5 or still there from updating. Best would be to check for both so side effects of leaving old PHP when upgrading to FPM can still be seen.
A good start might be
SHOW GLOBAL VARIABLES LIKE '%version%';
SHOW ENGINE INNODB STATUS \G
I borrowed these examples and the idea from @lazyfrosch from Monitoring Portal. Thank you. ๐
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.