Coder Social home page Coder Social logo

cicerops / monitoring-check-grafana Goto Github PK

View Code? Open in Web Editor NEW
3.0 4.0 0.0 37 KB

Monitor a Grafana datasource against data becoming stale to detect data loss or other dropout conditions.

License: GNU Affero General Public License v3.0

Shell 100.00%
monitoring database icinga2 icinga2-plugin grafana grafana-datasource influxdb data stale freshness

monitoring-check-grafana's Introduction

image

image

image

monitoring-check-grafana

About

A monitoring sensor for checking a Grafana datasource against data becoming stale. This will let you detect data loss or other dropout conditions of feeds into your datasources.

Goals

Tired of operations or engineering messing with the datasources or the sensors under the hood you are just watching in Grafana?

This plugin is an attempt to have a basic end-to-end monitoring probe covering the whole track of data flowing from arbitrary sensors into a (timeseries) database and then being displayed in Grafana. So, this probe basically checks for success in:

  • Acquisition: Data is received by the DAQ system.
  • Storage: Measurements are stored into the database.
  • Retrieval: Measurements are retrieved from the database.
  • Display: Data is displayed in Grafana (almost).

This is nearly to-the-glass monitoring as it probes the very same Grafana API endpoints as the frontend uses for fetching metric data from, just before rendering it to the display.

References

We are currently using this plugin for monitoring freshness of data flows from different sources into InfluxDB:

Kudos to all the people working behind the scenes for providing these great open data resources to the community!

Usage

$ ./check-grafana-datasource-stale.sh --help

Options:
-u, --uri           Grafana API datasource proxy URI
-d, --database      Database name

-t, --table         Table name
-w, --warning       Maximum age threshold of data to result in warning status
-c, --critical      Maximum age threshold of data to result in critical status

-h, --help          Print detailed help
-V, --version       Print version information
-v, --verbose       Turn on verbose output

Example

Sensor invocation:

./check-grafana-datasource-stale.sh \
    --uri https://datahub.example.org/grafana/api/datasources/proxy/42/query \
    --database testdrive \
    --table temperature \
    --warning 12h \
    --critical 3d \
    --verbose

Sensor output:

INFO:  Checking testdrive:temperature for data not older than 3d
INFO:  Checking testdrive:temperature for data not older than 12h
WARNING - Data in testdrive:temperature is stale for 12h or longer

Screenshot

Data acquisition from luftdaten.info triggered a data loss warning

image

when the people operating the platform had to perform some maintenance work on the database

If someone is wondering: The API is down for maintenance. Today we received value no. ‘2^31+1’ . But the database was defined with a maximum of 2^31 values. We are currently changing this to 2^63. But this may need some time.

— OK Lab Stuttgart (@codeforS) March 31, 2018

Install prerequisites

This sensor uses the fine programs HTTPie and jq, please install them on your system.

Debian

apt install httpie jq

# Optionally
pip install httpie

macOS

brew install httpie jq

Setup Icinga plugin

Plugin environment

mkdir -p /usr/local/lib/icinga2/plugins

Edit /etc/icinga2/constants.conf:

const CustomPluginDir = "/opt/monitoring/plugins"

Installation

git clone https://github.com/daq-tools/monitoring-check-grafana /opt/monitoring-check-grafana
ln -s /opt/monitoring-check-grafana/check-grafana-datasource-stale.sh /opt/monitoring/plugins/check-grafana-datasource-stale
ln -s /opt/monitoring-check-grafana/icinga-command-check-grafana.conf /etc/icinga2/conf.d/command-check-grafana.conf

Configuration

A blueprint for a usual configuration object:

object Service "Grafana datasource freshness for testdrive:temperature" {
  import "generic-service"
  check_command         = "check-grafana-datasource-stale"

  host_name             = "datahub.example.org"
  vars.sla              = "24x7"

  vars.grafana_uri      = "https://datahub.example.org/grafana/api/datasources/proxy/42/query"
  vars.grafana_database = "testdrive"
  vars.grafana_table    = "temperature"
  vars.grafana_warning  = "1h"
  vars.grafana_critical = "12h"

  # Optionally assign this service exclusively to these notification recipients only
  #vars.notification.mail.users  = [ "bruce-lee", "chuck-norris" ]
  #vars.notification.mail.groups = [ "null" ]
}

See also icinga-service-check-grafana.example.conf.

Project information

About

The "monitoring-check-grafana" sensor program is released under the GNU AGPL license. Its source code lives on GitHub.

If you'd like to contribute you're most welcome! Spend some time taking a look around, locate a bug, design issue or spelling mistake and then send us a pull request or create an issue.

Thanks in advance for your efforts, we really appreciate any help or feedback.

License

Licensed under the GNU AGPL license. See LICENSE file for details.

monitoring-check-grafana's People

Contributors

amotl avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.