Coder Social home page Coder Social logo

flapjackfeeder's Introduction

Flapjack Flapjack

Build Status

Flapjack is a flexible monitoring notification routing system that handles:

  • Alert routing (determining who should receive alerts based on interest, time of day, scheduled maintenance, etc)
  • Alert summarisation (with per-user, per media summary thresholds)
  • Your standard operational tasks (setting scheduled maintenance, acknowledgements, etc)

Flapjack will be immediately useful to you if:

  • You want to identify failures faster by rolling up your alerts across multiple monitoring systems.
  • You monitor infrastructures that have multiple teams responsible for keeping them up.
  • Your monitoring infrastructure is multitenant, and each customer has a bespoke alerting strategy.
  • You want to dip your toe in the water and try alternative check execution engines like Sensu, Icinga, or cron in parallel to Nagios.

Try it out with the Quickstart Guide

The Quickstart guide will help you get Flapjack up and running in a VM locally using Vagrant and VirtualBox.

The technical low-down

Flapjack provides a scalable method for dealing with events representing changes in system state (OK -> WARNING -> CRITICAL transitions) and alerting appropriate people as necessary.

At its core, Flapjack processes events received from external check execution engines, such as Nagios. Nagios provides a 'perfdata' event output channel, which writes to a named pipe. flapjack-nagios-receiver then reads from this named pipe, converts each line to JSON and adds them to the events queue.

Flapjack sits downstream of check execution engines (like Nagios, Sensu, Icinga, or cron), processing events to determine:

  • if a problem has been detected
  • who should know about the problem
  • how they should be told

Additional check engines can be supported by adding additional receiver processes similar to the nagios receiver.

Installing

NB: v2 packages will be ready soon -- for the moment these instructions will not work

Ubuntu Precise 64 (12.04):

Tell apt to trust the Flapjack package signing key:

gpg --keyserver keys.gnupg.net --recv-keys 803709B6
gpg -a --export 803709B6 | sudo apt-key add -

Add the Flapjack Debian repository to your Apt sources:

echo "deb http://packages.flapjack.io/deb/v2 precise main" | sudo tee /etc/apt/sources.list.d/flapjack.list

Install the latest Flapjack package:

sudo apt-get update
sudo apt-get install flapjack

Alternatively, download the deb and install using sudo dpkg -i <filename>

The Flapjack package is an Omnibus package and as such contains most dependencies under /opt/flapjack, including Redis.

Installing the package will start Redis (non standard port) and Flapjack. You should now be able to access the Flapjack Web UI at:

http://localhost:3080/

And consume the REST API at:

http://localhost:3081/

N.B. The Redis installed by Flapjack runs on a non-standard port (6380), so it doesn't conflict with other Redis instances you may already have installed.

Other OSes:

Currently we only make a package for Ubuntu Precise (amd64). If you feel comfortable getting a ruby environment going on your preferred OS, then you can also just install Flapjack from rubygems.org:

gem install flapjack

Using a tool like rbenv or rvm is recommended to keep your Ruby applications from intefering with one another.

Alternatively, you can add support for your OS of choice to omnibus-flapjack and build a native package. Pull requests welcome, and we'll help you make this happen!

You'll also need Redis >= 2.6.12.

Configuring

Have a look at the default config file and modify things as required. The package installer copies this to /etc/flapjack/flapjack_config.toml if it doesn't already exist.

# edit the config
sudo vi /etc/flapjack/flapjack_config.toml

# reload the config
sudo /etc/init.d/flapjack reload

Running

Ubuntu Precise 64:

After installing the Flapjack package, Redis and Flapjack should be automatically started.

First up, start Redis if it's not already started:

# status:
sudo /etc/init.d/redis-flapjack status

# start:
sudo /etc/init.d/redis-flapjack start

Operating Flapjack:

# status:
sudo /etc/init.d/flapjack status

# reload:
sudo /etc/init.d/flapjack reload

# restart:
sudo /etc/init.d/flapjack restart

# stop:
sudo /etc/init.d/flapjack stop

# start:
sudo /etc/init.d/flapjack start

Usage

Please see the documentation.

Developing Flapjack

Information on developing more Flapjack components or contributing to core Flapjack development can be found in the Developing section of the docs.

Note that the master branch corresponds to Flapjack 2; maintenance builds for Flapjack 1 are built from the maint/1.x branch.

Documentation Submodule

We have the documentation for this project on a github wiki and also referenced as a submodule at /doc in this project. Run the following commands to populate the local doc/ directory:

git submodule init
git submodule update

If you make changes to the documentation locally, here's how to publish them:

  • Checkout master within the doc subdir, otherwise you'll be commiting to no branch, a.k.a. no man's land.
  • git add, commit and push from inside the doc subdir
  • Add, commit and push the doc dir from the root (this updates the pointer in the main git repo to the correct ref in the doc repo, we think...)

RTFM

All of the documentation.

flapjackfeeder's People

Contributors

ali-graham avatar bs-github avatar jessereynolds avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

flapjackfeeder's Issues

tagging checks

question

We are just trying to deploy flapjack with Icinga.

We would need to tagging checks going from Icinga to flapjack or add some special field from Icinga like "servicegroups" or " membership" of host/service. We would like to use tagging as in Sensu, where you can define for every check any tags.

I found definition of NEB module which Flapjackfeeder is using http://larsmichelsen.com/nagios/nagios-event-broker/. There is noted that full list of information which we can get from Icinga through this module can be found in the nebcallbacks.h https://github.com/Icinga/Icinga-core/blob/master/include/nebcallbacks.h. But I'm not sure if is possible to get above mentioned information about check. If it would be possible then we could adjust Flapjackfeeder to sending relevant data to the Flapjack where we can parse it.

flapjackfeeder no longer works against naemon 1.0.4

[1466853358] Error: Could not load module '/usr/local/lib/flapjackfeeder4.o' -> /usr/local/lib/flapjackfeeder4.o: undefined symbol: schedule_new_event
[1466853358] Error: Failed to load module '/usr/local/lib/flapjackfeeder4.o'.

Looks like to moved to schedule_event

Two flapjackfeeders doesn't work

If you try and configure two flapjackfeeder brokers in your Nagios, only the first of them will actually transmit events on to Flapjack, however both will establish their own TCP sockets to redis.

I also tried changing the NEBMODULE_MODINFO_TITLE to "flapjackfeeder-staging" and compiling it to "flapjackfeeder-staging.o" but this makes no difference.

Perhaps there's some global variables that are in scope in both instances of the module, or something.

Quotes are not being escaped in check name (nagios service)

2014-04-03T17:19:52+11:00 [WARN] :: flapjack-processor :: 
  Error deserialising event json: unexpected character at line 1, column 187 [parse.c:625], 
  raw json: "{\"entity\":\"foo-app-01\",\"check\":\"HTTP - status check - www.example.com\",\"type\":\"service\",\"state\":\"OK\",\"summary\":\"HTTP OK: Status line output matched \"HTTP\" - 503 bytes in 0.182 second response time\",\"details\":\"(null)\",\"time\":\"1396505992\"}"

Employ make or other appropriate build tool

So we can build by just running 'make'. I'm not sure how best to handle downloading of the redis library and any other dependencies.

We can also then have travis automatically building flapjackfeeder and telling us about any compilation errors.

We might even like to add some tests :-)

Merge Nagios Host Tags with Service Tags

If I add a custom _tag variable to a Nagios host definition the value does not get added to the event data sent for service checks associated with that host.

I'd like to be able to add tags at the host level and have them passed along to Flapjack with the service check data as well. Right now I have to add the _tag variable and value to all of my checks used by that host. This works but is not as flexible as I'd like it.

This is something I can easily do with Sensu but unfortunately I have to maintain a Legacy Nagios system for a while yet.

Compiling on Mac OS X fails

Currently the build instructions in the readme only work for linux I think. Here's what happens when I somewhat naively try building on Mac OS X 10.8.5:

jesse@Heart-of-Gold flapjackfeeder $ (cd src ; gcc -fPIC -g -O2 -DHAVE_CONFIG_H -DNSCORE -o flapjackfeeder.o flapjackfeeder.c -shared -fPIC ../../hiredis/libhiredis.a)
i686-apple-darwin11-llvm-gcc-4.2: ../../hiredis/libhiredis.a: No such file or directory

On Mac, hiredis builds a libhiredis.dylib, not libhiredis.a, so changing that name gets a little bit further:

jesse@Heart-of-Gold flapjackfeeder $ (cd src ; gcc -fPIC -g -O2 -DHAVE_CONFIG_H -DNSCORE -o flapjackfeeder.o flapjackfeeder.c -shared -fPIC ../../hiredis/libhiredis.dylib)
Undefined symbols for architecture x86_64:
  "_find_host", referenced from:
      _npcdmod_handle_data in ccUpaUZW.o
  "_find_service", referenced from:
      _npcdmod_handle_data in ccUpaUZW.o
  "_neb_deregister_callback", referenced from:
      _nebmodule_deinit in ccUpaUZW.o
  "_neb_register_callback", referenced from:
      _nebmodule_init in ccUpaUZW.o
  "_neb_set_module_info", referenced from:
      _nebmodule_init in ccUpaUZW.o
  "_schedule_new_event", referenced from:
      _nebmodule_init in ccUpaUZW.o
  "_strip", referenced from:
      _npcdmod_process_config_var in ccUpaUZW.o
  "_write_to_all_logs", referenced from:
      _npcdmod_process_config_var in ccUpaUZW.o
      _npcdmod_handle_data in ccUpaUZW.o
      _npcdmod_file_roller in ccUpaUZW.o
      _nebmodule_deinit in ccUpaUZW.o
      _nebmodule_init in ccUpaUZW.o
ld: symbol(s) not found for architecture x86_64
collect2: ld returned 1 exit status

Module connects to Redis but doesn't send events

I'm using flapjackfeeder4-v0.0.5.o on a Nagios 4.0.8 install (wired up to a Redis 3.0.2 server, if that matters), and when I start Nagios the log shows:

[1436914037] Nagios 4.0.8 starting... (PID=94053)
[1436914037] Local time is Tue Jul 14 15:47:17 PDT 2015
[1436914037] LOG VERSION: 2.0
[1436914037] qh: Socket '<omitted>/nagios/var/rw/nagios.qh' successfully initialized
[1436914037] qh: core query handler registered
[1436914037] nerd: Channel hostchecks registered successfully
[1436914037] nerd: Channel servicechecks registered successfully
[1436914037] nerd: Channel opathchecks registered successfully
[1436914037] nerd: Fully initialized and ready to rock!
[1436914037] wproc: Successfully registered manager as @wproc with query handler
[1436914037] wproc: Registry request: name=Core Worker 94054;pid=94054
[1436914037] wproc: Registry request: name=Core Worker 94055;pid=94055
[1436914037] wproc: Registry request: name=Core Worker 94057;pid=94057
[1436914037] wproc: Registry request: name=Core Worker 94058;pid=94058
[1436914037] wproc: Registry request: name=Core Worker 94059;pid=94059
[1436914037] wproc: Registry request: name=Core Worker 94063;pid=94063
[1436914037] wproc: Registry request: name=Core Worker 94062;pid=94062
[1436914037] wproc: Registry request: name=Core Worker 94060;pid=94060
[1436914037] wproc: Registry request: name=Core Worker 94056;pid=94056
[1436914037] wproc: Registry request: name=Core Worker 94064;pid=94064
[1436914037] wproc: Registry request: name=Core Worker 94065;pid=94065
[1436914037] wproc: Registry request: name=Core Worker 94061;pid=94061
[1436914037] flapjackfeeder: Copyright (c) 2013-2015 Birger Schmidt, derived from npcdmod
[1436914037] flapjackfeeder: This is version 'v0.0.5' running.
[1436914037] flapjackfeeder: redis connection (<omitted>.net:6379debug_level=0) has to be (re)established.
[1436914037] flapjackfeeder: redis connection (<omitted>.net:6379debug_level=0) established.
[1436914037] Error: Failed to add event to squeue '(nil)' with prio 1: Operation now in progress
[1436914037] Event broker module '<omitted>/nagios/lib/flapjackfeeder4-v0.0.5.o' initialized successfully.
[1436914037] Successfully launched command file worker with pid 94066

I'm not sure if that Error message is related, but while there's an active connection to the Redis database, events aren't being sent (validated by querying Redis and running a tcpdump, which shows no traffic other than the initial connection establishment).

EDIT: I just fixed a bug in the nagios config (missing line break), so the redis connection lines are now correct:

[1436916334] flapjackfeeder: Copyright (c) 2013-2015 Birger Schmidt, derived from npcdmod
[1436916334] flapjackfeeder: This is version 'v0.0.5' running.
[1436916334] flapjackfeeder: redis connection (<omitted>.net:6379) has to be (re)established.
[1436916334] flapjackfeeder: redis connection (<omitted>.net:6379) established.
[1436916334] Error: Failed to add event to squeue '(nil)' with prio 1: Operation now in progress
[1436916334] Event broker module '/opt/palantir/houston/releases/current/services/nagios/lib/flapjackfeeder4-v0.0.5.o' initialized successfully.

...but I'm still getting the same error and the same behavior.

Config option needed for Flapjack major version, and change Redis write behaviour based on that

Flapjack v2 requires a write (containing any data) to the events_actions list once new data has been placed on the events queue, for safely interruptible event processing. flapjackfeeder will need to know whether or not to place that data there, and then do it if so.

(That queue name is actually configurable, and then suffixed with "_actions", I'll see if I can make that configurable via the config line as well.)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.