Coder Social home page Coder Social logo

librato-collectd-docker's Introduction

librato-collectd-docker

This project contains a custom collectd Exec plugin for gathering statistics from running Docker containers using the Docker API. At this time, the installation steps are brief but manual.

The output of this plugin is formatted for the Librato monitoring service, although you could technically send it to any compatible metrics receiver. Each metric defines its plugin_instance in the format librato-<container_id>, which is then rewritten on the fly in Librato's API to extract the container identifier into Librato's source dimension, and to remove the librato- marker.

An example metric originating from the plugin might look like this:

collectd/docker-librato-cc899ab3e11b/cpu/kernel

... before being rewritten into:

collectd.docker.cpu.kernel

... with a source of cc899ab3e11b.

Usage

Dependencies

  • Python >= 2.7
  • Docker >= 1.5
  • collectd >= 4.0 (Exec plugin support)

Environment Variables

None required. Collectd will pass the necessary COLLECTD_INTERVAL and COLLECTD_HOSTNAME variables into the script at runtime.

Installation

The custom plugin will need to be installed along with an updated types.db (docker.db). Additionally, we need to add a non-superuser user to the docker system group for access to the UNIX socket where the Docker API is listening. For our purposes, we've chosen the unprivileged nobody system user, although you can adjust this as needed for your environment. The collectd service can then be restarted.

$ git clone https://github.com/librato/librato-collectd-docker.git
$ sudo cp collectd-docker.py /usr/share/collectd/
$ sudo cp docker.conf /etc/collectd/collectd.conf.d/
$ sudo cp docker.db /etc/collectd/collectd.conf.d/
$ sudo chmod +x /usr/share/collectd/collectd-docker.py
$ sudo usermod -a -G docker nobody
$ sudo service collectd restart

Configuration

The included docker.conf should either be installed into your collectd configurations directory as demonstrated above, or lacking that capability, the following configuration should be enabled in your collectd.conf. Any configuration changes will need to be followed with a service restart.

LoadPlugin exec
<Plugin exec>
  Exec "nobody:docker" "/usr/share/collectd/collectd-docker.py"
</Plugin>

# Add custom TypesDB for network counter stats
TypesDB "/usr/share/collectd/types.db" "/etc/collectd/collectd.conf.d/docker.db"

Note that the script supports connections to the Docker API via either the default UNIX socket at unix://var/run/docker.sock or a TCP port. To change the default URL, simply edit the Exec line above to include the URL as an argument. For example, if your Docker API is listening via TCP on port 2375, you'll want to edit the line as such:

  Exec "nobody" "/usr/share/collectd/collectd-docker.py" "http://127.0.0.1:2375"

License

This project is distributed under the MIT license.

librato-collectd-docker's People

Contributors

obfuscurity avatar chancefeick avatar vaidyg avatar

Stargazers

Patrick Luzolo avatar  avatar ᎠᎡ. Ѕϵrgϵ Ѵictor avatar  avatar Łukasz Korecki avatar Hans Kristian Flaatten avatar e11it avatar Jan Garaj avatar  avatar

Watchers

Andre Lewis avatar Alex Kahn avatar Josh Whybark avatar Mike Heffner avatar Lin Lin avatar Dan Kuebrich avatar Bryan Mikaelian avatar Deprecated avatar Krzysztof Rempola avatar James Cloos avatar Ivan von Nagy avatar Ray Jenkins avatar Jakub Fojtl avatar Mark Tozzi avatar Dave Mangot avatar Peter N avatar Patrick Luzolo avatar Luke Thomas avatar Nicolas Delaby avatar vivekdev avatar Jiri Tomek avatar Michael Beale avatar Justin Doherty avatar  avatar Maciej Pyszyński avatar Filip Elias avatar Krzysztof Gądek avatar Vinh Nguyen avatar Bruce MacNaughton avatar Or Lavy avatar  avatar rob salmond avatar  avatar Martin Kunc avatar Mike Morgan avatar  avatar Quin Rogers avatar Simon Key avatar Serhii Beketov avatar Anuj Paliwal avatar Hunter Sherman avatar Jirka Kruml avatar Bartłomiej Perucki avatar Pawel Kedzior avatar  avatar Eric Yang avatar Shokri avatar Samuel Móro avatar Karlo avatar chris klinedinst avatar

librato-collectd-docker's Issues

Python2.6 compatibility

$ DEBUG=1 /usr/share/collectd/collectd-docker.py
  File "/usr/share/collectd/collectd-docker.py", line 175
    'docker-librato.\w+.cpu_stats.*',
                                    ^
SyntaxError: invalid syntax

Reported by @jzruscio.

Works fine, but...

Do you have a documentation which data is send and what kind of unit is used?

Docker monitoring crash during PUTVAL stats

We use ECS and librato agent on ec2-instances.

Jun 19 11:31:24 ip-172-22-6-184 collectd[11299]: exec plugin: exec_read_one: error = Traceback (most recent call last):
Jun 19 11:31:24 ip-172-22-6-184 collectd[11299]: exec plugin: exec_read_one: error = File "/opt/collectd/share/collectd/collectd-docker.py", line 458, in
Jun 19 11:31:24 ip-172-22-6-184 collectd[11299]: exec plugin: exec_read_one: error = format_stats(stats)
Jun 19 11:31:24 ip-172-22-6-184 collectd[11299]: exec plugin: exec_read_one: error = File "/opt/collectd/share/collectd/collectd-docker.py", line 409, in format_stats
Jun 19 11:31:24 ip-172-22-6-184 collectd[11299]: exec plugin: exec_read_one: error = build_network_stats_for(stats)
Jun 19 11:31:24 ip-172-22-6-184 collectd[11299]: exec plugin: exec_read_one: error = File "/opt/collectd/share/collectd/collectd-docker.py", line 385, in build_network_stats_for
Jun 19 11:31:24 ip-172-22-6-184 collectd[11299]: exec plugin: exec_read_one: error = for interface, interface_stats in stats['networks'].iteritems():
Jun 19 11:31:24 ip-172-22-6-184 collectd[11299]: exec plugin: exec_read_one: error = KeyError: 'networks'
Jun 19 11:31:24 ip-172-22-6-184 collectd[11299]: exec plugin: Program `/opt/collectd/share/collectd/collectd-docker.py' has closed STDERR.

PUTVAL "localhost/docker-librato-7e1810f450be/blkio-io_service_bytes_read" interval=60 N:35463168
PUTVAL "localhost/docker-librato-7e1810f450be/memory_stats/failcnt" interval=60 N:0
PUTVAL "localhost/docker-librato-7e1810f450be/memory_stats/stats/unevictable" interval=60 N:0
PUTVAL "localhost/docker-librato-7e1810f450be/memory-page_faults" interval=60 N:9001
PUTVAL "localhost/docker-librato-7e1810f450be/cpu-total" interval=60 N:7213927309
PUTVAL "localhost/docker-librato-7e1810f450be/blkio-io_service_bytes_total" interval=60 N:35491840
PUTVAL "localhost/docker-librato-7e1810f450be/network-rx_packets" interval=60 N:20930
PUTVAL "localhost/docker-librato-7e1810f450be/memory-active_file" interval=60 N:4706304
Traceback (most recent call last):
File "/opt/collectd/share/collectd/collectd-docker.py", line 458, in
format_stats(stats)
File "/opt/collectd/share/collectd/collectd-docker.py", line 409, in format_stats
build_network_stats_for(stats)
File "/opt/collectd/share/collectd/collectd-docker.py", line 385, in build_network_stats_for
for interface, interface_stats in stats['networks'].iteritems():
KeyError: 'networks'

"errors":[]}-1 1 Type `cpu_stats/system_cpu_usage' isn't defined.

  • OS version: CentOS 7.3.1611 in an AWS EC2 instance.
  • Docker version: docker-engine-17.05.0.ce-1.el7.centos.x86_64
  • Collectd version: collectd-core-5.7.1_librato1.413-0.x86_64

Steps:

  1. Install docker on the instance and run some containers.
  2. Download the Ansible Librato role and patch with librato/ansible-librato#4 because varnish-libs rpm was missing in the repo at provisioning time of writing.
  3. Provision with these relevant settings:
librato_rh_version: '5.7.1_librato1.413'
librato_hostname: "{{ ec2_tag_Hostname }}"
librato_fqdn_lookup: false
librato_enabled_plugins: ['docker']
librato_docker_host: "{{ inventory_hostname }}"
librato_docker_port: 2375
librato_logging_use_log_file: true
librato_logging_log_file_log_level: err
librato_logging_log_file_filename: stdout

I'm getting some metrics, but also getting these errors: (use sudo journalctl -u collectd -f to see the errors)

Jul 14 03:47:02 myhost collectd[15084]: {"measurements":{"summary":{"total":34,"accepted":34,"failed":0}},"errors":[]}{"measurements":{"summary":{"total":37,"accepted":37,"failed":0}},"errors":[]}{"measurements":{"summary":{"total":67,"accepted":67,"failed":0}},"errors":[]}-1 1 Type `cpu_stats/system_cpu_usage' isn't defined.
Jul 14 03:47:02 myhost collectd[15084]: -1 1 Type `memory_stats/stats/hierarchical_memsw_limit' isn't defined.
Jul 14 03:47:02 myhost collectd[15084]: -1 1 Type `memory_stats/stats/swap' isn't defined.
Jul 14 03:47:02 myhost collectd[15084]: -1 1 Type `cpu_stats/online_cpus' isn't defined.
Jul 14 03:47:02 myhost collectd[15084]: -1 1 Type `cpu_stats/throttling_data/periods' isn't defined.
Jul 14 03:47:02 myhost collectd[15084]: -1 1 Type `memory_stats/stats/unevictable' isn't defined.
Jul 14 03:47:02 myhost collectd[15084]: -1 1 Type `memory_stats/usage' isn't defined.
Jul 14 03:47:06 myhost collectd[15084]: {"measurements":{"summary":{"total":41,"accepted":41,"failed":0}},"errors":[]}-1 1 Type `memory_stats/stats/unevictable' isn't defined.
Jul 14 03:47:06 myhost collectd[15084]: -1 1 Type `cpu_stats/online_cpus' isn't defined.
Jul 14 03:47:06 myhost collectd[15084]: -1 1 Type `cpu_stats/throttling_data/periods' isn't defined.
Jul 14 03:47:06 myhost collectd[15084]: -1 1 Type `memory_stats/stats/hierarchical_memsw_limit' isn't defined.
Jul 14 03:47:06 myhost collectd[15084]: -1 1 Type `cpu_stats/system_cpu_usage' isn't defined.
Jul 14 03:47:06 myhost collectd[15084]: -1 1 Type `memory_stats/stats/swap' isn't defined.
Jul 14 03:47:06 myhost collectd[15084]: -1 1 Type `memory_stats/usage' isn't defined.
Jul 14 03:47:10 myhost collectd[15084]: {"measurements":{"summary":{"total":34,"accepted":34,"failed":0}},"errors":[]}-1 1 Type `memory_stats/usage' isn't defined.
Jul 14 03:47:10 myhost collectd[15084]: -1 1 Type `cpu_stats/online_cpus' isn't defined.
Jul 14 03:47:10 myhost collectd[15084]: -1 1 Type `cpu_stats/system_cpu_usage' isn't defined.
Jul 14 03:47:10 myhost collectd[15084]: -1 1 Type `memory_stats/stats/unevictable' isn't defined.
Jul 14 03:47:10 myhost collectd[15084]: -1 1 Type `memory_stats/stats/swap' isn't defined.
Jul 14 03:47:10 myhost collectd[15084]: -1 1 Type `memory_stats/stats/hierarchical_memsw_limit' isn't defined.
Jul 14 03:47:10 myhost collectd[15084]: -1 1 Type `cpu_stats/throttling_data/periods' isn't defined.

Note: I've checked and compared the collectd-docker.py in /opt/collectd/share/collectd/collectd-docker.py on the instance is the same as in this repo's collectd-docker.py file. Everything except the copyright is the same.

Stats collection fails for containers with no network

I recently ran up this plugin on our Nomad docker servers, and found that it crashed during statistics collection when interacting with a container launched with networked mode set to "host". In this situation, the "networks" section of the JSON object is missing, not empty. This causes the script to crash on line 374, in turn causing the other metrics to not be collected.

This should be correctable with a simple check to see if the key is present before iterating.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.