Coder Social home page Coder Social logo

lure's Introduction

lure

Lustre Filesystem Realtime Resource Explorer

What is it?

One single binary does it all! lure is a tool allowing you to monitor lustre clients, OSS and MDS servers in realtime with a very simple to use commandline utility.

It also provides the stats available via web browser interface and allows to feed server and client statistics into an InfluxDB.

What's new in this version?

  • report Lustre client performance stats

Note

Developed on an ldiskfs based lustre system. Should work on ZFS based systems too, let me know!

Also, I started this project as I was in need of a very simple to use tool which doesn't have 3rd party package or other software dependencies, can be easily distributed(just copy the binary to the server you want to look at) and doesn't require a PhD to get it built and working. The code might look a bit clunky here and there in its first iteration, but it does the job for me, and I'll see where I can improve it, if required. I'll also add more code documentation as I work on it and time allows.

Current functionality:

Lustre client Stats

  • Report throughput and metadata statistics

MDS Stats

  • Report MDT metadata performance statistics
  • Report MDT jobstats

OST Stats

  • Report OST throughput statistics
  • Report OST jobstats

Web/JSON interface

  • Report client, MDT and OST performance statistics, incl. jobstats
  • All stats can be pulled via HTTP Get. Details further down

InfluxDB support

  • feed all client, MDT and OST stats and jobstats directly into an InfluxDB
  • support for InfluxDB 1.8 or 2.x or later
  • Note: You can use this tool with Prometheus too, just use the Prometheus json exporter.

Stuff I'm working on for the next release

  • Continuous code clean up
  • Report OS, MDT and client IOPs details
  • Add capacity and inode ldiskfs consumption reporting

Installation

Quite simple actually. Just download the binary from here and run it on your Lustre client, MDS or OSS.

Or git clone https://github.com/storagebit/lure/ and cd into the bin directory where you find the binary or build and compile it from the source using the build_lure.sh bash script in the src directory. The choice is yours.

How to use it

Also, quite simple.

$ ./lure -h
Usage of ./lure:
  -daemon
    	Run as daemon in the background. No console output but stats available via web interface.
  -feedtoinflux
    	Store statistics in InfluxDB
  -ignore
    	Don't report OST stats.
  -ignoremdt
    	Don't report MDT stats.
  -influxbucket string
    	InfluxDB bucket (default "lure")
  -influxorg string
    	InfluxDB org (default "storagebit")
  -influxport string
    	InfluxDB server port (default "8086")
  -influxserver string
    	InfluxDB server name or IP (default "localhost")
  -influxtoken string
    	Read/Write token for the bucket or user:password in the InfluxDB (default "lure:password")
  -interval int
    	Sample interval in seconds (default 1)
  -jobstats
    	Report Lustre Jobstats for MDT and OST devices.
  -port int
    	HTTP port used to access the the stats via web browser. (default 8666)
  -version
    	Print version information.

To access the stats via web browser use: http://<ip address>:<port number>/stats as the URL

If you want to web access only, run lure in a fashion similar to: nohup ./lure -daemon /dev/null 2>&1 & Or you can also write a systemd unit file and run it as a lightweight daemon or service. That's how I run it.

Sample command line output(web will look very similar)

MDT Metadata Stats /s:
         Device      open     close     mknod      link    unlink     mkdir     rmdir    rename   getattr   setattr  getxattr  setxattr    statfs      sync  s_rename  c_rename
  testfs-MDT0001         0         0         0         0         0         0         0         0         0         0         0         0         1         0         0         0
  testfs-MDT0000         0         0         0         0         0         0         0         0         0         0         0         0         1         0         0         0

OST Operation Stats /s:
         Device  write_bytes   read_bytes      setattr       statfs       create      destroy        punch         sync     get_info     set_info
  testfs-OST0001            0            0            0            2            0            0            0            0            0            0
  testfs-OST0000            0            0            0            2            0            0            0            0            0            0

Stats in JSON format via HTTP Get

  • Lustre client stats via HTTP Get at http://<ip address>:<port number>/json?stats=client
  • MDT stats via HTTP Get at http://<ip address>:<port number>/json?stats=mdt
  • OST stats via HTTP Get at http://<ip address>:<port number>/json?stats=ost
  • MDT Jobstats via HTTP Get at http://<ip address>:<port number>/json?stats=mdtjob
  • OST Jobstats via HTTP Get at http://<ip address>:<port number>/json?stats=ostjob

Returns HTTP status 204 if there is no data to display, HTTP status 500 if there is an internal error, or HTTP status 400 if the request/URL was incorrect.

Note on InfluxDB

  • lure supports v1.8+ and the new InfluxDB format as introduced with version 2.x+
  • If you use v1.8+, as I do mostly, create the DB manually and setup user credentials with read/write access for the DB
  • For v1.8+, use the database name or the database/retention_policy name as "bucket" and the user:password for the token

Note on the example Grafana dashboard

  • setup the InfluxDB data source as v1 InfluxDB connection
  • Don't forget to match the sample interval to the lure interval

Happy Lustre Real-Time Monitoring!

lure's People

Contributors

storagebit avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.