gsxryan / storj_telegraf_mon Goto Github PK
View Code? Open in Web Editor NEWStorJ V3 Health and Success Rates output using telegraf inputs.exec to Grafana Dashboard
StorJ V3 Health and Success Rates output using telegraf inputs.exec to Grafana Dashboard
Instead of:
storj_telegraf_mon/folder_size.sh
Line 16 in a5d077f
Why not query the sqlite database or dashboard api?
Error, Etherscan API returned: {"status":"1","message":"OK","result":"############"}
[[: not found
in the if statements
i'm using ubuntu 18.04 lts
I see that there is a new host variable to avoid two datasources, and I am wondering how I should set this.
I can't seem to get any kind of output into grafana. I know that all my scripts are setup correctly and working and they are seemingly configured correctly in telegraf. Is there any way I can look into influxdb to see what kind of data if any is being put into the database?
Hi :)
I am having an issue with tokenes.sh.
When I run: telegraf --debug --config /etc/telegraf/telegraf.conf --input-filter exec --test
I get: D! [inputs.exec] Error in plugin: metric parse error: expected tag at 1:7: "Error, Etherscan API returned: "
However when I run ./tokens.sh through terminal I get the correct response: e.g.
StorJToken,stat=tokens,WalletAddress="0xXXXXXXXXXXXXXXXXXXXXXXXXX" BalanceSTORJ=10,BalanceUSD=1,BalanceEUR=1 1 StorJToken,stat=prices STORJPriceUSD=0.1221,STORJPriceEUR=0.1109 1
Also if I just run the query in browser I get the correct response with 'OK'
The other two plugins work as expected.
Am I missing something very obvious?
Thanks in advance guys, and thanks for doing this awesome work!
The answer is somewhere in here:
https://github.com/Pentium100MHz/storjv3-tools
`<?php
//Storj v3 concurrent connection and service time monitoring script (cacti version), by Pentium100.
error_reporting(0);
$log=array();
exec("/usr/bin/docker logs storagenode --since 2m 2>&1 | sed 's/\x1b[[0-9;]*m//g'",$log);
$pieces=array();
$times=array();
$requests=0; $requests_max=0;
$requestsup=0; $requestsup_max=0;
$requestsdown=0; $requestsdown_max=0;
$time_up=0; $req_up=0;
$time_down=0; $req_down=0;
$time_total=0; $req_total=0;
foreach ($log as $line) {
$parts=explode("\t",$line);
if ($parts[1] == "INFO") {
$json=json_decode($parts[4],true);
$action=$parts[3];
switch ($action) {
case "upload started":
if (isset($pieces[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]])) {
$pieces[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]]++;
} else {
$pieces[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]]=1;
}
$times[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]][]=$data=date_format(date_create_from_format('Y-m-d?H:i:s.uT',$parts[0]),"U.u");
$requestsup++;
if ($requestsup > $requestsup_max) $requestsup_max=$requestsup;
break;
case "download started":
if (isset($pieces[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]])) {
$pieces[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]]++;
} else {
$pieces[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]]=1;
}
$times[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]][]=$data=date_format(date_create_from_format('Y-m-d?H:i:s.uT',$parts[0]),"U.u");
$requestsdown++;
if ($requestsdown > $requestsdown_max) $requestsdown_max=$requestsdown;
break;
case "uploaded":
case "upload failed":
if (isset($pieces[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]])) {
if ($pieces[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]] > 0) {
$pieces[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]]--;
$requestsup--;
$endtime=date_format(date_create_from_format('Y-m-d?H:i:s.uT',$parts[0]),"U.u");
$duration=$endtime-$times[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]][0];
unset($times[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]][0]);
$times[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]]=array_values($times[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]]);
$req_up++;
$time_up+=$duration;
$req_total++;
$time_total+=$duration;
}
}
break;
case "downloaded":
case "download failed":
if (isset($pieces[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]])) {
if ($pieces[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]] > 0) {
$pieces[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]]--;
$requestsdown--;
$endtime=date_format(date_create_from_format('Y-m-d?H:i:s.uT',$parts[0]),"U.u");
$duration=$endtime-$times[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]][0];
unset($times[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]][0]);
$times[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]]=array_values($times[$json["SatelliteID"]][$json["Piece ID"]][$json["Action"]]);
$req_down++;
$time_down+=$duration;
$req_total++;
$time_total+=$duration;
}
}
break;
} //switch
} //if
} //foreach
printf ("up:%d down:%d t_up:%.3f t_down:%.3f t_total:%.3f\n",$requestsup_max,$requestsdown_max,$time_up/$req_up,$time_down/$req_down,$time_total/$req_total);
?>`
Audit Rate Alert at < 75%, critical at < 60%
curl -s http://localhost:14002/api/satellite/118UWpMCHzs6CvSgWd9BfFVjw5K9pZbJjkfZJexMtSkmKxvvAW | jq .data.audit.score
Add custom alerts for critical thresholds;
EX:
DONE:
PENDING:
I can not get any value about Disk Space and Bandwidth per month graphs, looking my web dashboard all this value are present.
My telegraf.conf it seems correct on dashboard part:
[[inputs.exec]]
commands = [
“curl -s 10.1.0.200:14002/api/dashboard” # Open SNO API by mapping ports when running your SNO docker instance
]
timeout = “60s”
interval = “1m”
data_format = “json”
tag_keys = [ “data_nodeID” ]
name_override = “StorJHealth”
I saw this error on my telegraf log:
telegraf | 2020-04-15T12:21:00Z E! [inputs.exec] Error in plugin: invalid character ‘<’ looking for beginning of value
Hi all.
I just got my grafana dashboard with storj_telegraf_mon up and running. But as you can see on the pic i dont have any data points on the docker0 interface.
Here comes my telegraf.conf
[[inputs.docker]]
endpoint = "unix:///var/run/docker.sock"
gather_services = false
container_names = []
container_name_include = []
container_name_exclude = []
timeout = "5s"
perdevice = true
total = false
docker_label_include = []
docker_label_exclude = []
Dev:
Docker-Compose.yml
telegraf.conf
ToDo:
Test docker-compose up and links
Default grafana imports for dashboard
See Project for status
https://github.com/gsxryan/storj_telegraf_mon/projects/1
If your node have 0 repair requests processed, telegraf fails to parse stats:
Here is my telegraf stats when running test (fresh install):
docker exec -i -t telegraf /bin/bash
root@2b893255a418:/# telegraf --debug --config /etc/telegraf/telegraf.conf --input-filter exec --test
2019-08-07T03:04:19Z I! Starting Telegraf 1.11.3
2019-08-07T03:04:21Z E! [inputs.exec]: Error in plugin: metric parse error: expected field at 3:77: "StorJHealth,NodeId=12bqvr GETRepairFail=0,GETRepairSuccess=0,GETRepairRatio=nan,PUTRepairFailed=0,PUTRepairSuccess=0,PUTRepairRatio=0.000 1565147061042414707"
> StorJToken,WalletAddress="0x0000000000000000",host=2b893255a418,stat=tokens BalanceEUR=1.01111,BalanceSTORJ=1.0111,BalanceUSD=1.0111 1565147063000000000
> StorJToken,host=2b893255a418,stat=prices STORJPriceEUR=0.1395,STORJPriceUSD=0.1565 1565147063000000000
> StorJHealth,host=2b893255a418,path="/storage/storj" directory_size_kilobytes=301643780 1565147063000000000
(balances and ETH address are replaced by dummy values)
I don't know what actually means "metric parse error: expected field", but I suspect that problem is caused by GETRepairRatio=nan.
Nan here is caused with division by zero in line 88: https://github.com/gsxryan/storj_telegraf_mon/blob/master/successrate.sh#L88
(Actually you don't have a "division-by-zero" check in all ratio-based calculations in successrate.sh).
Ok i am coming from here with my problem now:
Why do i dont have any pie diagram on the left side and on the right only a straight line?
As you can see on the link i realy checked many things. The output from "telegraf -test is looking good i think but the output from influx db is not ok.
Anybody any idee??
Specs Organized by Sat.
GETs
Sat1: 160
sat 2: 15
How to format items in *.secrets?
So I'm running Docker on Windows Server 2016, but am not running my telegraf in a docker instance. I'm trying to get the docker network stats working, but it's not collecting any datapoints, even when checking with the '--test' flag.
In my telegraf.conf, I have:
[[inputs.net]]
## NIC Traffic Monitor
interfaces = ["docker0"]
Which seems like it should work, but it doesn't. Everything else is working as expected, except for this. Any ideas?
two data sources are not needed.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.