hamann / check-barman Goto Github PK
View Code? Open in Web Editor NEWnagios plugin for PostgreSQL backup tool barman
License: MIT License
nagios plugin for PostgreSQL backup tool barman
License: MIT License
Hello,
I have an issue with nrpe & check-barman, I don't understand where is my problem.
I modified /etc/sudoers adding this :
Defaults:nagios !requiretty
nagios ALL=(ALL) NOPASSWD:/usr/lib64/nagios/plugins/check-barman/check-barman.rb
With nagios user, when I execute the command, it works :
-sh-4.1$ whoami
nagios
-sh-4.1$ sudo /usr/lib64/nagios/plugins/check-barman/check-barman.rb -a pg -s KBILL-SAP -h /opt/data/MSA/barman
PG connection ok
-sh-4.1$ rvmsudo /usr/lib64/nagios/plugins/check-barman/check-barman.rb -a pg -s KBILL-SAP -h /opt/data/MSA/barman
PG connection ok
I configured nrpe file :
dont_blame_nrpe=1
allowed_hosts=X.X.X.X
command[check-barman]=/usr/local/rvm/bin/rvmsudo /usr/lib64/nagios/plugins/check-barman/check-barman.rb -a
command[check_barman_pg]=sudo /usr/lib64/nagios/plugins/check-barman/check-barman.rb -a pg -s KBILL-SAP -h /opt/data/MSA/barman
With other nrpe commands, it works but not with the barman plugin. I have this error :
NRPE: Unable to read output
Did you encounter this error?
Cédric
Hello,
With the new barman version 1.3.0, the result of the command "barman list-backup KBILL-SAP" returns :
KBILL-SAP 20140422T220001 - Wed Apr 23 10:00:53 2014 - Size: 124.5 GiB - WAL Size: 4.2 GiB (tablespaces: kbill_tbs:/opt/data/postgres)
KBILL-SAP 20140417T220002 - Thu Apr 17 22:38:50 2014 - Size: 120.1 GiB - WAL Size: 4.0 GiB (tablespaces: kbill_tbs:/opt/data/postgres)
The command "./check-barman.rb -a last_wal_received -s KBILL-SAP -w 300 -c 600" returns an error :
/usr/local/rvm/gems/ruby-2.1.0/gems/rbarman-0.0.12/lib/rbarman/cli_command.rb:269:in size_in_bytes': unit not one of B|KiB|MiB|GiB|TiB (ArgumentError) from /usr/local/rvm/gems/ruby-2.1.0/gems/rbarman-0.0.12/lib/rbarman/cli_command.rb:217:in
block in parse_backup_list'
from /usr/local/rvm/gems/ruby-2.1.0/gems/rbarman-0.0.12/lib/rbarman/cli_command.rb:205:in each' from /usr/local/rvm/gems/ruby-2.1.0/gems/rbarman-0.0.12/lib/rbarman/cli_command.rb:205:in
parse_backup_list'
from /usr/local/rvm/gems/ruby-2.1.0/gems/rbarman-0.0.12/lib/rbarman/cli_command.rb:67:in backups' from /usr/local/rvm/gems/ruby-2.1.0/gems/rbarman-0.0.12/lib/rbarman/backups.rb:20:in
all'
from ./check-barman.rb:78:in check_last_wal_received' from ./check-barman.rb:202:in
The issue is in your match of these lines in your lib rbarman/cli_command.rb, line 215 :
sizematch = l.match(/.+Size:\s(.+)\s(.+)\s-.+Size:\s(.+)\s(.+)/)
You need to modify it for not to match " (tablespaces: kbill_tbs:/opt/data/postgres)" at the end of the result. You can change it by this match :
sizematch = l.match(/.+Size:\s(.+)\s(.+)\s-.+Size:\s(.+)\s(.+)\s(/)
when i am going to execute
./check-barman.rb -a last_wal_received -s postgresdb -w 300 -c 600
i am getting error
/var/lib/gems/2.1.0/gems/mixlib-shellout-1.3.0/lib/mixlib/shellout.rb:253:in invalid!': Expected process to exit with [0], but received '1' (Mixlib::ShellOut::ShellCommandFailed) ---- Begin output of /usr/bin/barman list-files --target wal postgresdb 20150610T150001 ---- STDOUT: STDERR: EXCEPTION: 'NoneType' object has no attribute 'rfind' See log file for more details. ---- End output of /usr/bin/barman list-files --target wal postgresdb 20150610T150001 ---- Ran /usr/bin/barman list-files --target wal postgresdb 20150610T150001 returned 1 from /var/lib/gems/2.1.0/gems/mixlib-shellout-1.3.0/lib/mixlib/shellout.rb:239:in
error!'
from /var/lib/gems/2.1.0/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:352:in run_barman_command' from /var/lib/gems/2.1.0/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:118:in
wal_files'
from /var/lib/gems/2.1.0/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:71:in block in backups' from /var/lib/gems/2.1.0/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:68:in
each'
from /var/lib/gems/2.1.0/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:68:in backups' from /var/lib/gems/2.1.0/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:54:in
backup'
from /var/lib/gems/2.1.0/gems/rbarman-0.0.14/lib/rbarman/backup.rb:298:in by_id' from ./check-barman.rb:82:in
check_last_wal_received'
from ./check-barman.rb:235:in `
I am using pgbarman version 1.4.1 in ubuntu 14.04.
Hi
which user should install this plugin ? on which directory
I am using redhat , and I assume it shuold be installed on barman server where we a nagios agent is installed already ?
thank you
Ayad
Hi
I'm using a ubuntu server 12.04 x64. Every time I try to run following commands
./check-barman.rb -a backups_available -s main -w 5 -c 6
I recieve following error:
/usr/lib/ruby/1.9.1/time.rb:267:in parse': no time information in "None" (ArgumentError) from /var/lib/gems/1.9.1/gems/rbarman-0.0.13/lib/rbarman/backup.rb:116:in
backup_start='
from /var/lib/gems/1.9.1/gems/rbarman-0.0.13/lib/rbarman/cli_command.rb:239:in block in parse_backup_info_file' from /var/lib/gems/1.9.1/gems/rbarman-0.0.13/lib/rbarman/cli_command.rb:235:in
each'
from /var/lib/gems/1.9.1/gems/rbarman-0.0.13/lib/rbarman/cli_command.rb:235:in parse_backup_info_file' from /var/lib/gems/1.9.1/gems/rbarman-0.0.13/lib/rbarman/cli_command.rb:69:in
block in backups'
from /var/lib/gems/1.9.1/gems/rbarman-0.0.13/lib/rbarman/cli_command.rb:68:in each' from /var/lib/gems/1.9.1/gems/rbarman-0.0.13/lib/rbarman/cli_command.rb:68:in
backups'
from /var/lib/gems/1.9.1/gems/rbarman-0.0.13/lib/rbarman/backups.rb:20:in all' from ./check-barman.rb:68:in
check_backups_available'
from ./check-barman.rb:199:in `
Thanks a lot for the help
Kind regards
luker
$ ./check-barman.rb -a backups_available -s test1 -w 5 -c 6
"4 backups available"
check number of backups and set warning if number of backups is > 5 or set critical if number of backups is > 6
Shouldn't the ">" (larger then) be "<" (smaller then) in the sentence above explaining the avail backups and warning level?
So, give a warning if the number of available backups is "smaller then" 5?
root@db:~# /usr/lib/nagios/plugins/check-barman -a last_wal_received -s db01 -w 1800 -c 2700
/usr/lib/ruby/1.8/rbarman/cli_command.rb:272:in size_in_bytes': unit not one of B|KiB|MiB|GiB|TiB (ArgumentError) from /usr/lib/ruby/1.8/rbarman/cli_command.rb:219:in
parse_backup_list'
from /usr/lib/ruby/1.8/rbarman/cli_command.rb:207:in each' from /usr/lib/ruby/1.8/rbarman/cli_command.rb:207:in
parse_backup_list'
from /usr/lib/ruby/1.8/rbarman/cli_command.rb:66:in backups' from /usr/lib/ruby/1.8/rbarman/backups.rb:20:in
all'
from /usr/lib/nagios/plugins/check-barman:78:in `check_last_wal_received'
from /usr/lib/nagios/plugins/check-barman:202
This is content of variable lines given to parse_backup_list as argument:
GiB
tbl_spc_hdd:/var/lib/postgresql_p1)
root@db:~# barman list-backup db1
db1 20140528T093503 - Wed May 28 09:53:56 2014 - Size: 35.6 GiB - WAL Size: 154.5 GiB (tablespaces: tbl_spc_hdd:/var/lib/postgresql_p1)
This started when I added a tablespace in Postgresql.
Hi,
First, very good plugin for us, awesome job! We meet some problems recently
barman --version
1.4.0
ssh works fine
pg works fine
backups_available works fine
last_wal_received fail with error
failed_backups works fine
./check-barman.rb -a last_wal_received -s u-master -w 300 -c 600
/var/lib/gems/1.8/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:125:in wal_files': undefined method
downcase' for :gzip:Symbol (NoMethodError)
from /var/lib/gems/1.8/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:121:in each' from /var/lib/gems/1.8/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:121:in
wal_files'
from /var/lib/gems/1.8/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:71:in backups' from /var/lib/gems/1.8/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:68:in
each'
from /var/lib/gems/1.8/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:68:in backups' from /var/lib/gems/1.8/gems/rbarman-0.0.14/lib/rbarman/cli_command.rb:54:in
backup'
from /var/lib/gems/1.8/gems/rbarman-0.0.14/lib/rbarman/backup.rb:298:in by_id' from ./check-barman.rb:82:in
check_last_wal_received'
from ./check-barman.rb:235
latest backup info
barman show-backup u-master latest
Backup 20150825T153001:
Server Name : u-master
Status : DONE
PostgreSQL Version : 90404
PGDATA directory : /var/lib/postgresql/9.4/main
Base backup information:
Disk usage : 22.5 GiB (22.5 GiB with WALs)
Incremental size : 22.5 GiB (-0.00%)
Timeline : 1
Begin WAL : 000000010000002B0000006F
End WAL : 000000010000002B00000079
WAL number : 11
WAL compression ratio: 68.10%
Begin time : 2015-08-25 15:30:01.470980-07:00
End time : 2015-08-25 16:02:05.825740-07:00
Begin Offset : 5025552
End Offset : 0
Begin XLOG : 2B/6F4CAF10
End XLOG : 2B/7A000000
WAL information:
No of files : 199
Disk usage : 1.1 GiB
WAL rate : 0.35/hour
Compression ratio : 63.54%
Last available : 000000010000002C00000040
Catalog information:
Retention Policy : VALID
Previous Backup : 20150821T153001
Next Backup : - (this is the latest base backup)
barman status u-master
Server u-master:
Description: U master PostgreSQL Database
PostgreSQL version: 9.4.4
pgespresso extension: Available
PostgreSQL Data directory: /var/lib/postgresql/9.4/main
PostgreSQL 'archive_command' setting: if [ ! -f /var/lib/postgresql/stop_archive ]; then rsync -a -e "ssh -p 20002" %p barman@prod-pgbackup-1:/var/lib/barman/u-master/incoming/%f; else cd .; fi
Last archived WAL: 000000010000002C00000040
Failures of WAL archiver: 0
Retention policies: enforced (mode: auto, retention: RECOVERY WINDOW OF 1 WEEKS, WAL retention: MAIN)
No. of available backups: 3
First available backup: 20150818T153001
Last available backup: 20150825T153001
Minimum redundancy requirements: satisfied (3/0)
Is it because new barman version output changed?
When I call
/check-barman.rb -a last_wal -s <HOST> -w 300 -c 600
I get this error:
.../check-barman.rb:95:in
check_last_wal_received': undefined method created' for nil:NilClass (NoMethodError)
Apparently the object latest.wal_files.last
does not exist.
Hi,
I've encountered an error with check-barman for the actions last_wal and mssing_wal, that is caused by a WAL history file. The history file is created when the timeline changes. In my case this always occurs after performing a failover in a master/slave cluster managed by crm.
./check-barman.rb -a last_wal -s mypgcluster -w 5 -c 10
The check command is causing the following error:
/var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/wal_file.rb:109:in parse': name has to be exactly 24 chars (RBarman::InvalidWalFileNameError) from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:196:in
block in parse_wal_files_list'
from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:194:in each' from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:194:in
parse_wal_files_list'
from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:119:in wal_files' from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:71:in
block in backups'
from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:68:in each' from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:68:in
backups'
from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:54:in backup' from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/backup.rb:298:in
by_id'
from ./check-barman.rb:84:in check_last_wal_received' from ./check-barman.rb:237:in
The error disappears if the 00000005.history is deleted and xlog rebuilded.
Hello,
I have an error when I try to execute the check with the user nagios and sudo :
-sh-4.1$ whoami
nagios
-sh-4.1$ sudo /usr/lib64/nagios/plugins/check-barman/check-barman.rb -a pg -h /opt/data/MSA/barman -s KBILL-SAP
/usr/lib64/nagios/plugins/check-barman/check-barman.rb:26:in `require': no such file to load -- rbarman (LoadError)
from /usr/lib64/nagios/plugins/check-barman/check-barman.rb:26
However, I have the rbarman gem installed :
-sh-4.1$ gem list rbarman
*** LOCAL GEMS ***
rbarman (0.0.13, 0.0.12)
And in sudoers file, I followed your advice and I added these lines :
Defaults:nagios !requiretty
nagios ALL=NOPASSWD: /usr/lib64/nagios/plugins/check-barman/check-barman.rb
And when I list gem with the command sudo, I also add rbarman gem :
-sh-4.1$ sudo gem list rbarman
*** LOCAL GEMS ***
rbarman (0.0.13, 0.0.12)
Isn't there something missing in your documentation to use check-barman with sudo command and nagios user?
I dont know why i have this return..
Effectivly i have a oldest failed backup but I think this a bug to parse returns ?
/opt/sensu/embedded/lib/ruby/gems/2.4.0/gems/rbarman-0.0.16/lib/rbarman/backup.rb:136:in status=': only :empty, :started, :done or :failed allowed! (ArgumentError) from /opt/sensu/embedded/lib/ruby/gems/2.4.0/gems/rbarman-0.0.16/lib/rbarman/cli_command.rb:239:in
block in parse_backup_info_file'
from /opt/sensu/embedded/lib/ruby/gems/2.4.0/gems/rbarman-0.0.16/lib/rbarman/cli_command.rb:235:in each' from /opt/sensu/embedded/lib/ruby/gems/2.4.0/gems/rbarman-0.0.16/lib/rbarman/cli_command.rb:235:in
parse_backup_info_file'
from /opt/sensu/embedded/lib/ruby/gems/2.4.0/gems/rbarman-0.0.16/lib/rbarman/cli_command.rb:69:in block in backups' from /opt/sensu/embedded/lib/ruby/gems/2.4.0/gems/rbarman-0.0.16/lib/rbarman/cli_command.rb:68:in
each'
from /opt/sensu/embedded/lib/ruby/gems/2.4.0/gems/rbarman-0.0.16/lib/rbarman/cli_command.rb:68:in backups' from /opt/sensu/embedded/lib/rutby/gems/2.4.0/gems/rbarman-0.0.16/lib/rbarman/backups.rb:20:in
all'
from ./check-barman.rb:102:in `check_failed_backups'
Greetings dev.
Recently updated our pgbarman installation to 1.6.0 and decided it was time to have some proper checks to verify the backups are proper.
However I am having some issues when running most of the check-barman actions. The psql setup is master --> slave replication with wal_level hot standby. Barman is running on the slave host, dumping to a locally mounted share.
whoami
barman
gem list rbarman
*** LOCAL GEMS ***
rbarman (0.0.15)
barman -v
1.6.0
The only check that works
./check-barman.rb -a pg -s main
PG connection ok
The checks that fails
./check-barman.rb -a ssh -s main
SSH connection failed!
Although SSH works according to barman itself
barman check main | grep ssh
ssh: OK (PostgreSQL server)
The ssh check I can live without. However the "good" checks spits out more fails which I interpret as being related to rbarman:
./check-barman.rb -a last_wal_received -s main -w 300 -c 600
/var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:216:in
'block in parse_backup_list': undefined method
[]' for nil:NilClass (NoMethodError)
from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:205:ineach' from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:205:in
parse_backup_list'
from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:67:inbackups' from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/backups.rb:20:in
all'
from ./check-barman.rb:43:inlatest_backup_id' from ./check-barman.rb:84:in
check_last_wal_received'
from ./check-barman.rb:237:in `
./check-barman.rb -a failed_backups -s main -w 1 -c 2
/var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:216:in
'block in parse_backup_list': undefined method
[]' for nil:NilClass (NoMethodError)
from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:205:ineach' from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:205:in
parse_backup_list'
from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/cli_command.rb:67:inbackups' from /var/lib/gems/1.9.1/gems/rbarman-0.0.15/lib/rbarman/backups.rb:20:in
all'
from ./check-barman.rb:102:incheck_failed_backups' from ./check-barman.rb:240:in
barman show-backup main latest
Backup 20160425T220004:
Server Name : main
Status : DONE
PostgreSQL Version : 90312
PGDATA directory : /var/lib/postgresql/9.3/mainBase backup information:
Disk usage : 10.6 GiB (10.6 GiB with WALs)
Incremental size : 9.7 GiB (-7.99%)
Timeline : 1
Begin WAL : 000000010000001B00000020
End WAL : 000000010000001B00000021
WAL number : 2
WAL compression ratio: 75.47%
Begin time : 2016-04-25 22:00:04.299301+02:00
End time : 2016-04-25 22:45:06.371444+02:00
Begin Offset : 9289632
End Offset : 0
Begin XLOG : 1B/208DBFA0
End XLOG : 1B/22000000WAL information:
No of files : 7
Disk usage : 28.5 MiB
WAL rate : 0.59/hour
Compression ratio : 74.56%
Last available : 000000010000001B00000028Catalog information:
Retention Policy : VALID
Previous Backup : 20160424T220003
Next Backup : - (this is the latest base backup)
Any help towards a solution would be appreciated.
Hi,
Thank you for the plugin. I have Barman 2.5 installed and I can successfully:
ssh [email protected] to mine postgresql 11 instance.
However I can't explain why the plugin fails to do the same:
barman@dev-gb-mgt-pg-01:/usr/lib/nagios/plugins/./check-barman.rb -a ssh -s dev-gb-gal-pg11-01
SSH connection failed!
I can however connect using action pg:
barman@dev-gb-mgt-pg-01:/usr/lib/nagios/plugins/./check-barman.rb -a pg -s dev-gb-gal-pg11-01
PG connection ok
Thank you.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.