opmdg / check_pgactivity Goto Github PK

View Code? Open in Web Editor NEW

172.0 172.0 51.0 2.17 MB

Nagios remote agent

Home Page: http://opm.readthedocs.io/probes/check_pgactivity.html

License: PostgreSQL License

Perl 99.78% PLpgSQL 0.22%

check_pgactivity's People

Contributors

Stargazers

Watchers

Forkers

informaticartica rdunklau khumeshturkar-cb davinirjr melaniegault subhankarc mypytries anayrat swrd rpv-tomsk michalmedvecky blackcobra1973 superette madtibo liuqian1990 loictim jmarfr tobixen cuulee daamien albertogviana loukash krysztophe gridl ng-pe joe2hpimn zjtheone kulkarnisourabh ioguix littlestar-angel chandl dkozyndank zhuomingliang tilkow rozhnev anilpatimidi blogh mohamedredalyoubi frbn fbomj hadesfeng cybernetics aaujon yhuelf robinportigliatti l00ptr frost242 mbisme killerwzb pchovelon

check_pgactivity's Issues

Support failing archive

The ready_archives service is fine, but checking for numbers of WAL ready to be archive is not fine grained enough to know if the archiving process is just lagging of really failing in some way.

We should add a new service able to detect if the archiving is doing fine or not.

First execution of pg_dump_backups causes an error

Hi,

When pg_dump_backups is run for the first time, the following erreur appears :
Remote command execution failed: Use of uninitialized value in subtraction (-) at /home/dalibo/bin/check_pgactivity line 4493.

This error should be silented.

Wrong unit in check_bgwriter service

Counters buffers_backend and buffers_backend_fsync should be in Bps instead of Nps.

please guide for check_pgactivity step-by-step setup in OPM UI.

Hello,

I am doing this from last 2 days and its didn't work. Please guide me.

Thanks

Query issue - check_last_maintenance

I use last_vaccum and last_analyze and these are same function - check_last_maintenance.

In my case, on pg_stat_user_tables there are null values on last_vacuum & last_analyze column
because we don't run vacuum and analyze manually.
So only last_auto${type} column has a value.

So, I modified the statement as follows.

SELECT coalesce(min(
extract(epoch FROM current_timestamp -
CASE (last_auto${type} > last_${type}) or (last_${type} is null)
WHEN 't' THEN last_auto${type}
ELSE last_${type}
END
)::float
), 'NaN'::float),
coalesce(sum(${type}_count), 0) AS ${type}_count,
coalesce(sum(auto${type}_count), 0) AS auto${type}_count
FROM pg_stat_user_tables

Please check above whether it will be right or not.
Thank you in advance.

Fix several services to work with PostgreSQL 10

Commit befd73c50f11a6c6a6719dae20f0de7b7585bef4 in PostgreSQL core add pg_ls_logdir() and pg_ls_waldir() functions. This affects at least wal_files and archiver probes.

Commit 25fff40798fc4ac11a241bfd9ab0c45c085e2212 introduces monitoring roles. We should prepare some documentation for it.

hit_ratio: Have ratio since last check

If no access since last check, no perfdata must be returned for the database(s).

Add oldest_waiting service

Currently, we have no way to determine if some transaction is blocked for a certain period of time.

A service oldest_waiting seems like a good idea.

One parameter mistake in README

Hello,

On the raw of 1001, the code is check_pgactivity -h localhost -p 5492 -s last_vacuum -w 30m -c 1h30m,
the parameter -p 5492 should be -p 5432 I think.

Some service return code of 13

Thanks for OPM, I now can monitor some services of my database, I added locks, btree_bloat and backends days ago, and it seems quite fine. However when I add the service using command like check_pgactivity -h localhost -p 5432 -s last_vacuum -w 30m -c 1h30m into the nagios service define file, the nagios returns code of 13 is out of bounds, any other command parameters need to be configured? And the service database_size return the same code of 13.

Thanks a lot.

Bug in the temp_files report

On a 9.1.1 server, it gives me this error message:

ERROR: set-valued function called in context that cannot accept a set

I tried on a 9.1.15, and it works for me. I don't think this is an issue with the minor releases. It might be well to look further into it.

Account the number of ExclusiveLocks taken on objects and on advisory locks separetely in perfdata

wal_files: query in error when timeline isn't an integer

The SQL query fired by check_pgactivity errors out when the timeline isn't an integer:
ERROR: invalid input syntax for integer: "0000001B"

Probably because the query tries to convert the timeline to an integer from a substring:
substring(s.f from 1 for 8)::integer AS tli

Better documentation for database_size service

As seen with issue #21, the current behaviour (check variation size, not size) is not intuitive, so enhance documentation, as requested by @deric.

Enhance message for service streaming delta

Currently, the message is for instance:

POSTGRES_STREAMING_DELTA CRITICAL: critical flush lag for name@ip

Adding the human readable lag size would be better. For instance, something like

POSTGRES_STREAMING_DELTA CRITICAL: critical flush lag XMB for name@ip

Wrong default database in documentation

Documentation report the following for the -d | --dbname option:

Database name to connect to (default: "postgres").

Actually this is template1 that is really used.

Handle logical replication sessions

With logical replication introduced in PostgreSQL 10, there will be active sessions running for each active publication on the sending server, as showed in pg_stat_activity :

-[ RECORD 2 ]----+--------------------------------------------------
datid            | 17359
datname          | src
pid              | 13786
usesysid         | 10
usename          | thomas
application_name | local
client_addr      | ::1
client_hostname  | 
client_port      | 53628
backend_start    | 2017-01-27 18:08:19.93667+01
xact_start       | 
query_start      | 2017-01-27 18:08:19.946993+01
state_change     | 2017-01-27 18:08:19.947523+01
wait_event_type  | Client
wait_event       | WalSenderWaitForWAL
state            | active
backend_xid      | 
backend_xmin     | 
query            | walsender

We will have to handle this sessions correctly in backends and backends_status.

Also, we may monitor subscriptions on the receiving side with the pg_stat_subscription view. Some informations available here : https://www.postgresql.org/docs/devel/static/logical-replication-monitoring.html

Checking bloat on toast index

Currently, there's no way to check how much indexes on toast relation are.

A dedicated service to check such bloat would be nice.

database size warning not triggered

I would expect triggering a warning when running check_pgactivity with following arguments:

-s database_size -w 10GB -c 20GB

the output is following:

POSTGRES_DB_SIZE OK: 19 database(s) checked | sentry=11414456504B;10737418240;21474836480 ...

where obviously 11414456504B (10.63GB) is above threshold 10737418240 (10GB).

Improve perfdata on locks

Hello,

Would it be welcome to distinguish page and relation locks held for SIReadLocks, used in SSI, in the locks service ?

The aim of this question is to help determine which kind of lock is effectively held while using SSI. Thus, it would help to explain some application or PostgreSQL behavior while using SSI.

Thoughts ?

Support "pending restart" for 9.5+

Pending restart has been introduced in 9.5 to inform of some setting changes that will apply on next start.

This should probably be integrated in the setting service.

Default host

I want to be able to use /usr/local/lib/monitoring/plugins/check_pgactivity -s $service without any connection parameters - and it should initiate psql without connection parameters.

The default behaviour for psql is to connect through the unix socket file, found at the compile-time configured default socket directory (/tmp in the official postgres release, but /var/run/postgres under most distributions) and then connect as the logged-in-user (I'll add an nrpe user with superuser privileges, alternatively just do sudo -u postgres).

Two things:

It's required to add either at least one connection parameter. This is just an aesthetic issue, I just don't want to pass a "dummy" connection parameter.
The default host is "localhost". This causes a TCP connection, and the user is no longer authenticated by the OS, so one is asked for a password:

$ sudo -u postgres psql -h localhost
Password:
psql: fe_sendauth: no password supplied

Possibly this could be solved by running up an ident server, but I don't want to go that route.

I could add '--host /tmp' or '--host /var/run/postgres', but I don't like to do such hard-coding as it may be distro-specific where the socket file is located.

I could add some config service file on the host with a password ... but I don't want to complicate the setup.

Last, -h "" seems to work, but it does give some warnings so perhaps best to avoid it:

$ sudo -u postgres /usr/local/lib/monitoring/plugins/check_pgactivity -h '' -s autovacuum
Use of uninitialized value in sprintf at /usr/local/lib/monitoring/plugins/check_pgactivity line 879.
POSTGRES_AUTOVACUUM OK: Number of autovacuum: 0 | ANALYZE=0 VACUUM=0 VACUUM_FREEZE=0 VACUUM_ANALYZE=0 oldest_autovacuum=NaNs max_workers=3

I'm considering writing up a pull request on this one.

Fix check_pgactivity to run with Perl 5.10.0

The line "use v5.8.0;" prevents check_pgactivity from working with Perl 5.10.0. It's obviously a bug as related in : http://www.nntp.perl.org/group/perl.perl5.porters/2008/01/msg132524.html

Adding use 5.006; before this line fix the issue.

permission denied for relation pg_statistic on Amazon RDS

On Amazon RDS, pg_statistic is owned by rdsadmin role, so many checks fail with permission denied error.

The tables pg_stats should be used instead.

I think https://github.com/ioguix/pgsql-bloat-estimation has updated query, but if this is not the case, then please Bucardo postgres nagios plugin here, which I can confirm it works with Amazon RDS.

Thank you.

handle psql warnings

When psql warn about something, the warning is captured, shown in the first line of the result then parsed, leading to various errors that are not handled as psql exit with rc=0.

This comes from this piece of code where we redirect stderr to stdout:

    $psqlcmd  = qq{ $args{'psql'} --set "ON_ERROR_STOP=1" }
              . qq{ -qXAf $tmpfile -R $RS -F $FS };
    $psqlcmd .= qq{ --dbname='$db' } if defined $db;
    $psqlcmd .= qq{ -t } unless defined $get_fields;
    $res      = qx{ $psqlcmd 2>&1 };
    $rc       = $?;

For instance:

root# sudo -u postgres psql -qXAt --set "ON_ERROR_STOP=1" -c "select 'test'"
could not change directory to "/root": Permission denied
test
root# echo $?
0

A solution to explore for this bug is to redirect stderr to a temp file and check its size and content before getting further in the parsing etc.

I'll work on this later.

Password parameter for connection test

Hi,

Please help me in passing password as a parameter to the connection service test.

/usr/local/nagios/libexec/check_pgactivity-2.0 # ./check_pgactivity -s connection -h x.x.x.x -p 1111
Password:

Thanks.

Wondering about the perfdata on the pg_dump_backup service

Not sure if this is a bug or not.

When I use the pg_dump_backup service and when the delta is beyond the warning and/or critical values, it gives me the size of the latest backup in the perfdatas. It looks like this:

POSTGRES_PGDUMP_BACKUP CRITICAL: 'b1_delta'=2301687B

From the name of the label, I thought it was the size of the delta for my b1 database. But actually, it is the size of the latest backup. Shouldn't it be the size of the delta?

Issues with action "backends"

For now, the probe only calculate the delta between the max number of connections (max_connections) and the number of backend connections (pg_stat_activity). However :

A streaming connection consume a connection, so we also need to query pg_stat_replication (sadly, it can't be done with pg 90)
An autovacuum worker doesn't consume a connection, so remove them from the count

Also, should we bother with superuser_reserved_connections ? If yes, we should find a way to handle applications connecting as superuser.

Database connection limit and user connection limit could also be considered, but I think they're too specific and can lead to unexpected alerts, so it should be better to not handle them.

Support version 10 and more

PostgreSQL will change its numbering policy starting with the next release 10, dropping the major release on 2 digit to keep it on a single number.

To keep it backward compatible with the old policy, the pg_version_num will stay compatible with its old behavior.

check_pgactivity does not rely on pg_version_num because it appears in 8.2, so it recompute it based on the "human" format. The current code will likely break with the new version scheme.

We should fix it to use pg_version_num first and fallback on the current implementation if pg_version_num fail.

check_streaming_delta is broken on standby.

ERROR: recovery is in progress
HINT: WAL control functions cannot be executed during recovery.

pg_last_xlog_receive_location() should be used on standby instead of pg_current_xlog_location();

Division by zero in btree_bloat

The case relpages = 0 is not handled :

[...]
        100 * (sub.relpages-est_pages)::float / sub.relpages AS bloat_ratio, is_na
[...]

Performance data not following plugin specification

Hello,

Thank you for this useful plugin we are currently putting in place with Icinga to monitor our Postgres DBs.

While setting up the monitoring; I found out that commit_ratio gives performance data which contains units like 'tps'. Unfortunately Icinga is unable to process this because it follows the Nagios plugin guidelines.

https://nagios-plugins.org/doc/guidelines.html
http://docs.icinga.org/latest/en/perfdata.html

which does not describe tps as a performance data unit.

Could you take a look into this please ?

--- check_pgactivity.orig   2016-03-16 17:47:57.933416750 +0100
+++ check_pgactivity    2016-03-16 17:49:29.766926601 +0100
@@ -2443,9 +2443,9 @@
         $global_commits   += $commits;
         $global_rollbacks += $rollbacks;

-        push @perfdata, sprintf "'%s_commit_rate'=%.2ftps",   $db, $commit_rate;
-        push @perfdata, sprintf "'%s_rollback_rate'=%.2ftps", $db, $rollback_rate;
-        push @perfdata, sprintf "'%s_xact_rate'=%.2ftps",     $db, $xact_rate;
+        push @perfdata, sprintf "'%s_commit_rate'=%.2f",   $db, $commit_rate;
+        push @perfdata, sprintf "'%s_rollback_rate'=%.2f", $db, $rollback_rate;
+        push @perfdata, sprintf "'%s_xact_rate'=%.2f",     $db, $xact_rate;

         if ( $rollbacks == 0) {
             $ratio = 0;
@@ -2464,7 +2464,7 @@
                     next THRESHOLD_LOOP;
                 }
                 if ( $val eq "rollback_rate" and $crit{$val} < $rollback_rate ) {
-                    push @msg_crit => sprintf "'%s'=%.2ftps", $prefix, $rollback_rate;
+                    push @msg_crit => sprintf "'%s'=%.2f", $prefix, $rollback_rate;
                     next THRESHOLD_LOOP;
                 }
                 if ( $val eq "rollback_ratio" and $crit{$val} < $ratio ) {
@@ -2479,7 +2479,7 @@
                     next THRESHOLD_LOOP;
                 }
                 if ( $val eq "rollback_rate" and $warn{$val} < $rollback_rate ) {
-                    push @msg_warn => sprintf("'%s'=%.2ftps", $prefix, $rollback_rate );
+                    push @msg_warn => sprintf("'%s'=%.2f", $prefix, $rollback_rate );
                     next THRESHOLD_LOOP;
                 }
                 if ( $val eq "rollback_ratio" and $warn{$val} < $ratio ) {

File «/usr/lib64/nagios/plugins/check_pgactivity.data» not recognized as a check_pgactivity status file.

I am using check_pgactivity with Icinga2 instead of Nagios and am getting the "File «/usr/lib64/nagios/plugins/check_pgactivity.data» not recognized as a check_pgactivity status file." errors on every call to check_pgactivity (for all services using check_pgactivity), and then 1 minute later the call to check_pg_activity works fine. This pattern repeats on every subsequent call.

Here is an example from Icinga2:

Could this be a race condition for all the services competing for read access to the same status file?

Any assistance would be greatly appreciated,

Thanks.

In human format, perfdata don't report the appropriate unit if unit finishes with "s"

Hello,

This bug only occurs with the human-readable output format. If the unit finishes with "s", the output is automatically converted to a duration. This happens at least with bgwriter with 'Bps' and commit_ratio with 'tps'.

$ ./check_pgactivity -h /tmp -p 5410 -U thomas -F human -s bgwriter
Service        : POSTGRES_BGWRITER
Returns        : 0 (OK)
Message        : No writes
Perfdata       : buffers_backend=0s
Perfdata       : checkpoint_timed=0s
Perfdata       : checkpoint_req=0s
Perfdata       : buffers_checkpoint=0s
Perfdata       : buffers_clean=0s
Perfdata       : maxwritten_clean=0s
Perfdata       : buffers_backend_fsync=0s
Perfdata       : buffers_alloc=0s

It's OK in Nagios output format :

$ ./check_pgactivity -h /tmp -p 5410 -U thomas -s bgwriter
POSTGRES_BGWRITER OK: No writes | buffers_backend=0Bps checkpoint_timed=0.00249376558603491Nps checkpoint_req=0Nps buffers_checkpoint=0Bps buffers_clean=0Bps maxwritten_clean=0Nps buffers_backend_fsync=0Nps buffers_alloc=0Bps

$ ./check_pgactivity -h /tmp -p 5410 -U thomas -F human -s commit_ratio
Service        : POSTGRES_COMMIT_RATIO
Returns        : 0 (OK)
Message        : Commits: 2 - Rollbacks: 0
Perfdata       : template1_commit_rate=0s
Perfdata       : template1_rollback_rate=0s
Perfdata       : template1_xact_rate=0s
Perfdata       : template1_rollback_ratio=0.00%
Perfdata       : postgres_commit_rate=0s
Perfdata       : postgres_rollback_rate=0s
Perfdata       : postgres_xact_rate=0s
Perfdata       : postgres_rollback_ratio=0.00%

The problem lays in sub human_output which don't calls is_time. We should verify other output formats for the same issue.

Avoid duplicated performance measures

We use Grafana to display the perf measures in a dashboard, but it's difficult when we have a service with duplicated labels. For example temp_files, each database have 4 perf measures with the same label, database name.

I think would be great to add a sufix (or prefix) to the database name indicating the kind of measueres.

For example:
postgres_Fpm=0Fpm postgres_Bpm=0Bpm postgres_temp_files=0Files postgres_temp_size=0B

Instead of:
postgres=0Fpm postgres=0Bpm postgres=0Files postgres=0B

The longest_query service use something like I am explaining

Check for unused indexes

Is it possible to add a check for unused indexes?

Josh Berkus had a nice blog post about that in:
http://www.databasesoup.com/2014/05/new-finding-unused-indexes-query.html

Probably also a check for sequencial scans on a table since last execution of check.

Filter out some applications

Some tools take a looong time to execute, and they held a transaction as long as they run. Yeah, pg_dump is my target :) I shouldn't get an alert if pg_dump holds a transaction for a long time. It looks like this:

 03:39:12.951715 | pg_dump          | 14555 | COPY public.<bip> (<bip>) TO stdout;
 03:39:12.95212  | pg_dump          | 14554 | COPY <bip>.<bip> (<bip>) TO stdout;
 03:39:12.952371 | pg_dump          | 14552 | COPY <bip>.<bip> (<bip>) TO stdout;
 03:39:12.952425 | pg_dump          | 14550 | COPY <bip>.<bip> (<bip>) TO stdout;

Probably it should filter out pg_dump connection lines (something like "WHERE application_name='pg_dump' ").

Thanks.

how to show custom_query on the OPM UI?

Hello,

Thanks for OPM.

I want to show some query on the OPM UI like the current connection-number of each user. I used the --custom_query service and there appeared the information of the connections on the Nagios UI, and I want to show them on OPM UI, what should I do?

Thanks.

Please provide Define Command & All Service example to setup in OPM UI.

Backends service: graph max_connections

I'd like to graph the max_connections value. Would it be possible to add the max_connections value as another "database"?

This should do it:

diff --git a/check_pgactivity b/check_pgactivity
index 14bdadb..2781c40 100755
--- a/check_pgactivity
+++ b/check_pgactivity
@@ -1164,6 +1164,9 @@ sub check_backends {
FROM pg_stat_database AS s
JOIN pg_database d ON d.oid=s.datid
WHERE d.datallowconn

```
          UNION
```

          SELECT 'max_connections', current_setting('max_connections')::integer,

```
              current_setting('max_connections')
```
};

warning and critical are mandatory.

@@ -1196,7 +1199,7 @@ sub check_backends {
if $args{'warning'} =~ /^([0-9.]+)%$/;

LOOP_DB: foreach my $db (@rs) {
```
   $num_backends += $db->[1];
```

   $num_backends += $db->[1] if $db->[0] !~ '^max_connections$';
 push @perfdata,
     "$db->[0]=$db->[1];$args{'warning'};$args{'critical'};0;$db->[2]";

}

Quick-n-dirty patch, you're warned :)

database_size is a confusing service

The thresholds sent to database_size is applied on the database growth since last query - this may be slightly unexpected from the name of the service.

Further, the perf-data gives the absolute database size and not the growth. I was staring at the numbers for a while and believing I found a bug - until I read the documentation. I think it's not appropriate that the thresholds apply to one variable while the actual reported data is another variable. And I even do have a use-case where I'd actually like to monitor the absolute database size.

Suggestion: mark database_size as deprecated and split it into two, database_size_growth as a rename of the currenct database_size (but it should return growth values in perfdata), and database_total_size, database_disk_size or something like that for a service that checks for and returns the actual database size. Check the available disk space on the postgres data partition (if possible) and let percentages in the thresholds apply to the ratio database_size / (database_size + free disk space)

Detecting "vertical" fragmentation

When someone drops a column in a table, the table is not rewritten. VACUUM won't claim the space. And check_pgactivity won't detect the space used for nothing. Wouldn't it be right that check_pgactivity checks that kind of issues?

Handle wait events correctly for 9.6+

Commit edc4cf8 introduced some basic support for wait events in 9.6. However, it only handles strong locks but not the Buffer Pin waits and the 2 LWLocks waits. We should think about how we treat them.

Commit edc4cf8 simply handles strong locks as queries waiting for locks, which is the "historical" behavior of the waiting column of pg_stat_activity from older versions of PostgreSQL.

check permission on PGDATA

Add a "simple" output format similar to check_postgres

The check_postgres.pl probe offers a simple output format which only outputs one number, and returns zero regardless of the defined threshold.

This would be very useful to have in check_pgactivty too because it would then be able to be used with Zabbix.

"One number to rule them all" should be defined for each service. Here is a suggestion for most of them, some of them would require significant change to provide anything useful, so I put them at the bottom of the list:

Service Name	Returned value
archive_folder	Number of archived wal files OR age of the latest one ?
archiver	Number of wal files ready to archive
autovacuum	Age of the oldest running autovacuum (in seconds)
backup_label_age	Age of backup label file (in seconds)
btree_bloat	Bloat size for the most bloated index
commit_ratio	Commit ratio since last execution
custom_query	First value of the first column of the resultset
database_size	Total variation of database sizes, using the already implemented dbinclude and dbexclude options
hit_ratio	Total hit ratio for all databases according to dbinclude and dbexclude options
invalid_indexes	Number of invalid indexes
last_analyze	Age of the oldest analyze or autoanalyze (in seconds)
last_vacuum	Age of the oldest vacuum or autovacuum (in seconds)
locks	Total number of locks. Optionnally, filter by lock type ?
longest_query	Age of the longest running query (in seconds)
max_freeze_age	Oldest database in transaction age.
minor_version	Version number (as in PG_NUM_VERSION ?)
oldest_2pc	Age of the oldest 2PC xact
oldest_idlexact	Age of the oldest idle xact
pga_version	Version number of check_pga
table_bloat	Bloat in bytes of the most bloated table
table_unlogged	Number of unlogged tables
temp_files	Rate (in bytes/s) of temp file generation
wal_files	Total number of WAL files.
streaming_delta	Delta in bytes between master and standby. By default, replayed delta but could be filtered
hot_standby_delta	Delta in bytes between master and standby. By default, replayed delta but could be filtered
backends	Percentage of used connections OR raw number ? dbinclude and dbexclude could be useful
backends_status	Maybe an option to filter on a set of specific status should be implemented. Then, number of connections in those status
bgwriter	Maybe an option to filter on a specific counter should be implemented. Then, this counter
replication_slots	Delta in bytes on the most lagging replication slot. Possibly filtered by replication slot ?
configuration	No idea
connection	No idea
is_hot_standby	No idea
is_master	No idea
is_replay_paused	No idea
pg_dump_backup	No idea
settings	No idea

Check error on save()

We do not check for error while saving the status file in the save() sub.

For memory, here is the doc about this: http://perldoc.perl.org/Storable.html#ERROR-REPORTING

NRPE unable to read plugin outup

I'm trying to run check_pgactivitiy via NRPE

Running command: sudo -u postgres /usr/lib/nagios/plugins/check_pgactivity -h /var/run/postgresql/ -s autovacuum
Command completed with return code 1 and output: 
Return Code: 1, Output: NRPE: Unable to read output

For some reason NRPE fails to read the output. Could there be problem with flushing stdout?

nrpe daemon runs under nagios user, so I've added sudo -u postgres. nagios should be able to connect to DB without password.

/etc/sudoers:

nagios ALL=NOPASSWD:/usr/lib/nagios/plugins/check_pgactivity,/usr/bin/psql

Any ideas what could possibly go wrong?

Some counters are in a weird unit

For instance, pg_stat_bgwriter:

Perfdata contains the difference from the pg_stat_bgwriter counters since last execution.

This has no meaning for the user: you never know the exact duration between 2 executions. If you have a downtime, then you'll have a spike at next connection.

A more rigourous measure would be difference divided by time. You'd get consistent results no matter if there is 10s or 20 minutes between calls. Anyway, the unit would seem logical: buffers allocations per minute for instance. Currently, it is just "buffers". It's like mixing distance and speed...

Wrong curl command line options

The current command is:
curl --silent --max-time=$timeout -H 'Pragma: no-cache' $url

and should be:
'curl' => "curl --silent --max-time $timeout -H 'Pragma: no-cache' $url"

At the very least with curl 7.19.7.

add minimal backup/delta size to pgdump_backup service

This is quite annoying to receive alert when a very small backup grows from eg. 20 to 22MB.

opmdg / check_pgactivity Goto Github PK

check_pgactivity's People

Contributors

Stargazers

Watchers

Forkers

check_pgactivity's Issues

Please help me in passing password as a parameter to the connection service test.

/usr/local/nagios/libexec/check_pgactivity-2.0 # ./check_pgactivity -s connection -h x.x.x.x -p 1111 Password:

warning and critical are mandatory.

Recommend Projects

Recommend Topics

Recommend Org

/usr/local/nagios/libexec/check_pgactivity-2.0 # ./check_pgactivity -s connection -h x.x.x.x -p 1111
Password: