Comments (9)
Hi @Pupkur,
A few questions/observations:
- One thing I see right away is that you have a monitor scheduled to run every 1 hour, but your range is only looking at the past 5 minutes. So every 1 hour your monitor will look at 5/60 minutes of data. Is this intended? Are you positive you have logs in those 5 minutes that would trigger an alert?
- It seems your initial post contained a range query for
system.filesystem.used.pct
and now you're using an aggregation? - Your trigger painless script is looking only at the first result of your hits with no sort specified and comparing that while your query seems to be returning all documents with
system.filesystem.used.pct
which means there's no guarantee that the document you're comparing is the highest pct in that 5 minute interval. - Your new query is using an aggregation so you probably want to use
ctx.results[0].aggregations.when.value
instead in your trigger painless script to do your comparison. - If you do end up using the aggregation you can add
size: 0
to your query since you do not need the hits.
from alerting.
Hi @Pupkur,
Can you provide your complete Monitor configuration so I can take a look at your query and trigger.
Thanks,
Drew
from alerting.
Monitor query you can see in first post.
from alerting.
Sorry, I meant the whole JSON configuration from the API; interested in seeing what you wrote for your painless script in your trigger so I can attempt to reproduce your issue.
from alerting.
Sorry for delay.
Where can I get this JSON?
from alerting.
There is a GET Monitor API you can use:
https://opendistro.github.io/for-elasticsearch-docs/docs/alerting/api/#get-monitor
from alerting.
{
"_id" : "S5fyB2sB35Cz7fEWkyEF",
"_version" : 11,
"monitor" : {
"type" : "monitor",
"name" : "Server low disk DEV envs",
"enabled" : true,
"enabled_time" : 1559206466297,
"schedule" : {
"period" : {
"interval" : 1,
"unit" : "HOURS"
}
},
"inputs" : [
{
"search" : {
"indices" : [
"metrics-dev-envs-*"
],
"query" : {
"query" : {
"bool" : {
"must" : [
{
"match_all" : {
"boost" : 1.0
}
},
{
"exists" : {
"field" : "system.filesystem.used.pct",
"boost" : 1.0
}
},
{
"range" : {
"@timestamp" : {
"from" : "now-5m",
"to" : "now",
"include_lower" : true,
"include_upper" : true,
"format" : "epoch_millis",
"boost" : 1.0
}
}
}
],
"must_not" : [
{
"match_phrase" : {
"system.filesystem.mount_point" : {
"query" : "/run/docker/*",
"slop" : 0,
"zero_terms_query" : "NONE",
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : false,
"boost" : 1.0
}
},
"aggregations" : {
"when" : {
"max" : {
"field" : "system.filesystem.used.pct"
}
}
}
}
}
}
],
"triggers" : [
{
"id" : "oZg1CGsB35Cz7fEWvpp8",
"name" : "Server low disk PPI DEV envs",
"severity" : "5",
"condition" : {
"script" : {
"source" : "ctx.results[0].hits.hits[0]._source.system.filesystem.used.pct > 0.8",
"lang" : "painless"
}
},
"actions" : [
{
"name" : "Mail to ME",
"destination_id" : "3kAf6moB35Cz7fEWRD5v",
"subject_template" : {
"source" : "",
"lang" : "mustache"
},
"message_template" : {
"source" : """
{
"title": "{{#ctx.results.0.hits.hits}}{{ctx.results.0.hits.hits.}}<FONT size=3>Server {{_source.beat.hostname}}<br> <FONT size=3>Mount point {{_source.system.filesystem.mount_point}} <FONT size=3>: {{_source.system.filesystem.used.pct}}<br><br>{{/ctx.results.0.hits.hits}}",
"text": "Low disk space on DEV env server. Please see mail for more details."
}
""",
"lang" : "mustache"
}
}
]
}
],
"last_update_time" : 1559219481868
}
}
from alerting.
This ctx.results[0].aggregations.when.value
works! Thx!
from alerting.
Hello,
I was reading this very interesting discussion, i have a problem i would like to format the numerical output of the disk usage percentage values ββto something more readable (currently: 0.9261000156402588 i would like it to become 92.0), also you can set a loop that prints all the results in the mail? For now I have found an inelegant method ... I report the json extracted from the API call:
{
"_id" : "txcC83cB3jrnAXbeQFjs",
"_version" : 4,
"_seq_no" : 93,
"primary_term" : 26,
"monitor" : {
"type" : "monitor",
"schema_version" : 3,
"name" : "(GlobalDash)Filesystem_Usage",
"user" : {
"name" : "admin",
"backend_roles" : [
"admin"
],
"roles" : [
"all_access",
"own_index"
],
"custom_attribute_names" : [ ]
},
"enabled" : false,
"enabled_time" : null,
"schedule" : {
"period" : {
"interval" : 15,
"unit" : "MINUTES"
}
},
"inputs" : [
{
"search" : {
"indices" : [
"idxmon*"
],
"query" : {
"size" : 0,
"query" : {
"bool" : {
"filter" : [
{
"range" : {
"timestamp" : {
"from" : "now-15m",
"to" : "now",
"include_lower" : true,
"include_upper" : true,
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"aggregations" : {
"groupByHost" : {
"terms" : {
"field" : "system.hostname.mount_point.keyword",
"size" : 10,
"min_doc_count" : 1,
"shard_min_doc_count" : 0,
"show_term_doc_count_error" : false,
"order" : [
{
"maxBYHOST" : "desc"
},
{
"_key" : "asc"
}
]
},
"aggregations" : {
"maxBYHOST" : {
"max" : {
"field" : "system.filesystem.used.pct"
}
},
"sales_bucket_filter" : {
"bucket_selector" : {
"buckets_path" : {
"filesystem_over_threshold" : "maxBYHOST"
},
"script" : {
"source" : "params.filesystem_over_threshold > 0.85",
"lang" : "painless"
},
"gap_policy" : "skip"
}
}
}
}
}
}
}
}
],
"triggers" : [
{
"id" : "0RgT83cB3jrnAXbeL0jI",
"name" : " (GlobalDash)Filesystem_Usage-Trigger",
"severity" : "1",
"condition" : {
"script" : {
"source" : "ctx.results[0].aggregations.groupByHost.buckets.length > 0",
"lang" : "painless"
}
},
"actions" : [
{
"id" : "0hgT83cB3jrnAXbeL0jI",
"name" : " (GlobalDash)Filesystem_sending_mail",
"destination_id" : "ThM0gXcBkntdlDdiBfwj",
"message_template" : {
"source" : """Monitor {{ctx.monitor.name}} just entered alert status. Please investigate the issue.
- Trigger: {{ctx.trigger.name}}
- Severity: {{ctx.trigger.severity}}
- Period start: {{ctx.periodStart}}
- Period end: {{ctx.periodEnd}}
RESULTS:
Hostname-Filesystem with % occupation:
{{&ctx.results.0.aggregations.groupByHost.buckets.0.key}} {{ctx.results.0.aggregations.groupByHost.buckets.0.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.1.key}} {{ctx.results.0.aggregations.groupByHost.buckets.1.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.2.key}} {{ctx.results.0.aggregations.groupByHost.buckets.2.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.3.key}} {{ctx.results.0.aggregations.groupByHost.buckets.3.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.4.key}} {{ctx.results.0.aggregations.groupByHost.buckets.4.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.5.key}} {{ctx.results.0.aggregations.groupByHost.buckets.5.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.6.key}} {{ctx.results.0.aggregations.groupByHost.buckets.6.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.7.key}} {{ctx.results.0.aggregations.groupByHost.buckets.7.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.8.key}} {{ctx.results.0.aggregations.groupByHost.buckets.8.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.9.key}} {{ctx.results.0.aggregations.groupByHost.buckets.9.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.10.key}} {{ctx.results.0.aggregations.groupByHost.buckets.10.maxBYHOST.value}}""",
"lang" : "mustache"
},
"throttle_enabled" : false,
"subject_template" : {
"source" : "Qualification - global filesystem check - alerting",
"lang" : "mustache"
}
}
]
}
],
"last_update_time" : 1614778756288
}
}
And this is an example of mail results:
Monitor (GlobalDash)Filesystem_Usage just entered alert status. Please investigate the issue.
- Trigger: (GlobalDash)Filesystem_Usage-Trigger
- Severity: 1
- Period start: 2021-03-04T09:51:52Z
- Period end: 2021-03-04T10:06:52Z
RESULTS:
Hostname-Filesystem with % occupation:
serverXXXXX1-/tmp 1
serverXXXXX2-/apps/JRE8 0.928600013256073
serverXXXXX3-/apps/JRE8 0.9279000163078308
serverXXXXX4-/apps/JRE8 0.9261000156402588
serverXXXXX5-/apps/JRE8 0.9261000156402588
serverXXXXX6-/apps 0.8676999807357788
serverXXXXX7-/apps/logs/XXXX-xxxx-xx 0.8634999990463257
serverXXXXX8-/apps/nginx/logs 0.8515999913215637
from alerting.
Related Issues (20)
- [BUG] Cannot send message to Azure service bus topic with shared sas key HOT 1
- [BUG]"Propagate exception from publishing Email notifications to Action run result" HOT 1
- Slack ACK url HOT 1
- Create Monitors using PPL, similar to use of query DSL HOT 2
- [BUG] Can not see monitors, destinations, email groups etc in GUI after installing through API HOT 1
- Support K8s service name as a valid web-hook destination URL HOT 1
- ctx.alert.isAcknowledged not accessible with trigger HOT 1
- UI notification for failed test of alert and destinations HOT 1
- Alerting with email as a destination not working in version 13.1.0.1. HOT 1
- Joining two queries
- [BUG] Alerting Webhooks incorrectly rejecting URL's HOT 2
- [BUG] Can not create monitor in the coordinating cluster when it involves remote indices HOT 1
- [BUG] No setting available for Monitor max triggers HOT 1
- Point people to OpenSearch
- [BUG] {{period_start}} and {{period_end}} seems not respected HOT 1
- Configuring cross-cluster domain across 2 OpenSearch 1.0 cluster throws TCP connection error HOT 2
- contents of fields of query resulting dataset from monitor
- Facing an issue in elasticsearch container logs for alerting is Could not convert socket to TLS, statuscode: 1 HOT 1
- Custom Alert Configure
- [BUG] Kibana alerting not working properly HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from alerting.