Coder Social home page Coder Social logo

Comments (9)

dbbaughe avatar dbbaughe commented on June 8, 2024 1

Hi @Pupkur,

A few questions/observations:

  • One thing I see right away is that you have a monitor scheduled to run every 1 hour, but your range is only looking at the past 5 minutes. So every 1 hour your monitor will look at 5/60 minutes of data. Is this intended? Are you positive you have logs in those 5 minutes that would trigger an alert?
  • It seems your initial post contained a range query for system.filesystem.used.pct and now you're using an aggregation?
  • Your trigger painless script is looking only at the first result of your hits with no sort specified and comparing that while your query seems to be returning all documents with system.filesystem.used.pct which means there's no guarantee that the document you're comparing is the highest pct in that 5 minute interval.
  • Your new query is using an aggregation so you probably want to use ctx.results[0].aggregations.when.value instead in your trigger painless script to do your comparison.
  • If you do end up using the aggregation you can add size: 0 to your query since you do not need the hits.

from alerting.

dbbaughe avatar dbbaughe commented on June 8, 2024

Hi @Pupkur,

Can you provide your complete Monitor configuration so I can take a look at your query and trigger.

Thanks,
Drew

from alerting.

Pupkur avatar Pupkur commented on June 8, 2024

image
Monitor query you can see in first post.

from alerting.

dbbaughe avatar dbbaughe commented on June 8, 2024

@Pupkur,

Sorry, I meant the whole JSON configuration from the API; interested in seeing what you wrote for your painless script in your trigger so I can attempt to reproduce your issue.

from alerting.

Pupkur avatar Pupkur commented on June 8, 2024

Sorry for delay.
Where can I get this JSON?

from alerting.

dbbaughe avatar dbbaughe commented on June 8, 2024

@Pupkur,

There is a GET Monitor API you can use:
https://opendistro.github.io/for-elasticsearch-docs/docs/alerting/api/#get-monitor

from alerting.

Pupkur avatar Pupkur commented on June 8, 2024
{
  "_id" : "S5fyB2sB35Cz7fEWkyEF",
  "_version" : 11,
  "monitor" : {
    "type" : "monitor",
    "name" : "Server low disk DEV envs",
    "enabled" : true,
    "enabled_time" : 1559206466297,
    "schedule" : {
      "period" : {
        "interval" : 1,
        "unit" : "HOURS"
      }
    },
    "inputs" : [
      {
        "search" : {
          "indices" : [
            "metrics-dev-envs-*"
          ],
          "query" : {
            "query" : {
              "bool" : {
                "must" : [
                  {
                    "match_all" : {
                      "boost" : 1.0
                    }
                  },
                  {
                    "exists" : {
                      "field" : "system.filesystem.used.pct",
                      "boost" : 1.0
                    }
                  },
                  {
                    "range" : {
                      "@timestamp" : {
                        "from" : "now-5m",
                        "to" : "now",
                        "include_lower" : true,
                        "include_upper" : true,
                        "format" : "epoch_millis",
                        "boost" : 1.0
                      }
                    }
                  }
                ],
                "must_not" : [
                  {
                    "match_phrase" : {
                      "system.filesystem.mount_point" : {
                        "query" : "/run/docker/*",
                        "slop" : 0,
                        "zero_terms_query" : "NONE",
                        "boost" : 1.0
                      }
                    }
                  }
                ],
                "adjust_pure_negative" : false,
                "boost" : 1.0
              }
            },
            "aggregations" : {
              "when" : {
                "max" : {
                  "field" : "system.filesystem.used.pct"
                }
              }
            }
          }
        }
      }
    ],
    "triggers" : [
      {
        "id" : "oZg1CGsB35Cz7fEWvpp8",
        "name" : "Server low disk PPI DEV envs",
        "severity" : "5",
        "condition" : {
          "script" : {
            "source" : "ctx.results[0].hits.hits[0]._source.system.filesystem.used.pct > 0.8",
            "lang" : "painless"
          }
        },
        "actions" : [
          {
            "name" : "Mail to ME",
            "destination_id" : "3kAf6moB35Cz7fEWRD5v",
            "subject_template" : {
              "source" : "",
              "lang" : "mustache"
            },
            "message_template" : {
              "source" : """
{
"title": "{{#ctx.results.0.hits.hits}}{{ctx.results.0.hits.hits.}}<FONT size=3>Server {{_source.beat.hostname}}<br> <FONT size=3>Mount point {{_source.system.filesystem.mount_point}} <FONT size=3>: {{_source.system.filesystem.used.pct}}<br><br>{{/ctx.results.0.hits.hits}}",

"text": "Low disk space on DEV env server. Please see mail for more details."
}
""",
              "lang" : "mustache"
            }
          }
        ]
      }
    ],
    "last_update_time" : 1559219481868
  }
}

from alerting.

Pupkur avatar Pupkur commented on June 8, 2024

This ctx.results[0].aggregations.when.value works! Thx!

from alerting.

uids2 avatar uids2 commented on June 8, 2024

Hello,

I was reading this very interesting discussion, i have a problem i would like to format the numerical output of the disk usage percentage values ​​to something more readable (currently: 0.9261000156402588 i would like it to become 92.0), also you can set a loop that prints all the results in the mail? For now I have found an inelegant method ... I report the json extracted from the API call:

{
"_id" : "txcC83cB3jrnAXbeQFjs",
"_version" : 4,
"_seq_no" : 93,
"primary_term" : 26,
"monitor" : {
"type" : "monitor",
"schema_version" : 3,
"name" : "(GlobalDash)Filesystem_Usage",
"user" : {
"name" : "admin",
"backend_roles" : [
"admin"
],
"roles" : [
"all_access",
"own_index"
],
"custom_attribute_names" : [ ]
},
"enabled" : false,
"enabled_time" : null,
"schedule" : {
"period" : {
"interval" : 15,
"unit" : "MINUTES"
}
},
"inputs" : [
{
"search" : {
"indices" : [
"idxmon
*"
],
"query" : {
"size" : 0,
"query" : {
"bool" : {
"filter" : [
{
"range" : {
"timestamp" : {
"from" : "now-15m",
"to" : "now",
"include_lower" : true,
"include_upper" : true,
"boost" : 1.0
}
}
}
],
"adjust_pure_negative" : true,
"boost" : 1.0
}
},
"aggregations" : {
"groupByHost" : {
"terms" : {
"field" : "system.hostname.mount_point.keyword",
"size" : 10,
"min_doc_count" : 1,
"shard_min_doc_count" : 0,
"show_term_doc_count_error" : false,
"order" : [
{
"maxBYHOST" : "desc"
},
{
"_key" : "asc"
}
]
},
"aggregations" : {
"maxBYHOST" : {
"max" : {
"field" : "system.filesystem.used.pct"
}
},
"sales_bucket_filter" : {
"bucket_selector" : {
"buckets_path" : {
"filesystem_over_threshold" : "maxBYHOST"
},
"script" : {
"source" : "params.filesystem_over_threshold > 0.85",
"lang" : "painless"
},
"gap_policy" : "skip"
}
}
}
}
}
}
}
}
],
"triggers" : [
{
"id" : "0RgT83cB3jrnAXbeL0jI",
"name" : " (GlobalDash)Filesystem_Usage-Trigger",
"severity" : "1",
"condition" : {
"script" : {
"source" : "ctx.results[0].aggregations.groupByHost.buckets.length > 0",
"lang" : "painless"
}
},
"actions" : [
{
"id" : "0hgT83cB3jrnAXbeL0jI",
"name" : " (GlobalDash)Filesystem_sending_mail",
"destination_id" : "ThM0gXcBkntdlDdiBfwj",
"message_template" : {
"source" : """Monitor {{ctx.monitor.name}} just entered alert status. Please investigate the issue.

  • Trigger: {{ctx.trigger.name}}
  • Severity: {{ctx.trigger.severity}}
  • Period start: {{ctx.periodStart}}
  • Period end: {{ctx.periodEnd}}

RESULTS:

Hostname-Filesystem with % occupation:
{{&ctx.results.0.aggregations.groupByHost.buckets.0.key}} {{ctx.results.0.aggregations.groupByHost.buckets.0.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.1.key}} {{ctx.results.0.aggregations.groupByHost.buckets.1.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.2.key}} {{ctx.results.0.aggregations.groupByHost.buckets.2.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.3.key}} {{ctx.results.0.aggregations.groupByHost.buckets.3.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.4.key}} {{ctx.results.0.aggregations.groupByHost.buckets.4.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.5.key}} {{ctx.results.0.aggregations.groupByHost.buckets.5.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.6.key}} {{ctx.results.0.aggregations.groupByHost.buckets.6.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.7.key}} {{ctx.results.0.aggregations.groupByHost.buckets.7.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.8.key}} {{ctx.results.0.aggregations.groupByHost.buckets.8.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.9.key}} {{ctx.results.0.aggregations.groupByHost.buckets.9.maxBYHOST.value}}
{{&ctx.results.0.aggregations.groupByHost.buckets.10.key}} {{ctx.results.0.aggregations.groupByHost.buckets.10.maxBYHOST.value}}""",
"lang" : "mustache"
},
"throttle_enabled" : false,
"subject_template" : {
"source" : "Qualification - global filesystem check - alerting",
"lang" : "mustache"
}
}
]
}
],
"last_update_time" : 1614778756288
}
}

And this is an example of mail results:

Monitor (GlobalDash)Filesystem_Usage just entered alert status. Please investigate the issue.

  • Trigger: (GlobalDash)Filesystem_Usage-Trigger
  • Severity: 1
  • Period start: 2021-03-04T09:51:52Z
  • Period end: 2021-03-04T10:06:52Z

RESULTS:

Hostname-Filesystem with % occupation:
serverXXXXX1-/tmp 1
serverXXXXX2-/apps/JRE8 0.928600013256073
serverXXXXX3-/apps/JRE8 0.9279000163078308
serverXXXXX4-/apps/JRE8 0.9261000156402588
serverXXXXX5-/apps/JRE8 0.9261000156402588
serverXXXXX6-/apps 0.8676999807357788
serverXXXXX7-/apps/logs/XXXX-xxxx-xx 0.8634999990463257
serverXXXXX8-/apps/nginx/logs 0.8515999913215637

from alerting.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.