Coder Social home page Coder Social logo

Comments (15)

dotdc avatar dotdc commented on May 18, 2024

Really interesting point!

On my setups, I try to have most of my pods in the 50-80% range, I then consider them to be correctly sized.
In my experience, you can start having reliability issues and weird behaviors above 80% resource usage.
I also consider pods running under 50% usage to be over-sized.

I decided to go for a "standard" color scheme for theses because I think It's what makes sense for most users.
We need to keep in mind that requests could also go above 100% if the limit is higher, so you could have something like red > yellow > green > red and I think it can be really confusing for users. We could also argue on the thresholds themselves, this depends on everyone use-cases and policies.

Other ideas would be to use a single color, or another color scheme (not green, yellow and red), but I think it's just a little bit weird...
Users like you that know what's best for their use-cases will just ignore the color anyway, so it's not a big deal in my opinion.

Keeping it this way is maybe safer for most users, what do you think?

If anyone wants to comment with thoughts or ideas, I think it's a good topic! 😊

from grafana-dashboards-kubernetes.

dotdc avatar dotdc commented on May 18, 2024

For the second part:

  • Yes It's a good idea to add the real usage in the table, will make a PR this week to add this
  • On Kubernetes the resources are set by containers not by pods, so I think it can only be "by container".

If you have a pod with more than one container, you should have one plot line per container like this:

image

  • For the last point, I know It could be confusing depending of the pods/containers configuration but didn't find a way to make it more readable than this.

Good to know:

  • I mostly use this dashboard to size my pods based on average or peak usage
  • The table can really help you understand what's wrong with your setup (see screenshot above)
  • Gauges could be hard to read if requests and limits are not set the the same way on all containers
  • The requests gauges can disappear if no requests are set

A nice (but old) thread by @thockin on limits : https://www.reddit.com/r/kubernetes/comments/all1vg/comment/efgyygu/

from grafana-dashboards-kubernetes.

reefland avatar reefland commented on May 18, 2024

On my setups, I try to have most of my pods in the 50-80% range, I then consider them to be correctly sized. In my experience, you can start having reliability issues and weird behaviors above 80% resource usage. I also consider pods running under 50% usage to be over-sized.

Not clear if you target your pods to be 50-80% of the LIMIT or REQUEST. I try to target to within 20% of the REQUEST as ideal. If its constantly over the request (20%+) then I would bump that up when tuning as clearly the request I asked for was too low. The LIMIT I want within 50%-70% as a starting point to avoid OOM kills and leave wiggle room.

I decided to go for a "standard" color scheme for theses because I think It's what makes sense for most users. We need to keep in mind that requests could also go above 100% if the limit is higher, so you could have something like red > yellow > green > red and I think it can be really confusing for users. We could also argue on the thresholds themselves, this depends on everyone use-cases and policies.

I don't think that is confusing. The request number should be center point of GREEN, left and right of center is an arbitrary number we pick that feels right... +/- 25% from center ??. This defines the green area. Then 20% either side of that would be yellow and the last 5% either side is red. If you are significantly under or over the request, that is a problem.

I think its more confusing now as new users will see a very good request value as RED, be confused and alter the values to get it GREEN which really is not what they should be doing.

Other ideas would be to use a single color, or another color scheme (not green, yellow and red), but I think it's just a little bit weird... Users like you that know what's best for their use-cases will just ignore the color anyway, so it's not a big deal in my opinion.

I've been trying to use Goldilocks to get an idea for requests and limits and its values are all over the map. Pretty much every time you hit refresh you get a different recommendation. I found using your dashboard to be WAY easier to tune with. It's just the request colors are off, you need to know that, and not use the colors to base your tuning. But if we can correct the colors, I think it would be an excellent tool for this.

Keeping it this way is maybe safer for most users, what do you think?

I think no color vs current color pattern is safer. The way is is now, I think encourages the wrong action to make it green. But I don't want no color :(

This is how I think it should look:
image

You're a bit over, still ok, should not be red:
image

Significantly under should indicate you can improve:
image

from grafana-dashboards-kubernetes.

reefland avatar reefland commented on May 18, 2024

For above, changes I made to graph:

  • Standard Options
    • Min: auto (but zero looks good to, not sure of difference)
    • Max: 2
    • Decimals: 1

And thresholds:
image


I'd also like to see a timeline graph of each CPU and RAM usage plotted with with respective request / limit lines plotted on it. This would allow an overall view over time (Last 1 hour, 6 hours, 2 days, etc).

from grafana-dashboards-kubernetes.

dotdc avatar dotdc commented on May 18, 2024

Thank you for this @reefland, you just shared many good points and ideas!
I'm still unsure for requests to be honest because it highly depend on how you manage your kubernetes resources (requests = limits, requests < limits...) So I would still keep them neutral for now but we can still iterate on this.

I just created a new version (didn't commit yet):

  • Switched to blue color for requests (pod total) and left limits with green, yellow & red
  • Added "Used" CPU & Memory in the table
  • Added 2 new panels with % usage on requests & limits with thresholds as colored areas

The rest of the dashboard is left unchanged.

Used 20% 30%, 70% & 80% as thresholds, as I think it's pretty conservative.

What do you you think?

Screenshots:
image
image
image
image

from grafana-dashboards-kubernetes.

reefland avatar reefland commented on May 18, 2024

Yeah! These look neat! Look forward to trying them.

from grafana-dashboards-kubernetes.

dotdc avatar dotdc commented on May 18, 2024

Just pushed the new version, try it and let me know what you think.
Maybe we can do a pros/cons list for the requests colors?

from grafana-dashboards-kubernetes.

reefland avatar reefland commented on May 18, 2024

ok, I'll check it out this weekend!

Do you have any way to determine if request = limit then make it blue, otherwise use color scale like something I suggested?

from grafana-dashboards-kubernetes.

reefland avatar reefland commented on May 18, 2024

I need to figure out this missing image= key. As-is, I get nothing. I'll have to re-work each gauge to remove that reference.

image

from grafana-dashboards-kubernetes.

reefland avatar reefland commented on May 18, 2024

sigh another issue, besides not having the image= do not have container=

The container_cpu_usage_seconds_total{namespace="mosquitto", pod="mosquitto-mqtt-0"} yields:

container_cpu_usage_seconds_total{cpu="total", endpoint="https-metrics", id="/kubepods/burstable/podcc153a2a-d87e-4b18-b37b-159fa6907cd4", instance="k3s02", job="kubelet", metrics_path="/metrics/cadvisor", namespace="mosquitto", node="k3s02", pod="mosquitto-mqtt-0", service="prometheus-kubelet"}

Which returns an empty set using by (container):

sum(rate(container_cpu_usage_seconds_total{namespace="mosquitto", pod="mosquitto-mqtt-0"}[1m])) by (container)

from grafana-dashboards-kubernetes.

dotdc avatar dotdc commented on May 18, 2024

Ok I think it's time to run copy of your k3s setup to solve both of theses.
I'll do my best to do it this week or during the weekend.
Will keep you updated, hopefully with a fix.

from grafana-dashboards-kubernetes.

dotdc avatar dotdc commented on May 18, 2024

We'll keep this issue on topic.
Investigation on missing labels will be in #18

from grafana-dashboards-kubernetes.

dotdc avatar dotdc commented on May 18, 2024

Did you manage to test the latest version?
I think it now includes most of what we discussed in this issue.
Let me know.

from grafana-dashboards-kubernetes.

reefland avatar reefland commented on May 18, 2024

Nah... without the container level metrics I can't really test it properly.

from grafana-dashboards-kubernetes.

dotdc avatar dotdc commented on May 18, 2024

Hope you will find a solution to get this working on your setup 🤞
Thanks again for your time and ideas on this!
Closing this issue.

from grafana-dashboards-kubernetes.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.