Comments (3)
Currently, I collect containers that use ext resources by traversing pods on the kubernetes node, and then get that CPU usage from cAdvisor. Should I get it directly from the kernel(/proc/bt_stat), or go straight past it from cAdvisor(label or cgroup.priority ?).
from crane.
sum(count(node_cpu_seconds_total{mode="idle",instance=~"%s.*"}) by (mode, cpu)) - sum(irate(node_cpu_seconds_total{mode="idle",instance=~"%s.*"}[%s]))
is node current cpu usage already, this is node exporter cpu metrics from the node global, Do not need minus (sum(irate(node_ext_cpu_usage_seconds_total{node="%s"}[%s]))
which is ext pods consumed usage again. the former query include all cpu usage of node already which include node_ext_cpu_usage_seconds_total.
I do not understand why you need to minus node_ext_cpu_usage_seconds_total. if you mean the ext pods consumed the predictor prediction available ext resource after ext pods scheduled to node instantly, then the crane-agent can minus it after watched the pods started when patch the node ext resource.
If you just want to use nodes total cpu minus all online pods cpu usage, then minus all offline cpu usages, then it will get all offline cpu resource to be available is what you want. this is only the cpu usage which is kubernetes managed pods, do not include some process that running in node which is not managed by kubernetes.
So node available cpu resource vs node kubernetes ext resource is different. the former make sure we got node's total usage, the latter make sure we got is kubernetes managed resource. I think the two both needed for node qos.
maybe you need give an example and more detailed describe.
from crane.
Maybe I didn't make it clear, and I'll give you an example
At first, the cpu of the node was 64 cores, the online service used 32 cores, so the ext-CPU was calculated to be 32000,
Offline service A is then scheduled to this node, requesting ext-CPU 16,000, assuming 15 cores are used
At this point, if the 15 cores used by the ext-resource service(Offline service A ) are not subtracted, the ext-cpu calculated again is 17000 ((64-32-16)*1000), and the remaining allocatable ext-CPU is 1000 (17000-16000), but the actual idle CPU is 17 cores, so the remaining allocatable ext-CPU should be (32000-16000) in order to schedule more ext-resource services to the node
from crane.
Related Issues (20)
- crane-agent can not connect the existing runtime endpint when using default runtime endpoint
- x509:certificate signed by unknown authority HOT 1
- Crane 和 Caelus 的关系
- no grafana panel show in Overview tab HOT 6
- Vulnerability Disclosure HOT 1
- Can't install Crane on MacOS M2
- 开源版本是否支持节点资源放大
- 自动更新资源 HOT 2
- Optimize pod qos initializer. HOT 1
- Currecy Value not changing to USD
- Proposal: add API based end-to-end (aka e2e) testing
- How to configure the CRD workload to use recommended features HOT 2
- 为什么会添加集群失败啊 HOT 2
- TimeSeriesPrediction status PredictPartial message is not all metric predicted HOT 1
- Optimization: convergence craned permissions
- Crane-agent may panic when used ExtResource gocrane.io/memory
- When online resource usage is too high, the agent will set ext resources to 0
- 干扰检测和主动回避--节点污点不能自动清除
- Resource recommendation support based on deep learning HOT 1
- craned - Leader election lost
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from crane.