From the little that I read about node affinity, does adding the following strategy ma

brain dump: So the documentation says - <div class="highlight highlight-source

I believe <a class="user-mention notranslate" data-hovercard-type="user" data-hovercar

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

ping <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Add support for node affinity strategy about descheduler HOT 14 CLOSED

kubernetes-sigs commented on July 19, 2024 1

Add support for node affinity strategy

from descheduler.

Comments (14)

concaf commented on July 19, 2024 1

brain dump:
So the documentation says -

	// The scheduler will prefer to schedule pods to nodes that satisfy
	// the affinity expressions specified by this field, but it may choose
	// a node that violates one or more of the expressions. The node that is
	// most preferred is the one with the greatest sum of weights, i.e.
	// for each node that meets all of the scheduling requirements (resource
	// request, requiredDuringScheduling affinity expressions, etc.),
	// compute a sum by iterating through the elements of this field and adding
	// "weight" to the sum if the node matches the corresponding matchExpressions; the
	// node(s) with the highest sum are the most preferred.
	// +optional
	PreferredDuringSchedulingIgnoredDuringExecution []PreferredSchedulingTerm `json:"preferredDuringSchedulingIgnoredDuringExecution,omitempty" protobuf:"bytes,2,rep,name=preferredDuringSchedulingIgnoredDuringExecution"`

IIUC, the pod is scheduled on the node with the maximum calculated weight. The weight that is specified in the nodeAffinity is added to the pre-calculated weight and then the pod is scheduled.
So, in order to evict a pod, we need to calculate weights of the nodes where the pod can fit (based on nodeAffinity rules), and if a node gets a weight more than the weight of the current node then evict the pod.

WDYT about the approach @ravisantoshgudimetla @aveshagarwal?

from descheduler.

concaf commented on July 19, 2024 1

I believe @ravisantoshgudimetla and @aveshagarwal are the right folks for that ;)

from descheduler.

aveshagarwal commented on July 19, 2024

@containscafeine I'd say first address requiredDuringSchedulingIgnoredDuringExecution (hard affinity) than preferredDuringSchedulingIgnoredDuringExecution (soft affinity).

For pods with node affinity set using preferredDuringSchedulingIgnoredDuringExecution, it might be possible that the preferred node was unavailable during scheduling and the pod was scheduled on another node. In this case, if the descheduler is run, it does the following -

It's not just about initial scheduling, as during initial scheduling, the scheduler makes its best effort to fulfill a pod's requirements. Infact it's more about what happens over time (changes in a cluster) that may lead to violation of requirements since these requirements are not checked at run time.

from descheduler.

concaf commented on July 19, 2024

@aveshagarwal a bit confused on how to handle requiredDuringSchedulingIgnoredDuringExecution, since the key says anyway that we need to ignore during execution.
Let's say the labels on the current node have changed during runtime and the affinity rules are not met anymore, then should we evict that pod, since it explicitly says to ignore during execution. What am I missing? :(

from descheduler.

aveshagarwal commented on July 19, 2024

@aveshagarwal a bit confused on how to handle requiredDuringSchedulingIgnoredDuringExecution, since the key says anyway that we need to ignore during execution.
Let's say the labels on the current node have changed during runtime and the affinity rules are not met anymore, then should we evict that pod,

Yes.

since it explicitly says to ignore during execution. What am I missing? :(

So it says "ignored" that means the scheduler does not take care of that. And that is something we can take care of in descheduler.

from descheduler.

concaf commented on July 19, 2024

requiredDuringSchedulingIgnoredDuringExecution was merged in #56
Will now implement preferredDuringSchedulingIgnoredDuringExecution

from descheduler.

concaf commented on July 19, 2024

ping @ravisantoshgudimetla @aveshagarwal

from descheduler.

nitishkumar71 commented on July 19, 2024

Hey @containscafeine,
are you still working on this feature, would be really a helpful feature.

from descheduler.

concaf commented on July 19, 2024

@nitishkumar71 unfortunately I don't have bandwidth to work on this right now, sorry about that.

from descheduler.

nitishkumar71 commented on July 19, 2024

@containscafeine No worries, can you point me to direction so I can understand codebase. I can give it a try.

from descheduler.

fejta-bot commented on July 19, 2024

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

from descheduler.

fejta-bot commented on July 19, 2024

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

from descheduler.

fejta-bot commented on July 19, 2024

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

from descheduler.

k8s-ci-robot commented on July 19, 2024

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

from descheduler.

Add support for node affinity strategy about descheduler HOT 14 CLOSED

Comments (14)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent