openkruise / kruise-game Goto Github PK
View Code? Open in Web Editor NEWGame Servers Management on Kubernetes
Home Page: https://openkruise.io/kruisegame/introduction
License: Apache License 2.0
Game Servers Management on Kubernetes
Home Page: https://openkruise.io/kruisegame/introduction
License: Apache License 2.0
The current probe state of the service quality function can only be marked with true/false. This often results in the need for multiple detection scripts to reveal multiple states. The problem with using multiple quality of service detection scripts is that due to subtle differences in the probe execution cycles, even if conflicting logic is avoided at the script level, there is still a certain probability that multiple detection results will return true at the same time, resulting in setting status conflicts.
Now it is proposed that through one detection, the user returns multiple results and sets different actions for different results.
API
// Not Change
type ServiceQuality struct {
corev1.Probe `json:",inline"`
Name string `json:"name"`
ContainerName string `json:"containerName,omitempty"`
// Whether to make GameServerSpec not change after the ServiceQualityAction is executed.
// When Permanent is true, regardless of the detection results, ServiceQualityAction will only be executed once.
// When Permanent is false, ServiceQualityAction can be executed again even though ServiceQualityAction has been executed.
Permanent bool `json:"permanent"`
ServiceQualityAction []ServiceQualityAction `json:"serviceQualityAction,omitempty"`
}
type ServiceQualityAction struct {
State bool `json:"state"`
// Result indicate the probe message returned by the script.
// When Result is defined, it would exec action only when the according Result is actually returns.
Result string `json:"result,omitempty"`
GameServerSpec `json:",inline"`
}
type ServiceQualityCondition struct {
Name string `json:"name"`
Status string `json:"status,omitempty"`
// Result indicate the probe message returned by the script
Result string `json:"result,omitempty"`
LastProbeTime metav1.Time `json:"lastProbeTime,omitempty"`
LastTransitionTime metav1.Time `json:"lastTransitionTime,omitempty"`
LastActionTransitionTime metav1.Time `json:"lastActionTransitionTime,omitempty"`
}
Add a Result field in ServiceQualityAction. When it is specified, the Spec of GameServer will be changed only when the script returns the corresponding value. In this way, a detection script can return multiple statuses.
It is recommended to add an update strategy that allows the image to be updated only under certain pod states.
When the status of the container is controlled through scripts, there will be the following problems:
1.If the pod status is allocated, the image will also be updated.This will affect players who are playing.
2.If you want to solve problem 1, you need to rebuild gss.yaml. If there is a scaling mechanism, you also need to rebuild scaled.yaml, and then delete the previous yaml, which is very troublesome.Of course there are other ways.
We encountered an issue while configuring Hostport network mode using OKG's GameServerSet.
First of all, the GameServerSet has been deployed in the cluster, and the pod has been started, and its status is Running.
Then I deleted it by running "kubectl delete" on the GameServerSet. At this time, pod's status is Terminating.
While the pod has not finished exiting, I re-applied this GameServerSet. However, the newly created pod failed to obtain the Hostport.
Later, I deleted this GameServerSet again, waited for the pod to completely exit, and then re-applied this GameServerSet. The pod was able to obtain the Hostport information correctly.
Upon investigation, I found that the Kruise-game controller panicked due to this operation, which caused a restart. I suspect that the issue was caused by this operation as a whole.
apiVersion: game.kruise.io/v1alpha1
kind: GameServerSet
metadata:
name: trunk
spec:
replicas: 1
updateStrategy:
rollingUpdate:
podUpdatePolicy: InPlaceIfPossible
network:
networkType: Kubernetes-HostPort
networkConf:
- name: ContainerPorts
value: "container1:5000/TCP"
gameServerTemplate:
spec:
containers:
- image: container1-image
imagePullPolicy: IfNotPresent
name: container1
env:
- name: KRUISE_CONTAINER_PRIORITY
value: "2"
volumeMounts:
- name: network
mountPath: /opt/network
- image: container2-image
imagePullPolicy: IfNotPresent
name: container2
env:
- name: KRUISE_CONTAINER_PRIORITY
value: "1"
volumes:
- name: network
downwardAPI:
items:
- path: "annotations"
fieldRef:
fieldPath: metadata.annotations['game.kruise.io/network-status']
volumeClaimTemplates:
- metadata:
name: db-storage
spec:
accessModes: ["ReadWriteOnce"]
storageClassName: "cfs"
resources:
requests:
storage: 10Gi
1.686739246445438e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/validate-v1alpha1-gss", "UID": "e5c1cf00-f3d3-46ae-b86a-9b5b8ad537a6", "kind": "game.kruise.io/v1alpha1, Kind=GameServerSet", "resource": {"group":"game.kruise.io","version":"v1alpha1","resource":"gameserversets"}}
1.686739246445752e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/validate-v1alpha1-gss", "code": 200, "reason": "pass validating", "UID": "e5c1cf00-f3d3-46ae-b86a-9b5b8ad537a6", "allowed": true}
1.6867392465119815e+09 DEBUG events Normal {"object": {"kind":"GameServerSet","namespace":"default","name":"a4","uid":"587e3fb3-cc3a-4b12-aa3b-6254ff0d7875","apiVersion":"game.kruise.io/v1alpha1","resourceVersion":"33261501457"}, "reason": "CreateWorkload", "message": "created Advanced StatefulSet"}
1.6867392465414512e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "a8cf9c48-0ab7-4366-978a-885777aee37f", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.6867392465423882e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "a8cf9c48-0ab7-4366-978a-885777aee37f", "allowed": true}
1.6867392466240625e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "3d27baa4-34d7-4681-b3c3-42ac7bee3ea2", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.6867392466250088e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "3d27baa4-34d7-4681-b3c3-42ac7bee3ea2", "allowed": true}
1.686739246684605e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "82bb9c68-6182-44ad-9874-d5845f78af91", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.686739246685527e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "82bb9c68-6182-44ad-9874-d5845f78af91", "allowed": true}
1.6867392467381902e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "413f0e56-ed28-46b2-a6c6-b7b183adbec2", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.686739246739107e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "413f0e56-ed28-46b2-a6c6-b7b183adbec2", "allowed": true}
1.686739246776154e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "2a41ab6a-97d3-4c68-847b-450fcebd1752", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.6867392467770834e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "2a41ab6a-97d3-4c68-847b-450fcebd1752", "allowed": true}
1.6867392468166428e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "1da9233f-f440-4845-84b4-b807ea56279e", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.6867392468175259e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "1da9233f-f440-4845-84b4-b807ea56279e", "allowed": true}
1.6867392470155354e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "a23451b2-dd25-4628-995f-56ca2470e466", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.6867392470164003e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "a23451b2-dd25-4628-995f-56ca2470e466", "allowed": true}
1.6867392473775764e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "964ff6a1-5588-4aca-a1a5-1959576b29eb", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.6867392473784983e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "964ff6a1-5588-4aca-a1a5-1959576b29eb", "allowed": true}
1.6867392481780772e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "54e1e9c6-5769-48e3-86ec-b06390cb19f8", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.6867392481789427e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "54e1e9c6-5769-48e3-86ec-b06390cb19f8", "allowed": true}
1.6867392495441675e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "96ac2a82-7933-4894-8c27-0764d4069dd1", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.6867392495450299e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "96ac2a82-7933-4894-8c27-0764d4069dd1", "allowed": true}
1.6867392521424189e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "0d92f93c-0e2e-4a55-ad51-e8425ce19de4", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.6867392521432865e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "0d92f93c-0e2e-4a55-ad51-e8425ce19de4", "allowed": true}
1.68673925730318e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "16658c6b-f5e2-4976-b9c7-eb08e8d4bdb1", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.6867392573040788e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "16658c6b-f5e2-4976-b9c7-eb08e8d4bdb1", "allowed": true}
1.6867392675877454e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "f7c72efe-8657-4122-97c4-7ebc8c530a3c", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
1.6867392675886924e+09 DEBUG controller-runtime.webhook.webhooks wrote response {"webhook": "/mutate-v1-pod", "code": 200, "reason": "", "UID": "f7c72efe-8657-4122-97c4-7ebc8c530a3c", "allowed": true}
1.6867392744261086e+09 DEBUG controller-runtime.webhook.webhooks received request {"webhook": "/mutate-v1-pod", "UID": "dedc8191-1383-4fd4-a151-ada8b4a73caf", "kind": "/v1, Kind=Pod", "resource": {"group":"","version":"v1","resource":"pods"}}
panic: runtime error: index out of range [-1]
goroutine 1790 [running]:
github.com/openkruise/kruise-game/cloudprovider/kubernetes.(*HostPortPlugin).deAllocate(0xc0003cf8b0, {0xc0008025d0, 0x1, 0x1a1172e?}, {0xc00099acd0, 0xc})
/workspace/cloudprovider/kubernetes/hostPort.go:251 +0x169
github.com/openkruise/kruise-game/cloudprovider/kubernetes.(*HostPortPlugin).OnPodDeleted(0xc0003cf8b0, {0x7000000000000?, 0xc00077a060?}, 0xc000700800, {0x0?, 0x0?})
/workspace/cloudprovider/kubernetes/hostPort.go:175 +0x14a
github.com/openkruise/kruise-game/pkg/webhook.(*PodMutatingHandler).Handle.func1()
/workspace/pkg/webhook/mutating_pod.go:81 +0xdf
created by github.com/openkruise/kruise-game/pkg/webhook.(*PodMutatingHandler).Handle
/workspace/pkg/webhook/mutating_pod.go:72 +0x37d
Currently, GameServer is recycled by the GameServerSet controller. When GameServerSet finds that the managed pods no longer exist in the cluster, the corresponding GameServers are deleted. One problem with this recycling method is that the life cycle of GameServer is not so certain. For example, when a pod is deleted and rebuilt, if the pod is rebuilt quickly, the GameServer will not be recycled and its attributes will still be retained; but if the rebuilt is slow, the GameServer will also be deleted and a new one will be created after the pod is created. GameServer with the same name, the status previously recorded in GameServer will be lost.
I think we need a more deterministic approach to lifecycle management and leave the choice to the user. The user will decide whether the owner of GameServer is GameServerSet or Pod. If the owner is GameServerSet, GameServer will not be deleted when the pod is deleted. GameServer will be deleted only when GameServerSet is deleted. If the owner is GameServer, GameServer will be deleted when the pod is deleted.
当前,GameServer是由GameServerSet控制器回收的,当GameServerSet发现管理的pods已经不存在集群,则删除对应的GameServers。这种回收方式有一个问题在于,GameServer的生命周期不是那么确定。比如在pod删除重建时,如果pod重建速度较快,则GameServer将不会被回收,它的属性也依旧保留;但如果重建较慢,则GameServer也将被删除,pod创建后会创建一个新的同名GameServer,之前记录在GameServer的状态将丢失。
我认为我们需要一个更加确定的生命周期管理方式,并将选择权交由用户。用户将决定GameServer的owner是GameServerSet还是Pod。如果owner是GameServerSet,则pod删除时,GameServer不会被删除,当GameServerSet被删除,GameServer才会被删除。如果owner是GameServer,则pod删除时GameServer就会被删除。
Kruise version: 1.5.0
Kruise game version: 0.6.1
Game server set sample from https://openkruise.io/zh/kruisegame/user-manuals/service-qualities :
apiVersion: game.kruise.io/v1alpha1
kind: GameServerSet
metadata:
name: minecraft
namespace: default
spec:
replicas: 3
gameServerTemplate:
spec:
containers:
- image: registry.cn-hangzhou.aliyuncs.com/gs-demo/gameserver:idle
name: minecraft
updateStrategy:
rollingUpdate:
podUpdatePolicy: InPlaceIfPossible
maxUnavailable: 100%
serviceQualities: # 设置了一个idle的服务质量
- name: idle
containerName: minecraft
permanent: false
#与原生probe类似,本例使用执行脚本的方式探测游戏服是否空闲,不存在玩家
exec:
command: ["bash", "./idle.sh"]
serviceQualityAction:
#不存在玩家,标记该游戏服运维状态为WaitToBeDeleted
- state: true
opsState: WaitToBeDeleted
#存在玩家,标记该游戏服运维状态为None
- state: false
opsState: None
kubectl -n kruise-system logs -f kruise-daemon-pk252
This problem also occurs in Kruise version 1.5.1
Firstly, please note the following series of events:
A Pod has been established within the cluster utilizing OKG for GSS, complete with requested hostport and performs regularly.
Upon updating the image and GSS configuration incorporating a ReadinessProbe, an error arises preventing the new Pod from launching (the grpc ReadinessProbe was applied, but presently, the cluster isn't capable of accommodating this new feature).
In response, I removed the problematic ReadinessProbe configuration which prompted a rebuild, leading to the successful creation of the new Pod.
Curiously, after its establishment, the new Pod neglected to request a hostport.
Upon inspecting the controller logs, it appears that the original Pod had not triggered a deallocation upon deletion. As such, the controller perpetually views the hostport of the Pod as requested. This situation merits further investigation of the related allocation processes to prevent potential errors in similar future scenarios.
How can such instances be avoided? Is there any mechanism to verify the deallocation of the hostport when the Pod is deleted? Any advice regarding resolving this issue would be greatly appreciated.
kruise-game/docs/中文/快速开始/游戏服水平伸缩.md
Line 211 in b6fdc23
gss配置如下
...
spec:
replicas: 5
reserveGameServerIds:
- 0
- 2
scaleStrategy:
scaleDownStrategyType: ReserveIds
...
gs信息如下
NAME STATE OPSSTATE DP UP AGE
nginx-okg-1 Ready None 0 0 23h
nginx-okg-3 Ready None 0 0 6h57m
nginx-okg-4 Ready None 0 0 7m26s
nginx-okg-5 Ready None 0 0 7m26s
nginx-okg-6 Ready None 0 0 6m19s
kubectl edit gss nginx-okg
spec:
replicas: 3
reserveGameServerIds:
- 0
- 4
scaleStrategy:
scaleDownStrategyType: ReserveIds
...
...
spec:
replicas: 3
reserveGameServerIds:
- 0
- 4
- 6
scaleStrategy:
scaleDownStrategyType: ReserveIds
...
NAME STATE OPSSTATE DP UP AGE
nginx-okg-1 Ready None 0 0 24h
nginx-okg-2 Ready None 0 0 2m9s
nginx-okg-3 Ready None 0 0 7h3m
1、第一预期应该是gs保留[1, 3, 5],reserveGameServerIds的值更新为[0, 4, 6]
2、目前还不理解新建gs-2的原理,假设新建gs-2符合预期的话,那么reserveGameServerIds的值为何不是更新为[0, 4, 5, 6]
Hello,
I've been working with the OpenKruise Game project, specifically with the GameServerSet custom resource. I noticed that when using GameServerSet, the generated Pods do not have individual DNS records, which seems to be due to the missing subdomain field in the Pods' spec part.
In contrast, when using a StatefulSet with a specified serviceName, the Pods have a subdomain field, which allows for individual DNS resolution for each Pod.
To improve the functionality of GameServerSet, I would like to suggest adding support for a serviceName-like field or directly using the subdomain field. This would enable individual DNS resolution for each Pod managed by a GameServerSet, making it more convenient for use cases that require addressing individual Pods.
Please let me know if this is something that can be considered for implementation in the OpenKruise Game project or if there are any workarounds available to achieve this functionality.
使用以下yaml创建gss不能实现创建预期编号为[1-4]的gs,最终gs与pod编号仍为[0-3]
apiVersion: game.kruise.io/v1alpha1
kind: GameServerSet
metadata:
name: minecraft
spec:
replicas: 4
reserveGameServerIds: [0]
gameServerTemplate:
spec:
containers:
- name: minecraft
image: registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2
查看events发现编号为4的pod被删除
LAST SEEN TYPE REASON OBJECT MESSAGE
3m7s Normal Scheduled pod/minecraft-0 Successfully assigned default/minecraft-0 to ssl-k8s-126-3
3m6s Normal Pulling pod/minecraft-0 Pulling image "registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2"
3m5s Normal Pulled pod/minecraft-0 Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2" in 353.010049ms (685.430123ms including waiting)
3m5s Warning ContainersNotReady gameserver/minecraft-0 containers with unready status: [minecraft]
3m5s Normal Created pod/minecraft-0 Created container minecraft
3m5s Normal Started pod/minecraft-0 Started container minecraft
3m4s Normal GsStateChanged gameserver/minecraft-0 State turn from Creating to Ready
3m7s Normal Scheduled pod/minecraft-1 Successfully assigned default/minecraft-1 to ssl-k8s-126-3
3m7s Warning ContainersNotReady gameserver/minecraft-1 containers with unready status: [minecraft]
3m6s Normal Pulling pod/minecraft-1 Pulling image "registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2"
3m6s Normal Pulled pod/minecraft-1 Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2" in 338.014ms (338.119396ms including waiting)
3m6s Normal Created pod/minecraft-1 Created container minecraft
3m6s Normal Started pod/minecraft-1 Started container minecraft
3m5s Normal GsStateChanged gameserver/minecraft-1 State turn from Creating to NotReady
3m3s Normal GsStateChanged gameserver/minecraft-1 State turn from NotReady to Ready
3m7s Normal Scheduled pod/minecraft-2 Successfully assigned default/minecraft-2 to ssl-k8s-126-3
3m7s Warning ContainersNotReady gameserver/minecraft-2 containers with unready status: [minecraft]
3m7s Normal Pulling pod/minecraft-2 Pulling image "registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2"
3m6s Normal Pulled pod/minecraft-2 Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2" in 299.383572ms (299.388314ms including waiting)
3m6s Normal Created pod/minecraft-2 Created container minecraft
3m6s Normal Started pod/minecraft-2 Started container minecraft
3m5s Normal GsStateChanged gameserver/minecraft-2 State turn from Creating to Ready
3m7s Normal Scheduled pod/minecraft-3 Successfully assigned default/minecraft-3 to ssl-k8s-126-3
3m7s Warning ContainersNotReady gameserver/minecraft-3 containers with unready status: [minecraft]
3m6s Normal Pulling pod/minecraft-3 Pulling image "registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2"
3m6s Normal Pulled pod/minecraft-3 Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2" in 349.925159ms (616.176777ms including waiting)
3m6s Normal Created pod/minecraft-3 Created container minecraft
3m6s Normal Started pod/minecraft-3 Started container minecraft
3m4s Normal GsStateChanged gameserver/minecraft-3 State turn from Creating to Ready
3m7s Normal Scheduled pod/minecraft-4 Successfully assigned default/minecraft-4 to ssl-k8s-126-3
3m6s Warning ContainersNotReady gameserver/minecraft-4 containers with unready status: [minecraft]
3m6s Normal Pulling pod/minecraft-4 Pulling image "registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2"
3m5s Normal Pulled pod/minecraft-4 Successfully pulled image "registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2" in 363.68806ms (772.307662ms including waiting)
3m5s Normal Created pod/minecraft-4 Created container minecraft
3m5s Normal Started pod/minecraft-4 Started container minecraft
3m4s Normal Killing pod/minecraft-4 Stopping container minecraft
3m3s Normal GsStateChanged gameserver/minecraft-4 State turn from Creating to Deleting
2m32s Warning ContainersNotReady; ContainerTerminated:Error gameserver/minecraft-4 containers with unready status: [minecraft]; ExitCode: 137
3m7s Normal CreateWorkload gameserverset/minecraft created Advanced StatefulSet
3m7s Normal Scale gameserverset/minecraft scale from 0 to 4
3m7s Normal SuccessfulCreate statefulset/minecraft create Pod minecraft-1 in StatefulSet minecraft successful
3m7s Normal SuccessfulCreate statefulset/minecraft create Pod minecraft-2 in StatefulSet minecraft successful
3m7s Normal SuccessfulCreate statefulset/minecraft create Pod minecraft-3 in StatefulSet minecraft successful
3m7s Normal SuccessfulCreate statefulset/minecraft create Pod minecraft-4 in StatefulSet minecraft successful
3m7s Normal SuccessfulCreate statefulset/minecraft create Pod minecraft-0 in StatefulSet minecraft successful
3m7s Normal SuccessfulDelete statefulset/minecraft delete Pod minecraft-4 in StatefulSet minecraft successful
We are experiencing an issue where, upon updating our GameServerSet (GSS), which causes all managed Pods to rebuild, there's an occurrence of Pods (out of the 6 running GameServers) ending up with a failure in retrieving network information, resulting in a "NotReady" network status. Below are the specific details and steps that lead to this issue:
Network Plugin: HostPort
Number of GameServer replica in the GSS: 6
After the update and subsequent Pod recreation, all Pods should successfully retrieve their network information and display a "Ready" network status.
I am observing logs from kruise-game-manager that warrant attention. Here are the specific log entries:
2024-01-26T14:59:46+08:00 I0126 06:59:46.237778 1 hostPort.go:73] Receiving pod dev/gs-dev-a4-3 ADD Operation
2024-01-26T14:59:46+08:00 I0126 06:59:46.237840 1 hostPort.go:80] There is a pod with same ns/name(dev/gs-dev-a4-3) exists in cluster, do not allocate
Currently, the pod mutating webhook handler contains two parts:
When the network plug-in was originally designed, an asynchronous mechanism was proposed to allow access to the network and pod generation asynchronously, that is, to allow the network plugin to fail in OnCreate/OnUpdate, and trigger another Update event through the controller to ensure the creation and availability of the network. Therefore, when the webhook handler encounters an error returned by the network plug-in, it will: 1) not modify the pod field (directly return to the original pod) 2) allow the pod to be created or updated normally, thinking that the network plug-in has made an invalid action and ignore this behavior.
However, there are two problems with the current mechanism:
目前pod mutating webhook handler中有包含两部分:
当最初设计网络插件时,提出了异步机制,允许接入网络与pod异步生成,即在允许网络插件在OnCreate/OnUpdate失败,通过控制器触发再次Update事件来保障网络的创建和可用。故,webhook handler在遇到网络插件返回error时会:1)不修改pod字段(直接返回原pod)2)允许pod正常创建或更新,认为网络插件做了一次无效动作而忽略本次行为。
然而,当前的机制存在两个问题:
It is recommended to follow Kubernetes' error handling semantics and retrigger the event when an error is encountered. So:
When the network plugin reports an error, the webhook handler will reject the pod creation/update, set it as failure, and retrigger the pod creation/update action until the action is executed successfully and no errors are generated. The network ready check is still performed asynchronously. The purpose of this proposal is to ensure the successful execution of an operation, regardless of the subsequent results of the operation.
It should be noted that in current network plug-ins such as slb, under the new mechanism, the network requires pod creation to take effect, so repeated errors will occur in the pod create action, resulting in pods not being created, so additional modifications are required.
建议遵循Kubernetes对错误处理的语意,遇到错误需要重新触发事件。如此一来:
当网络插件上报错误,webhook handler将拒绝pod创建/更新,将其置为失败,重新触发pod创建/更新动作,直至动作执行成功,未产生错误。网络ready check依然是异步进行的,这次重构目的在于保证一次操作的成功执行,不考虑操作后续结果如何。
需要注意,当前例如slb等网络插件中在新机制下,网络由于需要pod创建而生效,所以pod create动作会出现反复错误,进而导致pod创建不出来的情况,故需要进行额外改造。
Currently GameServerSet supports batch updates in a user-defined manner by setting UpdatePriority and Partition. However, under this strategy, users need to frequently operate gss and gs objects, and many times users want to complete rolling updates in a more automated way.
There are currently two scenario requirements:
type UpdateStrategy struct {
// Type indicates the type of the StatefulSetUpdateStrategy.
// Default is RollingUpdate.
// +optional
Type apps.StatefulSetUpdateStrategyType `json:"type,omitempty"`
// RollingUpdate is used to communicate parameters when Type is RollingUpdateStatefulSetStrategyType.
// +optional
RollingUpdate *RollingUpdateStatefulSetStrategy `json:"rollingUpdate,omitempty"`
// AutoUpdateStrategy means that the update process will be performed automatically without user intervention.
// +optional
AutoUpdateStrategy *AutoUpdateStrategy `json:"autoUpdateStrategy,omitempty"`
}
type AutoUpdateStrategy struct {
//+kubebuilder:validation:Required
Type AutoUpdateStrategyType `json:"type"`
// Only GameServers in SpecificStates will be updated.
// +optional
SpecificStates []OpsState `json:"specificStates,omitempty"`
}
type AutoUpdateStrategyType string
const (
// OnlyNewAutoUpdateStrategyType indicates exist GameServers will never be updated, new GameServers will be created in new template.
OnlyNewAutoUpdateStrategyType AutoUpdateStrategyType = "OnlyNew"
// SpecificStateAutoUpdateStrategyType indicates only GameServers with Specific OpsStates will be updated.
SpecificStateAutoUpdateStrategyType AutoUpdateStrategyType = "SpecificState"
)
我们使用 kibana 查找日志,只支持 JSON 格式的结构化日志,建议增加日志格式配置,支持配置为 JSON 格式输出,示例如下:
{"time":"2024-05-31T04:23:41.168044065Z","level":"INFO","source":{"function":"github.com/CloudNativeGame/kruise-game-open-match-director/pkg/logger.InfoContext","file":"/go/src/director/pkg/logger/logger.go","line":56},"msg":"begin FetchMatches","traceid":"3bc82e362d67a3cf6c55a9104d24456e","sampled":true}
Managing the configuration of game servers is a common issue after game containerization. The configuration of game servers can be presented through labels or annotations in Kubernetes (k8s), and then passed down to the containers using the Downward API for business awareness. However, in scenarios like PvE games or MMORPGs, each game server has its own unique configuration. This means that each game server requires distinct labels or annotations. Generally, the keys of these labels and annotations are the same across different game servers, only the values differ. We need a way to manage the different labels and annotations of different game servers in a batch, automatic, and persistent manner. Therefore, I propose a new custom resource definition (CRD) object called GameServerConfig.
type GameServerConfigSpec struct {
GameServerSetName string `json:"gameServerSetName"`
LabelConfigs []StringMapConfig `json:"labelConfigs,omitempty"`
AnnotationConfigs []StringMapConfig `json:"annotationConfigs,omitempty"`
}
type StringMapConfig struct {
Type StringMapConfigType `json:"type"`
KeyName string `json:"keyName"`
IdValues []IdValue `json:"idValues,omitempty"`
RenderRule string `json:"renderRule,omitempty"`
}
type StringMapConfigType string
const (
SpecifyID StringMapConfigType = "SpecifyID"
RenderID StringMapConfigType = "RenderID"
)
type IdValue struct {
IdList []int `json:"idList,omitempty"`
Value string `json:"value,omitempty"`
}
type GameServerConfig struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec GameServerConfigSpec `json:"spec,omitempty"`
Status GameServerConfigStatus `json:"status,omitempty"`
}
type GameServerConfigList struct {
metav1.TypeMeta `json:",inline"`
metav1.ListMeta `json:"metadata,omitempty"`
Items []GameServer `json:"items"`
}
type GameServerConfigState string
const (
Pending GameServerConfigState = "Pending"
Succeed GameServerConfigState = "Succeed"
)
type GameServerConfigStatus struct {
State GameServerConfigState `json:"state,omitempty"`
LastTransitionTime metav1.Time `json:"lastTransitionTime,omitempty"`
}
There are 8 GameServers managed by GameServerSet minecraft. If a GameServerConfig as follow is applied:
gsc := GameServerConfig{
Spec: GameServerConfigSpec{
GameServerSetName: "minecraft",
LabelConfigs: []StringMapConfig{
{
Type: SpecifyID,
KeyName: "zone-id",
IdValues: []IdValue{
{
IdList: []int{1, 3, 4},
Value: "8001",
},
{
IdList: []int{0, 2, 5},
Value: "8002",
},
{
IdList: []int{6},
Value: "8003",
},
{
IdList: []int{7},
Value: "8004",
},
},
},
},
AnnotationConfigs: []StringMapConfig{
{
Type: RenderID,
KeyName: "group-name",
RenderRule: "group-<id>",
},
},
},
}
The GameServers' labels & annotations will be:
GameServer Name | Label | Annotation |
---|---|---|
minecraft-0 | zone-id: 8002 | group-name: group-0 |
minecraft-1 | zone-id: 8001 | group-name: group-1 |
minecraft-2 | zone-id: 8002 | group-name: group-2 |
minecraft-3 | zone-id: 8001 | group-name: group-3 |
minecraft-4 | zone-id: 8001 | group-name: group-4 |
minecraft-5 | zone-id: 8002 | group-name: group-5 |
minecraft-6 | zone-id: 8003 | group-name: group-6 |
minecraft-7 | zone-id: 8004 | group-name: group-7 |
OKG autoscaler is implemented according to Keda's external scaler mechanism, which provides two interfaces, GetMetricSpec
exposes Target, and GetMetrics
exposes Value.
The current autoscaler of OKG uses the number of replicas of GameServerSet as Target in the GetMetricSpec method, and uses the number of replicas minus the number of WaitToBeDeleted as Value in the GetMetrics method. Since the calls of GetMetricSpec and GetMetrics are asynchronous, this will lead that, at some moments, the replicas obtained in GetMetricSpec and GetMetrics is not the same, and then, the desired replicas calculated by HPA does not meet expectations.
An improvement method is proposed to fix the target value set by GetMetricSpec, whether to scale down or not is completely determined by GetMetrics, and change the type of the scaler from Value to AverageValue. After the improvement, the ratio of value to target will only be less than or equal to 1, and the scaling-down will only be performed when the ratio is less than 1, which solves the current problem of occasional unexpected scaling-up.
Cloud load balancer, as a very mature network cloud product, has been well known by developers and has been widely used. However, in game scenarios, due to the stateful nature of game servers, user traffic cannot be balanced to different game servers, which runs counter to the concept of Service in Kubernetes.
The Service matches the corresponding Pod, and balances the traffic carried by the LB to different pods. As shown in the figure below, the port corresponding to the Service is 80, and the targetPort is 80. Only one port is opened on the LB.
云负载均衡器(LoadBalancer)作为极为成熟的网络云产品已经被开发者熟知,得到了广泛地应用。然而在游戏场景下,由于游戏服有状态的特性,用户流量是无法均衡到不同的游戏服上的,这与Kubernetes中Service的概念背道而驰。
Service匹配上对应的Pod,将LB承载的流量均衡到不同的pod上,如下图所示,Service对应的port是80,targetPort是80,LB实际上只开放了一个端口。
In the game server scenario, a single LB should open different ports and forward the traffic to the corresponding Pod. As shown in the figure below, the traffic is forwarded from port 555 of LB to port 80 of pod0, from port 556 of LB to port 80 of pod1, and from port 557 of LB to port 80 of pod2. This way of using LB is what the game server needs.
而在游戏服场景下,单个LB应该开放不同端口,将流量转发到对应的Pod上,如下图所示,LB的555端口转发到pod0的80端口;LB的556端口转发到pod1的80端口;LB的557端口转发到pod2的80端口。这种LB的使用方式才是游戏服所需要的。
Using OKG [cloud provider & network plugin mechanism](#15) , used as follows:
Specify network configuration when deploying GameServerSet:
cat <<EOF | kubectl apply -f -
apiVersion: game.kruise.io/v1alpha1
kind: GameServerSet
metadata:
name: gs-slb
namespace: default
spec:
replicas: 1
updateStrategy:
rollingUpdate:
podUpdatePolicy: InPlaceIfPossible
network:
networkType: AlibabaCloud-SLB
networkConf:
- name: SlbIds
#Fill in Alibaba Cloud LoadBalancer Id here
value: "lb-xxxxxxxxxxxxxxxxx"
- name: PortProtocols
#Fill in the exposed ports and their corresponding protocols here.
#If there are multiple ports, the format is as follows: {port1}/{protocol1},{port2}/{protocol2}...
#If the protocol is not filled in, the default is TCP
value: "80"
- name: Fixed
#Fill in here whether a fixed IP is required [optional] ; Default is false
value: "false"
gameServerTemplate:
spec:
containers:
- image: registry.cn-hangzhou.aliyuncs.com/gs-demo/gameserver:network
name: gameserver
EOF
Check network status in GameServer:
networkStatus:
createTime: "2022-11-24T01:27:30Z"
currentNetworkState: Ready
desiredNetworkState: Ready
externalAddresses:
- ip: 47.97.167.217
ports:
- name: "80"
port: "611"
protocol: TCP
internalAddresses:
- ip: 172.16.0.17
ports:
- name: "80"
port: "80"
protocol: TCP
lastTransitionTime: "2022-11-24T01:27:30Z"
networkType: Ali-SLB
ACK(Alibaba Cloud Container Service for Kubernetes) supports the mechanism of SLB multiplexing in k8s. Different SVCs can use different ports of the same SLB. According to this, the Ali-SLB network plugin will record the port assignments corresponding to each SLB. For game servers that specify the network type as Ali-SLB, the Ali-SLB network plugin will automatically allocate a port and create a service object, after the public network IP in the svc ingress field is successfully created, the GameServer network is in the Ready state, and the process is completed.
阿里云容器服务支持在k8s中对SLB复用的机制,不同的svc可以使用同一个SLB的不同端口。由此,Ali-SLB network plugin将记录各SLB对应的端口分配情况,对于指定了网络类型为Ali-SLB,Ali-SLB网络插件将会自动分配一个端口并创建一个service对象,待svc ingress字段的公网IP创建成功后,GameServer的网络处于Ready状态,该过程执行完成。
When Fixed is specified as true in the network configuration of GameServerSet, the fixed IP function will take effect. Even if the Pod is deleted and rebuilt, the traffic path of the game server from SLB port to Pod port will not change.
When creating SVC, set the ownerReference of SVC according to Fixed. When Fixed is true, the owner of SVC is GameServerSet, and the SVC will be deleted only when GameServerSet is deleted; when Fixed is false, the owner of SVC is pod, and SVC will also be deleted when pod is deleted.
让GameServerSet的网络参数中指定了Fixed为true时,固定IP功能生效,即使Pod被删除重建,对于该游戏服的访问链路不会发生改变,外部IP、端口与内部IP、端口的映射关系维持固定。
在创建SVC时,根据Fixed设置SVC的ownerReference。当Fixed为true时,SVC的owner为GameServerSet,只有GameServerSet删除时该SVC才会被删除,与Pod的生命周期无关;当Fixed为false时,SVC的owner为Pod,当Pod被删除时SVC也将被删除。
The Ali-SLB network plug-in provides network isolation. Even when the Pod is Ready, it can also remove the external network of the game server.
When the networkDisabeld field of GameServer.Spec is specified as true, the Ali-SLB network plugin will isolate the game server from the network, change the SVC network type from LoadBalancer to ClusterIP. This function is suitable for scenarios such as testing the game server after the game server updated when not reopened for players, cutting off traffic when the game server is abnorma.
Ali-SLB网络插件在SVC层面提供了网络隔离的能力,即使Pod为Ready时,也可以实现游戏服的对外网络的摘除。
在GameServer.Spec的networkDisabeld字段指定为true时,Ali-SLB网络插件将对该游戏服进行网络隔离,将对应SVC的网络类型由LoadBalancer改为ClusterIP,切断外部访问的流量。此功能适用于游戏服更新完成后测试游戏测试通过后再开服、以及游戏服异常时等切断流量等场景。
OpenKruiseGame has been adopted by lots of outstanding companies and we plan to set up a new project named kruise-game-dashboard to help more developers.
kruise-game-dashboard is working in progress for two popular opensource kubernetes dashboards following the directions below.
Welcome developers to provide information and suggestions on the key features you are interested in
First of all, thanks sincerely for watching Kruise-Game. We will try our best to keep Kruise-Game better, and keep community and eco-system growing.
We’d like to listen to the community to make Kruise-Game better.
We want to attract more people to contribute to Kruise-Game.
We're willing to learn more Kruise-Game use scenarios for better planning.
Please submit a comment in this issue to include the following information:
Your company, school or organization.
Your city and country.
Your contact info: blog, email, WeChat or Twitter (at least one).
What are the obstacles that block your game server migration to Kubernetes.
You can refer to the following sample answer for the format:
Orgnization/Company: Alibaba
Location: Hangzhou, China
Contact: [email protected]
Obstacles/Scenario: game server hot upgrade and static ip/port.
Thanks again for your participation!
Kruise-Game Community
首先,非常感谢大家持续关注Kruise-Game项目的发展。我们会竭尽所能让Kruise-Game更简单易用,保持整个社区和生态蓬勃发展。
我们希望听到社区的真实声音,让Krusie-Game变得更好。
我们希望吸引更多的开发者能够参与到Kruise-Game的研发中。
我们希望能够了解到更多Kruise-Game应该解决的场景,并进行需求的管理和规划。
请按照如下的格式提价一个包含如下内容的评论
您的公司、学校或者组织
您的城市或者国家
您的联系方式:博客、邮件、微信或者Twitter都可以
游戏服云原生化过程遇到的核心困扰
您可以按照下面的样例进行提交
公司/组织:阿里巴巴
位置:杭州,**
联系方式:[email protected]
核心障碍:游戏服的热更新,网络的固定ip和端口
再次感谢您的参与
Kruise-Game社区
In game area, servers are usually grouped into various partitions. Each partition of servers are set up for serving a specific range of players. In each partition, there could be multiple types of servers that provides different services and communicate with each other server in the same partition, like Battle servers or Scene servers.
Each type of server could contain multiple replicas and be modeled as a GameServerSet. The GameServerSet will create Open Kruise StatefulSet to pull up pods and each pod will be attached with an additional GameServer object to help set the operate state of the pod.
Operating/Managing a large amount of GameServerSets across different partitions can be laborious. To help alleviate the burden of repetitive operations for GameServerSets, we could introduce KubeVela to model the higher level application on top of the GameServerSets.
In KubeVela, applications are used to model resources and manage their spec & lifecycles. Besides, there are also delivery pipeline that could describe the operation actions as codes and be reused.
Specifically, we could model the GameServerSets in each partition as a single KubeVela application and let it manage the desired state and delivery process of the GameServerSets, such as updates. On top of that, for the whole game, we could add another application to manage the partition applications.
With architecture, it will be easy to modify the desired state of the GameServerSets along with the partitions or the types (roles).
First, to model the partition application, we need a KubeVela ComponentDefinition for the abstraction of GameServerSet. The below CUE templates defines how the GameServerSet is formed, which parameters are exposed and how the health state is evaluated.
"game-server-set": {
alias: ""
annotations: {}
description: "The GameServerSet."
type: "component"
attributes: {
workload: type: "autodetects.core.oam.dev"
status: {
customStatus: #"""
status: {
replicas: *0 | int
} & {
if context.output.status != _|_ {
if context.output.status.readyReplicas != _|_ {
replicas: context.output.status.readyReplicas
}
}
}
message: "\(context.name): \(status.replicas)/\(context.output.spec.replicas)"
"""#
healthPolicy: #"""
status: {
replicas: *0 | int
generation: *-1 | int
} & {
if context.output.status != _|_ {
if context.output.status.readyReplicas != _|_ {
replicas: context.output.status.readyReplicas
}
if context.output.status.observedGeneration != _|_ {
generation: context.output.status.observedGeneration
}
}
}
isHealth: (context.output.spec.replicas == status.replicas) && (context.output.metadata.generation == status.generation)
"""#
}
}
}
template: {
parameter: {
// +usage=The image of the Game Server
image: string
// +usage=The number of replicas
replicas: *1 | int
}
output: {
apiVersion: "game.kruise.io/v1alpha1"
kind: "GameServerSet"
spec: {
updateStrategy: rollingUpdate: podUpdatePolicy: "InPlaceIfPossible"
gameServerTemplate: spec: containers: [{
image: parameter.image
name: "\(context.name)"
}]
}
metadata: name: "\(context.name)"
spec: replicas: parameter.replicas
}
}
On top of that, we could have the abstraction of partition applications, which is indicated as game-server-sets
as below. The game-server-sets
is a template for generating partition application, and exposes the configuration of different types of underlying GameServerSets in the parameter.
"game-server-sets": {
alias: ""
annotations: {}
description: "The Game Server Sets of one region."
type: "component"
attributes: {
workload: type: "autodetects.core.oam.dev"
status: {
customStatus: #"""
status: {
phase: *"initializing" | string
} & {
if context.output.status != _|_ {
if context.output.status.status != _|_ {
phase: context.output.status.status
}
}
}
message: "\(context.name): \(status.phase)"
"""#
healthPolicy: #"""
status: {
phase: *"initializing" | string
generation: *-1 | int
} & {
if context.output.status != _|_ {
if context.output.status.status != _|_ {
phase: context.output.status.status
}
if context.output.status.observedGeneration != _|_ {
generation: context.output.status.observedGeneration
}
}
}
isHealth: (status.phase == "running") && (status.generation >= context.output.metadata.generation)
"""#
}
}
}
template: {
#GameServerSet: {
// +usage=The image of the Game Server
image: string
// +usage=The number of replicas
replicas: *1 | int
// +usage=The dependencies of the Game Server
dependsOn: *[] | [...string]
}
parameter: [string]: #GameServerSet
output: {
apiVersion: "core.oam.dev/v1beta1"
kind: "Application"
metadata: name: context.name
spec: {
components: [for role, gss in parameter {
name: "\(context.name)-\(role)"
type: "game-server-set"
properties: {
image: gss.image
replicas: gss.replicas
}
_dependsOn: [for d in gss.dependsOn {"\(context.name)-\(d)"}]
if len(_dependsOn) > 0 {
dependsOn: _dependsOn
}
}]
workflow: steps: [{
type: "deploy"
name: "deploy"
properties: policies: []
}]
}
}
}
Finally, we have got the application that manages all the partition application as the user interface, shown as below
apiVersion: core.oam.dev/v1beta1
kind: Application
metadata:
name: mmo
namespace: game
spec:
components:
- type: game-server-sets
name: partition-1
properties:
battle:
image: nginx:1.17
replicas: 2
scenes:
image: nginx:1.20
replicas: 2
ai:
image: nginx:1.21
replicas: 1
- type: game-server-sets
name: partition-2
dependsOn: ["partition-1"]
properties:
battle:
image: nginx:1.17
replicas: 3
scenes:
image: nginx:1.20
replicas: 3
ai:
image: nginx:1.21
replicas: 2
- type: game-server-sets
name: partition-3
properties:
battle:
image: nginx:1.17
replicas: 1
scenes:
image: nginx:1.20
replicas: 1
ai:
image: nginx:1.21
replicas: 1
policies:
- type: override
name: global-config
properties:
components:
- properties:
scenes:
dependsOn: ["battle"]
- type: apply-once
name: apply-once
properties:
enable: true
workflow:
steps:
- type: deploy
name: deploy
properties:
policies: ["global-config"]
The dependsOn lines specify the delivery order between different partitions and different types of GameServerSets. In the above example, we requires the update of partition-2 is after partition-1, and the updates of Scenes GameServerSets are after the updates of Battle GameServerSets.
game-server-sets
component to the top layer MMO application.game-server-sets
component's configuration. This allows users to have fine-grained control for the image of each GameServerSet in each partition. The other way is to set the image field in the global-config
policy, where users only need to config once and it will take effects across all partitions.Another action is to set the state of GameServer for the pod, to mark the pod's deletion state, priority and other configurations. This can be achieved with the use of KubeVela WorkflowRun. The reason for using WorkflowRun instead of Application for managing GameServers is because GameServer objects are not directly managed through Applications or GameServerSets. They are post-attached to pods. So using a sideway to manage them can be easier and more simple.
To define the operation behaviour, we use the follow CUE templates. The operate-gs
defines the detailed update operation for updating the GameServer object. It first read the GameServer from Kubernetes and re-assemble it with the update fields.
import (
"vela/op"
)
"operate-gs": {
type: "workflow-step"
description: "Operate GameServer."
}
template: {
#Operation: {
deletionPriority?: int
opsState?: "None" | "WaitToBeDeleted"
updatePriority?: int
}
handle: op.#Steps & {
for gsName, o in parameter {
"\(gsName)": op.#Steps & {
read: op.#Read & {
value: {
apiVersion: "game.kruise.io/v1alpha1"
kind: "GameServer"
metadata: {
name: gsName
namespace: context.namespace
}
}
} @step(1)
apply: op.#Apply & {
value: {
for k, v in read.value if k != "spec" {
"\(k)": v
}
if read.value.spec != _|_ {
spec: {
for k, v in read.value.spec {
if k != "deletionPriority" && k != "opsState" && k != "updatePriority" {
"\(k)": v
}
}
if o.deletionPriority != _|_ {
deletionPriority: o.deletionPriority
}
if o.deletionPriority == _|_ && read.value.spec.deletionPriority != _|_ {
deletionPriority: read.value.spec.deletionPriority
}
if o.opsState != _|_ {
opsState: o.opsState
}
if o.opsState == _|_ && read.value.spec.opsState != _|_ {
opsState: read.value.spec.opsState
}
if o.updatePriority != _|_ {
updatePriority: o.updatePriority
}
if o.updatePriority == _|_ && read.value.spec.updatePriority != _|_ {
updatePriority: read.value.spec.updatePriority
}
}
}
}
} @step(2)
}
}
}
parameter: [string]: #Operation
}
The use of the atomic action is as below
apiVersion: core.oam.dev/v1alpha1
kind: WorkflowRun
metadata:
name: edit-gs
namespace: game
spec:
workflowSpec:
steps:
- type: operate-gs
name: operate-gs
properties:
partition-1-scenes-0:
opsState: WaitToBeDeleted
partition-2-battle-1:
opsState: WaitToBeDeleted
partition-2-scenes-0:
deletionPriority: 20
This WorkflowRun is a one-time execution. It sets the opsState to WaitToBeDeleted for the first replica of the Scene GameServerSet in partition 1. Similar behaviors are applied to partition 2.
In KubeVela, there is resource topology in KubeVela to be used to display the internal architecture of the application. To help visualize the relationships between GameServerSets and StatefulSets, Pods and GameServers, we could apply the following configuration into the KubeVela system. Then we will be able to visualize the full architecture of the KubeVela application.
apiVersion: v1
kind: ConfigMap
metadata:
name: game-server-set-relation
namespace: vela-system
labels:
"rules.oam.dev/resource-format": "yaml"
"rules.oam.dev/resources": "true"
data:
rules: |-
- parentResourceType:
group: game.kruise.io
kind: GameServerSet
childrenResourceType:
- apiVersion: apps.kruise.io/v1beta1
kind: StatefulSet
- apiVersion: game.kruise.io/v1alpha1
kind: GameServer
- parentResourceType:
group: apps.kruise.io
kind: StatefulSet
childrenResourceType:
- apiVersion: v1
kind: Pod
When using a load balancer (lb) to expose services, the generated service needs to allocate a NodePort, and the default range for NodePort is 2000+. When there are too many container service ports, it can lead to a shortage of NodePort ports.
By setting the 'allocateLoadBalancerNodePorts' parameter in the service, you can control the generated service to not expose NodePort. However, this is only applicable in scenarios where LB traffic passes directly to the pod.
https://kubernetes.io/docs/concepts/services-networking/service/
As is known to all, there are certain differences between each server in PvE games, and these differences become more apparent over time. This usually manifests in two scenarios:
The number of players typically fluctuates over time, so the resource allocation of game servers needs to be adjusted accordingly to adapt to these changes and avoid anomalies in service quality or wasted resources. In this case, it is likely that the resource allocation between servers will be different in an ideal situation.
Some games have the concept of gameplay modes, and these gameplay strategies often have variations. Additionally, there are concepts like test servers and experimental servers, where different versions of the same service may exist on different servers. In this case, the ideal situation would be that the Image versions between servers are likely to be different.
Based on the above situations, I propose to enhance the targeted management capability of OKG to support different resource configurations and image versions for different GameServers under the same GameServerSet.
To achieve this, I suggest adding a new field called "Containers" to the GameServerSpec.
// GameServerSpec defines the desired state of GameServer
type GameServerSpec struct {
OpsState OpsState `json:"opsState,omitempty"`
UpdatePriority *intstr.IntOrString `json:"updatePriority,omitempty"`
DeletionPriority *intstr.IntOrString `json:"deletionPriority,omitempty"`
NetworkDisabled bool `json:"networkDisabled,omitempty"`
// Containers can be used to make the corresponding GameServer container fields
// different from the fields defined by GameServerTemplate in GameServerSetSpec.
Containers []GameServerContainer `json:"containers,omitempty"`
}
type GameServerContainer struct {
// Name indicates the name of the container to update.
Name string `json:"name"`
// Image indicates the image of the container to update.
Image string `json:"image,omitempty"`
// Resources indicates the resources of the container to update.
Resources corev1.ResourceRequirements `json:"resources,omitempty"`
}
When the image or resources configuration of a GameServer's containers is different from the pod spec, the corresponding fields of the pod will be updated. In case of conflicts, the content declared in the GameServer takes precedence.
Newly created GameServers will follow the default settings specified in the GameServerTemplate within the GameServerSet.
Please refer to the diagram below for an illustration of the proposed effects:
As we all know, pod.containers[*].resources can't be updated by default.
However, Kubernetes 1.27 add a new feature-gate named InPlacePodVerticalScaling
, which can help pod execute vertical scaling in-place(pod won not be recreated, containers could be restart or not), which means that GameServers could resize their resources, without affecting the players on those GameServers.
In order to avoid update failure, we should add a GameServer validating webhook, which only allow to update those containers with resizePolicy.
type GameServerTemplate struct {
corev1.PodTemplateSpec `json:",inline"`
VolumeClaimTemplates []corev1.PersistentVolumeClaim `json:"volumeClaimTemplates,omitempty"`
// new
Owner GameServerOwner `json:"owner"`
}
type GameServerOwner string
const (
OwnerPod GameServerOwner = "Pod"
OwnerGameServerSet GameServerOwner = "GameServerSet"
)
The owner is the Pod - created when the pod is created and deleted when the pod is deleted, consistent with the pod life cycle.
The owner is GameServerSet - created before the pod is created and deleted after the pod is actually deleted. Specific examples:
Default Owner is Pod.
The ack-kruise-game in ACK gave excessive authority when defining Service Account named "kruise-game-controller-manager". Besides, this Service Account is mounted in a pod named "kruise-game-controller-manager-675bb6974d-4m6d7", witch makes it possible for attackers to raise rights to administrators.
# Attacking Strategy
If a malicious user controls a specific worker node which has the Pod mentioned above , or steals the Service Account token mentioned above. He/She can raise permissions to administrator level and control the whole cluster.
For example,
# A few questions
AlibabaCloud-NLB
AlibabaCloud
AlibabaCloud-NLB enables game servers to be accessed from the Internet by using Layer 4 Network Load Balancer (NLB) of Alibaba Cloud. AlibabaCloud-NLB uses different ports of the same NLB instance to forward Internet traffic to different game servers. The NLB instance only forwards traffic, but does not implement load balancing.
This network plugin supports network isolation.
NlbIds
PortProtocols
Fixed
AllowNotReadyContainers
[alibabacloud]
enable = true
[alibabacloud.nlb]
# Specify the range of available ports of the NLB instance. Ports in this range can be used to forward Internet traffic to pods. In this example, the range includes 500 ports.
max_port = 1500
min_port = 1000
AlibabaCloud-NLB
AlibabaCloud
NlbIds
PortProtocols
Fixed
AllowNotReadyContainers
[alibabacloud]
enable = true
[alibabacloud.nlb]
#填写nlb可使用的空闲端口段,用于为pod分配外部接入端口,范围为500
max_port = 1500
min_port = 1000
The dingding chatgroup is not available now, anyone can update it?
There're some scenarios:
Each game server has an independent EIP, which is the best solution to solve the above problems.
AlibabaCloud-EIP
AlibabaCloud
ReleaseStrategy
PoolId
ResourceGroupId
Bandwidth
BandwidthPackageId
ChargeType
None
游戏服存在如下场景:
每个游戏服具备独立的EIP,是解决上述问题的最佳方案。
AlibabaCloud-EIP
AlibabaCloud
ReleaseStrategy
PoolId
ResourceGroupId
Bandwidth
BandwidthPackageId
ChargeType
无
In my GameServer, I have configured two game processes. Process 1 relies on OKG network annotation and will only start after reading this annotation. Additionally, I have set up a readinessProbe to monitor whether this process's GRPC listen is ready. Process 2 depends on process 1, and will only start after process 1 is ready. This setup utilizes OKG's startup sequence control.
In the given background, there is an occasional issue where Pods fail to retrieve network annotation, causing them to remain in a pending state indefinitely.
During my actual usage, when the GameServerSet replicas are set to 4, I encountered a situation where one Pod remains in the pending state while the others start up normally.
apiVersion: game.kruise.io/v1alpha1
kind: GameServerSet
metadata:
name: gss
labels:
gs-group: test
spec:
replicas: 4
updateStrategy:
rollingUpdate:
podUpdatePolicy: ReCreate
network:
networkType: Kubernetes-HostPort
networkConf:
- name: ContainerPorts
value: "process1:5000/TCP"
gameServerTemplate:
metadata:
labels:
gs-group: test
spec:
affinity:
podAntiAffinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
podAffinityTerm:
topologyKey: "kubernetes.io/hostname"
labelSelector:
matchLabels:
gs-group: test
imagePullSecrets:
- name: qcloudregistrykey
containers:
- image: IMAGE_1
imagePullPolicy: IfNotPresent
name: process1
env:
- name: KRUISE_CONTAINER_PRIORITY
value: "2"
readinessProbe:
tcpSocket:
port: 6000
initialDelaySeconds: 5
periodSeconds: 5
successThreshold: 1
failureThreshold: 3
volumeMounts:
- name: network
mountPath: /opt/network
- image: IMAGE_2
imagePullPolicy: IfNotPresent
name: process2
env:
- name: KRUISE_CONTAINER_PRIORITY
value: "1"
- name: network
downwardAPI:
items:
- path: "annotations"
fieldRef:
fieldPath: metadata.annotations['game.kruise.io/network-status']
Multi-cloud hybrid cloud expands the boundary of cloud and has become one of the trends of cloud-native development. In order to reduce the user's access cost to cloud infrastructure (such as network), we designed the Cloud Provider module, which aims to integrate with cloud infrastructure The coupling part is pluggably integrated into OpenKruiseGame (hereinafter referred to as OKG). At the beginning of the design of OKG, we believed that the game server network is an important function that cannot be ignored for game operation and maintenance, so we first supported the network plugin of Cloud Provider. Users can specify the network type and configuration information by defining the network field of the workload GameServerSet, and get the corresponding network information in the GameServer object.
Cloud Provider is integrated in the webhook of kruise-game-manager in an in-tree manner, and its architecture diagram is as follows:
...
spec:
network:
networkType: HostPortNetwork #This example uses the HostPort network
networkConf:
#The network configuration is imported in the form of K-V, and each plugin specifies the incoming format
#The format of HostPort is as follows, {containerName}:{port1}/{protocol1},{port2}/{protocol2},...
- name: ContainerPorts
value: game-server:25565/TCP
...
...
status:
networkStatus:
createTime: "2022-11-10T14:26:43Z"
#The current state, as determined by the network plugin
currentNetworkState: Ready
#The desired state, Ready/NotReady, is related to the GameServer.Spec.NetworkDisabled field.
#When NetworkDisabled is true, the expected state is NotReady;
#When NetworkDisabled is false, it is Ready, and the default is Ready when created.
desiredNetworkState: Ready
externalAddresses:
- ip: 38.111.149.177
ports:
- name: game-server-25565
port: 8870
protocol: TCP
internalAddresses:
- ip: 192.168.0.88
ports:
- name: game-server-25565
port: 25565
protocol: TCP
lastTransitionTime: "2022-11-10T14:26:43Z"
networkType: HostPortNetwork
...
OKG supports users to set the network to be temporarily unavailable to achieve the effect of interrupting network traffic
...
spec:
networkDisabled: true
...
...
status:
networkStatus:
createTime: "2022-11-10T14:26:43Z"
currentNetworkState: NotReady #The network plugin senses the disabled operation, and returns to the current state after
completing the network isolation
desiredNetworkState: NotReady
externalAddresses:
- ip: 38.111.149.177
ports:
- name: game-server-25565
port: 8870
protocol: TCP
internalAddresses:
- ip: 192.168.0.88
ports:
- name: game-server-25565
port: 25565
protocol: TCP
lastTransitionTime: "2022-11-10T14:29:01Z"
networkType: HostPortNetwork
...
As shown in the figure above, when the user defines the network field when creating the GameServerSet:
(1) gameserverset-controler initiates a create pod request by creating an advanced statefulset, and the webhook intercepts and calls the OnPodCreate() function to create network. After the create request passes, the pod is successfully created
(2) The gameserver-controller perceives that the pod has a corresponding network status annotation, and writes it back to the GameServer Status
(3) When the GameServer is Ready, if the gameserver-controller finds that the desiredNetworkState is consistent with the currentNetworkState, the reconcile is terminated; when the two fields are inconsistent, the reconcile is retriggered, an update pod request is initiated, and the webhook intercepts the request and calls the OnPodUpdate() function. The network plugin returns the current up-to-date network status. The tuning interval is 5s, and the total waiting time is one minute. If the status is still inconsistent after more than one minute, the req will be abandoned.
Let's take a look at the changes in the fields of each object during the process:
(1) First, the user specifies the type and config of the GameServerSet.
(2) Specify the corresponding encoded annotation in the spec.template of the newly created Advanced StatefulSet.
(3) Through the pod mutating webhook, the pod adds the encoded status annotation. The network status fields that need to be returned by the plugin include externalAddress, internalAddress, and currentNetworkState.
(4) GameServer generates status.netwrokStatus for the status annotation of the Pod, in which externalAddress, internalAddress, and currentNetworkState are inherited from the annotations of pod, and others are generated or modified during the controller reconcile process.
(5) When the currentNetworkState is inconsistent with the desiredNetworkState, modify the Pod network-trigger-time annotation to the current time, trigger the update request(trigger interval defaults to 5 seconds), until the timeout (1 minute by default) or reaching the same state.
As shown in the figure above, the networkDisabled field of the GameServer is specified when the user want to isolate/unisolate the network
(1) The gameserver-controller synchronizes the networkDisabled field to the annotation corresponding to the pod, triggers the update request intercepted by the webhook, and calls the OnPodUpdate() function. The network plugin will perform network isolation/unisolation according to the pod networkDisabled annotation, and change the currentNetworkState. The updated Pod will be with the latest status annotation, which includes whether the current network status is Ready.
(2) The gameserver-controller perceives the status change of the pod and synchronizes the networkStatus to the GameServer
(3) Similar to when creating process, decide whether to continue to initiate reconcile according to the comparison of network status consistency
Changes in the fields of each object during the process:
The meaning of the fields is similar to that of creating network, so I won’t repeat it in words.
The process of deleting the network is very simple, as shown in the figure below, after the pod deletion request is intercepted by the webhook, the network plug-in calls the OnPodDelete() function to delete the network resources
OKG supports developers to customize cloud providers and network plugins according to their own needs. First of all, let's take a look at the call relationship diagram of each module in the webhook, so as to understand the meaning of each interface of the network plugin:
(1) When the webhook registers the mutating pod Handler, it initializes the provider manager and the corresponding cloud providers. At this point, the network plugins will register itself in the map of the corresponding cloud provider, waiting to be accessed.
(2) When the pod mutating req is triggered, the handler will first extract the network plugin name corresponding to the pod, and find the corresponding network plugin object according to the network plugin name. Then call the corresponding function according to the action of req:
The network plugin needs to implement the following func
type Plugin interface {
Name() string
Alias() string
Init(client client.Client) error
OnPodAdded(client client.Client, pod *corev1.Pod) (*corev1.Pod, error)
OnPodUpdated(client client.Client, pod *corev1.Pod) (*corev1.Pod, error)
OnPodDeleted(client client.Client, pod *corev1.Pod) error
}
The significance of OnPodCreate(), OnPodUpdate(), OnPodDelete() will not be described in detail.
If OKG does not currently support the cloud provider you expect, you can access it by implementing the following methods
type CloudProvider interface {
Name() string
ListPlugins() (map[string]Plugin, error)
}
长线运营之后,每组服务器需要的资源规格是差异比价大的,这就需要去为每个gs设置不同的资源规格,如:
https://openkruise.io/zh/kruisegame/best-practices/pve-game#%E5%AE%9A%E5%90%91%E6%9B%B4%E6%96%B0%E6%B8%B8%E6%88%8F%E6%9C%8D%E9%95%9C%E5%83%8F%E4%B8%8E%E8%B5%84%E6%BA%90%E8%A7%84%E6%A0%BC
中描述的方法。
但是通过edit yaml的方式去管理,当gs的数量比较多的时候就很不直观了,所以期望提供一个统一的视角来管理gs的差异化资源规格。
可能的实现方法:
很早之前就关注了openkruise,最近看了一眼OpenKruiseGame,关于部署游戏服这里我有些疑问:https://openkruise.io/zh/kruisegame/user-manuals/deploy-gameservers,对于game-1 game-2 game-3这样的单体游戏服务,我是要加载不同的配置文件configmap的,比如一些自定义的环境变量,数据库的链接,这里应该如何好的实现呢?
Currently, the metadata in the GameServerTemplate will only apply to pods.
In fact, there is also a need for batch labels/annotations management on GameServers, and the metadata information of GameServerTemplate should also be synchronized to GameServers belong to that GameServerSet.
gs的spec.opsState希望加上参数校验,当修改值为"kill"而非"Kill"时可以修改成功,查询gs状态为kill
NAME STATE OPSSTATE DP UP AGE
nginx-okg-0 Ready None 0 0 13h
nginx-okg-1 Ready None 0 0 13h
nginx-okg-2 Ready kill 0 0 13h
nginx-okg-4 Ready None 0 0 13h
serviceQualities 是否可以提供自定义restartpolicry,如Always、Onfailure、Never,可以选择部分程序崩溃整个pod进行重启,解决部分程序崩溃造成数据不一致的问题;
The game server is stateful, and we need a protection mechanism to prevent the players who are playing the game from being affected when GameServerSet reducing the number of replicas.
Introduce the scale down strategy in GameServerSet, and add a new type of scale down strategy called protected. When scale down type is protected, the user can select protected objects to prevent from being deleted.
There will be two types of protection policies here, one is the threshold type, GameServers whose priority is less than the threshold value will be protected, and the other is the specified type, which can select GameServers with specified ids or matching labels to be protected.
type ScaleStrategy struct {
ScaleDown ScaleDownStrategy `json:"scaleDown,omitempty"`
...
}
type ScaleDownStrategy struct {
Type ScaleDownStrategyType `json:"type,omitempty"`
ProtectedStrategy ProtectedStrategy `json:"protectedPolicy,omitempty"`
}
type ScaleDownStrategyType string
const (
ProtectedScaleDownStrategyType ScaleDownStrategyType = "Protected"
)
type ProtectedStrategy struct {
Type ProtectedStrategyType `json:"type,omitempty"`
ThresholdStrategy ThresholdProtectedStrategy `json:"thresholdStrategy,omitempty"`
SpecifiedStrategy SpecifiedProtectedStrategy `json:"specifiedStrategy,omitempty"`
}
type ProtectedStrategyType string
const (
SpecifiedProtectedStrategyType ProtectedStrategyType = "Specified"
ThresholdProtectedStrategyType ProtectedStrategyType = "Threshold"
)
type ThresholdProtectedStrategy struct {
OpsState OpsState `json:"opsState,omitempty"`
DeletionPriority *intstr.IntOrString `json:"deletionPriority,omitempty"`
}
type SpecifiedProtectedStrategy struct {
GameServerIds []int `json:"gameServerIds,omitempty"`
LabelSelector metav1.LabelSelector `json:"labelSelector,omitempty"`
}
example 1:
apiVersion: game.kruise.io/v1alpha1
kind: GameServerSet
spec:
scaleStrategy:
scaleDown:
type: protected
protectedPolicy:
type: threshold
thresholdStrategy:
opsState: None
deletionPriority: 30
In this example, GameServers whose opsState is None and Maintaining, and GameServers whose deletionPriority is less than or equal to 30 will be protected.
example 2:
apiVersion: game.kruise.io/v1alpha1
kind: GameServerSet
spec:
scaleStrategy:
scaleDown:
type: protected
protectedPolicy:
type: specified
specifiedStrategy:
gameServerIds:
- 2
- 3
labelSelector:
matchLabels:
GameServerLabel: "xxx"
In this example, GameServers with serial numbers 2 and 3, and GameServers with GameServerLabel: "xxx" key-value pairs will be protected.
Also note that the replicas of the GameServerSet may be different from the actual number of GameServers when scaleDown Strategy type is protected.
e.g
serviceQualities: # 设置了一个idle的服务质量
- name: healthy
containerName: minecraft
permanent: false
exec:
command: ["bash", "./healthy.sh"]
serviceQualityAction:
- state: false
opsState: Maintaining
- state: true
opsState: None
- name: idle
containerName: minecraft
initialDelaySeconds: 10
permanent: false
exec:
command: [ "bash", "./idle.sh" ]
serviceQualityAction:
- state: true
opsState: WaitToBeDeleted
- state: false
opsState: None
Currently,
idle and healthy may conflict.
If the healthy probe return Maintaining and idle next round idle return None but healthy still not healthy.
A better way could be make the ops state calls atomic, which means user only need one api and wrap those logic in the only api(sh or http etc)
As #15 mentioned, OKG already supported cloud providers & plugins mechanism by Kubernetes webhook. Plus, in order to increase network availability, OKG also supports the function of network asynchronous ready, running the plugin within a limited time to repeatedly establish & confirm the network until the network is ready.
However, the function of network asynchronous ready requires that the webhook can still allow the operation of creating pod when an error occurs, which actually is conflict with synchronous plugins, like Kubernetes-HostPort plugin, because synchronous plugins require pod & network ready at same time.
Plugin interface add a new function IsSynchronous
to determine whether allow to create pod when errors occurred.
The new Plugin interface would be:
type Plugin interface {
Name() string
// Alias define the plugin with similar func cross multi cloud provider
Alias() string
Init(client client.Client, options CloudProviderOptions, ctx context.Context) error
// Pod Event handler
OnPodAdded(client client.Client, pod *corev1.Pod, ctx context.Context) (*corev1.Pod, errors.PluginError)
OnPodUpdated(client client.Client, pod *corev1.Pod, ctx context.Context) (*corev1.Pod, errors.PluginError)
OnPodDeleted(client client.Client, pod *corev1.Pod, ctx context.Context) errors.PluginError
// IsSynchronous determines whether allow to create pod when errors occurred.
// If set to false, the webhook allows creating pods despite errors. If set to true, the webhook denies creating pods when errors occur.
IsSynchronous() bool
}
When you want to delete specific GameServer, Delete GameServer CRD would not help. We need a immediate way to delete GameServer.
advancedstatefulset 中的 reserveOrdinals功能, kruise-game如何使用?
按照教程,我按照好了 kruise 和 kruise-game
但是,当我按照:https://openkruise.io/zh/kruisegame/installation,这个文档来部署游戏服服务时。
我使用下面的yaml:
apiVersion: v1
kind: GameServerSet
metadata:
name: minecraft
namespace: kruise-game-system
labels:
app: minecraft
spec:
replicas: 3
updateStrategy:
rollingUpdate:
podUpdatePolicy: InPlaceIfPossible
gameServerTemplate:
spec:
containers:
- image: registry.cn-hangzhou.aliyuncs.com/acs/minecraft-demo:1.12.2
name: minecraft
我想请教下我的配置有什么问题么?
附:
kind: Service
apiVersion: v1
metadata:
name: kruise-webhook-service
namespace: kruise-system
labels:
app.kubernetes.io/managed-by: Helm
app.kubesphere.io/instance: kruise-cn7tvf
annotations:
meta.helm.sh/release-name: kruise-cn7tvf
meta.helm.sh/release-namespace: okg-learn
spec:
ports:
- protocol: TCP
port: 443
targetPort: 9876
selector:
control-plane: controller-manager
clusterIP: 10.233.28.39
clusterIPs:
- 10.233.28.39
type: ClusterIP
sessionAffinity: None
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
internalTrafficPolicy: Cluster
kind: Service
apiVersion: v1
metadata:
name: kruise-game-controller-manager-metrics-service
namespace: kruise-game-system
labels:
app.kubernetes.io/managed-by: Helm
app.kubesphere.io/instance: kruise-game-b68lck
control-plane: kruise-game-controller-manager
annotations:
meta.helm.sh/release-name: kruise-game-b68lck
meta.helm.sh/release-namespace: okg-learn
spec:
ports:
- name: https
protocol: TCP
port: 8443
targetPort: https
selector:
control-plane: kruise-game-controller-manager
clusterIP: 10.233.48.173
clusterIPs:
- 10.233.48.173
type: ClusterIP
sessionAffinity: None
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
internalTrafficPolicy: Cluster
kind: Service
apiVersion: v1
metadata:
name: kruise-game-external-scaler
namespace: kruise-game-system
labels:
app.kubernetes.io/managed-by: Helm
app.kubesphere.io/instance: kruise-game-b68lck
annotations:
meta.helm.sh/release-name: kruise-game-b68lck
meta.helm.sh/release-namespace: okg-learn
spec:
ports:
- protocol: TCP
port: 6000
targetPort: 6000
selector:
control-plane: kruise-game-controller-manager
clusterIP: 10.233.60.166
clusterIPs:
- 10.233.60.166
type: ClusterIP
sessionAffinity: None
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
internalTrafficPolicy: Cluster
kind: Service
apiVersion: v1
metadata:
name: kruise-game-webhook-service
namespace: kruise-game-system
labels:
app.kubernetes.io/managed-by: Helm
app.kubesphere.io/instance: kruise-game-b68lck
annotations:
meta.helm.sh/release-name: kruise-game-b68lck
meta.helm.sh/release-namespace: okg-learn
spec:
ports:
- protocol: TCP
port: 443
targetPort: 9876
selector:
control-plane: kruise-game-controller-manager
clusterIP: 10.233.22.252
clusterIPs:
- 10.233.22.252
type: ClusterIP
sessionAffinity: None
ipFamilies:
- IPv4
ipFamilyPolicy: SingleStack
internalTrafficPolicy: Cluster
This issue outlines that the annotation tied to the network may not be updated correctly during the deletion and rebuilding phases of a Pod.
What happened:
当前服务与ApiServer的QPS和Burst使用默认配置(QPS=20, Burst=30); 在一些业务场景下,默认QPS和Burst可能不足以支撑,需要调整QPS和Burst。
What you expected to happen:
希望服务支持用户自定义QPS和Burst。例如在部署服务的command下设置
args:
- --api-server-qps=5
- --api-server-qps-burst=10
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.