jezogwza / testaha Goto Github PK

0.0 0.0 0.0 0 B

Tesing HAA with garbage repository

testaha's People

Watchers

testaha's Issues

Multiple stateless, self-hosted, self-healing API servers behind a HA load balancer, built out by the default "kube-up" automation on GCE, AWS and basic bare metal (BBM). Note that the single-host approach of having etcd listen only on localhost to ensure that only API server can connect to it will no longer work, so alternative security will be needed in the regard (either using firewall rules, SSL certs, or something else). All necessary flags are currently supported to enable SSL between API server and etcd (OpenShift runs like this out of the box), but this needs to be woven into the "kube-up" and related scripts. Detailed design of self-hosting and related bootstrapping and catastrophic failure recovery will be detailed in a separate design doc

kubeadm resilience

Expand Aric's tool (which leverages) kubeadm to solve resiliency issues there as a stop-gap and introduce fixes into kubeadm (long-term)

_ **Adding some extra text ** _

Controller manager and scheduler

Multiple self-hosted, self healing warm standby stateless controller managers and schedulers with leader election and automatic failover of API server clients, automatically installed by default "kube-up" automation.

From Github new after WebHook

k8s etc resilience

allocate a new node (not necessary if running etcd as a pod, in which case specific measures are required to prevent user pods from interfering with system pods, for example using node selectors as described in dynamic member addition. In the case of remote persistent disk, the etcd state can be recovered by attaching the remote persistent disk to the replacement node, thus the state is recoverable even if all other replicas are down. There are also significant performance differences between local disks and remote persistent disks. For example, the sustained throughput local disks in GCE is approximately 20x that of remote disks. Hence we suggest that self-healing be provided by remotely mounted persistent disks in non-performance critical, single-zone cloud deployments. For performance critical installations, faster local SSD's should be used, in which case remounting on node failure is not an option, so etcd runtime configuration should be used to replace the failed machine. Similarly, for cross-zone self-healing, cloud persistent disks are zonal, so automatic runtime configuration is required. Similarly, basic bare metal deployments cannot generally rely on remote persistent disks, so the same approach applies there.

Test Issue Github Source as

ddddd

Load balance

Multiple (3-5) etcd quorum members behind a load balancer with session affinity (to prevent clients from being bounced from one to another). Regarding self-healing, if a node running etcd goes down, it is always necessary to do three things:

jezogwza / testaha Goto Github PK

testaha's People

Watchers

testaha's Issues

From Github

k8s API Server Resilience

kubeadm resilience

Controller manager and scheduler

From Github new after WebHook

k8s etc resilience

Test Issue Github Source as

Load balance

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent