Coder Social home page Coder Social logo

open-simulator's Introduction

Open-Simulator

Go Report Card workflow build

English | 简体中文 | Korean

Introduction

Open-simulator is a cluster simulator for Kubernetes. With the simulation capability of Open-Simulator, users can create a fake Kubernetes cluster and deploy workloads on it. Open-Simulator will simulate the kube-controller-manager to create pods for the workloads, and simulate the kube-scheduler to assign pods to the appropriate nodes.

Use Case

  • Capacity Planning: plan out the number of nodes needed to install the cluster and deploy its applications successfully according to the existing server specifications (including the number of CPU cores, size of memory, capacity of disk, etc) and application workloads files (including the replicas, affinity rules, resource requirements, etc)
  • Simulating Deploying Applications: determine whether the applications can be deployed successfully at one time by simulating deploying applications in the running kubernetes cluster. If the cluster size does not meet the resource requirements of applications, plan out the number of nodes to add
  • Pods Migration: in the running Kubernetes cluster, pods can be migrated between nodes according to the migration policy(such as scaling down cluster, defragmentation, etc).

Open-Simulator intends to reduce the labor costs in the delivery phase and maintenance costs in production environment, improve the overall utilization of cluster resources by solving these thorny issues listed above.

✅ Feature

  • Create fake kubernetes clusters of any size
  • Deploy various workloads according to the custom order
  • Simulate Kube-Scheduler and report the topology results of applications deployment
  • Extend scheduling algorithm
  • Set the average resource utilization during capacity planning

User guide

Contact

Join us from DingTalk: Group No.44890136

License

Apache 2.0 License

open-simulator's People

Contributors

liwling avatar qzweng avatar sulaimangari avatar thearas avatar thebeatles1994 avatar youngseokyoon avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

open-simulator's Issues

English version README doc

Why you need it?

We need English version README.md to make it easier for non-Chinese people to understand what open-simulator is.

[Demo]二期实现内容:容器迁移

二期内容:

deadline:2021年1月15号(周五)

概念定义

容器迁移:将 Pod 从原节点迁移到指定节点

碎片:如果一个节点上的剩余资源,在任一维度上,不足够再放下一个当前集群内的任何一个 Pod,那么这些剩余资源都是碎片。比如一个节点CPU申请量已经达到 95%,而内存申请量只有 40%,那么该节点存在大量的内存碎片。

演示内容

在对集群进行缩容时,下线节点前需将其上的 Pod 进行迁移。本期演示下线节点前的容器迁移。本期暂不演示碎片整理功能。

  • 根据 kube-config 模拟现有集群
  • 预演(Dry Run)
    • 从模拟集群中选择 n 个可下线节点(n可配置)
      • 筛选节点时,支持设置资源过滤名单,比如某个命名空间下的资源不做处理,某个 Label 的 Pods 不做处理等
    • 图形化显示集群缩容前后各个资源变化
  • 容器迁移(Run)
    • 根据预演结果,对集群中的 Pods 按批次迁移到指定 Node 上

节点下线需支持暴露 SDK 供外部项目使用

// NodeStatus 结构体包含了 MigrationPlan 的内容,同时多了两个变量
// isRemovable 表示该节点是否可下线
// reason 表示节点不可下线的原因
type NodeStatus struct{
  MigrationPlan
  isRemovable bool
  reason string
}

type MigrationResult struct {
  nodeStatus []NodeStatus
}
// ScaleDownCluster
// 参数
// 1. 由使用方自己生成 cluster
// 2. nodelist 为用户指定的下线节点列表
// 返回值
// 1. error 不为空表示函数执行失败
// 2. error 为空表示函数执行成功,通过 MigrationResult 信息获取集群缩容模拟信息。其中 UnscheduledPods 表示无法调度的 Pods,若其为空表示模拟调度成功;NodeStatus 会详细记录每个 Node 上的 Pod 情况。
func ScaleDownCluster(cluster ResourceTypes, nodelist []string, opts ...Option) (*MigrationResult, error) 

为什么Simulator的Close方法需要创建一个test Pod并且等待其调度结束

Question

如果在这一行报错了,defer的Close方法会阻塞,因为scheduler还没运行起来,test pod永远不会返回调度结果,因此会阻塞在sim.simulatorStop这个chan上。

复现方法:删除这一行之后,执行测试pkg/simulator/core_test.go,就会阻塞住。
image

如果要修复这个问题,可以在生成Pod之后再NewSimulator。这样如果生成失败,也不需要Close simulator了

但我的疑问是为什么Simulator的Close方法需要创建一个test Pod并且等待其调度结束?去掉test Pod的话会有什么问题?

[feature]Support Helm Chart

Open-Simulator only supports simulating workload scheduling through yaml of statefulset and deployment types. We need to support helm chart.

[Demo]第一期实现内容:集群资源规划

演示内容:

deadline:2021年11月15号(周一)

  • 模拟拉起 Kubernetes 集群:节点规格(CPU、内存、硬盘)、节点数通过Yaml文件指定。集群包含如下节点:
    • master节点
    • 普通worker节点
    • 专用节点
  • 准备用户的 Helm Chart 文件(可以是 web server / nginx / mysql / redis 类应用)
    • Chart 中包含 K8s 标准的 Workload 文件,Workload中指定了不同的资源配额、调度规则(亲和、反亲和特性,包含应用之间以及应用-节点间亲和规则)、副本数
    • Chart 中包含自定义 CR 资源
    • 有多个 Chart 文件,Chart 之间有依赖关系
  • Open-Simulator 读取 Helm Chart,并给出如下结果:
    • 当前模拟集群是否可以一次性部署上述所有应用;若不满足,给出推荐的集群规模。
    • 显示应用部署之后的集群资源分配率
    • 显示应用部署之后的集群调度拓扑情况

Support for custom scheduler

Question

Thank you for awesome project.
We use Volcano scheduler for ML workload scheduling. Planning to use open simulator for scenarios testing.
Is it possible to use volcano scheduler instead of default kube scheduler?

thank you!!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.