Coder Social home page Coder Social logo

cloudnativegame / aigc-gateway Goto Github PK

View Code? Open in Web Editor NEW
39.0 39.0 8.0 41.4 MB

A user gateway that provides serverless AIGC experience.

License: Apache License 2.0

Dockerfile 1.56% JavaScript 4.94% HTML 1.16% Vue 17.38% CSS 3.36% Go 71.60%
aigc kubernetes openkruisegame

aigc-gateway's People

Contributors

chrisliu1995 avatar ringtail avatar smartwang avatar wangying-ly avatar wuwenrufeng avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

aigc-gateway's Issues

增强web前端界面的交互效果

现在web前端界面,当点击INSTALL、PAUSE、RESUME按钮之后,界面没有任何变化,只能通过刷新来改变界面状态,交互体验较差,且很容易让用户误以为操作没有成功从而重复操作。
期望在按钮点击之后,界面能即使自动相应更新,比如出现操作中的状态提示,同时避免用户重复点击。

[Feature] Design Proposal for aigc-gateway

Introduction(介绍)

This project aims to address the resource management issues of AIGC instances by providing an AIGC serverless gateway based on the auto-scaling feature of cloud-native architecture.

The gateway has the following features:

  • User management. Each user has their own AIGC instance, and the gateway will maintain the mapping between the user and the instance.
  • User-level resource management. AIGC computing instances are created and destroyed based on the user's login/offline status, while preserving user data.

本项目着眼于AIGC实例资源管理问题,基于云原生架构下自动伸缩特性,提供一个AIGC serverless网关。

该网关具备如下特点:

  • 用户态管理。每个用户拥有各自的AIGC实例,网关将维护用户与实例的对应关系。
  • 用户级别资源管理。根据用户的登录/下线状态生成/销毁AIGC计算实例,保留用户持久化数据。

Design(设计)

User Usage Flow(用户使用链路)

As shown in the figure below, AIGC-Gateway supports managing multiple AIGC model collections. When user A logs in and requests the gateway, the gateway returns the corresponding access endpoint of the instance for the user to connect and use.

以下图为例,AIGC-Gateway支持管理多个AIGC模型集合,用户登录时请求网关并选择使用模型,网关将返回实例对应访问端点,供用户连接使用。

image

New User Online(新用户访问)

As shown in the figure below, when a new user B logs in and selects an AIGC model, the gateway calls the instance collection interface to create a new instance for the user.

以下图为例,新用户B登录时选择AIGC模型,网关调用实例集合接口为用户创建对应的新实例。

image

Old User Offline(老用户下线)

As shown in the figure below, when an old user A logs out or the session expires, the gateway calls the instance collection interface to delete the instance accordingly, releasing computing resources while preserving storage resources.

以下图为例,老用户A登出或session过期,网关调用实例集合接口,定向缩容实例使计算资源释放,同时保留存储资源。

image

Old User Back Online(老用户重新上线)

As shown in the figure below, when an old user A logs in and selects an AIGC model, the gateway calls the instance collection interface to create a corresponding instance for user A. The instance name and access endpoint remain consistent with the previous settings, and the mounted persistent storage disk data will not be lost.

以下图为例,老用户A登录选择AIGC模型,网关调用实例集合接口为用户A创建对应的实例,实例名称与访问端点与之前保持一致,挂载的持久化存储盘数据不会丢失。

image

希望增进资源释放以及模板清理

希望增加下面的功能或者说明。

1 为防止安装后产生的问题,希望增加
a 对Terway版本依赖的说明。并在FAQ中说下网络依赖的问题与解法(面向阿里云客户)
b 对kruise-game版本的更新 --version 0.3.1

2 用户登录后,有时会发生错误,导致无法使用模板上部署的应用。希望能提供给删除资源的功能,运行客户手动释放重建资源。

3 希望提供用户手动删除模板的方法,面向需要删除的场景。
比如客户不再使用当前模板上定义的镜像(应用程序),需要更换别的应用时,或者不使用AIGC-Gateway时,应该允许删除。

[new feature] 新增重启/删除实例功能

背景

AIGC实例出现异常时,如底层计算/存储/网络资源拉起失败、或模型卡死等情况,用户需要自主选择权利,决定是否要将实例回收或重启。

设计

  • 新增 /restart 接口,用户可调用该接口重启容器。对应在Dashboard中增加Restart按钮,在界面实例转圈时/正常运行时可以让用户选择点击。

  • 新增 /delete 接口,删除计算实例,以及对应的存储。对应在Dashboard中增加Uninstall按钮,在安装后的任何阶段都可以让用户选择点击。

Is it suitable that dashboard show a default instance module named stable-diffusion-cpu when OKG not installed?

line 16 in ./aigc-dashboard/src/components/engine.vue

  methods: {
    getData() {
      this.axios.get("/resources").then((response) => {
        this.items = response.data
      }).catch((error) => {
        this.items = [{"kind":"GameServerSet","apiVersion":"game.kruise.io/v1alpha1","metadata":{"name":"stable-diffusion-cpu","namespace":"default","uid":"198242b9-f6c8-4d33-8a80-5bbee44fa353","resourceVersion":"3534496","generation":11,"creationTimestamp":"2023-05-19T11:05:15Z","annotations":{"game.kruise.io/reserve-ids":"3,2,1","kubectl.kubernetes.io/last-applied-configuration":"{\"apiVersion\":\"game.kruise.io/v1alpha1\",\"kind\":\"GameServerSet\",\"metadata\":{\"annotations\":{},\"name\":\"stable-diffusion-cpu\",\"namespace\":\"default\"},\"spec\":{\"gameServerTemplate\":{\"spec\":{\"containers\":[{\"args\":[\"--listen\",\"--skip-torch-cuda-test\",\"--no-half\"],\"command\":[\"python3\",\"launch.py\"],\"env\":[{\"name\":\"POD_NAME\",\"valueFrom\":{\"fieldRef\":{\"apiVersion\":\"v1\",\"fieldPath\":\"metadata.name\"}}}],\"image\":\"yunqi-registry.cn-shanghai.cr.aliyuncs.com/lab/stable-diffusion:v1.0.0-cpu\",\"name\":\"stable-diffusion\",\"readinessProbe\":{\"failureThreshold\":3,\"initialDelaySeconds\":5,\"periodSeconds\":10,\"successThreshold\":1,\"tcpSocket\":{\"port\":7860},\"timeoutSeconds\":1}}]}},\"network\":{\"networkConf\":[{\"name\":\"IngressClassName\",\"value\":\"nginx\"},{\"name\":\"Port\",\"value\":\"7860\"},{\"name\":\"Host\",\"value\":\"instances\\u003cid\\u003e.c5464a5f2c39341d3b3eda6e2dd37b505.cn-hangzhou.alicontainer.com\"},{\"name\":\"PathType\",\"value\":\"ImplementationSpecific\"},{\"name\":\"Path\",\"value\":\"/\"},{\"name\":\"Annotation\",\"value\":\"nginx.ingress.kubernetes.io/auth-url: https://dashboard.c5464a5f2c39341d3b3eda6e2dd37b505.cn-hangzhou.alicontainer.com/sign-in\"},{\"name\":\"Annotation\",\"value\":\"nginx.ingress.kubernetes.io/auth-signin: https://dashboard.c5464a5f2c39341d3b3eda6e2dd37b505.cn-hangzhou.alicontainer.com/\"}],\"networkType\":\"Kubernetes-Ingress\"},\"replicas\":0,\"scaleStrategy\":{\"scaleDownStrategyType\":\"ReserveIds\"},\"updateStrategy\":{\"rollingUpdate\":{\"maxUnavailable\":\"100%\",\"podUpdatePolicy\":\"InPlaceIfPossible\"}}}}\n"},"managedFields":[{"manager":"manager","operation":"Update","apiVersion":"game.kruise.io/v1alpha1","time":"2023-05-19T11:06:04Z","fieldsType":"FieldsV1","fieldsV1":{"f:status":{".":{},"f:availableReplicas":{},"f:currentReplicas":{},"f:labelSelector":{},"f:maintainingReplicas":{},"f:observedGeneration":{},"f:readyReplicas":{},"f:replicas":{},"f:updatedReadyReplicas":{},"f:updatedReplicas":{},"f:waitToBeDeletedReplicas":{}}},"subresource":"status"},{"manager":"kubectl-client-side-apply","operation":"Update","apiVersion":"game.kruise.io/v1alpha1","time":"2023-05-22T08:05:50Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{".":{},"f:kubectl.kubernetes.io/last-applied-configuration":{}}},"f:spec":{".":{},"f:gameServerTemplate":{".":{},"f:spec":{}},"f:network":{".":{},"f:networkConf":{},"f:networkType":{}},"f:scaleStrategy":{},"f:updateStrategy":{".":{},"f:rollingUpdate":{".":{},"f:maxUnavailable":{},"f:podUpdatePolicy":{}}}}}},{"manager":"manager","operation":"Update","apiVersion":"game.kruise.io/v1alpha1","time":"2023-05-22T08:05:50Z","fieldsType":"FieldsV1","fieldsV1":{"f:metadata":{"f:annotations":{"f:game.kruise.io/reserve-ids":{}}}}},{"manager":"aigc-gateway","operation":"Update","apiVersion":"game.kruise.io/v1alpha1","time":"2023-05-22T08:12:48Z","fieldsType":"FieldsV1","fieldsV1":{"f:spec":{"f:gameServerTemplate":{"f:metadata":{".":{},"f:creationTimestamp":{}},"f:spec":{"f:containers":{}}},"f:replicas":{},"f:reserveGameServerIds":{}}}}]},"spec":{"replicas":2,"gameServerTemplate":{"metadata":{"creationTimestamp":null},"spec":{"containers":[{"name":"stable-diffusion","image":"yunqi-registry.cn-shanghai.cr.aliyuncs.com/lab/stable-diffusion:v1.0.0-cpu","command":["python3","launch.py"],"args":["--listen","--skip-torch-cuda-test","--no-half"],"env":[{"name":"POD_NAME","valueFrom":{"fieldRef":{"apiVersion":"v1","fieldPath":"metadata.name"}}}],"resources":{},"readinessProbe":{"tcpSocket":{"port":7860},"initialDelaySeconds":5,"timeoutSeconds":1,"periodSeconds":10,"successThreshold":1,"failureThreshold":3}}]}},"reserveGameServerIds":[3,2,1],"updateStrategy":{"rollingUpdate":{"maxUnavailable":"100%","podUpdatePolicy":"InPlaceIfPossible"}},"scaleStrategy":{},"network":{"networkType":"Kubernetes-Ingress","networkConf":[{"name":"IngressClassName","value":"nginx"},{"name":"Port","value":"7860"},{"name":"Host","value":"instances\u003cid\u003e.c5464a5f2c39341d3b3eda6e2dd37b505.cn-hangzhou.alicontainer.com"},{"name":"PathType","value":"ImplementationSpecific"},{"name":"Path","value":"/"},{"name":"Annotation","value":"nginx.ingress.kubernetes.io/auth-url: https://dashboard.c5464a5f2c39341d3b3eda6e2dd37b505.cn-hangzhou.alicontainer.com/sign-in"},{"name":"Annotation","value":"nginx.ingress.kubernetes.io/auth-signin: https://dashboard.c5464a5f2c39341d3b3eda6e2dd37b505.cn-hangzhou.alicontainer.com/"}]}},"status":{"observedGeneration":11,"replicas":2,"readyReplicas":1,"availableReplicas":1,"currentReplicas":2,"updatedReplicas":2,"updatedReadyReplicas":1,"maintainingReplicas":0,"waitToBeDeletedReplicas":0,"labelSelector":"game.kruise.io/owner-gss=stable-diffusion-cpu"}}]
      })
    }
  },

I prefer to make dashboard show nothing when OKG is not installed.

部署logto后,创建web应用时,会大概率发生错误

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.