Coder Social home page Coder Social logo

kedacore / keda Goto Github PK

View Code? Open in Web Editor NEW
7.8K 93.0 977.0 69.51 MB

KEDA is a Kubernetes-based Event Driven Autoscaling component. It provides event driven scale for any container running in Kubernetes

Home Page: https://keda.sh

License: Apache License 2.0

Go 99.09% Shell 0.29% Dockerfile 0.15% Makefile 0.46%
kubernetes serverless autoscaling event-driven keda hacktoberfest

keda's People

Contributors

aarthisk avatar ahmelsayed avatar anirudhgarg avatar aviadlevy avatar coderanger avatar dependabot[bot] avatar dttung2905 avatar eldarrin avatar gauron99 avatar goku321 avatar jeffhollan avatar jorturfer avatar lee0c avatar patnaikshekhar avatar ppatierno avatar ramcohen avatar renovate[bot] avatar ritikaa96 avatar samuelmacko avatar shubham82 avatar silenceper avatar spiritzhou avatar tbickford avatar tomkerkhove avatar tsuyoshiushio avatar turbaszek avatar v-shenoy avatar yaron2 avatar yoongon avatar zroubalik avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

keda's Issues

Get logging out of only debug level

Some logs should be emitted by default to help with debugging. Specifically I think these logs would be interested:

  • When a new scaledObject is detected / registered
  • When activation occurs
  • When a scaling decision is made

Keda as a generic pull-based autoscaler for K8s and Knative

I'd love to see Kore become a generic pull-based autoscaler for Kubernetes and Knative. Below are the general high-level things I believe are needed to make that happen:

  • Open source Kore
  • Define an extension or plugin mechanism. If Kore supports various Azure services and Kafka out of the box, how would a user add support for another service? AMQP, other cloud provider services, etc? Where would that code live? Does it all need to live in the Kore repo? How would the person deploying Kore enable or disable specific service support?
  • Where would the Knative PodAutoscaler implementation the delegates to Kore live? Should that part live in Knative itself and just depend on Kore, which keeps Kore from needing to take a dependency on Knative? Or should it live in Kore? Or its own repo, independent of either?
  • Add unit and e2e tests so we have confidence in releases
  • Give Kore the option to watch a single namespace instead of only all namespaces. This is trivial if we move to an operator-sdk style layout, but even with what's here today it's just adding an environment variable and a bit of logic to control which namespaces we watch. I believe this is important because Kore handles Secret objects and, at least while it's fairly new, limiting its privileges to reading Secrets in a single namespace seems like a good idea.

This issue is to have a conversation about the above and track any other bits of work we need to make it happen.

Kore metrics as custom/external Kubernetes metrics

Kubernetes defines APIs to get custom/external metrics into the system. They can also be used to feed the HPA for example.

As Kore progresses, we could think about organizing the code in a way, that makes it straightforward to eventually plug it into the HPA (or other autoscalers). I think these APIs make for a very nice uniform way to provide the metrics Kore needs.

See https://github.com/kubernetes-incubator/custom-metrics-apiserver/blob/master/pkg/provider/interfaces.go#L104-L108 for the interface to implement. This might directly influence #19 as it could be a partial answer as to how the code should be organized.

I'd envision a custom/external metrics adapter per event-source. Kore's own autoscaler would poll these metrics in a fixed interval (as today) for now, to provide the cases Kore need.

Ability to set a minimum number of instances for a deployment

Rather than force all functions to scale to 0 on no events, you may want some functions (e.g. HTTP ones) to only ever scale down to 1. You still want them to scale 1 -> n if the event source gets noisy, but that would allow you to protect against cold start.

examples/scaledobject.yaml out of date

The example ScaledObject in examples/scaledobject.yaml has fields that don't match the spec. I can look at the source and figure out what's expected here, but just adding an issue to remember to update the example.

Stop scaling out new resources if the function isn't successfully processing

There are two scenarios we have today where we stop scaling new instances even as the queue length increases:

  1. No available partitions. If you have an event hub with 5 partitions that is growing in event lenghts, we will scale to 5 instances, and then 6. The 6th one gets scaled out but can't get a lock on a partition so never actually consumes any events. We never scale to a 7th because we look to see that the 6th isn't execution so stop scaling.
  2. A function is misconfigured and never actually is able to start consuming. Rather than scaling out a broken app, we stop scaling if we see no executions are happening.

In our service we do that by looking at the execution and billing metrics for the instance, and if no billing or execution metrics are emitting we stop scaling more.

We'd need a similar pattern in Kore so that if the event consumer isn't actually processing new messages, we don't keep scaling. Both for the "no available partition" scenario or the "my app isn't even working" scenario. We could also solve each of these in 2 seperate ways (e.g. maybe expose partition info into the scaler?)

Issue with api-resources and custom.metrics.k8s.io

After deploying via the current master branch deploy, things seem to setup ok but when I run kubectl api-resources I get the following:

error: unable to retrieve the complete list of server APIs: custom.metrics.k8s.io/v1beta1: the server could not find the requested resource

Tried on a fresh install and same thing.

Client Version: version.Info{Major:"1", Minor:"13", GitVersion:"v1.13.4", GitCommit:"c27b913fddd1a6c480c229191a087698aa92f0b1", GitTreeState:"clean", BuildDate:"2019-03-01T23:34:27Z", GoVersion:"go1.12", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.9", GitCommit:"16236ce91790d4c75b79f6ce96841db1c843e7d2", GitTreeState:"clean", BuildDate:"2019-03-25T06:30:48Z", GoVersion:"go1.10.8", Compiler:"gc", Platform:"linux/amd64"}

HTTP integration into Keda tooling (func)

People deploying a function set to consume HTTP events or a function set to consume queue events should work through the same publishing flow.

I understand from a meta-point of view, the decision on where or how to handle HTTP scale-to-zero and scale-to-n is slightly different than our non-HTTP triggers, and there are concepts like endpoint controllers, and forwarding of events. In addition, there are a few options on how to deal with this (Osiris, Knative, even some discussion about supporting this on the HPA directly).

I think short term the desired action is for the tooling to be able to integrate with and help establish an HTTP pipeline without burdening the user. The default should likely be something with few dependencies like Osiris, but I think it makes sense to make it configurable or extendible if people wanted a deploy that worked with Knative eventing. For now just scoping this issue to Osiris support.

Func cli command to generate deployment artifacts

Using the functions command line tools I should be able to do a deploy of a function project and have the necessary artifacts generated. This would include:

  • Ability to create scale controller resources per cluster (either idempotently or an explicit gesture once per cluster) (Needs linked issue)
  • Ability to generate the appropriate ScaledObjects based on the function.json metadata of the function (Needs linked issue)

Keep track of k8s api server usage

This issue is to examine KEDA's CPU and Memory implications in regards to the Kubernetes API server.
This is to ensure that KEDA does not cause over-utilization of the API server, as it communicates in a bi-di manner with the API server (KEDA controller being asked for metrics and looking up Kubernetes objects using the Kubernetes API).

Have helm chart delete CRD or add option to not create CRD

Deleting a helm release of keda & then re-installing fails currently as helm does not delete CRDs by default. There are a few possible work arounds:

  1. Add a values file flag that sets whether or not CRDs are created
  2. Add a hook that cleans up the ScaledObject CRD post-deletion
  3. Add a hook that cleans up the ScaledObject CRD pre-install (wipes old versions if they exist)

Might want to do 1. & one of 2. or 3. - since killing the ScaledObject CRD will kill deployed ScaledObjects afaik, adding in option 1. & making either 2. or 3. dependent on that value will give users a use case where they don't touch their deployed ScaledObjects while making other edits

Haven't checked the func cli behavior for this issue but the func cli also doesn't offer a command to delete/remove keda from a cluster so at that point it's manual anyway.

Add information to NOTES.txt in helm chart

Add installation notes, next steps, methods to see output, etc to NOTES.txt - some general information that would be useful to someone who has just installed the chart.

Most likely: note about config-ing scaledobjects, note about seeing which External Metrics are in use via apis/external.metrics... link, how to see current hpas, etc

Reduce level of logs

Seeing noisy logs on service-controller, lots of requests to create Kubernetes resources, and lots of custom and external metrics requests and responses. Would be good if we could move those to debug logs.

/cc @Aarthisk

Consider updating repo layout to align with operator-sdk or kubebuilder

This repo is based on the Kubernetes sample-controller, which is a bit outdated in its layout. Operator SDK and kubebuilder are both tools that have a more modern approach to generating Kubernetes controllers. Specifically, if this repo ever evolves to contain more than one API or controller reconcile loop, the skaffolded layouts of the above two are a more flexible approach.

I'm not sure that's needed, depending on where we'd like to land the Knative PodAutoscaler implementation - in this repo itself, in Knative, or in a 3rd new repo.

Align func core tools & trigger naming conventions

There are mismatched naming conventions for triggers when going between func core tools uses, function container env vars, & kore go files.

ex: storage queue functions have trigger type queueTrigger in functions, then type azure-queue in Kore.

If something in func core tools is deliberately standardizing those names for the k8s setting,

  1. that should be documented somewhere for cases where someone writes their own ScaledObject yaml (related: #96 )
  2. serviceBusTrigger should also be standardized by func core tools when creating a scaled object

Kafka trigger - recover broken function instances?

Not sure the right fix for this. Was playing with the kafka trigger again today, here's the cycle:

  1. Created a kafka topic with a single partition
  2. Deploy a function with KEDA. KEDA activated, the first function locked the partition
  3. KEDA kept scaling out (which is fine for now) until I had 4 instances. Only 1 was active (the first one). Once it caught up KEDA scaled down to 1 instance.

however at this point the instance that was left remaining was one of the additional instances that never got a lock. When checking the logs for that function it was more or less dead.

info: Host.General[0]
Host lock lease acquired by instance ID '000000000000000000000000448490CC'.
fail: Host.Triggers.Kafka[0]
kafka-cp-kafka-headless:9092/bootstrap: Failed to resolve 'kafka-cp-kafka-headless:9092': Temporary failure in name resolution (after 5298ms in state CONNECT)
fail: Host.Triggers.Kafka[0]
1/1 brokers are down

I'm not sure if I really had a reliability issue, or if this was one of the ones that didn't have an available partition.

In my mind a few thoughts:

  1. Should the Kafka trigger keep retrying to connect if it fails? I assume the runtime in general doesn't do this?
  2. Should Kubernetes know that this function is in a dead state so it can do the CrashBackoffCycle and restart it? If so, is there an existing health probe we should be hooking up?

Realize this isn't really a KEDA issue but didn't know where else to put.

/cc @ahmedelnably @fabiocav would be interested to get your thoughts here

core-tools kubernetes commands design

Install:

func kubernetes install kore
func kubernetes install osiris

# Other options:
# --namespace     Default: "default"
# --version       Default: latest
# --chart-option  Default: none.

Remove

func kubernetes remove kore 
func kubernetes remove osiris

# Other options:
# --namespace     Default: "default"

Update

func kubernetes update kore
func kubernetes update osiris

# Other options:
# --version    Default: latest

Deploy

# Auto-build container
func kubernetes deploy --name {name} --registry ahmelsayed.azurecr.io
func kubernetes deploy --name {name} --registry DockerHubUser

# Use image
func kubernetes deploy --name {name} --image-name {image-name}

# Other options:
# --image-pull-secret        Default: none.
# --enable-osiris            Default: true if the function app has any http trigger
# --function-per-deployment  Default: false
# --secrets-name             Default: generate a secrets collection.

Generate

# Use image
func kubernetes generate --name {name} --image-name {image-name}

# Other options:
# --image-pull-secret        Default: none.
# --enable-osiris            Default: true if the function app has any http trigger
# --function-per-deployment  Default: false
# --output                   Default: yaml. Options: yaml, json

Core tools generation fixes

Per my investigation I ran into two issues for the core tools. Creating them here for the detail instead of core tools repo.

  • ScaledObject needs to be deployed after the Deployment. Today running func kubernetes deploy --dry-run has ScaledObject serialized first.
  • The ScaledObject needs a label to hook up the custom metrics adapter for deploymentName which the value should be the name of the deployment

/cc @ahmelsayed @Aarthisk

[Scaler] Kafka

Creates the appropriate metrics and drives logic for scaling Kafka event sources.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.