Coder Social home page Coder Social logo

acouvreur / sablier Goto Github PK

View Code? Open in Web Editor NEW
1.1K 11.0 41.0 6.81 MB

Start your containers on demand, shut them down automatically when there's no activity. Docker, Docker Swarm Mode and Kubernetes compatible.

Home Page: https://acouvreur.github.io/sablier/

License: GNU Affero General Public License v3.0

Go 80.88% JavaScript 1.58% Dockerfile 0.43% Makefile 1.05% HTML 10.95% Shell 5.11%
traefik-plugin traefik docker docker-swarm plugin kubernetes nginx podman hacktoberfest

sablier's Introduction

Sablier

GitHub license GitHub contributors GitHub issues GitHub pull-requests PRs Welcome

GoDoc Latest Build Go Report Go Version Latest Release Latest PreRelease

An free and open-source software to start workloads on demand and stop them after a period of inactivity.

Demo

Either because you don't want to overload your raspberry pi or because your QA environment gets used only once a week and wastes resources by keeping your workloads up and running, Sablier is a project that might interest you.

🎯 Features

πŸ“ Documentation

See the documentation here

Contributors

Alexis Couvreur
Alexis Couvreur

πŸ’¬ πŸ› πŸ’» πŸ“– πŸ’‘ πŸ€”
Matthias Schneider
Matthias Schneider

πŸ’» πŸ“– πŸ‘€
Alexandre HILTCHER
Alexandre HILTCHER

πŸ’» πŸ€”
tandy1000
tandy1000

πŸ“– πŸ€”
Sam R.
Sam R.

πŸ“–
Stanislas Bruhière
Stanislas Bruhière

πŸ’» πŸ€”
Jenn Wheeler
Jenn Wheeler

πŸ’» πŸ€”
Romain Duminil
Romain Duminil

πŸ’» πŸ€”
Add your contributions

sablier's People

Contributors

acouvreur avatar massimeddu-sj avatar mschneider82 avatar mscreations avatar nastaliss avatar patcher-ms avatar renovate[bot] avatar romdum avatar sam-r avatar semantic-release-bot avatar sourgrasses avatar tandy-1000 avatar tomaszduda23 avatar valexz avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sablier's Issues

scaling down to zero does not work the first time

Scaling down to zero does not work if there was no http request to the service at all. E.g. during creating the cluster all pods stay up. After first http request the deployment is scale down to zero.

Feature Request: option to block request until container loads

Thank you for this cool idea! I am hoping to use it to reduce the load on a server that has a few rarely-used containers.

It would be great to have a feature that simply keeps the request open until the container is up and running, without a loading screen.

A great use case for this is a simple container I run that serves a single json file (a .wellknown file). If I put the ondemand plugin on this, it would break clients as they don't know how to parse the loading screen. However, if it simply blocked until the container is up and running, it would work.

An additional parameter for "wait" time after a container starts may be necessary if it takes some time for the container to be ready to serve an image.

[Traefik Plugin Catalog] Plugin Analyzer has detected a problem.

The plugin was not imported into Traefik Plugin Catalog.

Cause:

failed to read manifest content: yaml: line 6: did not find expected key

Traefik Plugin Analyzer will restart when you will close this issue.

If you believe there is a problem with the Analyzer or this issue is the result of a false positive, please contact us.

Improve logging

Add logging, that can help debugging, such as much more debug logs to know the duration elapsed for spinning up an instance, etc.

ondemand and traefik IngressRoute

I'm trying to get this working on k8s for quite a time but all i see is the error page.

So i am curious:

  • Does this work with traefik IngressRoute ?
  • the "gchr.io/acouvreur/traefik-ondemand-service" does not exist is "acouvreur/traefik-ondemand-service" up to date ?

Does anyone have a actual example how this works with k8s ?

[Traefik Plugin Catalog] Plugin Analyzer has detected a problem.

The plugin was not imported into Traefik Plugin Catalog.

Cause:

the import "github.com/acouvreur/sablier/plugins/traefik" must be related to the module name "github.com/acouvreur/sablier/v2"

Traefik Plugin Analyzer will restart when you will close this issue.

If you believe there is a problem with the Analyzer or this issue is the result of a false positive, please contact us.

Showing error page instead of loading page

Hi Acouvreur,

Great service. Love the plugin and the service. I do however have a problem, that the plugin loads the error page every time, instead of the loading page.
Setup. I start all my services using a .sh script, calling docker compose files. Therefore using docker classic...

Error:

Error loading Who am I.

There was an error loading your instance.
Get "http://ondemand:10000?name=who&timeout=1m0s": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

The service does seem to respond correctly:

09/23/2022 8:37:11 AM
time="2022-09-23T08:37:11+02:00" level=info msg="Scaling up who to 1"
09/23/2022 8:38:27 AM
time="2022-09-23T08:38:27+02:00" level=info msg="Scaling down who to 0"

Below snippets from dynamic-conf in Traefik. (http/services/middlewares are missing. This is only snippets of the specific conf.)

    chain-who:
      chain:
        middlewares:
          - odm-who
          - chain-default

    odm-who:
      plugin:
        traefik-ondemand-plugin:
          serviceUrl: http://ondemand:10000
          name: who
          displayname: "Who am I"
          timeout: 1m
          waitui: true

    who:
      loadBalancer:
        servers:
        - url: "http://who:80"

    who:
      rule: Host("who.domain.tld")
      entryPoints:
        - "websecure"
      middlewares:
        - chain-who
      service: "who"
      tls: true

I have tried adding below to my static conf.

      - "--serversTransport.forwardingTimeouts.dialTimeout=30s"
      - "--serversTransport.forwardingTimeouts.idleConnTimeout=30s"

But this didnt make any difference. Do you have any idea what might be causing the client time-out? All my trafik is sent through cloudflare argo tunnels, but this seems to be happening only server-side?

Feature Request: Support multiple containers

I would love to use this on a bunch of different rarely-used containers. It seems like today in order to use the ondemand plugin, I would have to create a new ondemand container per use case. Ideally I am reducing the number of containers running at a given time, but this would actually increase.

Is this actually possible given the information at request-time?

Ondemand container is not setting the variables I inject at docker run

Hi,

I tried to deploy your docker image for onDemand but when I pass the variable to disable the swarmMode I always get swarmMode true in the logs.
I paste the following command I used to deploy your image:

docker run -d --name='ondemand' --net='xxx' -e TZ="Europe/Berlin" -e 'swarmode'='false' -e 'swarmMode'='false' -e 'kubernetesMode'='true' -e '--swarmMode'='false' -e '--swarmode'='false' -l net.unraid.docker.managed=dockerman
-l 'traefik.http.middlewares.ondemand.plugin.traefik-ondemand-plugin.name'='radarr'
-l 'traefik.http.middlewares.ondemand.plugin.traefik-ondemand-plugin.serviceUrl'='http://ondemand:10000'
-l 'traefik.http.middlewares.ondemand.plugin.traefik-ondemand-plugin.timeout'='1m'
-l 'traefik.http.services.ondemand.loadbalancer.server.port'='10000'
-l 'traefik.enable'='true'
-p '10000:10000/tcp'
-v '/var/run/docker.sock':'/var/run/docker.sock':'rw'
-v '/mnt/user/appdata/ondemand':'/storagePath':'rw' 'acouvreur/traefik-ondemand-service:latest'

code-server on demand isn't working as expected

Hello,

Great plugin but I have a problem, the vscode server does not keep the container active, let me explain better.

The plugin always stops the container after exactly 1 minute from the startup even if I am writing code and if this happens (code-server shutting down while I'm using it) it doesn't come back up if i try to reach it via url with traefik host rule.
Is this supposed to happen? Can the plugin work as I'm expecting with vscode?

docker-compose.yml :

version: "3.3"

services:
  traefik:
    image: "traefik:latest"
    container_name: traefik
    command:
      - --api=true
      - --api.insecure=true
      - --entrypoints.web.address=:80
      - --providers.docker
      # plugin: ondemand
      - --experimental.plugins.traefik-ondemand-plugin.moduleName=github.com/acouvreur/traefik-ondemand-plugin
      - --experimental.plugins.traefik-ondemand-plugin.version=v1.2.0
      - --providers.file.filename=/etc/traefik/dynamic-config.yml
      - --log.level=ERROR
    ports:
      - "80:80"
      - "8080:8080"
    networks:
      - traefik_local
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock:ro"
      - "/root/dynamic-config.yml:/etc/traefik/dynamic-config.yml"
    labels:
      - "traefik.http.routers.http-catchall.rule=hostregexp(`{host:.+}`)"
      - "traefik.http.routers.http-catchall.entrypoints=web"

  portainer:
    image: portainer/portainer-ce:latest
    container_name: portainer
    command: -H unix:///var/run/docker.sock
    networks:
      - traefik_local
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - portainer_data:/data
    restart: always
    labels:
      # Frontend
      - "traefik.enable=true"
      - "traefik.http.routers.frontend.rule=Host(`portainer.fritz.box`)"
      - "traefik.http.routers.frontend.entrypoints=web"
      - "traefik.http.services.frontend.loadbalancer.server.port=9000"
      - "traefik.http.routers.frontend.service=frontend"

      # Edge
      - "traefik.http.routers.edge.rule=Host(`edge.fritz.box`)"
      - "traefik.http.routers.edge.entrypoints=web"
      - "traefik.http.services.edge.loadbalancer.server.port=8000"
      - "traefik.http.routers.edge.service=edge"

  ondemand:
    image: ghcr.io/acouvreur/traefik-ondemand-service:latest
    container_name: ondemand
    command: 
      - --swarmMode=false
    networks:
      - traefik_local
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock"
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.ondemand.name=ondemand"
      - "traefik.http.middlewares.ondemand.plugin.traefik-ondemand-plugin.name=code-server"
      - "traefik.http.middlewares.ondemand.plugin.traefik-ondemand-plugin.serviceUrl=http://ondemand:10000"
      - "traefik.http.middlewares.ondemand.plugin.traefik-ondemand-plugin.timeout=1m"
      - "traefik.http.services.ondemand.loadbalancer.server.port=10000"

  code-server:
    image: lscr.io/linuxserver/code-server:latest
    container_name: code-server
    environment:
      - PUID=0
      - PGID=0
      - TZ=Europe/Rome
    ports:
      - 8443:8443
    networks:
      - traefik_local
    volumes:
      - /root/containers/codeserver/config:/config
      - /root:/photonos_vm/root/
    restart: on-failure
    labels:
      # vscode
      - "traefik.enable=true"
      - "traefik.http.routers.code-server.name=code-server"

networks:
  traefik_local:
    name: traefik_local

volumes:
  portainer_data:
    name: portainer_data

dynamic-config.yml :

http:
  services:
    code-server:
      loadBalancer:
        servers:
        - url: "http://code-server:8443"
  routers:
    code-server:
      rule: Host(`vscode.fritz.box`)
      entryPoints:
        - "web"
      middlewares:
        - ondemand@docker
      service: "code-server"

Parameter to set individual loading.go page refresh time

I am testing the beta functionality to start/stop several containers.
I have to start 3 containers, which scale up quick, but take some time to fully load.
Until that time - the loading page quickly disappears and shows "Bad request" which is visible until everything is fully loaded.

I know that this specific container (and its dependencies) - needs 15 seconds to to be fully loaded.
Would be great if we could provide (similar to "displayname") parameter where we could se the refresh time - separately for each middleware. I'd keep "5" as a default value.

Thanks for developing this great plugin!

Update JSON API

The API is a bit inconsistent, change to camelCase and add documentation about it

Error: plugin: unknown plugin type: traefik-ondemand-plugin

Hello there,

I am trying to play and understand this plugin and tried to run the example for classic docker. Given that the readme already lead to errors I just simply started the docker-compose and it leads to the following error:

level=error msg="plugin: unknown plugin type: traefik-ondemand-plugin" entryPointName=http routerName=whoami@file

I raised the version to 1.2.0 and am trying to understand how to load the plugin correctly but any help would be greatly appreciated.

Allow to start/stop systemd services

Hi there, I really like using your plugin, but now I'm out of docker on some of my services (I use Proxmox)

By using your acouvreur/traefik-ondemand-service out of the docker image, we can start and stop systemd services directly on the host if you make some changes to the code.
That can allow anybody that didn't use traefik inside of docker to use this great plugin, and allow the usage of this plugin for a lot of usage (Start a VM or LXC container, start a local httpd server, start a game server like minecraft, ...)

Because of we can set custom url in each different middleware, it can also be possible to run the ondemand-service directly inside of the VM that can start the service with systemctl start httpd.services

Possibility to use for API

Thanks for this great plugin! I'd like to have an option for an API to just delay the request till the container has been started. Now it will respond with HTML which is not nice for an JSON api. Is this something that is remotely implementable? I'd like to hear!

Ability To Add Dynamic Loading

First off, I think that this plugin is great and I'd love to use it in "production", but the only thing stopping me is that I don't like that it requires an extra step on the user's end (the refresh).

I would like to propose the following two options to see if either is available or is even possible with a setup like this:

1.) Do not load a page until the underlying service is available. I would almost prefer this than having another step for the user, but depending on the startup time, the user may think that the server might not reply and get impatient. Therefore, I'm not sure if this is a great solution.

2.) I would much rather prefer this solution. Say you have one container that stays on continuously in order to serve a static page. This same container can be used across all services that implement this on demand service. On this webpage, it would have a loading animation almost similar to Cloudflare's "checking your browser" page. Then, there could be some simple JavaScript that can detect if the real web server is up and automatically refresh the page for the user once it is.

With that said, I understand this is a lot for a community open-source project and I understand if this just isn't possible, but I just wanted to share my thoughts on this as I think it's already a really great idea.

Thanks!

Inquiry for use-case when nodepool is also scaled to zero

I need to deploy a GPU service in GKE with Traefik which will respond to HTTP processing requests (as an API not for users) and most time I need this scaled to zero.
The nodepool will be also configured to scale down to zero when there are no pods deployed in it.

When there is HTTP traffic I'd expect Sablier to intercept the requests and deploy the pods.
Then GKE should be responsible to provision the nodes and download the images in order to deploy the services.
The whole cold-start time may take up to 5 minutes.

When there is no traffic, Sablier should scale the application down to zero and GKE should scale the nodepool down to zero.

Is Sablier suitable for this use-case?

Also, one thing I couldn't figure out from the docs, when there is an HTTP request how many replicas are being deployed? Does Sablier provide a queueing system to monitor HTTP load and provide dynamic scaling or it just deployes the configured number of replicas eg 10 replicas?

Last but not least is there a minimum Traefik version in order to use Sablier as a plugin with it?

Blocking startegy is not working with Kubernetes on Traefik - 503 Service Unavailable

The Blocking Strategy intercepts the incoming request to a non existent service.

Which we allow in Traefik with the flag --allowEmptyServices on the provider. (Either Swarm or Kubernetes).

But it seems that, since the request is created without a specific endpoint to reach as it doesn't exist yet, even if Sablier ensures that the service is created, with an endpoint reachable.

This is probably because the request is already at the middleware level that the service destination cannot be modified anymore, except if you launch a new request.

I have no idea how to fix that on Traefik, I'd have to find in the Traefik repo where they create such request.

Note that it works on docker classic. Probably because it is set inside a static configuration.

See https://github.com/acouvreur/sablier/actions/runs/3242670189/jobs/5316271973#step:7:172

--- FAIL: Test_Blocking (0.00s)
    printer.go:54: GET http://localhost:8080/blocking/whoami
    reporter.go:23: 
        	Error Trace:	/home/runner/work/sablier/sablier/e2e/reporter.go:23
        	            				/home/runner/work/sablier/sablier/e2e/chain.go:21
        	            				/home/runner/work/sablier/sablier/e2e/response.go:585
        	            				/home/runner/work/sablier/sablier/e2e/response.go:151
        	            				/home/runner/work/sablier/sablier/e2e/e2e_test.go:39
        	Error:      	
        	            	expected status equal to:
        	            	 "200 OK"
        	            	
        	            	but got:
        	            	 "503 Service Unavailable"
        	Test:       	Test_Blocking
FAIL
FAIL	github.com/acouvreur/sablier/e2e	0.009s
FAIL

problems downloading V2.0.0

Hi,
I have some problems...

--experimental.plugins.sablier.modulename=github.com/acouvreur/sablier/plugins/traefik --experimental.plugins.sablier.version=v2.0.0
gives error

time="2022-10-01T21:29:50+02:00" level=error msg="Plugins are disabled because an error has occurred." error="failed to download plugin github.com/acouvreur/sablier: error: 500: {\"error\":\"Failed to get plugin github.com/acouvreur/[email protected]\"}\n"

Then I tried

--experimental.plugins.sablier.modulename=github.com/acouvreur/sablier --experimental.plugins.sablier.version=v2.0.0

gives

time="2022-10-01T21:23:00+02:00" level=error msg="Plugins are disabled because an error has occurred." error="failed to download plugin github.com/acouvreur/sablier: error: 500: {\"error\":\"Failed to get plugin github.com/acouvreur/[email protected]\"}\n"

Okay... Back to orig setup...

--experimental.plugins.sablier.modulename=github.com/acouvreur/traefik-ondemand-plugin --experimental.plugins.sablier.version=v1.3.0

But still error...

time="2022-10-01T21:27:33+02:00" level=error msg="Plugins are disabled because an error has occurred." error="failed to check archive integrity of the plugin github.com/acouvreur/traefik-ondemand-plugin: plugin integrity check failed"

Any ideas how to proceed?

Websocket traffic is not detected as activity

Hello,
I'm using webtop / guacamole as ondemand service.
If there is (after an initial HTTP request) only traffic activity via a websocket channel, this is not considered as activity. After the configured timeout the service is scale down (due to "inactivity"), even if the connection is actively used.
Is there some configuration setting I'm missing to consider this?

Best regards
Sven

Rename the project

Rename the project to something like Sablier ?

The goal is to detach from being a traefik plugin, but rather be an API that can be plugged to reverse proxies via their integration plugins

Use of custom pages - how to?

Hi!

Guys, what the command I need to add in plugin container for use custom pages?

I don't quite understand, based on the description there is only a block, can I use labels in the plugin container, if so which ones?

Cannot use service labels .?

Hi,

Impressive plugin.

Can you clarify this

Cannot use service labels

I normally attach traefik labels under deploy of a compose file for swarm.

These hold the domain name and service port etc. these are still supported right ?

Scale down/up multiple containers with one middleware

Hi and thank you for developing this plugin !
I tested it on my infrastructure and it works perfectly. However some of the apps I want to scale down / up are composed of up to 10 Kubernetes deployments. This makes scaling up painfully slow as some of them take some time to start, and having 10 middlewares means the deployments start one by one.
Having so many deployments also makes managing all the middlewares finicky.
I've started working on a modification on this plugin to take multiple resources as an input.
I'll keep you posted on my progress.

[Traefik Plugin Catalog] Plugin Analyzer has detected a problem.

The plugin was not imported into Traefik Plugin Catalog.

Cause:

failed to run the plugin with Yaegi: the load of the plugin takes too much time(10s), or an error, inside the plugin, occurs during the load: 1:21: import "github.com/acouvreur/sablier/v2/plugins/traefik" error: unable to find source related to: "github.com/acouvreur/sablier/v2/plugins/traefik"

Traefik Plugin Analyzer will restart when you will close this issue.

If you believe there is a problem with the Analyzer or this issue is the result of a false positive, please contact us.

[Traefik Pilot] Traefik Plugin Analyzer has detected a problem.

The plugin was not imported into Traefik Pilot.

Cause:

failed to run with Yaegi: plugin: failed to create a new plugin instance: name cannot be null

Traefik Plugin Analyzer will restart when you will close this issue.

If you believe there is a problem with the Analyzer or this issue is the result of a false positive, please contact us.

Display name for the loading page

It would be nice to specify a custom display name to show on the loading page. Currently for Kubernetes you see an internal object name with format "namespace-middlewarename@kubernetescrd". I'm sure a nicer name could be achieved with the custom page feature, but a simple property in the middleware config would be nice.

Docs have issues

your docs on traefik plugin page are misleading:

--swarmode instead of -swarmMode

Error with API version: "cient version 1.41 is too new"

image

Hi!

Can u help with understand this?

      # On demand plugin
      - --experimental.plugins.traefik-ondemand-plugin.moduleName=github.com/acouvreur/traefik-ondemand-plugin
      - --experimental.plugins.traefik-ondemand-plugin.version=v1.2.0
  traefik-ondemand:
    image: ghcr.io/acouvreur/traefik-ondemand-service:latest

What is the problem? Thanks.

Add cache with short TTL once the service is started

Basically, plugins that would hit the API for each request should have some kind of cache.
Should it be inside the plugin? Or directly inside the service.

If you do 50 requests within few seconds, and the service is already started, theses checks are useless.

As long as the TLL is lower than the timeout

Configure scale number for Swarm service

Hi, thanks for this plugin!

Is it possible to configure the number of containers to scale in a docker swarm service? For example, upon receiving the first request, I know a lot of other requests will follow and want to scale to 20 replicas as opposed to just one.

Detection for manually stop/start services and resetting spin-down timer

I have a service that is set to spin down after 10 minutes and I've noticed that if I take the service offline and back online during that time frame via docker stack rm and deploy, I won't be able to access the service until the spin-down timer has been reached. I am currently running this plugin in local mode instead of using Pilot (not sure if this issue applies if deployed using Pilot).

Is there a way to manually reset the spin-down timer?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.