Coder Social home page Coder Social logo

katanaml / katana-skipper Goto Github PK

View Code? Open in Web Editor NEW
390.0 11.0 80.0 6.91 MB

Simple and flexible ML workflow engine

Home Page: https://katanaml.io/

License: Apache License 2.0

Python 50.17% Dockerfile 2.11% Shell 4.87% PureBasic 29.86% JavaScript 12.99%
katana pipeline orchestration tensorflow machine-learning kubernetes k8s docker ingress docker-compose

katana-skipper's Introduction

Katana ML Skipper

PyPI - Python GitHub Stars GitHub Issues Current Version

This is a simple and flexible ML workflow engine. It helps to orchestrate events across a set of microservices and create executable flow to handle requests. Engine is designed to be configurable with any microservices. Enjoy!

Skipper

Engine and Communication parts are generic and can be reused. A group of ML services is provided for sample purposes. You should replace a group of services with your own. The current group of ML services works with Boston Housing data. Data service is fetching Boston Housing data and converts it to the format suitable for TensorFlow model training. Training service builds TensorFlow model. Serving service is scaled to 2 instances and it serves prediction requests.

One of the services, mobilenetservice, shows how to use JavaScript based microservice with Skipper. This allows to use containers with various programming languages - Python, JavaScript, Java, etc. You can run ML services with Python frameworks, Node.js or any other choice.

Author

Katana ML, Andrej Baranovskij

Instructions

Start/Stop

Docker Compose

Start:

docker-compose up --build -d

This will start Skipper services and RabbitMQ.

Stop:

docker-compose down

Web API FastAPI endpoint:

http://127.0.0.1:8080/api/v1/skipper/tasks/docs

Kubernetes

NGINX Ingress Controller:

If you are using local Kubernetes setup, install NGINX Ingress Controller

Build Docker images:

docker-compose -f docker-compose-kubernetes.yml build

Setup Kubernetes services:

./kubectl-setup.sh

Skipper API endpoint published through NGINX Ingress (you can setup your own host in /etc/hosts):

http://kubernetes.docker.internal/api/v1/skipper/tasks/docs

Check NGINX Ingress Controller pod name:

kubectl get pods -n ingress-nginx

Sample response, copy the name of 'Running' pod:

NAME                                       READY   STATUS      RESTARTS   AGE
ingress-nginx-admission-create-dhtcm       0/1     Completed   0          14m
ingress-nginx-admission-patch-x8zvw        0/1     Completed   0          14m
ingress-nginx-controller-fd7bb8d66-tnb9t   1/1     Running     0          14m

NGINX Ingress Controller logs:

kubectl logs -n ingress-nginx -f <POD NAME>

Skipper API logs:

kubectl logs -n katana-skipper -f -l app=skipper-api

Remove Kubernetes services:

./kubectl-remove.sh

Components

  • api - Web API implementation
  • workflow - workflow logic
  • services - a set of sample microservices, you should replace this with your own services. Update references in docker-compose.yml
  • rabbitmq - service for RabbitMQ broker
  • skipper-lib - reusable Python library to streamline event communication through RabbitMQ
  • skipper-lib-js - reusable Node.js library to streamline event communication through RabbitMQ
  • logger - logger service

API URLs

  • Web API:
http://127.0.0.1:8080/api/v1/skipper/tasks/docs

If running on local Kubernetes with Docker Desktop:

http://kubernetes.docker.internal/api/v1/skipper/tasks/docs
  • RabbitMQ:
http://localhost:15672/ (skipper/welcome1)

If running on local Kubernets, make sure port forwarding is enabled:

kubectl -n rabbits port-forward rabbitmq-0 15672:15672

Skipper Library on PyPI

  • PyPI - skipper-lib is on PyPI

Skipper Library on NPM

  • NPM - skipper-lib-js is on NPM

Cloud Deployment Guides

  • OKE - deployment guide for Oracle Container Engine for Kubernetes

  • GKE - deployment guide for Google Kubernetes Engine

Usage

You can use Skipper engine to run Web API, workflow and communicate with a group of ML microservices implemented under services package.

Skipper can be deployed to any Cloud vendor with Kubernetes or Docker support. You can scale Skipper runtime on Cloud using Kubernetes commands.

IMAGE ALT TEXT

IMAGE ALT TEXT

License

Licensed under the Apache License, Version 2.0. Copyright 2020-2021 Katana ML, Andrej Baranovskij. Copy of the license.

katana-skipper's People

Contributors

abaranovskis-redsamurai avatar ladrua avatar xandrade avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

katana-skipper's Issues

How can we move from docker compose to kubernetes?

Hello Andrej,
I would like to ask about how to move from docker-compose to Kubernetes, do we have to use some tools like kompose or other tools, I appreciate if you could guide me a little bit about how to perform this conversion to run our services on Skipper not using docker compose but kubernetes. Thank you.

Docker-compose up not working

Hi

Thank you for the wonderful katana-skipper.
I am trying to digest the library and execute the docker-compose.yml.
But it seems like it is not working.

Would appreciate it if you could take a look

The difference between event_producer and exchange_producer

Hello,
Thanks for sharing your ML workflow. I appreciate if you could explain the difference between event_producer and exchange_producer. event_producer is used to produce an event to rabbitmq, but exchange_producer is not clear to me. Can't we use event_producer in place of exchange_producer?

Encountering Authentication Issues

When I run the start command on docker I get the following error in the data-service container. Would greatly appreciate guidance on how to fix this issue.
`
data-service
katanaml/data-service
RUNNING

Traceback (most recent call last):

File "main.py", line 19, in

main()

File "main.py", line 15, in main

'http://127.0.0.1:5001/api/v1/skipper/logger/log_receiver'))

File "/usr/local/lib/python3.7/site-packages/skipper_lib/events/event_receiver.py", line 16, in init

credentials=credentials))

File "/usr/local/lib/python3.7/site-packages/pika/adapters/blocking_connection.py", line 360, in init

self._impl = self._create_connection(parameters, _impl_class)

File "/usr/local/lib/python3.7/site-packages/pika/adapters/blocking_connection.py", line 451, in _create_connection

raise self._reap_last_connection_workflow_error(error)

pika.exceptions.AMQPConnectionError

Traceback (most recent call last):

File "main.py", line 19, in

main()

File "main.py", line 15, in main

'http://127.0.0.1:5001/api/v1/skipper/logger/log_receiver'))

File "/usr/local/lib/python3.7/site-packages/skipper_lib/events/event_receiver.py", line 16, in init

credentials=credentials))

File "/usr/local/lib/python3.7/site-packages/pika/adapters/blocking_connection.py", line 360, in init

self._impl = self._create_connection(parameters, _impl_class)

File "/usr/local/lib/python3.7/site-packages/pika/adapters/blocking_connection.py", line 451, in _create_connection

raise self._reap_last_connection_workflow_error(error)

pika.exceptions.ProbableAuthenticationError: ConnectionClosedByBroker: (403) 'ACCESS_REFUSED - Login was refused using authentication mechanism PLAIN. For details see the broker logfi`

Cache EventProducer

I found that cache the EventProducer can improve performace 40%. I tried but it block may request when increase the speed test. Do you have suggest to fix that

How to check if a service is running or not?

I have a question regarding the availability of microservices created, how can we check if a service is running or not ? So, for each service, we have an event receiver running using: python main.py, what if one service failed to run for some reason, how can we check by code if a certain service is running or not? In case the service is not running we need to show a message saying that this service is not running. Thank you.

Doc: How to add a new service with a new queue

How do we add a new service with a new queue called translator?

  1. I add a new router adding a new path for my new service defining a new prefix and tag named translator.
  2. I create a new request model for my new service in models.py containing task_type and expect a type translator and a payload
  3. I define a new service container with the correct variables and set my SERVICE=translator and QUEUE_NAME=skipper_translator

I am able to call the new endpoint and it returns:

task_id: "-", 
task_status: "Success", 
outcome: "<starlette.responses.JSONResponse object at 0x7ff2672dbed0>"

However the container is never triggered.

What am I missing?

What is the usage of logger component?

Hello, Can you provide more detail about the logger service and how to get the benefit of it, how to track services in case of any error or failure in some services? And what is the difference between logger and workflow? What I have understood is that both of them are used to log some information related to services, but it is not clear to me what kind of logging info each one can provide. Thank you.

heartbeat issues

Hi there, I am developing a microservice that takes more than 60 seconds to process some data with AI, sadly this causes the rabbitmq server to disconnect my client for not acking the message within 60 seconds.

how I can modify this timeout setting?

I've tried the rabbitmq docker-compose yml to set env variable:

  • RABBITMQ_HEARTBEAT=600

but still the connection setting have the timeout at 60 seconds.

How we were able to get around this since this is aimed for model processing which generally takes way more than 60 seconds?

Other way would be configure the pika connector and set the heartbeat setting manually, but the library doesn't allow it by default.

I would like to know how everyone get around this.

Thanks!

rabbitmq-service | 2023-11-21 13:45:42.224 [error] <0.19553.0> closing AMQP connection <0.19553.0> (172.27.0.6:53454 -> 172.27.0.4:5672):
rabbitmq-service | missed heartbeats from client, timeout: 60s
rabbitmq-service | 2023-11-21 13:45:42.225 [info] <0.19791.0> Closing all channels from connection '172.27.0.6:53454 -> 172.27.0.4:5672' because it has been closed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.