Coder Social home page Coder Social logo

fuseml / fuseml Goto Github PK

View Code? Open in Web Editor NEW
80.0 8.0 7.0 12.19 MB

FuseML aims to provide an MLOps framework as the medium dynamically integrating together the AI/ML tools of your choice. It's an extensible tool built through collaboration, where Data Engineers and DevOps Engineers can come together and contribute with reusable integration code.

License: Apache License 2.0

Makefile 2.09% Go 84.06% Shell 12.77% Mustache 0.52% Dockerfile 0.55%
machine-learning mlops devops-engineers data-engineers automation-recipes artificial-intelligence mlflow tekton data-scientists kfserving

fuseml's Introduction

GitHub Workflow Status Release Roadmap GitHub last commit

FuseML

Fuse your favourite AI/ML tools together for MLOps orchestration

Build your own custom MLOps orchestration workflows from composable automation recipes adapted to your favorite AI/ML tools, to get you from ML code to inference serving in production as fast as lighting a fuse.

Overview

Use FuseML to build a coherent stack of community shared AI/ML tools to run your ML operations. FuseML is powered by a flexible framework designed for consistent operations and a rich collection of integration formulas reflecting real world use cases that help you reduce technical debt and avoid vendor lock-in.

Inception and Roadmap

FuseML originated as a fork of our sister open source project Epinio, a lightweight open source PaaS built on top of Kubernetes, then has been gradually transformed and infused with the MLOps concepts that make it the AI/ML orchestration tool that it is today.

The project is under heavy development following the main directions:

  1. adding features and enhancements to improve flexibility and extensibility
  2. adding support for more community shared AI/ML tools
  3. creating more composable automation blocks adapted to the existing as well as new AI/ML tools

Take a look at our Project Board to see what we're working on and what's in store for the next release.

Basic Workflow

The basic FuseML workflow can be described as an MLOps type of workflow that starts with your ML code and automatically runs all the steps necessary to build and serve your machine learning model. FuseML's job begins when your machine learning code is ready for execution.

  1. install FuseML in a kubernetes cluster of your choice (see Installation Instructions)
  2. write your code using the AI/ML library of your choice (e.g. TensorFlow, PyTorch, SKLearn, XGBoost)
  3. organize your code using one of the conventions and experiment tracking tools supported by FuseML
  4. use the FuseML CLI to push your code to the FuseML Orchestrator instance and, optionally, supply parameters to customize the end-to-end MLOps workflow
  5. from this point onward, the process is completely automated: FuseML takes care of all aspects that involve building and packaging code, creating container images, running training jobs, storing and converting ML models in the right format and finally serving those models

Supported 3rd Party Tools

Experiment Tracking and Versioning

  • MLFlow
  • DVC (TBD)

Model Training

  • MLFlow
  • DeterminedAI (TBD)

Model Serving and Monitoring

  • KServe
  • Seldon Core (coming soon)
  • Knative Serving (coming soon)

Project Layout

This repository contains the code for the FuseML installer and is the main project repository. Other repositories of interest are:

fuseml's People

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fuseml's Issues

[Docs] - FuseML Windows binary missing sh and openssl

Apparently this is an "hard" requirement to run fuseML under Windows.

What is happening

When fuseml.exe is executed user receive te following error:

PS C:\Users\AlessandroFesta\Downloads> .\fuseml.exe
πŸ”₯ Not found: sh
πŸ”₯ Not found: openssl
Cannot operate: Please check your PATH, some of our dependencies were not found

What is expected

Use should have the fuseML help as output

##Suggested fix

Option A:
Document user strict requirements in a "Quick Start Guide"

Option B:
Remove the windows binary until next release

[Epic] - DataScientist - MlOperator collaboration flow

Following Google definition of the collaboration between:

  • Data Engineer (responsible for Data preparation and ingestion)
  • Data Scientist (responsible to develop a model to be trained and overview the capability of this model to satisfy project requirements as: precision,accuracy, performance)
  • MLOp (responsible of glue together the Data and Model and create an end-to-end workflow that train the model and serve the model to the inference to expose it to the final application)

MlOps Google

We may define the ideal flow to satisfy the following steps:

  1. Data extraction: You select and integrate the relevant data from various data sources for the ML task.
  2. Data analysis: You perform exploratory data analysis (EDA) to understand the available data for building the ML model. This process leads to the following:
  3. Understanding the data schema and characteristics that are expected by the model.
    Identifying the data preparation and feature engineering that are needed for the model.
  4. Data preparation: The data is prepared for the ML task. This preparation involves data cleaning, where you split the data into training, validation, and test sets. You also apply data transformations and feature engineering to the model that solves the target task. The output of this step are the data splits in the prepared format.
  5. Model training: The data scientist implements different algorithms with the prepared data to train various ML models. In addition, you subject the implemented algorithms to hyperparameter tuning to get the best performing ML model. The output of this step is a trained model.
  6. Model evaluation: The model is evaluated on a holdout test set to evaluate the model quality. The output of this step is a set of metrics to assess the quality of the model.
  7. Model validation: The model is confirmed to be adequate for deploymentβ€”that its predictive performance is better than a certain baseline.
  8. Model serving: The validated model is deployed to a target environment to serve predictions. This deployment can be one of the following:
    • Microservices with a REST API to serve online predictions.
    • An embedded model to an edge or mobile device.
    • Part of a batch prediction system.
  9. Model monitoring: The model predictive performance is monitored to potentially invoke a new iteration in the ML process.

We may assume that FUSEML in its first releases will have to provide a simple way for those three personas to collaborate, removing, at the same time, the majority of friction and complexity.

To do so as the initial phase we will have to focus on a subset of the above points.

We may describe this subset as:

  1. We assume the DE knows how to prepare the data and simply expose the datasets to the DS in some way (i.e.: S3 store, Remote URL,etc..).
  2. DS code on his preferred tool/IDE and once ready push its files to FUSEML as a branch of the final repo.
  3. We will embed these coding artifacts in a super simple pipeline that has only 2 or 3 steps:
    • data ingestion/preparation
    • training
    • outcome
  4. The simple pipeline will deploy everything the DS need (i.e. MLFlow instance for the experimentation phase)
  5. Once the Ds is satisfied with the training outcome he will execute a request to the MLOps (from Git logic he does a PR) that notify the MLOp that training code is ready to be pickup.
  6. MLOp inject the code into a more complex pipeline (merge) and start the end-to-end workflow

Codesets: separate location argument out of the API

In current implementation of codesets, fuseml/fuseml-core#8

"location is still showing in the REST API as a POST op parameter. I think the only way to avoid this is to leave it out of the goa design spec completely and add it manually only as a CLI command line parameter, given that the location only makes sense on the client side."

Add fuseml-core to installer

The fuseml installer needs to be extended to deploy the fuseml-core service as well as to "connect" it to the other foundation services (gitea/tekton) by providing the relevant configuration (secrets, env variables, arguments) pertaining to the location and credentials needed to connect to and use those services.

[Epic] - FUSEML Recipes Central Store

To simplify the user experience, we have to consider how the different personas will make use of the "recipes" we will offer in FUSEML.

We may envision a user will try to find them through a list that is expected to be dynamic, as in being changed from a central point and consequently presented to the user in the form of :

  • A list is a command-line tool
  • A marketplace in a web UI
  • A map of arrays in an API call

To solve the three above needs we would have to build a central point of access or in better terms a central store for the FuseML recipes.

Currently, FUSEML recipes are hard-coded into the CLI. This design hinders the user experience and is not manageable when multiple recipes are available for consumption.

A central store for FUSEML recipe templates should satisfy, at minimum the following:

  • The recipe templates are kept server-side, where they can be listed, inspected, classified,, and sorted into categories,, etc.
  • The recipe templates are stored on server; persistent storage is optional (i.e. in-memory or temporary file storage is sufficient)
  • A new recipe templates can be registered at runtime and become immediately available to all running clients (i.e. we need a registration mechanism).
  • The clients can interact with server-side recipes

[Epic] - FUSEML Installation Flows and User Experience

One important piece of the FUSEML solution as for any other project/solution is the UX provided to the end-user. Taking into account that we want to serve different personas this overall experience may vary.

We already identified three possible entry points in FUSEML and those are:

  • A command-line
  • A web UI
  • An API stack

The current implementation of the original idea from (Carrier)[https://github.com/SUSE/carrier] is tightly linked to a command-line only UX. We should investigate to find out, at a minimum:

  • The capability to decouple FUSEML from other underline solutions like carrier
  • The capability to deploy FUSEML using different patterns (have the command-line as a simple client and not an initiator of the solution
  • The capability to create a more extensive UX that is not necessarily limited to the deployment of the FUSEML components.

When pushing app, check if the deployment succeeded

When pushing and app, the last step should check if the deployment succeeded before stating that the App is online.

There are cases where although the workload was created successfully on kubernetes the deployment can fail.

failure when pushing image to registry: http2: response body closed

Sometimes the pipeline fails during the build-and-push step with this error coming from the pod

[APP] step-build-and-push error pushing image: failed to push to destination registry.fuseml-registry/apps/mlflow-85042120: Patch "https://registry.fuseml-registry/v2/apps/mlflow-85042120/blobs/uploads/69499f5e-6cbf-47a2-8b0f-87e253d76e20?_state=_pqnsBy9rkd7HGyP4FF4IvnF22UB1DGU0c1ZZxjJlWx7Ik5hbWUiOiJhcHBzL21sZmxvdy04NTA0MjEyMCIsIlVVSUQiOiI2OTQ5OWY1ZS02Y2JmLTQ3YTItOGIwZi04N2UyNTNkNzZlMjAiLCJPZmZzZXQiOjAsIlN0YXJ0ZWRBdCI6IjIwMjEtMDMtMTdUMTQ6MjQ6NDkuNjc0NzQxODI0WiJ9": http2: response body closed .

Suggest format for FuseML recipes

FuseML recipe templates (or actions) are the main concept FuseML is based on. Currently, a FuseML action is expressed as a Tekton pipeline, but as we add more and more actions, we need to define a more specific format that captures the particular characteristics of a FuseML action:

  • classification into categories (e.g. serving, packaging, model conversion, model training), each with its own conventions and characteristics (type of parameters, operations, behavior)
  • templating: parameters can either be automatically filled in by the FuseML orchestrator (e.g. location of a model available in the store, credentials for a kubernetes cluster, gitea repository and path) or filled in by the user when the action is instantiated
  • composability: two or more FuseML actions can be connected to one another based on their inputs and outputs forming a graph representing a more complex MLOps workflow
  • polymorphism : FuseML actions belonging the same category (e.g. serving, training) expose a similar interface and can be used interchangeably to achieve the same goal

Create a CONTRIBUTING.MD file

This issue implements #22 Epic.

Many projects have a CONTRIBUTING.md file where they explain how a use may contribute to the project. This more a governance model that try to avoid unexpected and rogue behaviors within the project.

  • GitHub as some samples (here)[https://github.com/github/docs/blob/main/CONTRIBUTING.md]
  • KubeFlow (here)[https://github.com/kubeflow/community/blob/master/CONTRIBUTING.md]
    We should define our own version. Here attached a sample we may use:

How to Contribute

We'd love to accept your patches and contributions to this project. There are
just a few small guidelines you need to follow.

Git

  • Each bug/feature must have its own branch.
  • You create the branch locally (avoiding to fork the project)
  • Don't push directly to the master branch
  • The PR must be validated, don't merge your own code
  • Squash the commits to one for each PR
  • The PR must have the Jira reference
  • You should sign the commits

Recommand process

Environment setup

  • Fork this project to your individual account.
  • Clone this project to your R&D environment, like git clone [email protected]:<username>/fuseml.git
  • In your fuseml directory, configure upstream: git remote add upstream '[email protected]:fuseml/fuseml.git'
  • Don't push to upstream directly: git remote set-url --push upstream no_push
  • Configure you name for this project: git config user.name "your name"
  • Configure your email for this project: git config user.email "[email protected]"

Update commits from upstream

  • Fetch commits from upstream: git fetch upstream
  • Merge to your repo: git merge upstream/master, we recommend only merge upstream commits to your master branch, and merge your master branch to the branch which you want to update.

Create pull request

  • Create a branch for your PR: git checkout -b <proposal>
  • Modify the code you want.
  • Push to your repo: git push origin <proposal>
  • In your repo, choose <proposal> branch, then create a pull request.
  • Fill in the pull request template and click "Create".

Long tasks

In case your tasks require more days, please update your job on here each couple of days.
If possible split the tasks into different sub-tasks so all the team can check the progress.

Make sure the Pod is not just running but also ready

When pod is created by application, fuseml should wait until it is ready (not just running) before it reports that the deployment ended with success. Moving to ready state can take few more seconds and if we check for the state immediately after pushing (like the testsuite does), it will report (correctly) app as not being ready.

[Epic] - FUSEML Contributing guidelines and Documentation

We should aim to have a contributing guideline that may help those who want to help to be quickly "enabled".
We should have a proposal, in the form of one or more issues to be approved and that will produce, as an outcome, a contributing.md file that implements this epic and its user-stories (the proposal above).

Also, we should set up a proper way to document the project through the Wiki's and/or other solutions like:

  • readthedocs
  • gitbooks

[Epic] - FUSEML Design Architecture

To be successful as an open-source project, a clear view of the architectural design is needed.

This view has to satisfy a minimum set of requirements:

  • The design architecture should represent the components that compose the FLUO solution.
  • The design architecture is a high-level abstraction of the components so that both the technical and non-technical audience may have the chance to understand the project.
  • The reference architecture is a dynamic design that is subjected to changes, still, some core, immutable components need to be highlighted in the design architecture.
  • The reason why we would need the design architecture is that we need to define the subsequent tasks to build and glue together the various components.

Also the design architecture will or would require a more detailed set of views, defined as a reference architecture for each, expressed in various forms as:

  • Proposals through issues/PR's
  • Internal/External documents:
    • GitHub Wikis
    • Presentations
    • Community MeetUp's
    • GitHub Discussions

Create recipe for serving the pre-trained model

This could be seen as a subset of #13 ... or rather a step that should be taken to understand the bigger issue better.

Currently we have recipes covering the workflow from training the model to serving it. Assume user already has trained model (stored in S3 store for example) and prepare a recipe that only serves it. We need to modify current pipeline or maybe think of using specialized pipelines.

Strategy for deleting artifacts and workloads managed by fuseml

We need to define a sane strategy for deleting or garbage collecting the various artifacts and kubernetes workloads that are "orphaned", for example when:

  • deleting a project (git org)
  • deleting a codeset
  • deleting a workflow
  • deleting a runnable

This could be modeled as an optional argument passed by the end-user when the top-level parent objects are deleted.

Examples of state that need to be deleted:

  • the MLFlow models stored in its S3 store
  • gitea repository or organization
  • tekton tasks and pipelines and task runs
  • tool specific k8s workloads (e.g. prediction services)

Feature: service registry

We need to manage the services deployed by the pipelines (e.g. prediction services) as first class citizens in fuseml. This enables users to show/list/delete running services.

Research: Persistent state support options for fuseml-core components

We need to store the state stored in the fuseml core service in a form of persistent storage, to avoid problems that will occur when the fuseml-core service is restarted and the rest of the runtime state stored in the other components (gitea, tekton, docker-registry) doesn't match anymore.

This research should summarize the available options and prepare a high-level proposal (blueprint) based on which implementation stories will be created.

Options to be explored:

  • relational database
  • key-value stores
  • a combination of the previous two
  • file-system / persistent volume
  • kubernetes resources (e.g. secrets, configmaps)
  • kubernetes backed (i.e. re-designing the fuseml core as a k8s controller/operator)

Other considerations:

  • upgrades incl. schema migration
  • backwards-compatibility necessity
  • scalability
  • high-availability

[Docs] - FuseML missing requirements

Like in #48 even in linux requirements need to be better documented.

What error is displayed

πŸ”₯ Not found: helm
πŸ”₯ Not found: git
Cannot operate: Please check your PATH, some of our dependencies were not found

We need to document the mandatory requirements to run fuseml

Apps should reference the prediction URL

When deploying a model the URL it references does not include the prediction path. This causes confusion when serving models using different inference services as each service may have a different URL format.

This also helps on fixing #14

Update Tekton components

The tekton components seems to be pretty outdated:

  • Tekton Pipeline: v0.19.0
  • Tekton Triggers: v0.11.1
  • Tekton Dashboard: v0.11.1

We should update them to the latest versions:

  • Tekton Pipeline: v0.22.0
  • Tekton Triggers: v0.12.1
  • Tekton Dashboard: v0.15.0

Feature: FuseML users

Currently server creates "fuseml user" every time when creating project - and name of such user is devised from the project name, password is hard coded. This user is used by client for pushing the actual code to the git repository.

We might want to change the server behavior so that once the user is created, its credentials are returned and client has to use them explicitly (or even has the chance to provide own) when calling the client action for pushing the code.

MLFlow extension: container image for MLFlow builder

We need a container image implementing the MLFlow "builder": a pipeline step that creates a container image out of an MLFlow compatible codeset that can be used as an environment to execute the MLFlow entrypoints specified in the MLproject file. This image will be associated with the MLFlow builder runnable artifact, but can also be referenced inline in fuseml workflows.

Some highlights about what the container image should contain and do when invoked as a pipeline step:

  • contains all software utilities able to interpret 'MLprojectandconda.yaml` files and install specified python requirements
  • contains or inherits from a base image (preferable) all software utilities able to build a container image (inside a container image) and push it to a remote OCI registry
  • (optional) contains or inherits from a base image (preferable) all software utilities required to register FuseML runnables (i.e. the fuseml CLI)
  • expects as inputs, modeled as volumes, command line args and/or env variables:
    • an MLFlow codeset (directory containing MLproject and conda.yaml files) to be mounted as volume. The mount path is either hard-coded as a FuseML wide convention, or supplied as an additional parameter with a default value. The latter is recommended, because it's more flexible.
    • service URLs and credentials required to push container images and register artifacts:
      1. URI and credentials for an OCI registry
      2. (optional) URI and credentials for the fuseml service API
  • when executed:
    • generates the contents and Dockerfile for the output container image
    • (optional) generates a runnable.yaml file with a description of the FuseML runnable representing the MLFlow environment
    • pushes the container image to the OCI registry
    • (optional) registers the FuseML runnable with the fuseml API
  • generates as outputs, modeled as volumes or files (depending on fuseml conventions on passing pipeline step outputs):
    • the full container image URL (including tag) where the image is saved
    • (optional) the runnable URL (including version)

The code (Dockerfiles, scripts etc) needed to build the builder image should be available under the fuseml github org, either as a separate repository holding only the MLFlow specific extensions, or as part of a global repository holding all FuseML extensions.

Create a CODE_OF_CONDUCT.md

As for the CONTRIBUTING file we should have a Code of Conduct file to be sure everyone is:

  • kind with each other users contributing to the project
  • proactive to stop any violation to the above point
  • be inclusive
    A good sample is GitHub CoC sample (here)[https://github.com/github/docs/blob/main/CODE_OF_CONDUCT.md]

Failed application cannot be deleted

See this progress:

πŸ•ž  Creating application resources...............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................error pushing app: waiting for app failed: waiting for app to be created failed: timed out waiting for the condition
✘-255 jsuchome@suse:~/kubernetes/fuseml [serving-with-seldon|✚ 1 …16]> fuseml apps

🚒 Listing applications
Organization: workspace
error listing apps: No deployment for application workspace.my-mlflow-04 found
✘-255 jsuchome@suse:~/kubernetes/fuseml [serving-with-seldon|✚ 1 …16]> fuseml delete my-mlflow-04

🚒 Deleting application...
Name: my-mlflow-04

Cloning application code ...

βœ”οΈ  Application clone successful
error deleting app: failed to delete application deploymentError from server (NotFound): error when deleting "/tmp/fuseml-app361824138/.fuseml/serve.yaml": services.serving.knative.dev "workspace-my-mlflow-04" not found
: exit status 1

The application does not exist, because the deployment failed. So it's not listed in the apps command.
However the git repository does exist and if it is not deleted, next attempt of pushing application behaves like updating existing one.
But deleting such repo is not possible, because first it tries to delete the deployment, which fails because there is none, and only then it would delete the git project.

multiple hostnames are used to access the built-in docker-registry service

The built-in docker-registry service needs to be accessed from several places simultaneously, and this is reflected in the URLs used for the container images that it stores:

  • the container runtime engines running on the k8s nodes in the same cluster. Currently, the registry service is exposed using a nodeport and can be accessed on the k8s nodes using the localhost and nodeport values: 127.0.0.1:30500
  • other containerized services running in the same cluster (e.g. tekton jobs building containers, the tekton operator itself). The hostname used in this case is the one provided by the k8s cluster DNS services, computed from the namespace name and the service name: registry.fuseml-registry
  • we may also need to open the docker-registry API for access outside the k8s cluster. One such case is working with one OCI registry across several k8s clusters, or synchronizing between multiple OCI registries managed by fuseml in multiple k8s clusters (i.e. inter-cluster replication).

This shows that our deployment setup is currently suffering from a multiple identity problem regarding the location of the OCI registry endpoint, and this already has some negative side-effects:

  • the most pressing problem is that in our workflow definitions we can't identify a container image using a single location, like we do with public container registries. The image location needs to be different depending on the consumer: a pipeline step that builds an image needs to use a different location than the step that uses that image.
  • in some cases, Tekton needs to access the OCI registry to inspect its entrypoint, but that is not possible if the nodeport hostname is used. An unsavory workaround is currently in place to fix this (see this comment for details).
  • another problem with using hard-coded nodeports is that the hard-coded nodeport value might not be available in the cluster.

Some ideas on how to fix this:

  1. use the k8s DNS name (registry.fuseml-registry) and cluster port on the k8s nodes instead of the localhost/nodeport, if possible
  2. expose the registry using an ingress. This might make the matter of generating matching certificates more complicated. Any post-installation changes made to the ingress hostname or domain name will also invalidate the configured pipelines.
  3. don't encode the registry hostname explicitly in the workflow and runnable definitions. Using a placeholder attribute value that says "this image comes from the built-in docker-registry" instead of pointing to it explicitly can help with the identity crysis.

NOTE: when this issue is fixed, the Tekton entrypoint workaround mentioned above should also be removed.

Define data structures and interfaces for the codeset element

As covered in the wiki, the codeset is the business logic element used to represent the machine learning application files (code, configuration, scripts, etc) created by data scientists during the ML experimentation phase and out of which the MLOps pipeline components are built and ultimately executed.

The FuseML core service needs an internal representation of these codesets, as well as an interface defining the operations that it needs to perform on them (referred to as the codeset store) for the purpose of implementing other operational concepts such as runnables, pipeline templates and pipelines.

Ideally, the core code used to define the codeset element and operations should be designed as "business domain" code: independent of outer layer logic that deals with presentation, serialization, storage etc.

Failures of mlflow serving containers with conda.yaml updated for seldon

With conda.yaml explictely requiring python 3 (bb035a4) non-seldon serving containers fail with

Traceback (most recent call last):
File "/opt/conda/envs/tutorial/bin/mlflow", line 8, in
sys.exit(cli())
File "/opt/conda/envs/tutorial/lib/python3.6/site-packages/click/core.py", line 829, in call
return self.main(*args, **kwargs)
File "/opt/conda/envs/tutorial/lib/python3.6/site-packages/click/core.py", line 760, in main
_verify_python3_env()
File "/opt/conda/envs/tutorial/lib/python3.6/site-packages/click/_unicodefun.py", line 130, in _verify_python3_env
" mitigation steps.{}".format(extra)
RuntimeError: Click will abort further execution because Python 3 was configured to use ASCII as encoding for the environment. Consult https://click.palletsprojects.com/python3/ for mitigation steps.

This system supports the C.UTF-8 locale which is recommended. You might be able to resolve your issue by exporting the following environment variables:

export LC_ALL=C.UTF-8
export LANG=C.UTF-8

Purge old fuseml CLI code from installer

With the new fuseml CLI in place, the old CLI inherited from carrier can be repurposed as an installer and all other code pertaining to pushing code to gitea, setting up webhooks, configuring pipelines and so on can be removed, and the binary renamed to reflect its new function.

Apps not shown in `fuseml apps` list

No app is shown for seldon-core deployment. In the end, this means CI fails because it scans the output of this command.

I've seen also the cases that even for the successfully deployed apps of deployment/knative type, I sometimes do not see any result under apps. This needs further investigating.

CLI should print more verbose information about workflow runs

After (or during) running a fuseml workflow, it could be nice to be able to retrieve more verbose information through the CLI about the state of the workflow and the specific artifacts that are created (e.g. S3 endpoints for generated models, serving URLs).

Explore goa as a code generator for fuseml-core

Goa (https://goa.design/learn/getting-started/) features code generation not only for REST APIs, but also for gRPC. This will be extremely useful in the context of detaching 3rd party tool specific code (e.g. artifact stores and workload agents) from the fuseml core service and running them as independent micro-services. Communication between the fuseml core service and its agents should be done using gRPC, and GOA gives us the option of generating the transport specific code stubs independent of the information being transported. Another advantage for using GOA is that it's compatible with newer openapi versions (3.0), whereas go-swagger, the current option being explored for fuseml-core, is stuck at 2.0 without any plans in sight to add support for 3.0.

Aspects to consider when exploring GOA to generate the REST/gRPC code stubs for fuseml-core:

  • where else is goa used (what other projects are using goa successfully) ? does that compare to go-swagger ?
  • how expressive is it ? does it provide more or less flexibility than go-swagger ?
  • how easy is it to work with ? is there a learning curve that is comparable to that required to work with the openapi specification ?
  • does it provide middleware features comparable or better than what is supported with go-swagger (e.g. support for integration with authentication and authorization frameworks such as OAuth and LDAP, logging, tracing etc.)

Remove static.go from git repository

As long as we embed files into compiled binary, we need statik. However it does not seem to make sense to track the generated static.go file in the git repository - this results in the file being changed on every code change and in effect makes it nightmare to manage branch rebases with file conflicts.

Feature: extension registry

We need a place to configure and store information about the extensions that are connected to FuseML at runtime (e.g. MLFlow URL and S3 backend credentials). This information is then more easily consumable by the workflow steps or runnables.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.