Coder Social home page Coder Social logo

marvinbuss / mldevops Goto Github PK

View Code? Open in Web Editor NEW
41.0 2.0 8.0 1.18 MB

ML DevOps using GitHub Actions and Azure Machine Learning

License: MIT License

Python 100.00%
mlops githubactions azureml azuremlservice machine-learning machinelearning ci-cd automation mldevops devops-for-data-science data-science lifecycle-management

mldevops's Introduction

INFO: We have built a new and even simpler method to do MLOps with GitHub Actions. Please visit this repository to get started: https://github.com/machine-learning-apps/ml-template-azure

GitHub Actions status

GitHub Actions and Azure Machine Learning

ML DevOps with GitHub Actions and Azure ML

This repository demonstrates how to automate the machine learning lifecycle using the CI/CD pipeline tools of GitHub Actions and Azure Machine Learning for training and deployment. The repository does not make use of Azure DevOps, but uses GitHub Actions as a future proof backend system for workflow automation.

The repository includes the following features:

  • GitHub Actions for the continuous integration (CI) and continuous delivery (CD) pipeline
  • Azure Machine Learning as a backend for training and deployment of machine learning models
  • CI/CD pipeline as code: the repository uses the Azure Machine Learning Python SDK to define the CI/CD steps and implements almost all features of this framework
  • Central settings file in json format to enable quick customization of of each step of the pipline

Implemented Azure ML features in the CI/CD pipeline

The repository and the CI/CD pipeline makes use of the following key features of Azure Machine Learning:

  • Loading or deployment of Workspace
  • Training on different compute engines: Azure Machine Learning Compute, Data Science VM, Remote VM
    • Automatically creates or attaches the compute engine, if it is not available yet
    • Allows extensive customizations of the compute engine depending on your requirements
  • Granular adjustment of training process
    • Custom docker images and container registries
    • Distributed training backends: MPI, Parameter Server, Gloo, NCCL
    • Supports all frameworks: TensorFlow, PyTorch, SKLearn, Chainer, etc.
    • Registration of environment
    • Hyperparameter Tuning
  • Comparison of models before registration in workspace
    • Comparison of production model and newly trained model based on metrics defined in central settings file
  • Model profiling after training has completed successfully
    • Recommends number of CPUs and RAM size for deployment
  • Deployment with testing in three phases:
    • Dev deployment: deployment on Azure Container Instance
    • Test deployment: deployment on Azure Kubernetes Service with purpose DEV_TEST
    • Production deployment: deployment on Azure Kubernetes Service with purpose FAST_PROD
  • And many others ...

What is ML DevOps?

Azure Machine Learning Lifecycle

MLOps empowers data scientists and app developers to bring together their knowledge and skills to simplify the model development as well as the release and deployment of them. ML DevOps enables you to track, version, test, certify and reuse assets in every part of the machine learning lifecycle and provides orchestration services to streamline managing this lifecycle. This allows to automate the end to end machine Learning lifecycle to frequently update models, test new models, and continuously roll out new ML models alongside your other applications and services.

This repository enables Data Scientists to focus on the training and deployment code of their machine learning project (code folder of this repository). Once new code is checked into the code folder of the master branch of this repository, the CI/CD pipeline is triggered and the training process starts automatically in the linked Azure Machine Learning workspace. Once the training process is completed successfully, the deployment of the model takes place in three stages: dev, test and production stage.

Key challenges solved by ML DevOps

Model reproducibility & versioning

  • Track, snapshot & manage assets used to create the model
  • Enable collaboration and sharing of ML pipelines

Model auditability & explainability

  • Maintain asset integrity & persist access control logs
  • Certify model behavior meets regulatory & adversarial standards

Model packaging & validation

  • Support model portability across a variety of platforms
  • Certify model performance meets functional and latency requirements

Model deployment & monitoring

  • Release models with confidence
  • Monitor & know when to retrain by analyzing signals such as data drift

Prerequisites

The following prerequisites are required to make this repository work:

  • Azure subscription
  • Contributor access to the Azure subscription
  • Access to the GitHub Actions Beta

If you don’t have an Azure subscription, create a free account before you begin. Try the free or paid version of Azure Machine Learning today.

Settings file

The repository uses a central settings file in /aml_service/settings.json to enable quick customizations of the end to end pipeline. The file can be found here and can be used to adjust the parameters of:

  • The compute engines for training and deployment,
  • The training process (experiment and run) and
  • The deployment process.

GitHub Workflow

The GitHub Workflow requires the follwing secrets:

  • AZURE_CREDENTIALS: Used for the az login action in the GitHub Actions Workflow. Please visit this website for a tutorial of this GitHub Action.
  • FRIENDLY_NAME: Friendly name of the Azure ML workspace.
  • LOCATION: Location of the workspace (e.g. westeurope, etc.)
  • RESOURCE_GROUP: Resource group where Azure Machine Learning was or will be deployed.
  • SUBSCRIPTION_ID: ID of the Azure subscription that should be used.
  • WORKSPACE_NAME: Name of your workspace or the workspace that should be created by the pipeline.

Further Links

TODO

  • Implement automatic Swagger generation
  • Handover of model name to training script
  • Bugfix in model evaluation
  • stop pileine failing if model performs worse: use features in GitHub Actions for improvement

mldevops's People

Contributors

marvinbuss avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

mldevops's Issues

Can you add output variables

Hi @marvinbuss I'm working with @aronchick at GitHub. We saw your cool example of CI/CD, one thing that would help us demo a cool integration is in each of your python pipeline steps, can you emit the following output variables

  • dashboard_link example: https://something.aml.com/your_dashboard/training
  • runtime: something formatted in hours, example 1h2m
  • commit_sha: the commit sha that corresponds to the which code in GitHub that the step was run on

You can emit output variables with this syntax

print("::set-output name=action_fruit::strawberry")

For example to emit the output variable dashboard_link you would do the following

print("::set-output name=label_name::https://something.aml.com/your_dashboard/training")

The idea is we could report back status upon the successful completion of each step into the PR with a link to the results. You might know better than us where to get the links (if available) as I am not familiar with Azure. Thanks would really love to use this workflow in an upcoming demo!

cc: @T-Holland, @inc0, @awmatheson

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.