Coder Social home page Coder Social logo

ml_pipeline_tfx_libraries's Introduction

ML pipeline illustration with TFx libraries on Google Cloud Platform

General Description

This repo contains the implementation of a machine learning continouos training pipeline that orchestrates the continuous training and deployment of a neural network for bike sharing prediction model with TFx libraries on Google Cloud Platform.

The model uses the bike sharing demand data from kaggle as a use case for implementation: https://www.kaggle.com/c/bike-sharing-demand/data The raw data from kaggle can be found

doc/bike-sharing-demand

For educational purpose, some of the numeric variables (actually categorical) in the dataset have been changed to string. This allow us to better interpret these variables (especially for TFMA slices analysis). The data is also copied to a gcp bucket.

You can check how data have been prepared in doc/0_preparing_data.

The train, val and test splits of the prepared data can be found at doc/bike-sharing-data, as well as the gs bucket gs://bike-sharing-data.

Pipeline Components

The training ML pipeline has the following 5 components and is mainly orchestrated with Kubeflow Pipelines and run on Google Cloud AI Platform Infrastructure.

Web App

We choose a functional decomposition of pipeline components to represent the main ML tasks.
Each component is built as a docker container which ensures its isolation, autonomy and maintainability.

You can manually build a new version of each component by running the following command:

$ cd [containers/component_folder]
$ bash build.sh [project_id] [tag]

You can also find a notebook version of each of the components along side the python packages required to run the notebook in the doc directory.

This will create a new image of the component and push it to Google Container Registry.
You can then provide the gcr image for each component and execute a Jupyter Notebook to compile the pipeline and submit it for execution with Kubeflow Pipelines.
Example of Jupyter Notebooks for pipeline compilation may be found in pipeline/bike_sharing_pipeline.

Pipeline Parameters

Global parameters

  • pipeline_version: Version of pipeline code being used which mainly contributes to name the model version
  • region: region used for underlying gcp infrastructure
  • project: GCP project Id where the pipelie will be executed
  • bucket: GCS bucket to store ML workflow data
  • bucket_staging: Bucket used for staging logs
  • raw_data_path:: GCS path where prepared data is ingested
  • tfversion: Version of tensorflow package to use (e.g. 1.14)
  • runner_validation: Runner used for the data validation component (local or Dataflow)
  • runner_transform: Runner used for the data transform component (local or Dataflow)
  • runner_training: Runner used for the model training component (local or AIplatform)
  • runner_analysis: Runner used for the model analysis component (local or Dataflow)
  • runner_deployment: Runner used for the model deployment component (local or AIplatform)

GCR docker images

  • sin_validation: Docker image on GCR of validation component
  • sin_transform: Docker image on GCR of transform component
  • sin_hypertune: Docker image on GCR of model training component
  • sin_analysis: Docker image on GCR of analysis component
  • sin_deploy: Docker image on GCR of deploy component

Model Versioning

The pipeline when deployed is responsible of creating new ML models by refreshing on new coming data.
We need 3 main information to define a unique model version deployed as inference endpoint on AI Platform:

  • The pipeline version that has been executed (e.g. v1.1)
  • The version of the dataset used for training and tuning the model represented by the date and time of preprocessing (e.g. 201231_125959)
  • The version of hyperparameter tuning represented by the date and time of searching for best model structure (e.g. 201231_125959

Example of model version: v0_3__200429_083501__200429_084356

Next Steps

  • write pipeline pipeline/workflow.py script
  • CI/CD pipeline with Cloud Build

More Info

For more information about the ML production pipeline, please refer to the ProductionML.md

ml_pipeline_tfx_libraries's People

Contributors

maggiemhanna avatar

Watchers

James Cloos avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.