Coder Social home page Coder Social logo

codechef's Introduction

MLOps:

  • After developing a machine learning or neural network model, to deploy the algorithm to production, we need to set up an API on the prediction server using Flask or any other web framework, along with the rest of the software code. The prediction server can be either in the cloud or at the edge. In manufacturing, edge deployment is often preferred because it ensures the factory continues operating even when the internet connection goes down.
  • For example, The edge device has inspection software with camera control. It captures images and sends an API request to the prediction server. The prediction server processes the images to predict the output and sends the result back to the edge device as a response to the API request. Based on this prediction (API response), the software on the edge device determines whether to accept or reject the product.

image

Machine Learning Project Lifecycle

  • 1) Scoping: In this phase, you define the project by deciding what to work on and what exactly you want to apply machine learning to. You need to identify the features (X) and the target variable (Y).

    • Gestimate the key matrics, Accuracy, latency (prediction time), throughput (howm many queries per second), resources needed (time, compute, budget).
  • 2) Data: Next, you collect the data for your algorithm. This includes defining the data sources, establishing a baseline, labeling, and organizing the data.

  • 3) Modeling: After you have the data, you train the model. This involves selecting an appropriate algorithm, training the model, and performing error analysis. Since machine learning is an iterative process, during error analysis, you may need to update the model or decide to collect more data if necessary.

    • Algorithm/Nueral network architecture code, Hyperparamters
  • 4) Deployment: In this step, you deploy the model into production. This includes writing the software needed for deployment, monitoring the system, and tracking the incoming data. If the data distribution changes, you will need to update the model.

  • 5) Maintenance: After the initial deployment, maintenance often involves performing more error analysis and possibly retraining the model. This may also mean taking the feedback from the data you receive and using that data to continuously improve and update the model until a more accurate version is deployed.

Challenges in deploying machine learning models:

  • 1) Machine learning or Statistical issues:
    • Concept drift
    • Data drift: Gradual change, Sudden change
  • 2) SOfttware engine issues: Whn you are inplementing a prediction service whose job is to take queries x and output prediction y, yoou have a lot of design choices as how to implement this piece of software that will affect how you implement your software.
    • Do you need real time predictions or Batch predictions
    • Does your prediction service run into clouds or does it run at the edge device
    • How much compute resources you have (CPU/GPU)
    • Latency, Throughpput (Queries per second)
    • Logging
    • Security and Privacy

image

Important terminology:

  • Data drift: Data drift refers to changes in the data distribution over time that can negatively impact the performance of a machine learning model. It occurs when, after deployment, the data used for inference differs from the data the model was trained on, causing the model's predictions to become less accurate.
  • For example, Imagine you build a machine learning model to predict house prices based on historical data, including features such as square footage, number of bedrooms, location, and year built. The model is trained on data from 2010 to 2020. During training, the model learns patterns in the data, such as the average price per square foot in different neighborhoods and how specific features affect the price. Over time, several factors change in the housing market, such as an economic downturn, changes in local employment rates, or a new housing development that alters demand in certain neighborhoods. As a result, the distribution of house prices shifts.
  • Concept drift: Concept drift refers to the phenomenon where the underlying relationship between input data and the target output changes over time. This can lead to a decline in the performance of a machine learning model, as the model may not accurately predict outcomes based on the new patterns in the data that were not present during training.
  • For example, Suppose you develop a machine learning model to predict customer churn for a subscription-based service. The model is trained on historical data, which includes features like customer age, subscription length, usage patterns, and customer support interactions. During training, the model learns that older customers with longer subscription lengths and lower usage are more likely to churn. Over time, the service introduces new features that appeal to younger customers, leading to a shift in customer behavior. Younger customers begin subscribing at a higher rate, and usage patterns change. For example, they may prefer shorter subscription plans with flexible cancellation options.
  • Edge device: An edge device is a piece of hardware that processes data locally, closer to where it is generated, rather than sending it to a centralized server or cloud. It can perform tasks like collecting data, running AI models, or controlling systems, often used in IoT, manufacturing, and automation.
  • Data-centric approach:
  • Model-centric approach:
  • MlOps:
  • Real time predictions or Batch predictions:

Libraries:

  • TFX
  • Tensorflow
  • Keras
  • Pytorch

Digrams for better understanding

image

codechef's People

Contributors

omkarfadtare avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.