Coder Social home page Coder Social logo

barrosm / end-to-end-automl-insurance Goto Github PK

View Code? Open in Web Editor NEW

This project forked from kennethleungty/end-to-end-automl-insurance

0.0 1.0 0.0 46.91 MB

End-to-End AutoML with H2O, MLflow, FastAPI, and Streamlit for Insurance Cross-Sell

Jupyter Notebook 99.10% Python 0.90%

end-to-end-automl-insurance's Introduction

End-to-End AutoML with H2O, MLflow, FastAPI, and Streamlit for Insurance Cross-Sell

Link to writeup: https://towardsdatascience.com/end-to-end-automl-train-and-serve-with-h2o-mlflow-fastapi-and-streamlit-5d36eedfe606

Overview - Business Aspect

  • Cross-selling in insurance is the practice of promoting products that are complementary to the policies that existing customers already own.
  • The goal of cross-selling is to create a win-win situation where customers can obtain comprehensive protection at a lower bundled cost, while insurers can boost revenue through enhanced policy conversions.
  • The aim of this project is to build a predictive ML pipeline (on the Health Insurance Cross-Sell dataset) to identify health insurance customers who are interested in purchasing additional vehicle insurance, in a bid to make cross-sell campaigns more efficient and targeted.

Overview - Technical Aspect

  • Traditional machine learning (ML) model development is time-consuming, resource-intensive, and requires a high degree of technical expertise along with many lines of code.ย 
  • This model development process has been accelerated with the advent of automated machine learning (AutoML), allowing teams to generate performant and scalable models efficiently.
  • An important thing to remember is that there are multiple components in a production-ready ML system beyond model development that requires plenty of work.
  • In this comprehensive guide, we explore how to set up, train, and serve an ML system using the powerful capabilities of H2O AutoML, MLflow, FastAPI, and Streamlit to build an insurance cross-sell prediction model.

Objective

  • Make cross-selling more efficient and targeted by building a predictive ML pipeline to identify health insurance customers interested in purchasing additional vehicle insurance.

Pipeline Components

  • Data Acquisition and Preprocessing
  • H2O AutoML training with MLflow tracking
  • Deployment of best H2O model via FastAPI
  • Streamlit user interface to post test data to FastAPI endpoint

UI Demo

alt text


Project Files and Folders

  • /data - Folder containing the raw data, processed data and output data (predictions JSON file)
  • /demo - Folder containing the gif and webm of Streamlit UI demo
  • /submissions - Folder containing the CSV files for Kaggle submission to retrieve model accuracy scores
  • /utils - Folder containing Python scripts with helper functions
  • 01_EDA_and_Data_PreProcessing.ipynb - Notebook detailing the data acquisition, data cleaning and feature engineering steps
  • 02_XGBoost_Baseline_Model.ipynb - Notebook running the XGBoost baseline model for subsequent comparison
  • 03_H2O_AutoML_with_MLflow.ipynb - Notebook showing the full H2O AutoML training and MLflow tracking process, along with model inference to get predictions
  • train.py - Python script for the execution of H2O AutoML training with MLflow tracking. E.g. Run in CLI with python train.py --target 'Response'
  • main.py - Python script for selecting best H2O model and deploying (and serving) it as FastAPI endpoint. E.g. Run in CLI with uvicorn main:app --host=0.0.0.0 --port=8000
  • ui.py - Python script for the Streamlit web app, connected with FastAPI endpoint for model inference. E.g. Run in CLI with streamlit run ui.py

References

end-to-end-automl-insurance's People

Contributors

kennethleungty avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.