Coder Social home page Coder Social logo

sagemaker-catboost-mme's Introduction

SageMaker CatBoost Multi-Model Endpoint

This repo depicts how to make use of a custom container to host multiple CatBoost models on a SageMaker Multi-Model-Endpoint.

The catboost-mme.ipynb contains the steps to build and push the custom image to ECR, deploy the SageMaker Endpoint and make inference against the Multi-Model-Endpoint.

The container folder contains the files needed for the custom image.

├── container
│   ├── dockerd-entrypoint.py
│   ├── Dockerfile
│   └── model_handler.py
  • dockerd-entrypoint.py is the entry point script that will start the multi model server.
  • Dockerfile contains the container definition that will be used to assemble the image. This include the packages that need to be installed.
  • model_handler.py is the script that will contain the logic to load up the model and make inference.

Benchmarking and load testing:

Load tests

All tests conducted on a single ml.m5.xlarge.

1) Uncompressed 569KB model in memory test ~460TPS

End to end:

Response time percentiles (approximated)
 Type     Name                                                                              50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100% # reqs
--------|----------------------------------------------------------------------------|---------|------|------|------|------|------|------|------|------|------|------|------|
 custom_protocol_boto3 sagemaker_client_invoke_endpoint                                      30     32     34     35     39     43     50     56     85    280   2100 137879

Model and Overhead Latency (p99) and Invocations (Sum) - 1min: metric1

2) Uncompressed 70MB model in memory test ~238TPS

End to end:

Response time percentiles (approximated)
 Type     Name                                                                              50%    66%    75%    80%    90%    95%    98%    99%  99.9% 99.99%   100% # reqs
--------|----------------------------------------------------------------------------|---------|------|------|------|------|------|------|------|------|------|------|------|
 custom_protocol_boto3 sagemaker_client_invoke_endpoint                                     59     64     67     69     75     80     87     93    220    940   1000  71230

Model and Overhead Latency (p99) and Invocations (Sum) - 1min: metric1

Code profiling (Big model)

Function Initial run time (ms) Subsequent run time (ms)
perf __init__ 0.000953674 -
perf initialize 258.2206726 -
perf handle_out 0.001907349 0.00166893
perf preprocess 0.005245209 0.005483627
perf inference 20.75648308 3.942251205
perf postprocess 0.031471252 0.021219254
perf handle in 32.42993355 12.28523254

sagemaker-catboost-mme's People

Contributors

marckarp avatar rsgrewal-aws avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.