Coder Social home page Coder Social logo

ss-torres / olive Goto Github PK

View Code? Open in Web Editor NEW

This project forked from microsoft/olive

0.0 0.0 0.0 70.38 MB

OLive, meaning ONNX Runtime(ORT) Go Live, is a python package that automates the process of accelerating models with ONNX Runtime(ORT). It contains two parts including model conversion to ONNX with correctness checking and auto performance tuning with ORT. Users can run these two together through a single pipeline or run them independently as neede

License: MIT License

Shell 2.29% JavaScript 0.87% Python 70.75% HTML 0.10% Batchfile 0.88% Vue 19.14% Jupyter Notebook 5.80% Dockerfile 0.18%

olive's Introduction

OLive - ONNX Runtime Go Live

OLive, meaning ONNX Runtime(ORT) Go Live, is a python package that automates the process of accelerating models with ONNX Runtime(ORT). It contains two parts including model conversion to ONNX with correctness checking and auto performance tuning with ORT. Users can run these two together through a single pipeline or run them independently as needed.

Model conversion to ONNX

Simplify multiple frameworks to ONNX conversion experience by integrating existing ONNX conversion tools into a single package, as well as validating the converted models' correctness. Currently supported frameworks are PyTorch and TensorFlow.

  • TensorFlow: OLive supports conversion with TensorFlow model in saved model, frozen graph, and checkpoint format. User needs to provider inputs' names and outputs' names for frozen graph and checkpoint conversion.
  • PyTorch: User needs to provide inputs' names and shapes to convert PyTorch model. Besides, user needs to provide outputs' names and shapes to convert torchscript PyTorch model.

Auto performance tuning with ORT

ONNX Runtime(ORT) is a high performance inference engine to run ONNX model. It enables many advanced tuning knobs for user to further optimize inference performance. OLive heuristically explores optimization search space in ORT to select the best ORT settings for a specific model on a specific hardware. It outputs the option combinations with the best performance for latency or for throughput.

Optimization fileds:

Getting Started

OLive package can be installed with command pip install onnxruntime_olive==0.5.0 -f https://olivewheels.azureedge.net/oaas/onnxruntime-olive . Supported python version: 3.7, 3.8, 3.9

User needs to install CUDA and cuDNN dependencies for perf tuning with OLive on GPU. The table below shows the ORT version and required CUDA and cuDNN version in the latest OLive.

ONNX Runtime CUDA cuDNN
1.11.0 11.4 8.2

There are three ways to use OLive:

  1. Use With Command Line: Run the OLive with command line using Python.
  2. Use With Jupyter Notebook: Quickstart of the OLive with tutorial using Jupyter Notebook.
  3. Use With OLive Server: Setup local OLive server for model conversion, optimizaton, and visualization service.

Inference your model with OLive result from auto performance tuning

  1. Get best tuning result with best_test_name, which includes inference session settings, environment variable settings, and latency result.
  2. Set related environment variables in your environment.
    • OMP_WAIT_POLICY
    • OMP_NUM_THREADS
    • KMP_AFFINITY
    • OMP_MAX_ACTIVE_LEVELS
    • ORT_TENSORRT_FP16_ENABLE
  3. Create onnxruntime inference session with related settings.
    • inter_op_num_threads
    • intra_op_num_threads
    • execution_mode
    • graph_optimization_level
    • execution_provider
    import onnxruntime as ort
    sess_options = ort.SessionOptions()
    sess_options.inter_op_num_threads = inter_op_num_threads
    sess_options.intra_op_num_threads = intra_op_num_threads
    sess_options.execution_mode = execution_mode
    sess_options.graph_optimization_level = ort.GraphOptimizationLevel(graph_optimization_level)
    onnx_session = ort.InferenceSession(model_path, sess_options, providers=[execution_provider])
    

Key Updates

10/28/2021

Update OLive from docker container based usage to python package based usage for more flexibilities.

Enable more optimization options for performance tuning with ORT, including INT8 quantization, mix precision in ORT-TensorRT, and transformer model optimization.

Contributing

We’d love to embrace your contribution to OLive. Please refer to CONTRIBUTING.md.

License

Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

olive's People

Contributors

liuziyue avatar leqiao-1 avatar dependabot[bot] avatar harishsk avatar mreyesgomez avatar wangyems avatar jcwchen avatar dabh avatar emmaningms avatar cly1st avatar microsoft-github-policy-service[bot] avatar porcupineyhairs avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.