adlik / adlik Goto Github PK

Adlik: Toolkit for Accelerating Deep Learning Inference

License: Apache License 2.0

Python 39.47% C++ 47.31% Shell 1.40% C 2.40% Dockerfile 3.97% Starlark 4.91% CMake 0.56%

compiler deep-learning docker-images inference inference-engine model-optimizer openvino tensorflow-serving tensorrt

adlik's Introduction

Adlik

Adlik [ædlik] is an end-to-end optimizing framework for deep learning models. The goal of Adlik is to accelerate deep learning inference process both on cloud and embedded environment.

With Adlik framework, different deep learning models can be deployed to different platforms with high performance in a much flexible and easy way.

In cloud environment, the compiled model and Adlik Inference Engine should be built as a docker image, and deployed as a container.
In edge environment, Adlik Inference Engine should be deployed as a container. The compiled model should be transferred to edge environment, and the Adlik Inference Engine should automatically update and load model.
In device environment, Adlik Inference Engine and the compiled model should be compiled into a binary file (so or lib). Users who want to run model inference on device should link user defined AI function and Adlik binary file to the execution file, and run directly.

Inference performance of Adlik

We test the inference performance of Adlik on the same CPU or GPU using the simple CNN model (MNIST model), the ResNet50 model, and InceptionV3 with different serving engines. The test performance data of Adlik on different models are as follows:

Model Optimizer

Model optimizer focuses on specific hardware and runs on it to achieve acceleration. The proposed framework mainly consists of two categories of algorithm components, i.e. pruner and quantizer.

Model Compiler

Model compiler supports several optimizing technologies like pruning, quantization and structural compression, which can be easily used for models developed with TensorFlow, Keras, PyTorch, etc.

Serving Engine

Serving Engine provides deep learning models with optimized runtime based on the deployment environment. Put simply, based on a deep learning model, the users of Adlik can optimize it with model compiler and then deploy it to a certain platform with Adlik serving platform.

Getting Started

Docker images

All Adlik compiler images and serving images are stored in Alibaba Cloud. These images can be downloaded and used directly, users do not need to build the Adlik on Ubuntu. Users can use the compiler images to compile model from H5, CheckPoint, FrozenGraph, ONNX and SavedModel to Openvino, TensorFlow, TensorFlow Lite, TensorRT. Users also can use the serving images for model inference.

Docker pull command:

docker pull docker_image_name:tag

Compiler docker images

The compiler docker images can be used in CPU and GPU. In the CPU, you can compile the model from source type to TensorFlow model, OpenVino model and TensorFlow Lite model. And in the CPU, you can compile the model from source type to TensorFlow model, and TensorRT model. The name and label of compiler mirror are shown below, and the first half of label represents the version of TensorRT, the latter part of label represents the version of CUDA:

registry.cn-beijing.aliyuncs.com/adlik/model-compiler:v1.0

Using model compiler image compile model

Run the image.

docker run -it --rm -v source_model:/mnt/model
registry.cn-beijing.aliyuncs.com/adlik/model-compiler:v1.0 bash

Configure the json file or environment variables required to compile the model.

The config_schema.json describle the json file field information, and for the example, you can reference compiler_json_example.json. For the environment variable field description, see env_field.txt, for the example, reference compiler_env_example.txt.

Note: The checkpoint model must be given the input and output op names of the model when compiling, and other models can be compiled without the input and output op names of the model.

Compile the model.

Compilation instructions (json file mode):

python3 "-c" "import json; import model_compiler as compiler; file=open('/mnt/model/serving_model.json','r');
request = json.load(file);compiler.compile_model(request); file.close()"

Compilation instructions (environment variable mode):

python3 "-c" "import model_compiler.compiler as compiler;compiler.compile_from_env()"

Serving docker images

The serving docker images contains CPU and GPU mirrors. The label of openvino image represents the version of OpenVINO. And for the TensorRT image the first half of label represents the version of TensorRT, the latter part of label represents the version of CUDA. The names and labels of serving mirrors are as follows:

CPU:

registry.cn-beijing.aliyuncs.com/adlik/serving-tflite-cpu:v1.0

registry.cn-beijing.aliyuncs.com/adlik/serving-tensorflow-cpu:v1.0

registry.cn-beijing.aliyuncs.com/adlik/serving-openvino:v1.0

registry.cn-beijing.aliyuncs.com/adlik/serving-libtorch-cpu:v1.0

GPU:

registry.cn-beijing.aliyuncs.com/adlik/serving-tftrt-gpu:v1.0

registry.cn-beijing.aliyuncs.com/adlik/serving-tensorrt:v1.0

registry.cn-beijing.aliyuncs.com/adlik/serving-libtorch-gpu:v1.0

Using the serving images for model inference

Run the mirror and pay attention to mapping out the service port.

docker run -it --rm -p 8500:8500 -v compiled_model:/model
registry.cn-beijing.aliyuncs.com/adlik/serving-openvino:v1.0 bash

Load the compiled model in the image and start the service.

adlik-serving --grpc_port=8500 --http_port=8501 --model_base_path=/model

Install the client wheel package adlik serving package or adlik serving gpu package locally, execute the inference code, and perform inference.

Note: If the service port is not mapped when you run the mirror, you need install the adlik serving package or adlik serving gpu package in the container. Then execute the inference code, and perform inference in the container.

Build

This guide is for building Adlik on Ubuntu systems.

First, install Git and Bazel.

Then, clone Adlik and change the working directory into the source directory:

git clone https://github.com/Adlik/Adlik.git
cd Adlik

Build clients

Install the following packages:
- python3-setuptools
- python3-wheel

Build clients:

bazel build //adlik_serving/clients/python:build_pip_package -c opt

Build pip package:

mkdir /tmp/pip-packages && bazel-bin/adlik_serving/clients/python/build_pip_package /tmp/pip-packages

Build serving

First, install the following packages:

automake
libtbb2
libtool
make
python3-six

Build serving with OpenVINO runtime

Install openvino-<VERSION> package from OpenVINO.

Assume the installation path of OpenVINO is /opt/intel/openvino_VERSION, run the following command:

export INTEL_CVSDK_DIR=/opt/intel/openvino_2022
export InferenceEngine_DIR=$INTEL_CVSDK_DIR/runtime/cmake
bazel build //adlik_serving \
    --config=openvino \
    -c opt

Build serving with TensorFlow CPU runtime

Run the following command:

bazel build //adlik_serving \
    --config=tensorflow-cpu \
    -c opt

Build serving with TensorFlow GPU runtime

Assume building with CUDA version 11.6.

Install the following packages from here and here:
- cuda-nvprune-11-6
- cuda-nvtx-11-6
- cuda-cupti-dev-11-6
- libcublas-dev-11-6
- libcudnn8=*+cuda11.6
- libcudnn8-dev=*+cuda11.6
- libcufft-dev-11-6
- libcurand-dev-11-6
- libcusolver-dev-11-6
- libcusparse-dev-11-6
- libnvinfer8=8.4.*+cuda11.6
- libnvinfer-dev=8.4.*+cuda11.6
- libnvinfer-plugin7=8.4.*+cuda11.6
- libnvinfer-plugin-dev=8.4.*+cuda11.6

Run the following command:

env TF_CUDA_VERSION=11.6 TF_NEED_TENSORRT=1 \
    bazel build //adlik_serving \
        --config=tensorflow-gpu \
        -c opt \
        --incompatible_use_specific_tool_files=false

Build serving with TensorFlow Lite CPU runtime

Run the following command:

bazel build //adlik_serving \
    --config=tensorflow-lite-cpu \
    -c opt

Build serving with TensorRT runtime

Assume building with CUDA version 11.0.

Install the following packages from here and here:
- cuda-cupti-dev-11-6
- cuda-nvml-dev-11-6
- cuda-nvrtc-11-6
- libcublas-dev-11-6
- libcudnn8=*+cuda11.6
- libcudnn8-dev=*+cuda11.6
- libcufft-dev-11-0
- libcurand-dev-11-0
- libcusolver-dev-11-6
- libcusparse-dev-11-6
- libnvinfer8=8.4.*+cuda11.6
- libnvinfer-dev=8.4.*+cuda11.6
- libnvonnxparsers8=8.4.*+cuda11.6
- libnvonnxparsers-dev=8.4.*+cuda11.6

Run the following command:

env TF_CUDA_VERSION=11.6 \
    bazel build //adlik_serving \
        --config=TensorRT \
        -c opt \
        --action_env=LIBRARY_PATH=/usr/local/cuda-11.0/lib64/stubs \
        --incompatible_use_specific_tool_files=false

Build serving with TF-TRT runtime

Assume building with CUDA version 11.0.

Install the following packages from here and here:
- cuda-cupti-dev-11-6
- libcublas-dev-11-6
- libcudnn8=*+cuda11.6
- libcudnn8-dev=*+cuda11.6
- libcufft-dev-11-6
- libcurand-dev-11-6
- libcusolver-dev-11-6
- libcusparse-dev-11-6
- libnvinfer8=8.4.*+cuda11.6
- libnvinfer-dev=8.4.*+cuda11.6
- libnvinfer-plugin8=8.4.*+cuda11.6
- libnvinfer-plugin-dev=8.4.*+cuda11.6

Run the following command:

env TF_CUDA_VERSION=11.6 TF_NEED_TENSORRT=1 \
    bazel build //adlik_serving \
        --config=tensorflow-tensorrt \
        -c opt \
        --incompatible_use_specific_tool_files=false

Build serving with Tvm runtime

Install the following packages:
- build-essential
- cmake
- tvm

Run the following command:

bazel build //adlik_serving \
   --config=tvm \
   -c opt

Build in Docker

The ci/docker/build.sh file can be used to build a Docker images that contains all the requirements for building Adlik. You can build Adlik with the Docker image.

Note: If you build the runtime with GPU in a Docker image, you need to add the CUDA environment variables in the Dockerfile, such as:
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute, utility

Release

The version of the service engine Adlik supports.

	Enflame 2.0	TensorFlow 2.10.1	OpenVINO 2022.3.0	TensorRT 8.4.3.1	PaddlePaddle 2.4.0
Keras	✓	✓	✓	✓	✗
TensorFlow	✓	✓	✓	✓	✗
PyTorch	✓	✗	✓	✓	✗
PaddlePaddle	✓	✗	✓	✓	✓
OneFLow	✓	✓	✓	✓	✗
OpenVINO	✗	✗	✓	✗	✗

License

Apache License 2.0

adlik's People

Contributors

Stargazers

Watchers

Forkers

zhaohainan666 yuanliya horance-liu xufangqing austinzh amadeus-zte javierluraschi rayye586 liutengfei0 tuq820 zhouyonglong tomzhang kiminh vonrosenchild yespon pubfork seasky100 wangzhenyu32 elsa623 xueweihan21 tmaone jackyangzg ayorra bobdeng1974 deepandy tjyhlt ouyufeng miguelhdo hhy5277 jaynotleno ishine jiangjiajun vegetbirdkai thisisbaozi supermizar kenuosec jinxiaozhao startime-h tanyhuan huapilin jasonzhang892 rujor 0xqq dut3062796s xiebinbb snnjm zzk0 v-analytics tobehuang dorakbg chen-xiaoshen butterluo ully xpcom-bsd ustc-wentao-yang takeshineshiro l-net-1992 pyluyang testbestp sea-wyq dotrado shimei0203 nanagisasa jqk6 moveyunix corner4world gg-big-org fangichao straitrobot tbaali songnous wanghongsheng01 rapidai ai-mou pengzhouhu littlemaer oppo heimafeitian seudonimo brunoscaglione dualword

adlik's Issues

If user requests an output which not in model will get null response after prediction

TfLite run-time should support ARM CPU

Considering two scenario:

Use QEMU ARM64 emulator.

Provide cross-compile tool-chain for ARM-64.
Compile ServingLite framework and TfLite runtime, and run inference test with Tflite Model ( Resnet50).
Considering several optimizing methods for TfLite Model, the specification of performance tests should be clear, and cover as many scenes as possible.

Support real device such as Raspberry Pi or Nvidia Jetson Nano.

Provide specific cross-compile tool-chain.
Compile ServingLite framework and TfLite runtime, and run TfLite Model in device successfully.

Differences between adlik and other similar types

Adlik does compression optimization for the model, but ONNX Runtime, Baidu's PaddleSlim and others all have similar functions, so what's the difference between them?

Parameter in method 'bool Notification::wait(int64_t micros)' should be millisecond

When the model compiler creates config.pbtxt, if the Number dimension of the input and output shapes is not None but a specific value, the created config.pbtxt cannot be inferred.

The adlik_serving doesn't load the latest version default

For a model which has 3 versions: 1, 2 and 3, when adlik_serving starts it doesn't load version 3 but load version 2.

Adjust the readme document structure and add detailed introduction to make Adlik easier to use.

Considerations for deploying adlik GPU services with docker image

Failed to deploy openvino service according to readme document

In ML_Model::run(), if 'timeout' parameter is 0 the task will terminate immediately

Readme documentation has no tutorial for compile model.

Support query model information

Support query model information include version, path, whether activated and so on

Undefinite declaration in Readme.md

"" Build serving with OpenVINO runtime
Install intel-openvino-ie-rt-core package from OpenVINO.""

Is a installation process or a building process?

ML runtime support http req

ML runtime support http req.

Support activate a specific model

Support activate a specific model ( or version) via grpc/http interface

Rewrite model compiler

Currently, there are some bad designs in the model compiler. We should come up with a new design.

Support an efficient and correct csv reader/writer

Support model update

Support update model (but not activate) via grpc/http interface

Fix signal handler

The function “exit” does not belong to the list of async-signal-safe functions.
I guess that a different program design will be needed for your function “handler”.

Inaccurate description in "Build in Docker"

You can build Adlik inside the Docker image.

Image is not runtime, so you can't compile Adlik inside image. There are two ways to use the image:

Use the image as a base image.
Run container of the image.

Notes on tensorrt's model compilation and deploy serving.

Related: #152.

TensorRT runtime failure

Run tensorRT model failed because some codes about cuda not be compiled into binary.

Use markdownlint to check Markdown formats

Add markdownlint to CI for checking Markdown files.

Compile model error when input tensor is scalar

Support FPGA runtime

ServingLite runtime framework should support Deep Learning model running in FPGA.

Execute instruction pipeline by model compiler.
Divide ops running in CPU and FPGA.
Run ops in CPU
Call FPGA interface when running ops in FPGA.

Monitor the server's status

Monitor the server's status which includes power on status and running status

Tensorrt 7 cannot compile onnx model

When i use the tensorrt 7 to compile the onnx model, raise parser error.

Adlik Benchmark Test framework

Support Benchmark test :

Test performance of different runtime framework, if Adlik supports
Test Adlik inference scheduler performance

Add information about the project name

What is the meaning of “Adlik”? How to pronounce it? Add these information to README.md file.

Bazel build does not support 2020 OpenVINO

In the readme.md

Build serving with OpenVINO runtime
Install intel-openvino-ie-rt-core package from OpenVINO.

If we install intel-openvino-runtime-ubuntu18-2020.1.023 and intel-openvino-dev-ubuntu18-2020.1.023 according to the OpenVINO, we cannot successfully build the serving.

New model compiler implementation: instantiate Tensorflow model from checkpoint file

To complete model compiler scenario.

Support redirect log to file

TensorFlow runtime doesn't enable any batch scheduler

TensorFlow runtime doesn't enable any batch scheduler and can't batch multiple requests into one.

The model compiler of OpenVINO does not support TensorFlow 2.x.

When I compile the model to OpenVINO type, the error is as follows:
AttributeError: module 'tensorflow' has no attribute 'NodeDef'

Refactor model_config.proto becames essential when there are more and more runtimes

Different runtimes have their own definition for model config

Add benchmark test result with tf lite runtime

Add benchmark test result with tf lite runtime.
Model is Resnet50.

Typo in Adlik schematic image

The second line should start with “Model Compiler”.

Model Optimizer

Adlik should support the framework to optimize deep-learned model, such as tensorflow checkpoint.
The optimizing techniques include:

Model Quantilization.
Model Pruning.

Model Compiler: Supporting FPGA Device

Model compiler should support compiling framework for the Deep Learning model running in FPGA.

Prediction will fail if information in model.pbtxt and model representation not consistent in tensorflowLite runtime

User usually get model meta which described in "model.pbtxt" to construct input before do prediction. But prediction will fail if information in "model.pbtxt" and model representation not consistent.

Add more CI tests for building on macOS

Add the following CI checks:

Build clients on macOS
Build serving on macOS
- Build serving with OpenVINO runtime on macOS
- Build serving with TensorFlow CPU runtime on macOS
- Build serving with TensorFlow GPU runtime on macOS
- Build serving with TensorRT runtime on macOS

The readme documentation does not have examples to test whether the installation and build were successful

After installing all of Adlik's packages and the build is complete, there should be a sample test to see if Adlik was successfully installed, similar to the "Hello world!" test of other software.

Remove an unnecessary null pointer check

An extra null pointer check is not needed in a function like the destructor for the class “Instance”.

Typo in Model Compiler image

1.'graph conversoin' should be 'graph conversion'.
2.'OP combinaton' should be 'OP combination'.

Refactor the function " ModelOperateImpl::activateModel"

The "ModelOperateImpl::activateModel" function is too long and into chaos

Compile the onnx model to OpenVINO or TensorRT model failed.

When i compile the onnx model to OpenVINO or TensorRT model , the error is as follows：
TypeError: expected str, bytes or os.PathLike object, not NoneType {'status': 'failure', 'error_msg': 'expected str, bytes or os.PathLike object, not NoneType'}
But there is an onnx model in the directory of input_model.

Modify the introduction about adlik

The introduction part of adlik adds a little more description of application scenarios and advantages over other platforms

The bug of model_compiler

There is a bug in line 326 of model_loader.py in model_compiler

 if len(self.input_formats) < len(self.input_names):
      self.input_formats.extend([None for _ in range(len(self.input_formats), len(self.input_formats))])

Upgrade OpenVINO to version 2020

The docker images built by ci/docker/build.sh file can not compile openvino model.

When I use the docker images which built by ci/docker/build.sh file to compile the openvino model, the error message is as follows:

{'status': 'failure', 'error_msg': 'mo.py does not exist, path: /opt/intel/openvino_2019.3.344/deployment_tools/model_optimizer/mo.py'}

Recommend Projects

React

A declarative, efficient, and flexible JavaScript library for building user interfaces.
Vue.js

🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
Typescript

TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
TensorFlow

An Open Source Machine Learning Framework for Everyone
Django

The Web framework for perfectionists with deadlines.
Laravel

A PHP framework for web artisans
D3

Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

javascript

JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
web

Some thing interesting about web. New door for the world.
server

A server is a program made to process requests and deliver data to clients.
Machine learning

Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Visualization

Some thing interesting about visualization, use data art
Game

Some thing interesting about game, make everyone happy.

Recommend Org

Facebook

We are working to build community through open source technology. NB: members must have two-factor auth.
Microsoft

Open source projects and samples from Microsoft.
Google

Google ❤️ Open Source for everyone.
Alibaba

Alibaba Open Source for everyone
D3

Data-Driven Documents codes.
Tencent

China tencent open source team.

adlik / adlik Goto Github PK

adlik's Introduction

Adlik

Contents

Getting Started

Docker images

Compiler docker images

Using model compiler image compile model

Serving docker images

Using the serving images for model inference

Build

Build clients

Build serving

Build serving with OpenVINO runtime

Build serving with TensorFlow CPU runtime

Build serving with TensorFlow GPU runtime

Build serving with TensorFlow Lite CPU runtime

Build serving with TensorRT runtime

Build serving with TF-TRT runtime

Build serving with Tvm runtime

Build in Docker

Release

License

adlik's People

Contributors

Stargazers

Watchers

Forkers

adlik's Issues

Recommend Projects

Recommend Topics

Recommend Org