Coder Social home page Coder Social logo

yiaoyiyou / bladedisc Goto Github PK

View Code? Open in Web Editor NEW

This project forked from alibaba/bladedisc

0.0 0.0 0.0 10.15 MB

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

License: Apache License 2.0

Shell 0.67% Ruby 0.01% C++ 74.05% Python 13.06% C 0.19% Makefile 0.01% Smarty 0.03% CMake 0.73% HCL 0.01% Dockerfile 0.01% MLIR 8.36% Starlark 2.88%

bladedisc's Introduction

BladeDISC Introduction

Overview

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads, which is one of the key components of Alibaba's PAI-Blade. BladeDISC provides general, transparent, and ease of use performance optimization for TensorFlow/PyTorch workloads on GPGPU and CPU backends. The architecture natively supports dynamic shape workloads, with many considerations in the performance of both static and dynamic shape scenarios. It also supports multiple and flexible deployment solutions, including both Plugin Mode inside TensorFlow/PyTorch runtime, and Standalone Mode for AOT standalone execution. The project is based on MLIR and highly related with mlir-hlo project.

Refer to our website for more information, including the setup tutorial, developer guide, demo examples and documents for developers.

Features and Roadmap

Frontend Framework Support Matrix

TensorFlow [1] PyTorch [2]
Inference Yes Yes
Training Yes [3] Ongoing

[1] TensorFlow 1.12, 1.15, 2.4 & 2.5 are supported and fully verified. For other versions some slight works on adaptation might be needed.

[2] 1.6.0 <= PyTorch version < 1.9.0 has been fully verified.

[3] Although supported, there's much room for improvement on Op coverage for training workloads.

Backend Support Matrix

Status
Nvidia GPU Yes
AMD GPU Ongoing
Hygon DCU Yes
X86 Yes
AArch64 Yes

Deployment Solutions

  • Plugin Mode - BladeDISC works as a plugin of TensorFlow or PyTorch. Only the supported Ops are clustered and compiled, and the unsupported ones will be executed by the original TensorFlow or PyTorch runtime. We recommend this mode to most of the users for its transparency and ease of use.

  • Standalone Mode - In Standalone mode, the input workload will be compiled into a binary that can be executed by it self, aka, does not rely on a TensorFlow or PyTorch runtime. In this mode all ops must be supported.

Numbers of Typical Workloads

By evaluating BladeDISC using a set of typical machine learning workloads for production purpose, BladeDISC shows up to 8.66x speedup compared with TensorFlow/PyTorch. Moreover, compared to static optimizing compilers (i.e., XLA and TensorRT), DISC shows comparable or even better performance.

Fig.1 Performance speedup over framework. Framework means either TensorFlow or PyTorch. FastSpeech2 is TensorFlow model and others are PyTorch models. The static compiler for TensorFlow is XLA and that for PyTorch is TensorRT. Note that S2T and T5 have no TensorRT performance due to wrong result.

Advantage in Dynamic Shape Workloads

Specifically, for the BERT large inference on T4 we provide in the examples, static compiler optimization (XLA) shows severe performance degradation due to its compilation overhead, while DISC shows a 1.75x speedup.

TensorFlow XLA DISC
1.78 s 41.69s 1.02s
1X 1.75X

API QuickView

For TensorFlow Users

Only two lines of code are needed on native Tensorflow program as the following:

import numpy as np
import tensorflow as tf

## enable BladeDISC on TensorFlow program
import blade_disc_tf as disc
disc.enable()

## construct TensorFlow Graph and run it
g = tf.Graph()
with g.as_default():
    ...
    with tf.session as sess:
        sess.run(...)

For more information, please refer to QuickStart for TensorFlow Users

For PyTorch Users

PyTorch users only need the following few lines of code to enable BladeDISC:

import torch_blade
# construct PyTorch Module
class MyModule(nn.Module):
    ...

module = MyModule()

with torch.no_grad():
    # blade_module is the optimized module by BladeDISC
    blade_module = torch_blade.optimize(module, allow_tracing=True, model_inputs=(x, y))

# run the optimized module
blade_module(x, y)

torch_blade.optimize accepts an nn.Module object and outputs the optimized module. For more information, please refer to Quickstart for PyTorch Users.

Setup and Examples

Publications

Tutorials and Documents for Developers

Presentations and Talks

How to Contribute

FAQ

Roadmap with mlir-hlo Project

BladeDISC is in a close relationship with mlir-hlo project. Part of the building blocks, including the MHLO Op definitions, TF to MHLO conversions, and some general purpose passes have been upstreamed to mlir-hlo repository. We'll continue to work in a close cooperative relationship with mlir-hlo project in the longer term.

Contact Us

DingTalk

bladedisc's People

Contributors

wyzero avatar yancey1989 avatar linearhit avatar qiuxiafei avatar orion34-lanbo avatar jamesthez avatar minminsun avatar deeply avatar alibaba-oss avatar zhiqwang avatar bladedisc avatar lfengad avatar wangdalin avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.