Coder Social home page Coder Social logo

tlcbench's Introduction

TLCBench

Benchmark scripts for TVM

Content

Requirement

Tested with
TVM commit id: 91e07e1f3a7 (Feb. 5, 2021)
mxnet==1.7.0
gluonnlp==0.10.0

Intel CPU

Results on AWS c5.9xlarge (Intel Xeon Platinum 8124m @ 3.00GHz 18-core)

  • AutoTVM
-------------------------------------------------------------
Network Name       Batch size   Mean Inference Time (std dev)
-------------------------------------------------------------
resnet_50          1            5.40 ms             (0.08 ms)
mobilenet_v2       1            1.33 ms             (0.05 ms)
bert               1            31.31 ms            (0.11 ms)
-------------------------------------------------------------
  • AutoScheduler
-------------------------------------------------------------
Network Name       Batch size   Mean Inference Time (std dev)
-------------------------------------------------------------
resnet_50          1            5.30 ms             (0.05 ms)
mobilenet_v2       1            0.91 ms             (0.02 ms)
bert               1            16.52 ms            (0.16 ms)
-------------------------------------------------------------

Benchmark All Networks

The following commands read pre-tuned logs from directory saved_logs/latest and benchmark the latency for all networks.

  • Commands for AutoTVM
python3 benchmark_autotvm.py --network all --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m" --logdir saved_logs/latest
  • Commands for AutoScheduler
python3 benchmark_autoscheduler.py --network all --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m" --logdir saved_logs/latest

Benchmark One Network

The following commands read pre-tuned logs from directory saved_logs/latest and benchmark the latency for one network. You can replace "resnet_50" below with "mobilenet_v2" or "bert".

  • Commands for AutoTVM
python3 benchmark_autotvm.py --network resnet_50 --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m" --logdir saved_logs/latest
  • Commands for AutoScheduler
python3 benchmark_autoscheduler.py --network resnet_50 --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m"  --logdir saved_logs/latest

Tuning

The following commands perform auto-tuning for one or all networks and save tuning logs to directory tmp_logs. After tuning, you can use these logs to run benchmark by using benchmark commands above and replace the last argument with --logdir tmp_logs

  • Commands for AutoTVM
# Tune one network
python3 tune_autotvm.py --network resnet_50 --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m"
# Tune all networks
python3 tune_autotvm.py --network all --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m"
  • Commands for AutoScheduler
# Tune one network
python3 tune_autoscheduler.py --network resnet_50 --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m"
# Tune all networks
python3 tune_autoscheduler.py --network all --target "llvm -mcpu=skylake-avx512 -model=platinum-8124m"

Nvidia GPU

Results on AWS g4dn.4xlarge (NVIDIA T4)

  • AutoTVM
-------------------------------------------------------------
Network Name       Batch size   Mean Inference Time (std dev)
-------------------------------------------------------------
resnet_50          1            3.54 ms             (0.02 ms)
mobilenet_v2       1            0.74 ms             (0.00 ms)
bert               1            89.06 ms            (1.22 ms)
-------------------------------------------------------------
  • AutoScheduler
-------------------------------------------------------------
Network Name       Batch size   Mean Inference Time (std dev)
-------------------------------------------------------------
resnet_50          1            2.90 ms             (0.01 ms)
mobilenet_v2       1            0.57 ms             (0.00 ms)
bert               1            9.95 ms             (0.01 ms)
-------------------------------------------------------------

Benchmark All Networks

The following commands read pre-tuned logs from directory saved_logs/latest and benchmark the latency for all networks.

  • Commands for AutoTVM
python3 benchmark_autotvm.py --network all --target "cuda -model=t4" --logdir saved_logs/latest
  • Commands for AutoScheduler
python3 benchmark_autoscheduler.py --network all --target "cuda -model=t4" --logdir saved_logs/latest

Benchmark One Network

The following commands read pre-tuned logs from directory saved_logs/latest and benchmark the latency for one network. You can replace "resnet_50" below with "mobilenet_v2" or "bert".

  • Commands for AutoTVM
python3 benchmark_autotvm.py --network resnet_50 --target "cuda -model=t4" --logdir saved_logs/latest
  • Commands for AutoScheduler
python3 benchmark_autoscheduler.py --network resnet_50 --target "cuda -model=t4"  --logdir saved_logs/latest

Tuning

The following commands perform auto-tuning for one or all networks and save tuning logs to directory tmp_logs. After tuning, you can use these logs to run benchmark by using benchmark commands above and replace the last argument with --logdir tmp_logs

  • Commands for AutoTVM
# Tune one network
python3 tune_autotvm.py --network resnet_50 --target "cuda -model=t4"
# Tune all networks
python3 tune_autotvm.py --network all --target "cuda -model=t4"
  • Commands for AutoScheduler
# Tune one network
python3 tune_autoscheduler.py --network resnet_50 --target "cuda -model=t4"
# Tune all networks
python3 tune_autoscheduler.py --network all --target "cuda -model=t4"

tlcbench's People

Contributors

lurkrazy avatar merrymercy avatar masahi avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.