Coder Social home page Coder Social logo

wxbbuaa2011 / acceltran Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jha-lab/acceltran

0.0 0.0 0.0 86.09 MB

[TCAD'23] AccelTran: A Sparsity-Aware Accelerator for Transformers

License: BSD 3-Clause "New" or "Revised" License

Shell 0.97% Python 45.87% Tcl 2.37% VHDL 0.68% Verilog 20.84% SystemVerilog 29.27%

acceltran's Introduction

AccelTran: A Sparsity-Aware Monolithic 3D Accelerator for Transformer Architectures at Scale

Python Version Conda PyTorch Hits

AccelTran is a tool to simulate a design space of accelerators on diverse flexible and heterogeneous transformer architectures supported by the FlexiBERT 2.0 framework at jha-lab/txf_design-space.

The figure below shows the utilization of different modules in an AccelTran architecture for the BERT-Tiny transformer model.

AccelTran GIF

Table of Contents

Environment setup

Clone this repository and initialize sub-modules

git clone https://github.com/JHA-Lab/acceltran.git
cd ./acceltran/
git submodule init
git submodule update

Setup python environment

The python environment setup is based on conda. The script below creates a new environment named txf_design-space:

source env_setup.sh

For pip installation, we are creating a requirements.txt file. Stay tuned!

Run synthesis

Synthesis scripts use Synopsys Design Compiler. All hardware modules are implemented in SystemVerilog in the directory synthesis/top.

To get area and power consumption reports for each module, use the following command:

cd ./synthesis/
dc_shell -f 14nm_sg.tcl -x "set top_module <M>"
cd ..

Here, <M> is the module that is to be synthesized in: mac_lane, ln_forward_<T> (for layer normalization), softmax_<T>, etc. where <T> is the tile size among 8, 16, or 32.

All output resports are stored in synthesis/reports.

To run the synthesis for the DMA module, run the following command instead:

cd ./synthesis/
dc_shell -f dma.tcl 

Run pruning

To get the sparsity in activations and weights in an input transformer model and its corresponding performance on the GLUE benchmark, use the dynamic pruning model: DP-BERT.

To test the effect of different sparsity ratios on the model performance on the SST-2 benchmark, use the following script:

cd ./pruning/
python3 run_evaluation.py --task sst2 --max_pruning_threshold 0.1
cd ..

The script uses a weight-pruned model, and so, the weights are not pruned futher. To prune the weights with a pruning_threshold as well, use the flag: --prune_weights.

Run simulator

AccelTran supports a diverse range of accelerator hyperparameters. It also supports all ~1088 models in the FlexiBERT 2.0 design space.

To specify the configuration of an accelerator's architecture, use a configuration file in simulator/config directory. Example configuration files are given accelerators optimized for BERT-Nano and BERT-Tiny. Accelerator hardware configuration files should conform with the design space specified in the simulator/design_space/design_space.yaml file.

To specify the transformer model parameters, use a model dictionary file in simulator/model_dicts. Model dictionaries for BERT-Nano and BERT-Tiny have already been provided for convenience.

To run AccelTran on the BERT-Tiny model, while plotting utilization and metric curves every 1000 cycles, use the following command:

cd ./simulator/
python3 run_simulator.py --model_dict_path ./model_dicts/bert_tiny.json --config_path ./config/config_tiny.yaml --plot_steps 1000 --debug
cd ..

This will output the accelerator state for every cycle. For more information on the possible inputs to the simulation script, use:

cd ./simulator/
python3 run_simulator.py --help
cd ..

Developer

Shikhar Tuli. For any questions, comments or suggestions, please reach me at [email protected].

Cite this work

Cite our work using the following bitex entry:

@article{tuli2023acceltran,
  title={{AccelTran}: A Sparsity-Aware Accelerator for Dynamic Inference with Transformers},
  author={Tuli, Shikhar and Jha, Niraj K},
  journal={arXiv preprint arXiv:2302.14705},
  year={2023}
}

If you use the AccelTran design space to implement transformer-accelerator co-design, please also cite:

@article{tuli2023transcode,
  title={{TransCODE}: Co-design of Transformers and Accelerators for Efficient Training and Inference},
  author={Tuli, Shikhar and Jha, Niraj K},
  journal={arXiv preprint arXiv:2303.14882},
  year={2023}
}

License

BSD-3-Clause. Copyright (c) 2022, Shikhar Tuli and Jha Lab. All rights reserved.

See License file for more details.

acceltran's People

Contributors

shikhartuli avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.