Coder Social home page Coder Social logo

biggsbenjamin / atheena Goto Github PK

View Code? Open in Web Editor NEW
1.0 1.0 1.0 11.89 MB

ATHEENA respository (including software and hardware artifacts for FCCM submission 2023)

License: GNU General Public License v3.0

Scala 5.55% Shell 3.90% Python 60.89% Tcl 4.95% C++ 19.28% C 5.44%
fpga rtl chisel3 fccm hls tcl

atheena's Introduction

DOI

Note: If above DOI is broken, this one should be correct.

ATHEENA

A Toolflow for Hardware Early-Exit Network Automation (FCCM Artifacts)

In our paper, we develop an FPGA-based accelerator toolflow targeting Deep Convolutional Early-Exit Neural Networks. Our work builds on the streaming architecture hardware in fpgaConvNet. We leverage the probabilitic nature of the input-dependent Early-Exit network to scale the resource allocation for different stages of the accelerator.

This repository contains the software and hardware to generate accelerator designs for early-exit networks and the artifacts for the FCCM 2023 paper.

There are three main artifacts:

  • Optimiser
  • Buffer Hardware Component (and generation)
  • HLS-based hardware generation using Vivado

Full Setup instructions

The optimiser (and HLS generation) has been verified using the following software:

  • conda=4.9.2, 4.10.1
  • python=3.7

Optimiser package and Environment setup

To install this package, run from this directory the following:

sudo apt install protobuf-compiler libprotoc-dev
cd ./optimiser/
conda env create -f atheena_opt_hls_p37.yml
conda activate atheena_opt_hls_p37
python atheena_setup.py install 

Scala and Chisel package setup

To install the appropriate software for buffer generation:

Note This module has been verified working for Ubuntu 20.04.6 LTS, Java version 11.0.18, sbt version 1.4.9, Scala 2.12.13

The following instructions are taken from Chisel's instructions on environment setup:

  1. Install Coursier and follow the instructions.
curl -fL https://github.com/coursier/coursier/releases/latest/download/cs-x86_64-pc-linux.gz | gzip -d > cs && chmod +x cs 
./cs setup

Note: This will install the most recent version of scala. To check it has worked, run scala -version (a restart of the terminal maybe required).

  1. Install Scala version 2.12.13 and sbt version 1.4.9
cs install scala:2.12.13 && cs install scalac:2.12.13
cs install sbt:1.4.9 && cs install sbtn:1.4.9

Note: To check the scala and sbt versions, run scala -version and sbt --script-version.

  1. Regenerate the project for the buffer package.
cd ./buffer/
sbt pack

Vivado setup

To install Vivado 2019.1:

  1. First download from the Xilinx website.

  2. Install the y2k22 patch according to these instructions.

  3. Add the following to your ~/.bashrc file:

source /tools/Xilinx/Vivado/2019.1/settings64.sh
source /tools/Xilinx/SDK/2019.1/settings64.sh
export FPGACONVNET_ROOT=(path to repo)/ATHEENA_fccm_artifacts/hls
export FPGACONVNET_HLS=(path to repo)/ATHEENA_fccm_artifacts/hls
export FPGACONVNET_OPTIMISER=(path to repo)/ATHEENA_fccm_artifacts/optimiser
  1. Once installed, you will also need to add a license server to your .bashrc file.

  2. You will need to setup JTAG drivers to program a device. To do so, execute the following script:

/tools/Xilinx/Vivado/2019.1/data/xicom/cable_drivers/lin64/install_script/install_drivers/install_drivers

For more information, visit here.

Finally, there is a known bug to do with C++ libraries. A workaround for this is adding the mpfr.h and gmp.h headers manually. For this project, you need to create a header file include/system.hpp which includes the following:

#ifndef SYSTEM_HPP_
#define SYSTEM_HPP_

#include "(path to Vivado 2019.1)/include/gmp.h"
#include "(path to Vivado 2019.1)/include/mpfr.h"

#endif

Creating Accelerator Designs with ATHEENA

To generate an optimised FPGA accelerator description for an Early-Exit network, follow the instructions in optimiser/README.md:

Optimiser instructions (after setup)

  1. Run optimiser on the branchy LeNet network description.
cd ./optimiser/
python -m fpgaconvnet_optimiser.tools.dev_script \
    --expr opt_brn \
    --save_name branchy_lenet \
    -o outputs/branchy_lenet \
    --model_path examples/models/atheena/branchy_lenet_20220902.onnx \
    --platform_path examples/platforms/zc706.json \
    --optimiser_path examples/optimiser_example.yml \
    -bs 1024
  1. Generate the pareto graph for the optimiser results at an early-exit probability of 75% (as in the paper).
python -m fpgaconvnet_optimiser.tools.dev_script \
    --expr gen_graph \
    --save_name branchy_lenet_graph \
    -o outputs/branchy_lenet/results/ \
    -i outputs/branchy_lenet/ \
    --profiled_probability 0.75 
  1. Run the following command to perform a stage merge for all the results in the combined report.
python -m fpgaconvnet_optimiser.tools.ee_stage_merger \
    -c outputs/branchy_lenet/results/combined_rpt_eefrac75.txt \
    -j outputs/branchy_lenet/ \
    -on branchy_lenet_merged \
    --output_path outputs/branchy_lenet/merged/
  1. Copy this .json file into a folder in hls/test/partitions/(example)/.

For example:

mkdir -p ../hls/test/partitions/branchy_lenet_eg
cp outputs/branchy_lenet/merged/branchy_lenet_merged_rsc80_thru95000.json ../hls/test/partitions/branchy_lenet_eg/

Note: Due to the non-deterministic nature of the optimiser, the above file will have slightly different resource usage and throughput. For the A1-like design use an rsc30-35 and thru~19500. For A2-like, use rsc45-50 and thru~45000. For A3-like design, use rsc80-90 and thru95000.

Buffer instructions (after setup)

  1. Run the following instructions to generate available hardware IP for the buffer layer at different resource allocations.
cd ../buffer/
./gen_buff.sh
  1. Respond to the prompt with a, to generate all the configurations.

HLS Instructions (after setup)

  1. Run the following instructions to start the HLS generation process for the layers based on the hardware description provided.
cd ../hls/test/partitions/
../../scripts/split_run.sh -a \
    -n branchy_lenet_eg \
    -m $FPGACONVNET_OPTIMISER/examples/models/atheena/branchy_lenet_20220902.onnx \
    -p branchy_lenet_merged_rsc80_thru95000.json \
    -v

Note: The -a is used to generate all the network layers, the top layer, and the host code. The -v flag is used to stitch the resulting network IP layers into a full board design and then run Vivado synthesis and implementation before finally generating the bitstream. The script can be run with or without these flags if only one operation is required.

  1. The final step requires some manual integration with the Vivado SDK and assumes that the target board is the ZC706 (used in the paper).

    a. Open the resulting project_1 in test/partitions/branchy_lenet_eg/partition_0/branchy_lenet_eg_hw_prj

    b. Export the hardware + bitstream: File > Export > Export Hardware. Check include bitstream.

    c. Launch the SDK: File > Launch SDK

    d. Generate the FSBL: File > New > Application Project. Provide a project name and select the exported hw platform 0. Hit Next and select Zynq FSBL and hit Finish.

    e. Generate the host code: File > New > Application Project. Provide a project name and select the exported hw platform 0. Hit Next and select Hello world and hit Finish.

    f. In this project, open hello_world.c and replace the contents with branchy_lenet_eg_host_code.c

    g. Add the xilffs support to the host code (hello world) BSP using system.mss > modify bsp

    h. Insert SD card loaded with i0.bin file copied from ./hls/test/data/test/partitions/branchy_lenet_eg/partition_0/data/input0.bin

    i. Run the FSBL project on the board, program with the bitstream, and then run the host code (hello world) project!

For further details on running the hardware, see the hls README.

Reproducing the ATHEENA Paper Results

As the HLS generation and Vivado Synthesis take a significant amount of time to run, I have included three ATHEENA hardware projects and designs from the paper have been included in this repository.

A1 : ./hls/test/partitions/design_A1/ A2 : ./hls/test/partitions/design_A2/ A3 : ./hls/test/partitions/design_A3/

These folders contain:

  • .json file with the hardware description (generated by the optimiser).

  • .c host code that runs the project on the ZC706 board.

  • split_run.sh script that regenerates the HLS files and Vivado project.

  • Run the generation from inside the folder, using ./split_run.sh -a -v.

A copy of the hardware project with host code for each of these examples can be found and downloaded here

Note: to unzip use tar -xzvf a1_hw_artifact.tar.gz and then opened using Vivado design suite and SDK.

Citation

@inproceedings{bbiggs_ATHEENA_2023,
    title = {{ATHEENA: A Toolflow for Hardware Early-Exit Network Automation}},
    booktitle = {2023 IEEE 31st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)},
    author = {Benjamin Biggs and Christos-Savvas Bouganis and George A. Constantinides},
    year = {2023},
}

Acknowledgements

A huge thank you to our Artifact reviewer Yizhao Gao for their advice and patience throughout the artifact review process!

atheena's People

Contributors

biggsbenjamin avatar

Stargazers

 avatar

Watchers

 avatar

Forkers

yizhaogao

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.