Note: If above DOI is broken, this one should be correct.

ATHEENA

A Toolflow for Hardware Early-Exit Network Automation (FCCM Artifacts)

In our paper, we develop an FPGA-based accelerator toolflow targeting Deep Convolutional Early-Exit Neural Networks. Our work builds on the streaming architecture hardware in fpgaConvNet. We leverage the probabilitic nature of the input-dependent Early-Exit network to scale the resource allocation for different stages of the accelerator.

This repository contains the software and hardware to generate accelerator designs for early-exit networks and the artifacts for the FCCM 2023 paper.

There are three main artifacts:

Optimiser
Buffer Hardware Component (and generation)
HLS-based hardware generation using Vivado

Full Setup instructions

The optimiser (and HLS generation) has been verified using the following software:

conda=4.9.2, 4.10.1
python=3.7

Optimiser package and Environment setup

To install this package, run from this directory the following:

sudo apt install protobuf-compiler libprotoc-dev
cd ./optimiser/
conda env create -f atheena_opt_hls_p37.yml
conda activate atheena_opt_hls_p37
python atheena_setup.py install

Scala and Chisel package setup

To install the appropriate software for buffer generation:

Note This module has been verified working for Ubuntu 20.04.6 LTS, Java version 11.0.18, sbt version 1.4.9, Scala 2.12.13

The following instructions are taken from Chisel's instructions on environment setup:

Install Coursier and follow the instructions.

curl -fL https://github.com/coursier/coursier/releases/latest/download/cs-x86_64-pc-linux.gz | gzip -d > cs && chmod +x cs

./cs setup

Note: This will install the most recent version of scala. To check it has worked, run scala -version (a restart of the terminal maybe required).

Install Scala version 2.12.13 and sbt version 1.4.9

cs install scala:2.12.13 && cs install scalac:2.12.13

cs install sbt:1.4.9 && cs install sbtn:1.4.9

Note: To check the scala and sbt versions, run scala -version and sbt --script-version.

Regenerate the project for the buffer package.

cd ./buffer/

sbt pack

Vivado setup

To install Vivado 2019.1:

First download from the Xilinx website.
Install the y2k22 patch according to these instructions.
Add the following to your ~/.bashrc file:

source /tools/Xilinx/Vivado/2019.1/settings64.sh
source /tools/Xilinx/SDK/2019.1/settings64.sh
export FPGACONVNET_ROOT=(path to repo)/ATHEENA_fccm_artifacts/hls
export FPGACONVNET_HLS=(path to repo)/ATHEENA_fccm_artifacts/hls
export FPGACONVNET_OPTIMISER=(path to repo)/ATHEENA_fccm_artifacts/optimiser

Once installed, you will also need to add a license server to your .bashrc file.
You will need to setup JTAG drivers to program a device. To do so, execute the following script:

/tools/Xilinx/Vivado/2019.1/data/xicom/cable_drivers/lin64/install_script/install_drivers/install_drivers

For more information, visit here.

Finally, there is a known bug to do with C++ libraries. A workaround for this is adding the mpfr.h and gmp.h headers manually. For this project, you need to create a header file include/system.hpp which includes the following:

#ifndef SYSTEM_HPP_
#define SYSTEM_HPP_

#include "(path to Vivado 2019.1)/include/gmp.h"
#include "(path to Vivado 2019.1)/include/mpfr.h"

#endif

Creating Accelerator Designs with ATHEENA

To generate an optimised FPGA accelerator description for an Early-Exit network, follow the instructions in optimiser/README.md:

Optimiser instructions (after setup)

Run optimiser on the branchy LeNet network description.

cd ./optimiser/

python -m fpgaconvnet_optimiser.tools.dev_script \
    --expr opt_brn \
    --save_name branchy_lenet \
    -o outputs/branchy_lenet \
    --model_path examples/models/atheena/branchy_lenet_20220902.onnx \
    --platform_path examples/platforms/zc706.json \
    --optimiser_path examples/optimiser_example.yml \
    -bs 1024

Generate the pareto graph for the optimiser results at an early-exit probability of 75% (as in the paper).

python -m fpgaconvnet_optimiser.tools.dev_script \
    --expr gen_graph \
    --save_name branchy_lenet_graph \
    -o outputs/branchy_lenet/results/ \
    -i outputs/branchy_lenet/ \
    --profiled_probability 0.75

Run the following command to perform a stage merge for all the results in the combined report.

python -m fpgaconvnet_optimiser.tools.ee_stage_merger \
    -c outputs/branchy_lenet/results/combined_rpt_eefrac75.txt \
    -j outputs/branchy_lenet/ \
    -on branchy_lenet_merged \
    --output_path outputs/branchy_lenet/merged/

Copy this .json file into a folder in hls/test/partitions/(example)/.

For example:

mkdir -p ../hls/test/partitions/branchy_lenet_eg

cp outputs/branchy_lenet/merged/branchy_lenet_merged_rsc80_thru95000.json ../hls/test/partitions/branchy_lenet_eg/

Note: Due to the non-deterministic nature of the optimiser, the above file will have slightly different resource usage and throughput. For the A1-like design use an rsc30-35 and thru~19500. For A2-like, use rsc45-50 and thru~45000. For A3-like design, use rsc80-90 and thru95000.

Buffer instructions (after setup)

Run the following instructions to generate available hardware IP for the buffer layer at different resource allocations.

cd ../buffer/

./gen_buff.sh

Respond to the prompt with a, to generate all the configurations.

HLS Instructions (after setup)

Run the following instructions to start the HLS generation process for the layers based on the hardware description provided.

cd ../hls/test/partitions/

../../scripts/split_run.sh -a \
    -n branchy_lenet_eg \
    -m $FPGACONVNET_OPTIMISER/examples/models/atheena/branchy_lenet_20220902.onnx \
    -p branchy_lenet_merged_rsc80_thru95000.json \
    -v

Note: The -a is used to generate all the network layers, the top layer, and the host code. The -v flag is used to stitch the resulting network IP layers into a full board design and then run Vivado synthesis and implementation before finally generating the bitstream. The script can be run with or without these flags if only one operation is required.

The final step requires some manual integration with the Vivado SDK and assumes that the target board is the ZC706 (used in the paper).

a. Open the resulting project_1 in test/partitions/branchy_lenet_eg/partition_0/branchy_lenet_eg_hw_prj

b. Export the hardware + bitstream: File > Export > Export Hardware. Check include bitstream.

c. Launch the SDK: File > Launch SDK

d. Generate the FSBL: File > New > Application Project. Provide a project name and select the exported hw platform 0. Hit Next and select Zynq FSBL and hit Finish.

e. Generate the host code: File > New > Application Project. Provide a project name and select the exported hw platform 0. Hit Next and select Hello world and hit Finish.

f. In this project, open hello_world.c and replace the contents with branchy_lenet_eg_host_code.c

g. Add the xilffs support to the host code (hello world) BSP using system.mss > modify bsp

h. Insert SD card loaded with i0.bin file copied from ./hls/test/data/test/partitions/branchy_lenet_eg/partition_0/data/input0.bin

i. Run the FSBL project on the board, program with the bitstream, and then run the host code (hello world) project!

For further details on running the hardware, see the hls README.

Reproducing the ATHEENA Paper Results

As the HLS generation and Vivado Synthesis take a significant amount of time to run, I have included three ATHEENA hardware projects and designs from the paper have been included in this repository.

A1 : ./hls/test/partitions/design_A1/ A2 : ./hls/test/partitions/design_A2/ A3 : ./hls/test/partitions/design_A3/

These folders contain:

.json file with the hardware description (generated by the optimiser).
.c host code that runs the project on the ZC706 board.
split_run.sh script that regenerates the HLS files and Vivado project.
Run the generation from inside the folder, using ./split_run.sh -a -v.

A copy of the hardware project with host code for each of these examples can be found and downloaded here

Note: to unzip use tar -xzvf a1_hw_artifact.tar.gz and then opened using Vivado design suite and SDK.

Citation

@inproceedings{bbiggs_ATHEENA_2023,
    title = {{ATHEENA: A Toolflow for Hardware Early-Exit Network Automation}},
    booktitle = {2023 IEEE 31st Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM)},
    author = {Benjamin Biggs and Christos-Savvas Bouganis and George A. Constantinides},
    year = {2023},
}

Acknowledgements

A huge thank you to our Artifact reviewer Yizhao Gao for their advice and patience throughout the artifact review process!

biggsbenjamin / atheena Goto Github PK

atheena's Introduction

ATHEENA

A Toolflow for Hardware Early-Exit Network Automation (FCCM Artifacts)

Full Setup instructions

Optimiser package and Environment setup

Scala and Chisel package setup

Vivado setup

Creating Accelerator Designs with ATHEENA

Optimiser instructions (after setup)

Buffer instructions (after setup)

HLS Instructions (after setup)

Reproducing the ATHEENA Paper Results

Citation

Acknowledgements

atheena's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

Recommend Topics

Recommend Org