This is the implementation of the paper [Optimus: Towards Optimal Layer-Fusion on Deep Learning Processors], which has accepted by LCTES2021.
- Create virtual env
conda create --name optimusEnv python=3.6
conda activate optimusEnv
- Install requirement
pip install -r requirements.txt
- Run a test
./test.sh
- To find out all the options
python ./fusion/tools/optimal_schedule_search.py --help
- Run overall experiment to get the memory access and energy over multiple models The result will be stored in result/overall_experiment/. It will take more than ten minutes to complete this experiment.
python ./fusion/experiment/overall_experiment.py
- Run memory access analysis over multiple models The result will be stored in result/analysis/. It will take more than ten minutes to complete this experiment.
python ./fusion/experiment/analysis.py
- Evaluate the Impact of Batch Size The result will be stored in result/batch_size/.
python ./fusion/experiment/batch_size.py
- Evaluate the impact of on-chip memory space The result will be stored in result/buffer_size/.
python ./fusion/experiment/buffer_size.py
- Evaluate the impact of Dataflow This experiment supports the experiment results of section 4.2.5 in our paper, and the result will be stored in result/dataflow/.
python ./fusion/experiment/dataflow.py
- Evaluate the impact of PE-array and buffer The result will be stored in result/pe_array/.
python ./fusion/experiment/pe_array.py
- Evaluate the performance on different processors The result will be stored in result/processor/.
python ./fusion/experiment/processor.py