CNNergy: An Analytical CNN Energy Model

If you use any part of this project please cite:

S. D. Manasi, F. S. Snigdha and S. S. Sapatnekar, "NeuPart: Using Analytical Models to Drive Energy-Efficient Partitioning of CNN Computations on Cloud-Connected Mobile Clients," IEEE Transactions on Very Large Scale Integration Systems (TVLSI), vol. 28, no. 8, pp. 1844-1857, Aug. 2020, doi: 10.1109/TVLSI.2020.2995135.

The directories named "src" and "data" contain all the required files to use the tool.

Guideline to use CNNergy

List of the files in the "data" directory:

GoogleNet_v1_Input_File.txt -- This file contains the layer shape parameters for all the layers of GoogleNet_v1 and is used as input to the "GoogleNet_Main.m" file.
SqueezeNet_v11_Input_File.txt -- This file contains the layer shape parameters for all the layers of SqueezeNet_v11 and is used as input to the "SqueezeNet_Main.m" file.
GoogleNet_v1_Input_File_with_description.txt -- This is same as the "GoogleNet_v1_Input_File.txt" with a description of the format of each row.
SqueezeNet_v11_Input_File_with_description.txt -- This is same as the "SqueezeNet_v11_Input_File.txt" with a description of the format of each row.

List of the files in the "src" directory:

AlexNet_Main.m -- The main file to run all the layers of AlexNet.
VGG16_Main.m -- The main file to run all the layers of VGG-16.
GoogleNet_Main.m -- The main file to run all the layers of GoogleNet_v1.
SqueezeNet_Main.m -- The main file to run all the layers of SqueezeNet_v11.
Clock_Tree_Power.m -- This file implements the model to estimate the power consumption by the clock network.
Analytical_Model.m -- This is the function which implements the analytical CNN energy model. Details about the function are provided below.

Notes: (In order to run the "GoogleNet_Main.m" file, keep "GoogleNet_Main.m" and "GoogleNet_v1_Input_File.txt" in the same directory.

In order to run the "SqueezeNet_Main.m" file, keep "SqueezeNet_Main.m" and "SqueezeNet_v11_Input_File.txt" in the same directory.

In order to run each of the main.m files listed above in 5-8, keep the main.m files and the "Analytical_Model.m" file in the same directory.

After running each of the main.m files listed above in 5-8, the energy values required to process each layer of that CNN will be plotted in a figure as well as the energy results will be written in a .txt file)

The function: Analytical_Model.m

This function computes the energy required to process a given CNN layer on a deep learning accelerator

INPUTS:

The inputs to the function are as follows:

Layer = index of a CNN layer
filter_size = Height/Width of a filter
Nos_of_Filter = #of total 3D filters in a layer
Ifmap_size = Height/Width of an input feature map (ifmap)
Nos_of_Channel = #of total channels in the ifmap/filter
Stride = convolution stride
Ofmap_size = Height/Width of an output feature map (ofmap)
Sparsity_if = Percent sparsity (i.e., #of zeros) in the padded ifmap volume
Sparsity_Of = Percent sparsity (i.e., #of zeros) in the ofmap volume
n2 = #of ifmap to be processed together
n_flag = Binary flag to indicate whether this is the first simulation or second simulation
C_prcnt = Percent control energy from components other than the clock network
Clock_Energy = Energy consumption (in joule) by the clock network
bit_flag = Binary flag to indicate whether it is an 8-bit or 16-bit implementation

in 8-bit implementation, all data types (i.e., ifmap, filter, psum, ofmap) are 8-bit (set bit_flag = 0)

in 16-bit implementation, all data types (i.e., ifmap, filter, psum, ofmap) are 16-bit (set bit_flag = 1)

OUTPUTS:

For a given input layer the function provides the following outputs:

Final_Energy_per_Ifmap = Energy (in joule) to process the given CNN layer per image (The energy value is computed in 65nm technology node with Vdd = 1V)
n = maximum #of ifmap which can be processed together depending on the hardware constraints
psum_kb = On-chip Global SRAM Buffer(GLB) storage for the intermediate partial sums (psum) in kilobyte
ifmap_kb = GLB storage for ifmap in kilobyte
Total_kb = Total GLB storage used in kilobyte
Nos_of_NZMAC = Number of Nonzero-MAC operations per image
RF_Acc_MB_1N = Total amount of access into register file (RF) in MegaByte per image (both Inter-PE RF access and RF access from the same PE are included here)
GLB_Acc_MB_1N = Total amount of access into GLB in MegaByte per image
DARM_Acc_MB_1N = Total amount of access into off-chip DRAM in MegaByte per image
Filter_GLB_MB = Amount of GLB access in Megabyte from filter data per image
Ifmap_GLB_MB = Amount of GLB access in Megabyte from ifmap data per image
psum_GLB_MB = Amount of GLB access in Megabyte from psum data per image
Filter_DRAM_MB = Amount of DRAM access in Megabyte from filter data per image
Ifmap_DRAM_MB = Amount of DRAM access in Megabyte from ifmap data per image
Ofmap_DRAM_MB = Amount of DRAM access in Megabyte from ofmap data per image

For a detail description of our analytical CNN energy model please read our paper: S. D. Manasi, F. S. Snigdha and S. S. Sapatnekar, "NeuPart: Using Analytical Models to Drive Energy-Efficient Partitioning of CNN Computations on Cloud-Connected Mobile Clients," IEEE Transactions on Very Large Scale Integration Systems (TVLSI), vol. 28, no. 8, pp. 1844-1857, Aug. 2020. (Link: https://ieeexplore.ieee.org/abstract/document/9113336).

wxbbuaa2011 / cnnergy Goto Github PK

cnnergy's Introduction

CNNergy: An Analytical CNN Energy Model

Guideline to use CNNergy

The function: Analytical_Model.m

cnnergy's People

Contributors

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent