TY - GEN
T1 - Improving Energy Efficiency of Convolutional Neural Networks on Multi-core Architectures through Run-time Reconfiguration
AU - Xiong, Y.
AU - Li, J.
AU - Blaauw, D.
AU - Kim, H. S.
AU - Mudge, T.
AU - Dreslinski, R.
AU - Chakrabarti, C.
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Convolutional neural networks (CNNs) are built with convolution layers which account for most of their computation time. The differences in the convolution kernel types (2D, point-wise, depth-wise), and input sizes lead to significant differences in their computation and memory demands. In this work, we exploit run-time reconfiguration to adapt to the differences in the characteristics of different convolution kernels on a low-power reconfigurable architecture, Transmuter. The architecture consists of light-weight cores interconnected by caches and crossbars that support run-time reconfiguration between different cache modes-shared or private, different dataflow modes-systolic or parallel, and different computation mapping schemes. To achieve run-time reconfiguration, we propose a decision-tree-based engine that selects the optimal Transmuter configuration at a low cost. The proposed method is evaluated on commonly-used CNN models such as ResNetl8, VGGII, AlexNet and MobileNetV3. Simulation results show that run-time reconfiguration helps improve the energy efficiency of Transmuter in the range of 3.1 times-13.7 times across all networks.
AB - Convolutional neural networks (CNNs) are built with convolution layers which account for most of their computation time. The differences in the convolution kernel types (2D, point-wise, depth-wise), and input sizes lead to significant differences in their computation and memory demands. In this work, we exploit run-time reconfiguration to adapt to the differences in the characteristics of different convolution kernels on a low-power reconfigurable architecture, Transmuter. The architecture consists of light-weight cores interconnected by caches and crossbars that support run-time reconfiguration between different cache modes-shared or private, different dataflow modes-systolic or parallel, and different computation mapping schemes. To achieve run-time reconfiguration, we propose a decision-tree-based engine that selects the optimal Transmuter configuration at a low cost. The proposed method is evaluated on commonly-used CNN models such as ResNetl8, VGGII, AlexNet and MobileNetV3. Simulation results show that run-time reconfiguration helps improve the energy efficiency of Transmuter in the range of 3.1 times-13.7 times across all networks.
KW - CNN
KW - Energy-efficiency
KW - multicore architecture
KW - runtime reconfiguration
UR - http://www.scopus.com/inward/record.url?scp=85142521747&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85142521747&partnerID=8YFLogxK
U2 - 10.1109/ISCAS48785.2022.9937275
DO - 10.1109/ISCAS48785.2022.9937275
M3 - Conference contribution
AN - SCOPUS:85142521747
T3 - Proceedings - IEEE International Symposium on Circuits and Systems
SP - 375
EP - 379
BT - IEEE International Symposium on Circuits and Systems, ISCAS 2022
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 2022 IEEE International Symposium on Circuits and Systems, ISCAS 2022
Y2 - 27 May 2022 through 1 June 2022
ER -