TY - GEN
T1 - NeuroFabric
T2 - 40th IEEE International Conference on Computer Design, ICCD 2022
AU - Isakov, Mihailo
AU - Kinsy, Michel A.
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022
Y1 - 2022
N2 - Sparse Deep Neural Networks (DNN) offer a large improvement in model storage requirements, execution latency and execution throughput. DNN pruning is contingent on knowing model weights, so networks can be pruned only after training. A priori sparse neural networks have been proposed as a way to extend sparsity benefits to the training process as well. Selecting a topology a priori is also beneficial for hardware accelerator specialization, lowering power, chip area, and latency.We present NeuroFabric, a hardware-ML model co-design approach that jointly optimizes a sparse neural network topology and a hardware accelerator configuration. NeuroFabric replaces dense DNN layers with cascades of sparse layers with a specific topology. We present an efficient and data-agnostic method for sparse network topology optimization, and show that parallel butterfly networks with skip connections achieve the best accuracy independent of sparsity or depth. We also present a multi-objective optimization framework that finds a Pareto frontier of hardware-ML model configurations over six objectives: accuracy, parameter count, throughput, latency, power, and hardware area.
AB - Sparse Deep Neural Networks (DNN) offer a large improvement in model storage requirements, execution latency and execution throughput. DNN pruning is contingent on knowing model weights, so networks can be pruned only after training. A priori sparse neural networks have been proposed as a way to extend sparsity benefits to the training process as well. Selecting a topology a priori is also beneficial for hardware accelerator specialization, lowering power, chip area, and latency.We present NeuroFabric, a hardware-ML model co-design approach that jointly optimizes a sparse neural network topology and a hardware accelerator configuration. NeuroFabric replaces dense DNN layers with cascades of sparse layers with a specific topology. We present an efficient and data-agnostic method for sparse network topology optimization, and show that parallel butterfly networks with skip connections achieve the best accuracy independent of sparsity or depth. We also present a multi-objective optimization framework that finds a Pareto frontier of hardware-ML model configurations over six objectives: accuracy, parameter count, throughput, latency, power, and hardware area.
KW - acceleration
KW - neural network
KW - Sparsity
KW - topology
UR - http://www.scopus.com/inward/record.url?scp=85145876251&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85145876251&partnerID=8YFLogxK
U2 - 10.1109/ICCD56317.2022.00088
DO - 10.1109/ICCD56317.2022.00088
M3 - Conference contribution
AN - SCOPUS:85145876251
T3 - Proceedings - IEEE International Conference on Computer Design: VLSI in Computers and Processors
SP - 561
EP - 564
BT - Proceedings - 2022 IEEE 40th International Conference on Computer Design, ICCD 2022
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 23 October 2022 through 26 October 2022
ER -