TY - JOUR
T1 - Exploring Model Stability of Deep Neural Networks for Reliable RRAM-Based In-Memory Acceleration
AU - Krishnan, Gokul
AU - Yang, Li
AU - Sun, Jingbo
AU - Hazra, Jubin
AU - Du, Xiaocong
AU - Liehr, Maximilian
AU - Li, Zheng
AU - Beckmann, Karsten
AU - Joshi, Rajiv V.
AU - Cady, Nathaniel C.
AU - Fan, Deliang
AU - Cao, Yu
N1 - Publisher Copyright:
© 2022 IEEE.
PY - 2022/11/1
Y1 - 2022/11/1
N2 - RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs). Furthermore, model compression techniques, such as quantization and pruning, are necessary to improve algorithm mapping and hardware performance. However, in the presence of RRAM device variations, low-precision and sparse DNNs suffer from severe post-mapping accuracy loss. To address this, in this work, we investigate a new metric, model stability, from the loss landscape to help shed light on accuracy loss under variations and model compression, which guides an algorithmic solution to maximize model stability and mitigate accuracy loss. Based on statistical data from a CMOS/RRAM 1T1R test chip at 65nm, we characterize wafer-level RRAM variations and develop a cross-layer benchmark tool that incorporates quantization, pruning, device variations, model stability, and IMC architecture parameters to assess post-mapping accuracy and hardware performance. Leveraging this tool, we show that a loss-landscape-based DNN model selection for stability effectively tolerates device variations and achieves a post-mapping accuracy higher than that with 50% lower RRAM variations. Moreover, we quantitatively interpret why model pruning increases the sensitivity to variations, while a lower-precision model has better tolerance to variations. Finally, we propose a novel variation-aware training method to improve model stability, in which there exists the most stable model for the best post-mapping accuracy of compressed DNNs. Experimental evaluation of the method shows up to 19%, 21%, and 11% post-mapping accuracy improvement for our 65nm RRAM device, across various precision and sparsity, on CIFAR-10, CIFAR-100, and SVHN datasets, respectively.
AB - RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs). Furthermore, model compression techniques, such as quantization and pruning, are necessary to improve algorithm mapping and hardware performance. However, in the presence of RRAM device variations, low-precision and sparse DNNs suffer from severe post-mapping accuracy loss. To address this, in this work, we investigate a new metric, model stability, from the loss landscape to help shed light on accuracy loss under variations and model compression, which guides an algorithmic solution to maximize model stability and mitigate accuracy loss. Based on statistical data from a CMOS/RRAM 1T1R test chip at 65nm, we characterize wafer-level RRAM variations and develop a cross-layer benchmark tool that incorporates quantization, pruning, device variations, model stability, and IMC architecture parameters to assess post-mapping accuracy and hardware performance. Leveraging this tool, we show that a loss-landscape-based DNN model selection for stability effectively tolerates device variations and achieves a post-mapping accuracy higher than that with 50% lower RRAM variations. Moreover, we quantitatively interpret why model pruning increases the sensitivity to variations, while a lower-precision model has better tolerance to variations. Finally, we propose a novel variation-aware training method to improve model stability, in which there exists the most stable model for the best post-mapping accuracy of compressed DNNs. Experimental evaluation of the method shows up to 19%, 21%, and 11% post-mapping accuracy improvement for our 65nm RRAM device, across various precision and sparsity, on CIFAR-10, CIFAR-100, and SVHN datasets, respectively.
KW - In-memory computing
KW - RRAM
KW - deep neural networks
KW - model stability
KW - pruning
KW - quantization
KW - reliability
UR - http://www.scopus.com/inward/record.url?scp=85132515239&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85132515239&partnerID=8YFLogxK
U2 - 10.1109/TC.2022.3174585
DO - 10.1109/TC.2022.3174585
M3 - Article
AN - SCOPUS:85132515239
SN - 0018-9340
VL - 71
SP - 2740
EP - 2752
JO - IEEE Transactions on Computers
JF - IEEE Transactions on Computers
IS - 11
ER -