Exploring Model Stability of Deep Neural Networks for Reliable RRAM-Based In-Memory Acceleration

Gokul Krishnan; Li Yang; Jingbo Sun; Jubin Hazra; Xiaocong Du; Maximilian Liehr; Zheng Li; Karsten Beckmann; Rajiv V. Joshi; Nathaniel C. Cady; Deliang Fan; Yu Cao

doi:10.1109/TC.2022.3174585

Exploring Model Stability of Deep Neural Networks for Reliable RRAM-Based In-Memory Acceleration

Gokul Krishnan, Li Yang, Jingbo Sun, Jubin Hazra, Xiaocong Du, Maximilian Liehr, Zheng Li, Karsten Beckmann, Rajiv V. Joshi, Nathaniel C. Cady, Deliang Fan, Yu Cao

Research output: Contribution to journal › Article › peer-review

4 Scopus citations

Abstract

RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs). Furthermore, model compression techniques, such as quantization and pruning, are necessary to improve algorithm mapping and hardware performance. However, in the presence of RRAM device variations, low-precision and sparse DNNs suffer from severe post-mapping accuracy loss. To address this, in this work, we investigate a new metric, model stability, from the loss landscape to help shed light on accuracy loss under variations and model compression, which guides an algorithmic solution to maximize model stability and mitigate accuracy loss. Based on statistical data from a CMOS/RRAM 1T1R test chip at 65nm, we characterize wafer-level RRAM variations and develop a cross-layer benchmark tool that incorporates quantization, pruning, device variations, model stability, and IMC architecture parameters to assess post-mapping accuracy and hardware performance. Leveraging this tool, we show that a loss-landscape-based DNN model selection for stability effectively tolerates device variations and achieves a post-mapping accuracy higher than that with 50% lower RRAM variations. Moreover, we quantitatively interpret why model pruning increases the sensitivity to variations, while a lower-precision model has better tolerance to variations. Finally, we propose a novel variation-aware training method to improve model stability, in which there exists the most stable model for the best post-mapping accuracy of compressed DNNs. Experimental evaluation of the method shows up to 19%, 21%, and 11% post-mapping accuracy improvement for our 65nm RRAM device, across various precision and sparsity, on CIFAR-10, CIFAR-100, and SVHN datasets, respectively.

Original language	English (US)
Pages (from-to)	2740-2752
Number of pages	13
Journal	IEEE Transactions on Computers
Volume	71
Issue number	11
DOIs	https://doi.org/10.1109/TC.2022.3174585
State	Published - Nov 1 2022

Keywords

In-memory computing
RRAM
deep neural networks
model stability
pruning
quantization
reliability

ASJC Scopus subject areas

Software
Theoretical Computer Science
Hardware and Architecture
Computational Theory and Mathematics

Access to Document

10.1109/TC.2022.3174585

Cite this

@article{30c2005e45334dc38f0901dd48ffd48a,

title = "Exploring Model Stability of Deep Neural Networks for Reliable RRAM-Based In-Memory Acceleration",

abstract = "RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs). Furthermore, model compression techniques, such as quantization and pruning, are necessary to improve algorithm mapping and hardware performance. However, in the presence of RRAM device variations, low-precision and sparse DNNs suffer from severe post-mapping accuracy loss. To address this, in this work, we investigate a new metric, model stability, from the loss landscape to help shed light on accuracy loss under variations and model compression, which guides an algorithmic solution to maximize model stability and mitigate accuracy loss. Based on statistical data from a CMOS/RRAM 1T1R test chip at 65nm, we characterize wafer-level RRAM variations and develop a cross-layer benchmark tool that incorporates quantization, pruning, device variations, model stability, and IMC architecture parameters to assess post-mapping accuracy and hardware performance. Leveraging this tool, we show that a loss-landscape-based DNN model selection for stability effectively tolerates device variations and achieves a post-mapping accuracy higher than that with 50% lower RRAM variations. Moreover, we quantitatively interpret why model pruning increases the sensitivity to variations, while a lower-precision model has better tolerance to variations. Finally, we propose a novel variation-aware training method to improve model stability, in which there exists the most stable model for the best post-mapping accuracy of compressed DNNs. Experimental evaluation of the method shows up to 19%, 21%, and 11% post-mapping accuracy improvement for our 65nm RRAM device, across various precision and sparsity, on CIFAR-10, CIFAR-100, and SVHN datasets, respectively.",

keywords = "In-memory computing, RRAM, deep neural networks, model stability, pruning, quantization, reliability",

author = "Gokul Krishnan and Li Yang and Jingbo Sun and Jubin Hazra and Xiaocong Du and Maximilian Liehr and Zheng Li and Karsten Beckmann and Joshi, {Rajiv V.} and Cady, {Nathaniel C.} and Deliang Fan and Yu Cao",

note = "Publisher Copyright: {\textcopyright} 2022 IEEE.",

year = "2022",

month = nov,

day = "1",

doi = "10.1109/TC.2022.3174585",

language = "English (US)",

volume = "71",

pages = "2740--2752",

journal = "IEEE Transactions on Computers",

issn = "0018-9340",

publisher = "IEEE Computer Society",

number = "11",

}

TY - JOUR

T1 - Exploring Model Stability of Deep Neural Networks for Reliable RRAM-Based In-Memory Acceleration

AU - Krishnan, Gokul

AU - Yang, Li

AU - Sun, Jingbo

AU - Hazra, Jubin

AU - Du, Xiaocong

AU - Liehr, Maximilian

AU - Li, Zheng

AU - Beckmann, Karsten

AU - Joshi, Rajiv V.

AU - Cady, Nathaniel C.

AU - Fan, Deliang

AU - Cao, Yu

PY - 2022/11/1

Y1 - 2022/11/1

N2 - RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs). Furthermore, model compression techniques, such as quantization and pruning, are necessary to improve algorithm mapping and hardware performance. However, in the presence of RRAM device variations, low-precision and sparse DNNs suffer from severe post-mapping accuracy loss. To address this, in this work, we investigate a new metric, model stability, from the loss landscape to help shed light on accuracy loss under variations and model compression, which guides an algorithmic solution to maximize model stability and mitigate accuracy loss. Based on statistical data from a CMOS/RRAM 1T1R test chip at 65nm, we characterize wafer-level RRAM variations and develop a cross-layer benchmark tool that incorporates quantization, pruning, device variations, model stability, and IMC architecture parameters to assess post-mapping accuracy and hardware performance. Leveraging this tool, we show that a loss-landscape-based DNN model selection for stability effectively tolerates device variations and achieves a post-mapping accuracy higher than that with 50% lower RRAM variations. Moreover, we quantitatively interpret why model pruning increases the sensitivity to variations, while a lower-precision model has better tolerance to variations. Finally, we propose a novel variation-aware training method to improve model stability, in which there exists the most stable model for the best post-mapping accuracy of compressed DNNs. Experimental evaluation of the method shows up to 19%, 21%, and 11% post-mapping accuracy improvement for our 65nm RRAM device, across various precision and sparsity, on CIFAR-10, CIFAR-100, and SVHN datasets, respectively.

AB - RRAM-based in-memory computing (IMC) effectively accelerates deep neural networks (DNNs). Furthermore, model compression techniques, such as quantization and pruning, are necessary to improve algorithm mapping and hardware performance. However, in the presence of RRAM device variations, low-precision and sparse DNNs suffer from severe post-mapping accuracy loss. To address this, in this work, we investigate a new metric, model stability, from the loss landscape to help shed light on accuracy loss under variations and model compression, which guides an algorithmic solution to maximize model stability and mitigate accuracy loss. Based on statistical data from a CMOS/RRAM 1T1R test chip at 65nm, we characterize wafer-level RRAM variations and develop a cross-layer benchmark tool that incorporates quantization, pruning, device variations, model stability, and IMC architecture parameters to assess post-mapping accuracy and hardware performance. Leveraging this tool, we show that a loss-landscape-based DNN model selection for stability effectively tolerates device variations and achieves a post-mapping accuracy higher than that with 50% lower RRAM variations. Moreover, we quantitatively interpret why model pruning increases the sensitivity to variations, while a lower-precision model has better tolerance to variations. Finally, we propose a novel variation-aware training method to improve model stability, in which there exists the most stable model for the best post-mapping accuracy of compressed DNNs. Experimental evaluation of the method shows up to 19%, 21%, and 11% post-mapping accuracy improvement for our 65nm RRAM device, across various precision and sparsity, on CIFAR-10, CIFAR-100, and SVHN datasets, respectively.

KW - In-memory computing

KW - RRAM

KW - deep neural networks

KW - model stability

KW - pruning

KW - quantization

KW - reliability

UR - http://www.scopus.com/inward/record.url?scp=85132515239&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85132515239&partnerID=8YFLogxK

U2 - 10.1109/TC.2022.3174585

DO - 10.1109/TC.2022.3174585

M3 - Article

AN - SCOPUS:85132515239

SN - 0018-9340

VL - 71

SP - 2740

EP - 2752

JO - IEEE Transactions on Computers

JF - IEEE Transactions on Computers

IS - 11

ER -

Exploring Model Stability of Deep Neural Networks for Reliable RRAM-Based In-Memory Acceleration

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this