Accurate inference with inaccurate RRAM devices: Statistical data, model transfer, and on-line adaptation

Gouranga Charan; Jubin Hazra; Karsten Beckmann; Xiaocong Du; Gokul Krishnan; Rajiv V. Joshi; Nathaniel C. Cady; Yu Cao

doi:10.1109/DAC18072.2020.9218605

Accurate inference with inaccurate RRAM devices: Statistical data, model transfer, and on-line adaptation

Gouranga Charan, Jubin Hazra, Karsten Beckmann, Xiaocong Du, Gokul Krishnan, Rajiv V. Joshi, Nathaniel C. Cady, Yu Cao

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

24 Scopus citations

Abstract

Resistive random-access memory (RRAM) is a promising technology for in-memory computing with high storage density, fast inference, and good compatibility with CMOS. However, the mapping of a pre-trained deep neural network (DNN) model on RRAM suffers from realistic device issues, especially the variation and quantization error, resulting in a significant reduction in inference accuracy. In this work, we first extract these statistical properties from 65 nm RRAM data on 300mm wafers. The RRAM data present 10-levels in quantization and 50% variance, resulting in an accuracy drop to 31.76% and 10.49% for MNIST and CIFAR-10 datasets, respectively. Based on the experimental data, we propose a combination of machine learning algorithms and on-line adaptation to recover the accuracy with the minimum overhead. The recipe first applies Knowledge Distillation (KD) to transfer an ideal model into a student model with statistical variations and 10 levels. Furthermore, an on-line sparse adaptation (OSA) method is applied to the DNN model mapped on to the RRAM array. Using importance sampling, OSA adds a small SRAM array that is sparsely connected to the main RRAM array; only this SRAM array is updated to recover the accuracy. As demonstrated on MNIST and CIFAR-10 datasets, a 7.86% area cost is sufficient to achieve baseline accuracy for the 65 nm RRAM devices.

Original language	English (US)
Title of host publication	2020 57th ACM/IEEE Design Automation Conference, DAC 2020
Publisher	Institute of Electrical and Electronics Engineers Inc.
ISBN (Electronic)	9781450367257
DOIs	https://doi.org/10.1109/DAC18072.2020.9218605
State	Published - Jul 2020
Event	57th ACM/IEEE Design Automation Conference, DAC 2020 - Virtual, San Francisco, United States Duration: Jul 20 2020 → Jul 24 2020

Publication series

Name	Proceedings - Design Automation Conference
Volume	2020-July
ISSN (Print)	0738-100X

Conference

Conference	57th ACM/IEEE Design Automation Conference, DAC 2020
Country/Territory	United States
City	Virtual, San Francisco
Period	7/20/20 → 7/24/20

Keywords

In-memory computing
Knowledge Distillation
On-line adaptation
Resistive random access memory (RRAM)
Robustness

ASJC Scopus subject areas

Computer Science Applications
Control and Systems Engineering
Electrical and Electronic Engineering
Modeling and Simulation

Access to Document

10.1109/DAC18072.2020.9218605

Cite this

Charan, G., Hazra, J., Beckmann, K., Du, X., Krishnan, G., Joshi, R. V., Cady, N. C., & Cao, Y. (2020). Accurate inference with inaccurate RRAM devices: Statistical data, model transfer, and on-line adaptation. In 2020 57th ACM/IEEE Design Automation Conference, DAC 2020 Article 9218605 (Proceedings - Design Automation Conference; Vol. 2020-July). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/DAC18072.2020.9218605

Accurate inference with inaccurate RRAM devices: Statistical data, model transfer, and on-line adaptation. / Charan, Gouranga; Hazra, Jubin; Beckmann, Karsten et al.
2020 57th ACM/IEEE Design Automation Conference, DAC 2020. Institute of Electrical and Electronics Engineers Inc., 2020. 9218605 (Proceedings - Design Automation Conference; Vol. 2020-July).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Charan, G, Hazra, J, Beckmann, K, Du, X, Krishnan, G, Joshi, RV, Cady, NC & Cao, Y 2020, Accurate inference with inaccurate RRAM devices: Statistical data, model transfer, and on-line adaptation. in 2020 57th ACM/IEEE Design Automation Conference, DAC 2020., 9218605, Proceedings - Design Automation Conference, vol. 2020-July, Institute of Electrical and Electronics Engineers Inc., 57th ACM/IEEE Design Automation Conference, DAC 2020, Virtual, San Francisco, United States, 7/20/20. https://doi.org/10.1109/DAC18072.2020.9218605

Charan G, Hazra J, Beckmann K, Du X, Krishnan G, Joshi RV et al. Accurate inference with inaccurate RRAM devices: Statistical data, model transfer, and on-line adaptation. In 2020 57th ACM/IEEE Design Automation Conference, DAC 2020. Institute of Electrical and Electronics Engineers Inc. 2020. 9218605. (Proceedings - Design Automation Conference). doi: 10.1109/DAC18072.2020.9218605

@inproceedings{207154ffb2774f7ea39e4c43e8c04e57,

title = "Accurate inference with inaccurate RRAM devices: Statistical data, model transfer, and on-line adaptation",

abstract = "Resistive random-access memory (RRAM) is a promising technology for in-memory computing with high storage density, fast inference, and good compatibility with CMOS. However, the mapping of a pre-trained deep neural network (DNN) model on RRAM suffers from realistic device issues, especially the variation and quantization error, resulting in a significant reduction in inference accuracy. In this work, we first extract these statistical properties from 65 nm RRAM data on 300mm wafers. The RRAM data present 10-levels in quantization and 50% variance, resulting in an accuracy drop to 31.76% and 10.49% for MNIST and CIFAR-10 datasets, respectively. Based on the experimental data, we propose a combination of machine learning algorithms and on-line adaptation to recover the accuracy with the minimum overhead. The recipe first applies Knowledge Distillation (KD) to transfer an ideal model into a student model with statistical variations and 10 levels. Furthermore, an on-line sparse adaptation (OSA) method is applied to the DNN model mapped on to the RRAM array. Using importance sampling, OSA adds a small SRAM array that is sparsely connected to the main RRAM array; only this SRAM array is updated to recover the accuracy. As demonstrated on MNIST and CIFAR-10 datasets, a 7.86% area cost is sufficient to achieve baseline accuracy for the 65 nm RRAM devices.",

keywords = "In-memory computing, Knowledge Distillation, On-line adaptation, Resistive random access memory (RRAM), Robustness",

author = "Gouranga Charan and Jubin Hazra and Karsten Beckmann and Xiaocong Du and Gokul Krishnan and Joshi, {Rajiv V.} and Cady, {Nathaniel C.} and Yu Cao",

note = "Funding Information: This work was supported in part by the C-BRIC, one of six centers in JUMP, in part by the Semiconductor Research Corporation (SRC) Program, and in part by the by the National Science Foundation (NSF) under CCF 1715443. Publisher Copyright: {\textcopyright} 2020 IEEE.; 57th ACM/IEEE Design Automation Conference, DAC 2020 ; Conference date: 20-07-2020 Through 24-07-2020",

year = "2020",

month = jul,

doi = "10.1109/DAC18072.2020.9218605",

language = "English (US)",

series = "Proceedings - Design Automation Conference",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

booktitle = "2020 57th ACM/IEEE Design Automation Conference, DAC 2020",

}

TY - GEN

T1 - Accurate inference with inaccurate RRAM devices

T2 - 57th ACM/IEEE Design Automation Conference, DAC 2020

AU - Charan, Gouranga

AU - Hazra, Jubin

AU - Beckmann, Karsten

AU - Du, Xiaocong

AU - Krishnan, Gokul

AU - Joshi, Rajiv V.

AU - Cady, Nathaniel C.

AU - Cao, Yu

N1 - Funding Information: This work was supported in part by the C-BRIC, one of six centers in JUMP, in part by the Semiconductor Research Corporation (SRC) Program, and in part by the by the National Science Foundation (NSF) under CCF 1715443. Publisher Copyright: © 2020 IEEE.

PY - 2020/7

Y1 - 2020/7

N2 - Resistive random-access memory (RRAM) is a promising technology for in-memory computing with high storage density, fast inference, and good compatibility with CMOS. However, the mapping of a pre-trained deep neural network (DNN) model on RRAM suffers from realistic device issues, especially the variation and quantization error, resulting in a significant reduction in inference accuracy. In this work, we first extract these statistical properties from 65 nm RRAM data on 300mm wafers. The RRAM data present 10-levels in quantization and 50% variance, resulting in an accuracy drop to 31.76% and 10.49% for MNIST and CIFAR-10 datasets, respectively. Based on the experimental data, we propose a combination of machine learning algorithms and on-line adaptation to recover the accuracy with the minimum overhead. The recipe first applies Knowledge Distillation (KD) to transfer an ideal model into a student model with statistical variations and 10 levels. Furthermore, an on-line sparse adaptation (OSA) method is applied to the DNN model mapped on to the RRAM array. Using importance sampling, OSA adds a small SRAM array that is sparsely connected to the main RRAM array; only this SRAM array is updated to recover the accuracy. As demonstrated on MNIST and CIFAR-10 datasets, a 7.86% area cost is sufficient to achieve baseline accuracy for the 65 nm RRAM devices.

AB - Resistive random-access memory (RRAM) is a promising technology for in-memory computing with high storage density, fast inference, and good compatibility with CMOS. However, the mapping of a pre-trained deep neural network (DNN) model on RRAM suffers from realistic device issues, especially the variation and quantization error, resulting in a significant reduction in inference accuracy. In this work, we first extract these statistical properties from 65 nm RRAM data on 300mm wafers. The RRAM data present 10-levels in quantization and 50% variance, resulting in an accuracy drop to 31.76% and 10.49% for MNIST and CIFAR-10 datasets, respectively. Based on the experimental data, we propose a combination of machine learning algorithms and on-line adaptation to recover the accuracy with the minimum overhead. The recipe first applies Knowledge Distillation (KD) to transfer an ideal model into a student model with statistical variations and 10 levels. Furthermore, an on-line sparse adaptation (OSA) method is applied to the DNN model mapped on to the RRAM array. Using importance sampling, OSA adds a small SRAM array that is sparsely connected to the main RRAM array; only this SRAM array is updated to recover the accuracy. As demonstrated on MNIST and CIFAR-10 datasets, a 7.86% area cost is sufficient to achieve baseline accuracy for the 65 nm RRAM devices.

KW - In-memory computing

KW - Knowledge Distillation

KW - On-line adaptation

KW - Resistive random access memory (RRAM)

KW - Robustness

UR - http://www.scopus.com/inward/record.url?scp=85093927059&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85093927059&partnerID=8YFLogxK

U2 - 10.1109/DAC18072.2020.9218605

DO - 10.1109/DAC18072.2020.9218605

M3 - Conference contribution

AN - SCOPUS:85093927059

T3 - Proceedings - Design Automation Conference

BT - 2020 57th ACM/IEEE Design Automation Conference, DAC 2020

PB - Institute of Electrical and Electronics Engineers Inc.

Y2 - 20 July 2020 through 24 July 2020

ER -

Accurate inference with inaccurate RRAM devices: Statistical data, model transfer, and on-line adaptation

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this