Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference

Jian Meng; Injune Yeo; Wonbo Shim; Li Yang; Deliang Fan; Shimeng Yu; Jae Sun Seo

doi:10.1109/IRPS48227.2022.9764480

Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference

Jian Meng, Injune Yeo, Wonbo Shim, Li Yang, Deliang Fan, Shimeng Yu, Jae Sun Seo

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

1 Scopus citations

Abstract

Resistive random-access memory (RRAM)-based in-memory computing (IMC) recently became a promising paradigm for efficient deep neural network acceleration. The multi-bit RRAM arrays provide dense storage and high throughput, whereas the physical non-ideality of the RRAM devices impairs the retention characteristics of the resistive cells, leading to accuracy degradation. On the algorithm side, various hardware-aware compression algorithms have been proposed to accelerate the computation of deep neural networks (DNNs) computation. However, most recent works individually consider the "model compression"and "hardware robustness". The impact of the RRAM non-ideality for the sparse model is still under-explored. In this work, we present a novel temperature-resilient RRAM-based IMC scheme for reliable DNN inference hardware. Based on the measurement from a 90nm RRAM prototype chip, we first explore the robustness of the sparse model under the different operating temperatures (25°C to 85°C). On top of that, we propose a novel robustness-aware pruning algorithm, then further enhance the model robustness with a novel sparsity-aware noise-injected fine-tuning. The proposed scheme achieves >92% CIFAR-10 inference accuracy after one-day operation, which is >37% higher than the state-of-art method.

Original language	English (US)
Title of host publication	2022 IEEE International Reliability Physics Symposium, IRPS 2022 - Proceedings
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	3C11-3C16
ISBN (Electronic)	9781665479509
DOIs	https://doi.org/10.1109/IRPS48227.2022.9764480
State	Published - 2022
Event	2022 IEEE International Reliability Physics Symposium, IRPS 2022 - Dallas, United States Duration: Mar 27 2022 → Mar 31 2022

Publication series

Name	IEEE International Reliability Physics Symposium Proceedings
Volume	2022-March
ISSN (Print)	1541-7026

Conference

Conference	2022 IEEE International Reliability Physics Symposium, IRPS 2022
Country/Territory	United States
City	Dallas
Period	3/27/22 → 3/31/22

Keywords

Convolutional neural network
data retention
in-memory computing
multilevel RRAM
structured pruning

ASJC Scopus subject areas

General Engineering

Access to Document

10.1109/IRPS48227.2022.9764480

Cite this

Meng, J., Yeo, I., Shim, W., Yang, L., Fan, D., Yu, S., & Seo, J. S. (2022). Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference. In 2022 IEEE International Reliability Physics Symposium, IRPS 2022 - Proceedings (pp. 3C11-3C16). (IEEE International Reliability Physics Symposium Proceedings; Vol. 2022-March). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IRPS48227.2022.9764480

Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference. / Meng, Jian; Yeo, Injune; Shim, Wonbo et al.
2022 IEEE International Reliability Physics Symposium, IRPS 2022 - Proceedings. Institute of Electrical and Electronics Engineers Inc., 2022. p. 3C11-3C16 (IEEE International Reliability Physics Symposium Proceedings; Vol. 2022-March).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Meng, J, Yeo, I, Shim, W, Yang, L, Fan, D, Yu, S & Seo, JS 2022, Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference. in 2022 IEEE International Reliability Physics Symposium, IRPS 2022 - Proceedings. IEEE International Reliability Physics Symposium Proceedings, vol. 2022-March, Institute of Electrical and Electronics Engineers Inc., pp. 3C11-3C16, 2022 IEEE International Reliability Physics Symposium, IRPS 2022, Dallas, United States, 3/27/22. https://doi.org/10.1109/IRPS48227.2022.9764480

Meng J, Yeo I, Shim W, Yang L, Fan D, Yu S et al. Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference. In 2022 IEEE International Reliability Physics Symposium, IRPS 2022 - Proceedings. Institute of Electrical and Electronics Engineers Inc. 2022. p. 3C11-3C16. (IEEE International Reliability Physics Symposium Proceedings). doi: 10.1109/IRPS48227.2022.9764480

@inproceedings{def29145f83043598c0e7a5e82b0d236,

title = "Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference",

abstract = "Resistive random-access memory (RRAM)-based in-memory computing (IMC) recently became a promising paradigm for efficient deep neural network acceleration. The multi-bit RRAM arrays provide dense storage and high throughput, whereas the physical non-ideality of the RRAM devices impairs the retention characteristics of the resistive cells, leading to accuracy degradation. On the algorithm side, various hardware-aware compression algorithms have been proposed to accelerate the computation of deep neural networks (DNNs) computation. However, most recent works individually consider the {"}model compression{"}and {"}hardware robustness{"}. The impact of the RRAM non-ideality for the sparse model is still under-explored. In this work, we present a novel temperature-resilient RRAM-based IMC scheme for reliable DNN inference hardware. Based on the measurement from a 90nm RRAM prototype chip, we first explore the robustness of the sparse model under the different operating temperatures (25°C to 85°C). On top of that, we propose a novel robustness-aware pruning algorithm, then further enhance the model robustness with a novel sparsity-aware noise-injected fine-tuning. The proposed scheme achieves >92% CIFAR-10 inference accuracy after one-day operation, which is >37% higher than the state-of-art method.",

keywords = "Convolutional neural network, data retention, in-memory computing, multilevel RRAM, structured pruning",

author = "Jian Meng and Injune Yeo and Wonbo Shim and Li Yang and Deliang Fan and Shimeng Yu and Seo, {Jae Sun}",

note = "Publisher Copyright: {\textcopyright} 2022 IEEE.; 2022 IEEE International Reliability Physics Symposium, IRPS 2022 ; Conference date: 27-03-2022 Through 31-03-2022",

year = "2022",

doi = "10.1109/IRPS48227.2022.9764480",

language = "English (US)",

series = "IEEE International Reliability Physics Symposium Proceedings",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "3C11--3C16",

booktitle = "2022 IEEE International Reliability Physics Symposium, IRPS 2022 - Proceedings",

}

TY - GEN

T1 - Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference

AU - Meng, Jian

AU - Yeo, Injune

AU - Shim, Wonbo

AU - Yang, Li

AU - Fan, Deliang

AU - Yu, Shimeng

AU - Seo, Jae Sun

PY - 2022

Y1 - 2022

N2 - Resistive random-access memory (RRAM)-based in-memory computing (IMC) recently became a promising paradigm for efficient deep neural network acceleration. The multi-bit RRAM arrays provide dense storage and high throughput, whereas the physical non-ideality of the RRAM devices impairs the retention characteristics of the resistive cells, leading to accuracy degradation. On the algorithm side, various hardware-aware compression algorithms have been proposed to accelerate the computation of deep neural networks (DNNs) computation. However, most recent works individually consider the "model compression"and "hardware robustness". The impact of the RRAM non-ideality for the sparse model is still under-explored. In this work, we present a novel temperature-resilient RRAM-based IMC scheme for reliable DNN inference hardware. Based on the measurement from a 90nm RRAM prototype chip, we first explore the robustness of the sparse model under the different operating temperatures (25°C to 85°C). On top of that, we propose a novel robustness-aware pruning algorithm, then further enhance the model robustness with a novel sparsity-aware noise-injected fine-tuning. The proposed scheme achieves >92% CIFAR-10 inference accuracy after one-day operation, which is >37% higher than the state-of-art method.

AB - Resistive random-access memory (RRAM)-based in-memory computing (IMC) recently became a promising paradigm for efficient deep neural network acceleration. The multi-bit RRAM arrays provide dense storage and high throughput, whereas the physical non-ideality of the RRAM devices impairs the retention characteristics of the resistive cells, leading to accuracy degradation. On the algorithm side, various hardware-aware compression algorithms have been proposed to accelerate the computation of deep neural networks (DNNs) computation. However, most recent works individually consider the "model compression"and "hardware robustness". The impact of the RRAM non-ideality for the sparse model is still under-explored. In this work, we present a novel temperature-resilient RRAM-based IMC scheme for reliable DNN inference hardware. Based on the measurement from a 90nm RRAM prototype chip, we first explore the robustness of the sparse model under the different operating temperatures (25°C to 85°C). On top of that, we propose a novel robustness-aware pruning algorithm, then further enhance the model robustness with a novel sparsity-aware noise-injected fine-tuning. The proposed scheme achieves >92% CIFAR-10 inference accuracy after one-day operation, which is >37% higher than the state-of-art method.

KW - Convolutional neural network

KW - data retention

KW - in-memory computing

KW - multilevel RRAM

KW - structured pruning

UR - http://www.scopus.com/inward/record.url?scp=85130718348&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85130718348&partnerID=8YFLogxK

U2 - 10.1109/IRPS48227.2022.9764480

DO - 10.1109/IRPS48227.2022.9764480

M3 - Conference contribution

AN - SCOPUS:85130718348

T3 - IEEE International Reliability Physics Symposium Proceedings

SP - 3C11-3C16

BT - 2022 IEEE International Reliability Physics Symposium, IRPS 2022 - Proceedings

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2022 IEEE International Reliability Physics Symposium, IRPS 2022

Y2 - 27 March 2022 through 31 March 2022

ER -

Sparse and Robust RRAM-based Efficient In-memory Computing for DNN Inference

Abstract

Publication series

Conference

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this