TY - JOUR
T1 - Array-Level Programming of 3-Bit per Cell Resistive Memory and Its Application for Deep Neural Network Inference
AU - Luo, Yandong
AU - Han, Xu
AU - Ye, Zhilu
AU - Barnaby, Hugh
AU - Seo, Jae Sun
AU - Yu, Shimeng
N1 - Funding Information:
Manuscript received July 10, 2020; accepted August 5, 2020. Date of publication August 24, 2020; date of current version October 22, 2020. This work was supported in part by the Applications and Systems-Driven Center for Energy-Efficient Integrated NanoTechnologies (ASCENT) and Center for Brain-inspired Computing Enabling Autonomous Intelligence (C-BRIC), two of Semiconductor Research Corporation (SRC)/Defense Advanced Research Projects Agency (DARPA) JUMP Centers, in part by the National Science Foundation (NSF)/SRC Energy-Efficient Computing: from Devices to Architectures (E2CDA) Program, and in part by the NSF under Grant NSF-CCF-1903951. The review of this article was arranged by Editor C. Monzio Compagnoni. (Yandong Luo and Xu Han contributed equally to this work.) (Corresponding author: Yandong Luo.) Yandong Luo and Shimeng Yu are with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: yandongluo@gatech.edu; shimeng.yu@ ece.gatech.edu).
Publisher Copyright:
© 1963-2012 IEEE.
PY - 2020/11
Y1 - 2020/11
N2 - The requirement of multilevel cell (MLC) resistive random access memory (RRAM) for computing is different than that for MLC storage. It generally requires a linearly spaced conductance median and an ultratight conductance distribution, as the column current are summed up for analog computation. In this article, 3-bit per cell RRAM that is suitable for accurate inference of a deep neural network (DNN) is demonstrated, with ultratight conductance distribution (<1.5% sigma). First, a two-loop write-verify protocol is proposed. Then, statistical experiments are conducted on RRAM array fabricated in Winbond's 90-nm process. By incorporating the measured conductance distribution into DNN simulation considering the real weight mapping, inference accuracy with only 0.5% degradation over software baseline is achieved for CIFAR-10 data set even when 128 rows are read-out in parallel. By enabling parallel read-out, the system-level energy efficiency and throughput could be improved by 5.3 \times and 4.4 \times , respectively, compared to the 3-bit per cell RRAM used as MLC storage.
AB - The requirement of multilevel cell (MLC) resistive random access memory (RRAM) for computing is different than that for MLC storage. It generally requires a linearly spaced conductance median and an ultratight conductance distribution, as the column current are summed up for analog computation. In this article, 3-bit per cell RRAM that is suitable for accurate inference of a deep neural network (DNN) is demonstrated, with ultratight conductance distribution (<1.5% sigma). First, a two-loop write-verify protocol is proposed. Then, statistical experiments are conducted on RRAM array fabricated in Winbond's 90-nm process. By incorporating the measured conductance distribution into DNN simulation considering the real weight mapping, inference accuracy with only 0.5% degradation over software baseline is achieved for CIFAR-10 data set even when 128 rows are read-out in parallel. By enabling parallel read-out, the system-level energy efficiency and throughput could be improved by 5.3 \times and 4.4 \times , respectively, compared to the 3-bit per cell RRAM used as MLC storage.
KW - Compute-in-memory (CIM)
KW - deep neural network (DNN)
KW - multilevel cell (MLC)
KW - resistive random access memory (RRAM)
UR - http://www.scopus.com/inward/record.url?scp=85094919873&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85094919873&partnerID=8YFLogxK
U2 - 10.1109/TED.2020.3015940
DO - 10.1109/TED.2020.3015940
M3 - Article
AN - SCOPUS:85094919873
SN - 0018-9383
VL - 67
SP - 4621
EP - 4625
JO - IEEE Transactions on Electron Devices
JF - IEEE Transactions on Electron Devices
IS - 11
M1 - 9174666
ER -