Inference engine benchmarking across technological platforms from CMOS to RRAM

Xiaochen Peng, Minkyu Kim, Xiaoyu Sun, Shihui Yin, Titash Rakshit, Ryan M. Hatcher, Jorge A. Kittl, Jae sun Seo, Shimeng Yu

Research output: Chapter in Book/Report/Conference proceedingConference contribution

8 Scopus citations


State-of-the-art deep convolutional neural networks (CNNs) are widely used in current AI systems, and achieve remarkable success in image/speech recognition and classification. A number of recent efforts have attempted to design custom inference engine based on various approaches, including the systolic architecture, near memory processing, and processing-in-memory (PIM) approach with emerging technologies such as resistive random access memory (RRAM). However, a comprehensive comparison of these various approaches in a unified framework is missing, and the benefits of new designs or emerging technologies are mostly based on qualitative projections. In this paper, we evaluate the energy efficiency and frame rate for a VGG-like CNN inference accelerator on CIFAR-10 dataset across the technological platforms from CMOS to post-CMOS, with hardware resource constraint, i.e. comparable on-chip area. We also investigate the effects of off-chip memory DRAM access and interconnect during data movement, which are the bottlenecks of CMOS platforms. Our quantitative analysis shows that the peripheries (ADCs) dominate in energy consumption and area (rather than memory array) in digital RRAM-based parallel readout PIM architecture. Despite presence of ADCs, this architecture shows >2.5× improvement in energy efficiency (TOPS/W) over systolic arrays or near memory processing, with a comparable frame rate due to reduced DRAM access, high throughput and optimized parallel read out. Further >10× improvements can be achieved by implementing bit-count reduced XNOR network and pipelining.

Original languageEnglish (US)
Title of host publicationMEMSYS 2019 - Proceedings of the International Symposium on Memory Systems
PublisherAssociation for Computing Machinery
Number of pages9
ISBN (Electronic)9781450372060
StatePublished - Sep 30 2019
Event2019 International Symposium on Memory Systems, MEMSYS 2019 - Washington, United States
Duration: Sep 30 2019Oct 3 2019

Publication series

NameACM International Conference Proceeding Series


Conference2019 International Symposium on Memory Systems, MEMSYS 2019
Country/TerritoryUnited States


  • Deep convolutional neural network
  • Hardware accelerator
  • Near memory processing
  • Processing in memory
  • Resistive random access memory
  • Systolic architecture

ASJC Scopus subject areas

  • Human-Computer Interaction
  • Computer Networks and Communications
  • Computer Vision and Pattern Recognition
  • Software


Dive into the research topics of 'Inference engine benchmarking across technological platforms from CMOS to RRAM'. Together they form a unique fingerprint.

Cite this