Though recent progress in resistive random access memory (ReRAM)-based accelerator designs for convolutional neural networks (CNN) achieve superior timing performance and area-efficiency improvements over CMOS-based accelerators, they have high energy consumptions due to low inter-layer data reuse. In this work, we propose a multi-tile ReRAM accelerator for supporting multiple CNN topologies, where each tile processes one or more layers in a pipelined fashion. Building upon the fact that a tile with large receptive field can be built with a stack of smaller (3×3) filters, we design every tile with 9 processing elements that operate in a systolic fashion. Use of systolic data flow design maximizes input feature map reuse and minimizes interconnection cost. We show that 1-bit weight and 4-bit activation achieves good accuracy for both AlexNet and VGGNet, and design our ReRAM based accelerator to support this configuration. System-level simulation results on 32 nm node show that the proposed architecture for AlexNet with stacking small filters can achieve computation efficiency of 8.42 TOPs/s/mm 2 , energy efficiency of 4.08 TOPs/s/W and storage efficiency of 0.18 MB/mm 2 for inference computation of one image in the CIFAR-100 dataset.

Original languageEnglish (US)
Title of host publicationProceedings of the IEEE Workshop on Signal Processing Systems, SiPS 2018
PublisherInstitute of Electrical and Electronics Engineers Inc.
Number of pages6
ISBN (Electronic)9781538663189
StatePublished - Dec 31 2018
Event2018 IEEE Workshop on Signal Processing Systems, SiPS 2018 - Cape Town, South Africa
Duration: Oct 21 2018Oct 24 2018

Publication series

NameIEEE Workshop on Signal Processing Systems, SiPS: Design and Implementation
ISSN (Print)1520-6130


Conference2018 IEEE Workshop on Signal Processing Systems, SiPS 2018
Country/TerritorySouth Africa
CityCape Town


  • CNN
  • ReRAM
  • accelerator
  • systolic

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Signal Processing
  • Applied Mathematics
  • Hardware and Architecture


Dive into the research topics of 'A Versatile ReRAM-based Accelerator for Convolutional Neural Networks'. Together they form a unique fingerprint.

Cite this