OuterSPACE: An Outer Product Based Sparse Matrix Multiplication Accelerator

Subhankar Pal, Jonathan Beaumont, Dong Hyeon Park, Aporva Amarnath, Siying Feng, Chaitali Chakrabarti, Hun Seok Kim, David Blaauw, Trevor Mudge, Ronald Dreslinski

Research output: Chapter in Book/Report/Conference proceedingConference contribution

135 Scopus citations


Sparse matrices are widely used in graph and data analytics, machine learning, engineering and scientific applications. This paper describes and analyzes OuterSPACE, an accelerator targeted at applications that involve large sparse matrices. OuterSPACE is a highly-scalable, energy-efficient, reconfigurable design, consisting of massively parallel Single Program, Multiple Data (SPMD)-style processing units, distributed memories, high-speed crossbars and High Bandwidth Memory (HBM). We identify redundant memory accesses to non-zeros as a key bottleneck in traditional sparse matrix-matrix multiplication algorithms. To ameliorate this, we implement an outer product based matrix multiplication technique that eliminates redundant accesses by decoupling multiplication from accumulation. We demonstrate that traditional architectures, due to limitations in their memory hierarchies and ability to harness parallelism in the algorithm, are unable to take advantage of this reduction without incurring significant overheads. OuterSPACE is designed to specifically overcome these challenges. We simulate the key components of our architecture using gem5 on a diverse set of matrices from the University of Florida's SuiteSparse collection and the Stanford Network Analysis Project and show a mean speedup of 7.9× over Intel Math Kernel Library on a Xeon CPU, 13.0× against cuSPARSE and 14.0× against CUSP when run on an NVIDIA K40 GPU, while achieving an average throughput of 2.9 GFLOPS within a 24 W power budget in an area of 87 mm2.

Original languageEnglish (US)
Title of host publicationProceedings - 24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018
PublisherIEEE Computer Society
Number of pages13
ISBN (Electronic)9781538636596
StatePublished - Mar 27 2018
Event24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018 - Vienna, Austria
Duration: Feb 24 2018Feb 28 2018

Publication series

NameProceedings - International Symposium on High-Performance Computer Architecture
ISSN (Print)1530-0897


Other24th IEEE International Symposium on High Performance Computer Architecture, HPCA 2018


  • Application specific hardware
  • Hardware accelerators
  • Hardware software co design
  • Parallel computer architecture
  • Sparse matrix processing

ASJC Scopus subject areas

  • Hardware and Architecture


Dive into the research topics of 'OuterSPACE: An Outer Product Based Sparse Matrix Multiplication Accelerator'. Together they form a unique fingerprint.

Cite this