Unrolling and retiming of stream applications onto embedded multicore processors

Weijia Che, Karam S. Chatha

Research output: Chapter in Book/Report/Conference proceedingConference contribution

7 Scopus citations


In recent years, we have observed the prevalence of stream applications in many embedded domains. Stream applications distinguish themselves from traditional sequential programming languages through well defined independent actors, explicit data communication, and stable code/data access patterns. In order to achieve high performance and low power, scratch pad memory (SPM) has been introduced in today's embedded multicore processors. Programing on SPM based architecture is both challenging and time consuming. In this paper we address the problem of automatic compilation of stream applications onto SPM based embedded multicore processors through unrolling and retiming. In our technique, code overlay and data overlay are implemented to overcome the limited SPM capacity. Smart double buffering and code prefetching are introduced to amortize memory access delays. We evaluated the efficiency of our technique through compiling several stream applications onto the IBM Cell processor and compared their performance with existing approaches.

Original languageEnglish (US)
Title of host publicationProceedings of the 49th Annual Design Automation Conference, DAC '12
Number of pages6
StatePublished - 2012
Event49th Annual Design Automation Conference, DAC '12 - San Francisco, CA, United States
Duration: Jun 3 2012Jun 7 2012

Publication series

NameProceedings - Design Automation Conference
ISSN (Print)0738-100X


Other49th Annual Design Automation Conference, DAC '12
Country/TerritoryUnited States
CitySan Francisco, CA


  • SPM
  • multicore
  • overlay
  • retiming
  • stream
  • unrolling

ASJC Scopus subject areas

  • Computer Science Applications
  • Control and Systems Engineering
  • Electrical and Electronic Engineering
  • Modeling and Simulation


Dive into the research topics of 'Unrolling and retiming of stream applications onto embedded multicore processors'. Together they form a unique fingerprint.

Cite this