TY - GEN
T1 - LASER
T2 - 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018
AU - Balasubramanian, Mahesh
AU - Dave, Shail
AU - Shrivastava, Aviral
AU - Jeyapaul, Reiley
N1 - Funding Information:
This work was partially supported by funding from National Science Foundation grants CCF 1055094 (CAREER), CNS 1525855 and CCF 1723476.
Publisher Copyright:
© 2018 EDAA.
PY - 2018/4/19
Y1 - 2018/4/19
N2 - Coarse-Grained Reconfigurable Arrays (CGRAs) are popular accelerators predominantly used in streaming, filtering, and decoding applications. Due to their high performance and high power-efficiency, CGRAs can be a promising solution to accelerate the loops of general purpose applications also. However, the loops in general purpose applications are often complicated, like loops with perfect and imperfect nests and loops with nested if-then-else's (conditionals). We argue that the existing hardware-software solutions to execute branches and conditions are inefficient. In order to efficiently execute complicated loops on CGRAs, we present a hardware-software hybrid solution: LASER - a comprehensive technique to accelerate compute-intensive loops of applications. In LASER, compiler transforms complex loops, maps them to the CGRA, and lays them out in the memory in a specific manner, such that the hardware can fetch and execute the instructions from the right path at runtime. LASER achieves a geomean performance improvement of 40.91% and utilization of 43.43% with 46% lower energy consumption.
AB - Coarse-Grained Reconfigurable Arrays (CGRAs) are popular accelerators predominantly used in streaming, filtering, and decoding applications. Due to their high performance and high power-efficiency, CGRAs can be a promising solution to accelerate the loops of general purpose applications also. However, the loops in general purpose applications are often complicated, like loops with perfect and imperfect nests and loops with nested if-then-else's (conditionals). We argue that the existing hardware-software solutions to execute branches and conditions are inefficient. In order to efficiently execute complicated loops on CGRAs, we present a hardware-software hybrid solution: LASER - a comprehensive technique to accelerate compute-intensive loops of applications. In LASER, compiler transforms complex loops, maps them to the CGRA, and lays them out in the memory in a specific manner, such that the hardware can fetch and execute the instructions from the right path at runtime. LASER achieves a geomean performance improvement of 40.91% and utilization of 43.43% with 46% lower energy consumption.
UR - http://www.scopus.com/inward/record.url?scp=85048793392&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85048793392&partnerID=8YFLogxK
U2 - 10.23919/DATE.2018.8342170
DO - 10.23919/DATE.2018.8342170
M3 - Conference contribution
AN - SCOPUS:85048793392
T3 - Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018
SP - 1069
EP - 1074
BT - Proceedings of the 2018 Design, Automation and Test in Europe Conference and Exhibition, DATE 2018
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 19 March 2018 through 23 March 2018
ER -