TY - GEN
T1 - Characterizing Loop Acceleration in Heterogeneous Computing
AU - Biookaghazadeh, Saman
AU - Ren, Fengbo
AU - Zhao, Ming
N1 - Funding Information:
We thank the anonymous reviewers for their helpful comments. This work is supported by National Science Foundation awards CNS-1955593, CNS-1562837, and CNS-1629888 and Intel’s donation of the Fog Reference Design units.
Publisher Copyright:
© 2021 IEEE.
PY - 2021/9
Y1 - 2021/9
N2 - Computation intensive applications usually consist of multiple nested or flattened loops. These loops are the main building blocks of the applications and embody a specific type of execution pattern. In order to reduce the running time of the loops, developers need to analyze the loops in the code and try to parallelize them on hardware accelerators, such as GPUs, TPUs, and FPGAs, which are increasingly available in the cloud. Unfortunately, the lack of understanding of loop characteristics and the ability of hardware accelerators in handling these types of loops prevents developers from choosing the right platform to develop their applications in the cloud. Also, developing and optimizing code for a specific accelerator is a time-consuming effort. To address these issues, this paper studies the effectiveness of different processors in accelerating common patterns of loops. It identifies five important types of loops that commonly exist in real-world applications, and presents Loopy, the implementations of these loops optimized for different architectures. Using Loopy, the paper also evaluates different hardware in accelerating the loop patterns. The result reveals the architectural differences among different accelerators with regard to different loop patterns. It also provides insights for the developers to choose the right accelerators for their applications. The current version of Loopy supports both FPGAs and GPUs, which are the most versatile and available accelerators.
AB - Computation intensive applications usually consist of multiple nested or flattened loops. These loops are the main building blocks of the applications and embody a specific type of execution pattern. In order to reduce the running time of the loops, developers need to analyze the loops in the code and try to parallelize them on hardware accelerators, such as GPUs, TPUs, and FPGAs, which are increasingly available in the cloud. Unfortunately, the lack of understanding of loop characteristics and the ability of hardware accelerators in handling these types of loops prevents developers from choosing the right platform to develop their applications in the cloud. Also, developing and optimizing code for a specific accelerator is a time-consuming effort. To address these issues, this paper studies the effectiveness of different processors in accelerating common patterns of loops. It identifies five important types of loops that commonly exist in real-world applications, and presents Loopy, the implementations of these loops optimized for different architectures. Using Loopy, the paper also evaluates different hardware in accelerating the loop patterns. The result reveals the architectural differences among different accelerators with regard to different loop patterns. It also provides insights for the developers to choose the right accelerators for their applications. The current version of Loopy supports both FPGAs and GPUs, which are the most versatile and available accelerators.
KW - FPGA
KW - GPU
KW - Heterogeneous computing
KW - Loop characterization
UR - http://www.scopus.com/inward/record.url?scp=85119319749&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85119319749&partnerID=8YFLogxK
U2 - 10.1109/CLOUD53861.2021.00059
DO - 10.1109/CLOUD53861.2021.00059
M3 - Conference contribution
AN - SCOPUS:85119319749
T3 - IEEE International Conference on Cloud Computing, CLOUD
SP - 445
EP - 455
BT - Proceedings - 2021 IEEE 14th International Conference on Cloud Computing, CLOUD 2021
A2 - Ardagna, Claudio Agostino
A2 - Chang, Carl K.
A2 - Daminai, Ernesto
A2 - Ranjan, Rajiv
A2 - Wang, Zhongjie
A2 - Ward, Robert
A2 - Zhang, Jia
A2 - Zhang, Wensheng
PB - IEEE Computer Society
T2 - 14th IEEE International Conference on Cloud Computing, CLOUD 2021
Y2 - 5 September 2021 through 11 September 2021
ER -