TY - GEN
T1 - Genetic Improvement of GPU Code
AU - Liou, Jhe Yu
AU - Forrest, Stephanie
AU - Wu, Carole Jean
N1 - Funding Information:
part by the National Science Foundation under CCF-1618039 and SHF-1652132; and AFRL FA8750-17-S-7007.
Publisher Copyright:
© 2019 IEEE.
PY - 2019/5
Y1 - 2019/5
N2 - As the programming stack and tool support for GPU have matured, GPUs have become accessible to programmers who often lack domain-specific knowledge of the underlying architecture and fail to fully leverage the GPU's computation power. This paper presents GEVO (Gpu EVOlution), a tool for automatically tuning the performance of GPU kernels in the LLVM representation to meet desired criteria. GEVO uses population-based search to find edits to programs compiled to LLVM-IR that improve performance on desired criteria and retain required functionality. GEVO extends earlier GI work by operating directly on the LLVM-IR without custom representations or other manual interventions. We demonstrate that GEVO improves runtime on NVIDIA Tesla P100 for many programs in the Rodinia benchmark suite and a supervised machine learning code, ThunderSVM. For the Rodinia benchmark, GEVO improves GPU kernel runtime performance by an average of 13.87% and as much as 43% over the fully compiler-optimized baseline. If the kernel output accuracy is relaxed to tolerate 1% error, GEVO can find kernel variants that outperform the baseline version by an average of 15.47%. For ThunderSVM, GEVO reduces entire model training time by 50% and 24.8%, for MNIST handwriting recognition dataset and a9a income prediction, where the accuracy of trained model are improved by 0.17% and 0.04% respectively.
AB - As the programming stack and tool support for GPU have matured, GPUs have become accessible to programmers who often lack domain-specific knowledge of the underlying architecture and fail to fully leverage the GPU's computation power. This paper presents GEVO (Gpu EVOlution), a tool for automatically tuning the performance of GPU kernels in the LLVM representation to meet desired criteria. GEVO uses population-based search to find edits to programs compiled to LLVM-IR that improve performance on desired criteria and retain required functionality. GEVO extends earlier GI work by operating directly on the LLVM-IR without custom representations or other manual interventions. We demonstrate that GEVO improves runtime on NVIDIA Tesla P100 for many programs in the Rodinia benchmark suite and a supervised machine learning code, ThunderSVM. For the Rodinia benchmark, GEVO improves GPU kernel runtime performance by an average of 13.87% and as much as 43% over the fully compiler-optimized baseline. If the kernel output accuracy is relaxed to tolerate 1% error, GEVO can find kernel variants that outperform the baseline version by an average of 15.47%. For ThunderSVM, GEVO reduces entire model training time by 50% and 24.8%, for MNIST handwriting recognition dataset and a9a income prediction, where the accuracy of trained model are improved by 0.17% and 0.04% respectively.
KW - GPU code optimization
KW - Genetic Improvement
KW - LLVM Intermediate Representation
KW - Multi-objective Evolutionary Computation
UR - http://www.scopus.com/inward/record.url?scp=85072964860&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85072964860&partnerID=8YFLogxK
U2 - 10.1109/GI.2019.00014
DO - 10.1109/GI.2019.00014
M3 - Conference contribution
AN - SCOPUS:85072964860
T3 - Proceedings - 2019 IEEE/ACM 6th International Workshop on Genetic Improvement, GI 2019
SP - 20
EP - 27
BT - Proceedings - 2019 IEEE/ACM 6th International Workshop on Genetic Improvement, GI 2019
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 6th IEEE/ACM International Workshop on Genetic Improvement, GI 2019
Y2 - 28 May 2019
ER -