TY - GEN
T1 - Ahoy SAILR! There is No Need to DREAM of C
T2 - 33rd USENIX Security Symposium, USENIX Security 2024
AU - Basque, Zion Leonahenahe
AU - Bajaj, Ati Priya
AU - Gibbs, Wil
AU - O'Kain, Jude
AU - Miao, Derron
AU - Bao, Tiffany
AU - Doupé, Adam
AU - Shoshitaishvili, Yan
AU - Wang, Ruoyu
N1 - Publisher Copyright:
© USENIX Security Symposium 2024.All rights reserved.
PY - 2024
Y1 - 2024
N2 - Contrary to prevailing wisdom, we argue that the measure of binary decompiler success is not to eliminate all gotos or reduce the complexity of the decompiled code but to get as close as possible to the original source code. Many gotos exist in the original source code (the Linux kernel version 6.1 contains 3, 754) and, therefore, should be preserved during decompilation, and only spurious gotos should be removed. Fundamentally, decompilers insert spurious gotos in decompilation because structuring algorithms fail to recover C-style structures from binary code. Through a quantitative study, we find that the root cause of spurious gotos is compiler-induced optimizations that occur at all optimization levels (17% in non-optimized compilation). Therefore, we believe that to achieve high-quality decompilation, decompilers must be compiler-aware to mirror (and remove) the goto-inducing optimizations. In this paper, we present a novel structuring algorithm called SAILR that mirrors the compilation pipeline of GCC and precisely inverts goto-inducing transformations. We build an open-source decompiler on angr (the ANGR DECOMPILER) and implement SAILR as well as otherwise-unavailable prior work (Phoenix, DREAM, and rev.ng's Combing) and evaluate them, using a new metric of how close the decompiled code structure is to the original source code, showing that SAILR markedly improves on prior work. In addition, we find that SAILR performs well on binaries compiled with non-GCC compilers, which suggests that compilers similarly implement goto-inducing transformations.
AB - Contrary to prevailing wisdom, we argue that the measure of binary decompiler success is not to eliminate all gotos or reduce the complexity of the decompiled code but to get as close as possible to the original source code. Many gotos exist in the original source code (the Linux kernel version 6.1 contains 3, 754) and, therefore, should be preserved during decompilation, and only spurious gotos should be removed. Fundamentally, decompilers insert spurious gotos in decompilation because structuring algorithms fail to recover C-style structures from binary code. Through a quantitative study, we find that the root cause of spurious gotos is compiler-induced optimizations that occur at all optimization levels (17% in non-optimized compilation). Therefore, we believe that to achieve high-quality decompilation, decompilers must be compiler-aware to mirror (and remove) the goto-inducing optimizations. In this paper, we present a novel structuring algorithm called SAILR that mirrors the compilation pipeline of GCC and precisely inverts goto-inducing transformations. We build an open-source decompiler on angr (the ANGR DECOMPILER) and implement SAILR as well as otherwise-unavailable prior work (Phoenix, DREAM, and rev.ng's Combing) and evaluate them, using a new metric of how close the decompiled code structure is to the original source code, showing that SAILR markedly improves on prior work. In addition, we find that SAILR performs well on binaries compiled with non-GCC compilers, which suggests that compilers similarly implement goto-inducing transformations.
UR - http://www.scopus.com/inward/record.url?scp=85204057069&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85204057069&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85204057069
T3 - Proceedings of the 33rd USENIX Security Symposium
SP - 361
EP - 378
BT - Proceedings of the 33rd USENIX Security Symposium
PB - USENIX Association
Y2 - 14 August 2024 through 16 August 2024
ER -