Embedded devices are becoming ubiquitous, and ARM is becoming the dominant architecture for them. Meanwhile, there is a pressing need to perform security assessments for these devices. Due to different types of peripherals, emulating the software, i.e., firmware, of these devices in scale is challenging. Therefore, static analysis is still widely used. Existing works usually leverage off-the-shelf tools to disassemble stripped ARM binaries and (implicitly) assume that reliably disassembling binaries is a solved problem. However, whether this assumption really holds is unknown. In this paper, we conduct the first comprehensive study on ARM disassembly tools. Specifically, we build 1,896 ARM binaries (including 248 obfuscated ones) with different compilers, compiling options, and obfuscation methods. We then evaluate them using eight state-of-the-art ARM disassembly tools (including both commercial and noncommercial ones) in three different versions on their capabilities to locate instruction boundary, function boundary, and function signature. Instruction and function boundary are two fundamental primitives that the other primitives are built upon while function signature is significant for control flow integrity (CFI) techniques. Our work reveals some observations that have not been systematically summarized and/or confirmed. For instance, we find that the existence of both ARM and Thumb instruction sets, and the reuse of the BL instruction for both function calls and branches bring serious challenges to disassembly tools. Our evaluation sheds light on the limitations of state-of-the-art disassembly tools and points out potential directions for improvement.
- reverse engineering
ASJC Scopus subject areas