TY - GEN
T1 - REPT
T2 - 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018
AU - Cui, Weidong
AU - Ge, Xinyang
AU - Kasikci, Baris
AU - Niu, Ben
AU - Sharma, Upamanyu
AU - Wang, Ruoyu
AU - Yun, Insu
N1 - Funding Information:
We thank our shepherd, Xi Wang, and other reviewers for their insightful feedback. We are very grateful for all the help from our colleagues on the Microsoft Windows team. In particular, Alan Auerbach, Peter Gilson, Khom Kaowthumrong, Graham McIntyre, Timothy Misiak, Jordi Mola, Prashant Ratanchandani, and Pedro Teixeira provided tremendous help and valuable perspectives throughout the project. We also thank Bee-man Strong from Intel for answering numerous questions about Intel Processor Trace.
Publisher Copyright:
© Proceedings of NSDI 2010: 7th USENIX Symposium on Networked Systems Design and Implementation. All rights reserved.
PY - 2007
Y1 - 2007
N2 - Debugging software failures in deployed systems is important because they impact real users and customers. However, debugging such failures is notoriously hard in practice because developers have to rely on limited information such as memory dumps. The execution history is usually unavailable because high-fidelity program tracing is not affordable in deployed systems. In this paper, we present REPT, a practical system that enables reverse debugging of software failures in deployed systems. REPT reconstructs the execution history with high fidelity by combining online lightweight hardware tracing of a program's control flow with offline binary analysis that recovers its data flow. It is seemingly impossible to recover data values thousands of instructions before the failure due to information loss and concurrent execution. REPT tackles these challenges by constructing a partial execution order based on timestamps logged by hardware and iteratively performing forward and backward execution with error correction. We design and implement REPT, deploy it on Microsoft Windows, and integrate it into WinDbg. We evaluate REPT on 16 real-world bugs and show that it can recover data values accurately (92% on average) and efficiently (in less than 20 seconds) for these bugs. We also show that it enables effective reverse debugging for 14 bugs.
AB - Debugging software failures in deployed systems is important because they impact real users and customers. However, debugging such failures is notoriously hard in practice because developers have to rely on limited information such as memory dumps. The execution history is usually unavailable because high-fidelity program tracing is not affordable in deployed systems. In this paper, we present REPT, a practical system that enables reverse debugging of software failures in deployed systems. REPT reconstructs the execution history with high fidelity by combining online lightweight hardware tracing of a program's control flow with offline binary analysis that recovers its data flow. It is seemingly impossible to recover data values thousands of instructions before the failure due to information loss and concurrent execution. REPT tackles these challenges by constructing a partial execution order based on timestamps logged by hardware and iteratively performing forward and backward execution with error correction. We design and implement REPT, deploy it on Microsoft Windows, and integrate it into WinDbg. We evaluate REPT on 16 real-world bugs and show that it can recover data values accurately (92% on average) and efficiently (in less than 20 seconds) for these bugs. We also show that it enables effective reverse debugging for 14 bugs.
UR - http://www.scopus.com/inward/record.url?scp=85066881748&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85066881748&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:85066881748
T3 - Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018
SP - 17
EP - 32
BT - Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2018
PB - USENIX Association
Y2 - 8 October 2018 through 10 October 2018
ER -