Not All Coverage Measurements Are Equal: Fuzzing by Coverage Accounting for Input Prioritization

Yanhao Wang; Xiangkun Jia; Yuwei Liu; Kyle Zeng; Tiffany Bao; Dinghao Wu; Purui Su

doi:10.14722/ndss.2020.24422

Not All Coverage Measurements Are Equal: Fuzzing by Coverage Accounting for Input Prioritization

Yanhao Wang, Xiangkun Jia, Yuwei Liu, Kyle Zeng, Tiffany Bao, Dinghao Wu, Purui Su

Engineering, Ira A. Fulton Schools of (IAFSE)

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

70 Scopus citations

Abstract

Coverage-based fuzzing has been actively studied and widely adopted for finding vulnerabilities in real-world software applications. With coverage information, such as statement coverage and transition coverage, as the guidance of input mutation, coverage-based fuzzing can generate inputs that cover more code and thus find more vulnerabilities without prerequisite information such as input format. Current coverage-based fuzzing tools treat covered code equally. All inputs that contribute to new statements or transitions are kept for future mutation no matter what the statements or transitions are and how much they impact security. Although this design is reasonable from the perspective of software testing that aims at full code coverage, it is inefficient for vulnerability discovery since that 1) current techniques are still inadequate to reach full coverage within a reasonable amount of time, and that 2) we always want to discover vulnerabilities early so that it can be fixed promptly. Even worse, due to the non-discriminative code coverage treatment, current fuzzing tools suffer from recent anti-fuzzing techniques and become much less effective in finding vulnerabilities from programs enabled with anti-fuzzing schemes. To address the limitation caused by equal coverage, we propose coverage accounting, a novel approach that evaluates coverage by security impacts. Coverage accounting attributes edges by three metrics based on three different levels: function, loop and basic block. Based on the proposed metrics, we design a new scheme to prioritize fuzzing inputs and develop TortoiseFuzz, a greybox fuzzer for finding memory corruption vulnerabilities. We evaluated TortoiseFuzz on 30 real-world applications and compared it with 6 state-of-the-art greybox and hybrid fuzzers: AFL, AFLFast, FairFuzz, MOPT, QSYM, and Angora. Statistically, TortoiseFuzz found more vulnerabilities than 5 out of 6 fuzzers (AFL, AFLFast, FairFuzz, MOPT, and Angora), and it had a comparable result to QSYM yet only consumed around 2% of QSYM's memory usage on average. We also compared coverage accounting metrics with two other metrics, AFL-Sensitive and LEOPARD, and TortoiseFuzz performed significantly better than both metrics in finding vulnerabilities. Furthermore, we applied the coverage accounting metrics to QSYM and noticed that coverage accounting helps increase the number of discovered vulnerabilities by 28.6% on average. TortoiseFuzz found 20 zero-day vulnerabilities with 15 confirmed with CVE identifications.

Original language	English (US)
Title of host publication	27th Annual Network and Distributed System Security Symposium, NDSS 2020
Publisher	The Internet Society
ISBN (Electronic)	1891562614, 9781891562617
DOIs	https://doi.org/10.14722/ndss.2020.24422
State	Published - 2020
Event	27th Annual Network and Distributed System Security Symposium, NDSS 2020 - San Diego, United States Duration: Feb 23 2020 → Feb 26 2020

Publication series

Name	27th Annual Network and Distributed System Security Symposium, NDSS 2020

Conference

Conference	27th Annual Network and Distributed System Security Symposium, NDSS 2020
Country/Territory	United States
City	San Diego
Period	2/23/20 → 2/26/20

ASJC Scopus subject areas

Computer Networks and Communications
Control and Systems Engineering
Safety, Risk, Reliability and Quality

Access to Document

10.14722/ndss.2020.24422

Cite this

Wang, Y., Jia, X., Liu, Y., Zeng, K., Bao, T., Wu, D., & Su, P. (2020). Not All Coverage Measurements Are Equal: Fuzzing by Coverage Accounting for Input Prioritization. In 27th Annual Network and Distributed System Security Symposium, NDSS 2020 (27th Annual Network and Distributed System Security Symposium, NDSS 2020). The Internet Society. https://doi.org/10.14722/ndss.2020.24422

Not All Coverage Measurements Are Equal: Fuzzing by Coverage Accounting for Input Prioritization. / Wang, Yanhao; Jia, Xiangkun; Liu, Yuwei et al.
27th Annual Network and Distributed System Security Symposium, NDSS 2020. The Internet Society, 2020. (27th Annual Network and Distributed System Security Symposium, NDSS 2020).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Wang, Y, Jia, X, Liu, Y, Zeng, K, Bao, T, Wu, D & Su, P 2020, Not All Coverage Measurements Are Equal: Fuzzing by Coverage Accounting for Input Prioritization. in 27th Annual Network and Distributed System Security Symposium, NDSS 2020. 27th Annual Network and Distributed System Security Symposium, NDSS 2020, The Internet Society, 27th Annual Network and Distributed System Security Symposium, NDSS 2020, San Diego, United States, 2/23/20. https://doi.org/10.14722/ndss.2020.24422

@inproceedings{ddcd5a0e64df4ac49200742b346d22e8,

title = "Not All Coverage Measurements Are Equal: Fuzzing by Coverage Accounting for Input Prioritization",

abstract = "Coverage-based fuzzing has been actively studied and widely adopted for finding vulnerabilities in real-world software applications. With coverage information, such as statement coverage and transition coverage, as the guidance of input mutation, coverage-based fuzzing can generate inputs that cover more code and thus find more vulnerabilities without prerequisite information such as input format. Current coverage-based fuzzing tools treat covered code equally. All inputs that contribute to new statements or transitions are kept for future mutation no matter what the statements or transitions are and how much they impact security. Although this design is reasonable from the perspective of software testing that aims at full code coverage, it is inefficient for vulnerability discovery since that 1) current techniques are still inadequate to reach full coverage within a reasonable amount of time, and that 2) we always want to discover vulnerabilities early so that it can be fixed promptly. Even worse, due to the non-discriminative code coverage treatment, current fuzzing tools suffer from recent anti-fuzzing techniques and become much less effective in finding vulnerabilities from programs enabled with anti-fuzzing schemes. To address the limitation caused by equal coverage, we propose coverage accounting, a novel approach that evaluates coverage by security impacts. Coverage accounting attributes edges by three metrics based on three different levels: function, loop and basic block. Based on the proposed metrics, we design a new scheme to prioritize fuzzing inputs and develop TortoiseFuzz, a greybox fuzzer for finding memory corruption vulnerabilities. We evaluated TortoiseFuzz on 30 real-world applications and compared it with 6 state-of-the-art greybox and hybrid fuzzers: AFL, AFLFast, FairFuzz, MOPT, QSYM, and Angora. Statistically, TortoiseFuzz found more vulnerabilities than 5 out of 6 fuzzers (AFL, AFLFast, FairFuzz, MOPT, and Angora), and it had a comparable result to QSYM yet only consumed around 2% of QSYM's memory usage on average. We also compared coverage accounting metrics with two other metrics, AFL-Sensitive and LEOPARD, and TortoiseFuzz performed significantly better than both metrics in finding vulnerabilities. Furthermore, we applied the coverage accounting metrics to QSYM and noticed that coverage accounting helps increase the number of discovered vulnerabilities by 28.6% on average. TortoiseFuzz found 20 zero-day vulnerabilities with 15 confirmed with CVE identifications.",

author = "Yanhao Wang and Xiangkun Jia and Yuwei Liu and Kyle Zeng and Tiffany Bao and Dinghao Wu and Purui Su",

note = "Publisher Copyright: {\textcopyright} 2020 27th Annual Network and Distributed System Security Symposium, NDSS 2020. All Rights Reserved.; 27th Annual Network and Distributed System Security Symposium, NDSS 2020 ; Conference date: 23-02-2020 Through 26-02-2020",

year = "2020",

doi = "10.14722/ndss.2020.24422",

language = "English (US)",

series = "27th Annual Network and Distributed System Security Symposium, NDSS 2020",

publisher = "The Internet Society",

booktitle = "27th Annual Network and Distributed System Security Symposium, NDSS 2020",

}

TY - GEN

T1 - Not All Coverage Measurements Are Equal

T2 - 27th Annual Network and Distributed System Security Symposium, NDSS 2020

AU - Wang, Yanhao

AU - Jia, Xiangkun

AU - Liu, Yuwei

AU - Zeng, Kyle

AU - Bao, Tiffany

AU - Wu, Dinghao

AU - Su, Purui

PY - 2020

Y1 - 2020

N2 - Coverage-based fuzzing has been actively studied and widely adopted for finding vulnerabilities in real-world software applications. With coverage information, such as statement coverage and transition coverage, as the guidance of input mutation, coverage-based fuzzing can generate inputs that cover more code and thus find more vulnerabilities without prerequisite information such as input format. Current coverage-based fuzzing tools treat covered code equally. All inputs that contribute to new statements or transitions are kept for future mutation no matter what the statements or transitions are and how much they impact security. Although this design is reasonable from the perspective of software testing that aims at full code coverage, it is inefficient for vulnerability discovery since that 1) current techniques are still inadequate to reach full coverage within a reasonable amount of time, and that 2) we always want to discover vulnerabilities early so that it can be fixed promptly. Even worse, due to the non-discriminative code coverage treatment, current fuzzing tools suffer from recent anti-fuzzing techniques and become much less effective in finding vulnerabilities from programs enabled with anti-fuzzing schemes. To address the limitation caused by equal coverage, we propose coverage accounting, a novel approach that evaluates coverage by security impacts. Coverage accounting attributes edges by three metrics based on three different levels: function, loop and basic block. Based on the proposed metrics, we design a new scheme to prioritize fuzzing inputs and develop TortoiseFuzz, a greybox fuzzer for finding memory corruption vulnerabilities. We evaluated TortoiseFuzz on 30 real-world applications and compared it with 6 state-of-the-art greybox and hybrid fuzzers: AFL, AFLFast, FairFuzz, MOPT, QSYM, and Angora. Statistically, TortoiseFuzz found more vulnerabilities than 5 out of 6 fuzzers (AFL, AFLFast, FairFuzz, MOPT, and Angora), and it had a comparable result to QSYM yet only consumed around 2% of QSYM's memory usage on average. We also compared coverage accounting metrics with two other metrics, AFL-Sensitive and LEOPARD, and TortoiseFuzz performed significantly better than both metrics in finding vulnerabilities. Furthermore, we applied the coverage accounting metrics to QSYM and noticed that coverage accounting helps increase the number of discovered vulnerabilities by 28.6% on average. TortoiseFuzz found 20 zero-day vulnerabilities with 15 confirmed with CVE identifications.

AB - Coverage-based fuzzing has been actively studied and widely adopted for finding vulnerabilities in real-world software applications. With coverage information, such as statement coverage and transition coverage, as the guidance of input mutation, coverage-based fuzzing can generate inputs that cover more code and thus find more vulnerabilities without prerequisite information such as input format. Current coverage-based fuzzing tools treat covered code equally. All inputs that contribute to new statements or transitions are kept for future mutation no matter what the statements or transitions are and how much they impact security. Although this design is reasonable from the perspective of software testing that aims at full code coverage, it is inefficient for vulnerability discovery since that 1) current techniques are still inadequate to reach full coverage within a reasonable amount of time, and that 2) we always want to discover vulnerabilities early so that it can be fixed promptly. Even worse, due to the non-discriminative code coverage treatment, current fuzzing tools suffer from recent anti-fuzzing techniques and become much less effective in finding vulnerabilities from programs enabled with anti-fuzzing schemes. To address the limitation caused by equal coverage, we propose coverage accounting, a novel approach that evaluates coverage by security impacts. Coverage accounting attributes edges by three metrics based on three different levels: function, loop and basic block. Based on the proposed metrics, we design a new scheme to prioritize fuzzing inputs and develop TortoiseFuzz, a greybox fuzzer for finding memory corruption vulnerabilities. We evaluated TortoiseFuzz on 30 real-world applications and compared it with 6 state-of-the-art greybox and hybrid fuzzers: AFL, AFLFast, FairFuzz, MOPT, QSYM, and Angora. Statistically, TortoiseFuzz found more vulnerabilities than 5 out of 6 fuzzers (AFL, AFLFast, FairFuzz, MOPT, and Angora), and it had a comparable result to QSYM yet only consumed around 2% of QSYM's memory usage on average. We also compared coverage accounting metrics with two other metrics, AFL-Sensitive and LEOPARD, and TortoiseFuzz performed significantly better than both metrics in finding vulnerabilities. Furthermore, we applied the coverage accounting metrics to QSYM and noticed that coverage accounting helps increase the number of discovered vulnerabilities by 28.6% on average. TortoiseFuzz found 20 zero-day vulnerabilities with 15 confirmed with CVE identifications.

UR - http://www.scopus.com/inward/record.url?scp=85176103921&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85176103921&partnerID=8YFLogxK

U2 - 10.14722/ndss.2020.24422

DO - 10.14722/ndss.2020.24422

M3 - Conference contribution

AN - SCOPUS:85176103921

T3 - 27th Annual Network and Distributed System Security Symposium, NDSS 2020

BT - 27th Annual Network and Distributed System Security Symposium, NDSS 2020

PB - The Internet Society

Y2 - 23 February 2020 through 26 February 2020

ER -

Not All Coverage Measurements Are Equal: Fuzzing by Coverage Accounting for Input Prioritization

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this