TY - GEN
T1 - Malcertain
T2 - 44th ACM/IEEE International Conference on Software Engineering, ICSE 2024
AU - Li, Haodong
AU - Xu, Guosheng
AU - Wang, Liu
AU - Xiao, Xusheng
AU - Luo, Xiapu
AU - Xu, Guoai
AU - Wang, Haoyu
N1 - Publisher Copyright:
© 2024 ACM.
PY - 2024
Y1 - 2024
N2 - The long-lasting Android malware threat has attracted significant research efforts in malware detection. In particular, by modeling malware detection as a classification problem, machine learning based approaches, especially deep neural network (DNN) based approaches, are increasingly being used for Android malware detection and have achieved significant improvements over other detection approaches such as signature-based approaches. However, as Android malware evolve rapidly and the presence of adversarial samples, DNN models trained on early constructed samples often yield poor decisions when used to detect newly emerging samples. Fundamentally, this phenomenon can be summarized as the uncertainly in the data (noise or randomness) and the weakness in the training process (insufficient training data). Overlooking these uncertainties poses risks in the model predictions. In this paper, we take the first step to estimate the prediction uncertainty of DNN models in malware detection and leverage these estimates to enhance Android malware detection techniques. Specifically, be-sides training a DNN model to predict malware, we employ several uncertainty estimation methods to train a Correction Model that de-termines whether a sample is correctly or incorrectly predicted by the DNN model. We then leverage the estimated uncertainty output by the Correction Model to correct the prediction results, improving the accuracy of the DNN model. Experimental results show that our proposed Malcertain effectively improves the accuracy of the underlying DNN models for Android malware detection by around 21% and significantly improves the detection effectiveness of adversarial Android malware samples by up to 94.38%. Our research sheds light on the promising direction that leverages prediction uncertainty to improve prediction-based software engineering tasks.
AB - The long-lasting Android malware threat has attracted significant research efforts in malware detection. In particular, by modeling malware detection as a classification problem, machine learning based approaches, especially deep neural network (DNN) based approaches, are increasingly being used for Android malware detection and have achieved significant improvements over other detection approaches such as signature-based approaches. However, as Android malware evolve rapidly and the presence of adversarial samples, DNN models trained on early constructed samples often yield poor decisions when used to detect newly emerging samples. Fundamentally, this phenomenon can be summarized as the uncertainly in the data (noise or randomness) and the weakness in the training process (insufficient training data). Overlooking these uncertainties poses risks in the model predictions. In this paper, we take the first step to estimate the prediction uncertainty of DNN models in malware detection and leverage these estimates to enhance Android malware detection techniques. Specifically, be-sides training a DNN model to predict malware, we employ several uncertainty estimation methods to train a Correction Model that de-termines whether a sample is correctly or incorrectly predicted by the DNN model. We then leverage the estimated uncertainty output by the Correction Model to correct the prediction results, improving the accuracy of the DNN model. Experimental results show that our proposed Malcertain effectively improves the accuracy of the underlying DNN models for Android malware detection by around 21% and significantly improves the detection effectiveness of adversarial Android malware samples by up to 94.38%. Our research sheds light on the promising direction that leverages prediction uncertainty to improve prediction-based software engineering tasks.
KW - Android Malware Detection
KW - DNN
KW - Uncertainty
UR - http://www.scopus.com/inward/record.url?scp=85182833765&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85182833765&partnerID=8YFLogxK
U2 - 10.1145/3597503.3639122
DO - 10.1145/3597503.3639122
M3 - Conference contribution
AN - SCOPUS:85182833765
T3 - Proceedings - International Conference on Software Engineering
SP - 1850
EP - 1862
BT - Proceedings - 2024 ACM/IEEE 44th International Conference on Software Engineering, ICSE 2024
PB - IEEE Computer Society
Y2 - 14 April 2024 through 20 April 2024
ER -