Android Malware Detection by XGBoost Algorithm
Keywords:
Malware detection, Artificial Intelligence, Machine Learning, Anti-malware, Android, XGBoost, Ensemble ClassifierAbstract
Today, smartphones are prevalent for personal and corporate use and have become the new personal computer due to their portability, ease of use, and functionality (such as video conferencing, Internet browsing, e-mail, continuous wireless and data connectivity, worldwide map location services, and countless mobile applications such as banking applications). On the other hand, we store many sensitive and private information daily on smart devices. This information is of interest to malicious writers who are developing malware to steal information from mobile devices. Unfortunately, the open source and widespread adoption of the Android operating system has made it the most targeted of the four popular mobile platforms by malware writers. Many researchers have tried to identify malware using program signatures, which have been successful to some extent. However, the signature cannot effectively identify new and unknown malware. For this reason, in this article, we propose a method that designs a machine-learning model for Android malware detection based on the properties of Permissions, Intents APKs. In this study, we evaluated more than 25,000 Android samples belonging to malware and trusted samples. Experimental results show the effectiveness of the proposed method by obtaining 96.27% accuracy.
Downloads
References
A. Muzaffar, H. R. Hassen, H. Zantout, and M. A. Lones, "A Comprehensive Investigation of Feature and Model Importance in Android Malware Detection," 2023. [Online]. Available: https://doi.org/10.48550/arXiv.2301.12778.
S. O'Dea, "Forecast number of mobile users worldwide from 2020 to 2025," 2020. [Online]. Available: https://www.statista.com/statistics/218984/number-of-global-mobile-users-since-2010/.
I. U. Haq, T. A. Khan, A. Akhunzada, and X. Liu, "MalDroid: Secure DL‐enabled intelligent malware detection framework," IET Communications, vol. 16, no. 10, pp. 1160-1171, 2022, doi: 10.1049/cmu2.12265.
A. Fournier, F. El Khoury, and S. Pierre, "Classification method for malware detection on android devices," in Proceedings of the Future Technologies Conference (FTC) 2020, Volume 3: Springer International Publishing, 2021, pp. 810-829.
Y. Liu, C. Tantithamthavorn, L. Li, and Y. Liu, "Explainable AI for Android Malware Detection: Towards Understanding Why the Models Perform So Well?," in 2022 IEEE 33rd International Symposium on Software Reliability Engineering (ISSRE), 2022, pp. 169-180, doi: 10.1109/ISSRE55969.2022.00026.
K. Beckert-Plewka, H. Gierow, V. Haake, and S. Karpenstein, "G DATA Mobile Malware Report: Harmful Android Apps Every Eight Seconds," 2020.
D. Arp, M. Spreitzenbarth, M. Hubner, H. Gascon, and K. Rieck, "Drebin: Effective and explainable detection of android malware in your pocket," in NDSS, 2014, vol. 14, pp. 23-26, doi: 10.14722/ndss.2014.23247.
B. Yu, Y. Fang, Q. Yang, Y. Tang, and L. Liu, "A survey of malware behavior description and analysis," Frontiers of Information Technology & Electronic Engineering, vol. 19, pp. 583-603, 2018, doi: 10.1631/FITEE.1601745.
Z. Liu, R. Wang, N. Japkowicz, D. Tang, W. Zhang, and J. Zhao, "Research on unsupervised feature learning for android malware detection based on restricted Boltzmann machines," Future Generation Computer Systems, vol. 120, pp. 91-108, 2021, doi: 10.1016/j.future.2021.02.015.
M. Yang, X. Chen, Y. Luo, and H. Zhang, "An Android Malware Detection Model Based on DT‐SVM," Security and Communication Networks, vol. 2020, no. 1, p. 8841233, 2020, doi: 10.1155/2020/8841233.
M. N. AlJarrah, Q. M. Yaseen, and A. M. Mustafa, "A context-aware android malware detection approach using machine learning," Information, vol. 13, no. 12, p. 563, 2022, doi: 10.3390/info13120563.
B. Sanz, I. Santos, C. Laorden, X. Ugarte-Pedrero, P. G. Bringas, and G. Álvarez, "Puma: Permission usage to detect malware in android," in International Joint Conference CISIS'12-ICEUTE 12-SOCO 12 Special Sessions: Springer Berlin Heidelberg, 2013, pp. 289-298.
N. Peiravian and X. Zhu, "Machine learning for android malware detection using permission and API calls," in 2013 IEEE 25th International Conference on Tools with Artificial Intelligence, 2013, pp. 300-305, doi: 10.1109/ICTAI.2013.53.
J. Jung, K. Lim, B. Kim, S. J. Cho, S. Han, and K. Suh, "Detecting malicious android apps using the popularity and relations of APIs," in 2019 IEEE Second International Conference on Artificial Intelligence and Knowledge Engineering (AIKE), 2019, pp. 309-312, doi: 10.1109/AIKE.2019.00062.
W. Li, J. Ge, and G. Dai, "Detecting malware for android platform: An SVM-based approach," in 2015 IEEE 2nd International Conference on Cyber Security and Cloud Computing, 2015, pp. 464-469, doi: 10.1109/CSCloud.2015.50.
A. Fournier, F. El Khoury, and S. Pierre, "A client/server malware detection model based on machine learning for android devices," IoT, vol. 2, no. 3, pp. 355-374, 2021, doi: 10.3390/iot2030019.
E. J. Alqahtani, R. Zagrouba, and A. Almuhaideb, "A survey on android malware detection techniques using machine learning algorithms," in 2019 Sixth International Conference on Software Defined Systems (SDS), 2019, pp. 110-117, doi: 10.1109/SDS.2019.8768729.
J. Toldinas, A. Venčkauskas, R. Damaševičius, Š. Grigaliūnas, N. Morkevičius, and E. Baranauskas, "A novel approach for network intrusion detection using multistage deep learning image recognition," Electronics, vol. 10, no. 15, p. 1854, 2021, doi: 10.3390/electronics10151854.
H. Zhu, Y. Li, R. Li, J. Li, Z. You, and H. Song, "SEDMDroid: An enhanced stacking ensemble framework for Android malware detection," IEEE Transactions on Network Science and Engineering, vol. 8, no. 2, pp. 984-994, 2020, doi: 10.1109/TNSE.2020.2996379.
J. H. Friedman, "Greedy function approximation: a gradient boosting machine," Annals of Statistics, pp. 1189-1232, 2001, doi: 10.1214/aos/1013203451.
X. Yao, X. Fu, and C. Zong, “Short-term load forecasting method based on feature preference strategy and LightGBM-XGBoost,” IEEE Access, vol. 10, pp. 75257–75268, 2022
A. Martín, R. Lara-Cabrera, and D. Camacho, "Android malware detection through hybrid features fusion and ensemble classifiers: The AndroPyTool framework and the OmniDroid dataset," Information Fusion, vol. 52, pp. 128-142, 2019, doi: 10.1016/j.inffus.2018.12.006.
S. Nazari Nezhad, M. H. Zahedi, and E. Farahani, "Detecting diseases in medical prescriptions using data mining methods," BioData Mining, vol. 15, no. 1, p. 29, 2022, doi: 10.1186/s13040-022-00314-w.
V. Avdiienko et al., "Mining apps for abnormal usage of sensitive data," in 2015 IEEE/ACM 37th IEEE International Conference on Software Engineering, 2015, vol. 1, pp. 426-436, doi: 10.1109/ICSE.2015.61.
Downloads
Published
Submitted
Revised
Accepted
Issue
Section
License
Copyright (c) 2025 Sana Nazarinezhad; Nafise Khosrojerdi, Ahmad Reza Shafieesabet (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.