Role of Feature Selection in Cross Project Software Defect Prediction- A Review

Authors

  • Muhammad Salman Saeed Virtual University of Pakistan

Keywords:

Software defect prediction, cross project defect prediction, feature selection, machine learning

Abstract

Software Defect Prediction (SDP) is crucial for enhancing software quality and minimizing issues after release. The advent of machine learning, particularly in Cross-Project Defect Prediction (CPDP), has garnered significant attention for its potential to enhance defect predictions in one project by leveraging information from another. A critical factor influencing CPDP effectiveness is feature selection, the process of identifying the most relevant features from an available set. This review article thoroughly examines the role of feature selection in CPDP. Existing feature selection methods are systematically analyzed and classified within the CPDP context, encompassing both traditional and state-of-the-art approaches. The review delves into the challenges and opportunities presented by diverse project characteristics, data heterogeneity, and the curse of dimensionality. Additionally, the article underscores how feature selection impacts model performance, generalization, and adaptability across various software projects. Through synthesizing findings from multiple studies, trends, best practices, and potential research directions in the field are identified. In conclusion, this review article provides valuable insights into the significance of feature selection for enhancing the reliability and efficiency of CPDP models.

Author Biography

Muhammad Salman Saeed, Virtual University of Pakistan

Department of Computer Science,

References

Abbas, S., Aftab, S., Khan, M. A., Ghazal, T. M., Hamadi, H. A., & Yeun, C. Y. (2023). Data and Ensemble Machine Learning Fusion Based Intelligent Software Defect Prediction System. Computers, Materials & Continua, 75(3).

Aftab, S., Abbas, S., Ghazal, T. M., Ahmad, M., Hamadi, H. A., Yeun, C. Y., & Khan, M. A. (2023). A Cloud-Based Software Defect Prediction System Using Data and Decision-Level Machine Learning Fusion. Mathematics, 11(3), 632.

Aftab, S., Alanazi, S., Ahmad, M., Khan, M. A., Fatima, A., & Elmitwally, N. S. (2021). Cloud-Based Diabetes Decision Support System Using Machine Learning Fusion. Computers, Materials & Continua, 68(1).

Ahmed, U., Issa, G. F., Khan, M. A., Aftab, S., Khan, M. F., Said, R. A., ... & Ahmad, M. (2022). Prediction of diabetes empowered with fused machine learning. IEEE Access,

, 8529-8538.

Ali, U., Aftab, S., Iqbal, A., Nawaz, Z., Bashir, M. S., & Saeed, M. A. (2020). Software defect prediction using variant based ensemble learning and feature selection techniques. Int.

J. Mod. Educ. Comput. Sci, 12(5), 29-40.

Aziz, N., & Aftab, S. (2021). Data mining framework for nutrition ranking: Methodology: SPSS modeller. International Journal of Technology, Innovation and Management (IJTIM), 1(1), 85-95.

Bhat, N. A., & Farooq, S. U. (2023). An empirical evaluation of defect prediction approaches in within-project and cross-project context. Software Quality Journal, 1-30.

Catolino, G., Di Nucci, D., & Ferrucci, F. (2019, May). Cross- project just-in-time bug prediction for mobile apps: An empirical assessment. In 2019 IEEE/ACM 6th International Conference on Mobile Software Engineering and Systems (MOBILESoft) (pp. 99-110). IEEE.

Daoud, M. S., Aftab, S., Ahmad, M., Khan, M. A., Iqbal, A., Abbas, S., ... & Ihnaini, B. (2022). Machine learning empowered software defect prediction system.

Daoud, M. S., Fatima, A., Khan, W. A., Khan, M. A., Abbas, S., Ihnaini, B., ... & Aftab, S. (2021). Joint Channel and Multi- User Detection Empowered with Machine Learning.

Ghazal, T. M., Abbas, S., Ahmad, M., & Aftab, S. (2022, February). An IoMT based Ensemble Classification Framework to Predict Treatment Response in Hepatitis C Patients. In 2022 International Conference on Business Analytics for Technology and Security (ICBATS) (pp. 1-4). IEEE.

Iqbal, A., & Aftab, S. (2019). A feed-forward and pattern recognition ANN model for network intrusion detection. International Journal of Computer Network and Information Security, 11(4), 19.

Iqbal, A., & Aftab, S. (2020). A Classification Framework for Software Defect Prediction Using Multi-filter Feature Selection Technique and MLP. International Journal of Modern Education & Computer Science, 12(1).

Iqbal, A., Aftab, S., Ali, U., Nawaz, Z., Sana, L., Ahmad, M., & Husen, A. (2019). Performance analysis of machine learning techniques on software defect prediction using NASA datasets. International Journal of Advanced Computer Science and Applications, 10(5).

Iqbal, A., Aftab, S., Ullah, I., Saeed, M. A., & Husen, A. (2019). A classification framework to detect DoS attacks. International Journal of Computer Network and Information Security, 11(9), 40-47.

Jahanshahi, H., Cevik, M., & Başar, A. (2021). Moving from cross-project defect prediction to heterogeneous defect prediction: a partial replication study. arXiv preprint arXiv:2103.03490.

Jindal, R., Ahmad, A., & Aditya, A. (2022). Ensemble Based- Cross Project Defect Prediction. In Ubiquitous Intelligent Systems: Proceedings of ICUIS 2021 (pp. 611- 620). Springer Singapore.

Kalaivani, N., & Beena, R. (2022). Improved SMOTE and Optimized Siamese Neural Networks for Class Imbalanced Heterogeneous Cross Project Defect Prediction. International Journal of Intelligent Engineering & Systems, 15(2).

Lei, T., Xue, J., & Han, W. (2020). Cross-Project Software Defect Prediction Based on Feature Selection and Transfer Learning. In Machine Learning for Cyber Security: Third

International Conference, ML4CS 2020, Guangzhou, China, October 8–10, 2020, Proceedings, Part III 3 (pp. 363-371). Springer International Publishing.

Matloob, F., Aftab, S., & Iqbal, A. (2019). A Framework for Software Defect Prediction Using Feature Selection and Ensemble Learning Techniques. International Journal of Modern Education & Computer Science, 11(12).

Nawaz, Z., Aftab, S., & Anwer, F. (2017). Simplified FDD process model. International Journal of Modern Education and Computer Science, 9(9), 53.

Omondiagbe, O. P., Licorish, S. A., & MacDonell, S. G. (2022, August). Negative Transfer in Cross Project Defect Prediction: Effect of Domain Divergence. In 2022 48th Euromicro Conference on Software Engineering and Advanced Applications (SEAA) (pp. 1-4). IEEE.

Ozturk, M. M. (2021). complexFuzzy: A novel clustering method for selecting training instances of cross-project defect prediction. Computer Science, 22(1).

Rahman, A. U., Abbas, S., Gollapalli, M., Ahmed, R., Aftab, S., Ahmad, M., ... & Mosavi, A. (2022). Rainfall prediction system using machine learning fusion for smart cities. Sensors, 22(9), 3504.

Reddy, J. M., Muthukumaran, K., Shahriar, H., Clincy, V., & Sakib,

N. (2022, June). Comprehensive Feature Extraction for Cross-Project Software Defect Prediction. In 2022 IEEE 46th Annual Computers, Software, and Applications Conference (COMPSAC) (pp. 450-451). IEEE.

Shabib Aftab, M. A., Hameed, N., Bashir, M. S., Ali, I., & Nawaz, Z. (2018). Rainfall prediction in Lahore City using data mining techniques. International journal of advanced computer science and applications, 9(4).

Sharma, U., & Sadam, R. (2023). How far does the predictive decision impact the software project? The cost, service time, and failure analysis from a cross-project defect prediction model. Journal of Systems and Software, 195, 111522.

Sheng, L., Lu, L., & Lin, J. (2020). An adversarial discriminative convolutional neural network for cross-project defect prediction. IEEE Access, 8, 55241-55253.

Vijayaraj, N., & Ravi, T. N. (2021). Cross-Project Defect Prediction based on Cognitive Metrics Using Sampled Boosting. Annals of the Romanian Society for Cell Biology, 25(6), 7431-7440.

Wang, W., Zhao, H., Li, Y., Su, J., Lu, J., & Wang, B. (2022,

November). Research on cross-project software defect prediction based on feature transfer method. In Proceedings of the 4th International Conference on Advanced Information Science and System (pp. 1-5).

Wen, W., Zhang, R., Wang, C., Shen, C., Yu, M., Zhang, S., & Gao,

X. (2022). A Cross-Project Defect Prediction Model Based on Deep Learning With Self-Attention. IEEE Access, 10, 110385-110401.

Yuan, Z., Chen, X., Cui, Z., & Mu, Y. (2020). ALTRA: Cross-project software defect prediction via active learning and tradaboost. IEEE Access, 8, 30037-30049.

Zhu, Y., Zhao, Y., Yu, Q., & Chen, X. (2022). Cross-Project Defect Prediction Method based on Feature Distribution Alignment and Neighborhood Instance Selection. Journal of Internet Technology, 23(4), 761-769.

Zou, J., Li, Z., Liu, X., & Tong, H. (2023). MSCPDPLab: A MATLAB

toolbox for transfer learning based multi-source cross- project defect prediction. SoftwareX, 21, 101286.

Downloads

Published

2023-12-22

How to Cite

Muhammad Salman Saeed. (2023). Role of Feature Selection in Cross Project Software Defect Prediction- A Review. International Journal of Computations, Information and Manufacturing (IJCIM), 3(2), 37–56. Retrieved from https://journals.gaftim.com/index.php/ijcim/article/view/277