Software Defect Prediction Using Clustering: A Comprehensive Literature Review
Keywords:
Software Defect Prediction, Clustering, Software EngineeringAbstract
Anticipating software defects prior to the testing phase proves advantageous for efficient resource allocation to develop the high-quality software, a necessity for any organization. Machine learning (ML) methodologies play a pivotal role in addressing these issues, leading to the creation of numerous predictive models designed to categorize software modules as either defective or non-defective. Several obstacles hinder the analysis of software data that is defected, encompassing issues like redundancy, correlation, irrelevant features, missing data points, and an unbalance distribution between faulty and non-faulty classes. Both supervised and unsupervised machine learning techniques have garnered global attention from practitioners and researchers as viable approaches to tackle these challenges, yielding noticeable enhancements in defect prediction accuracy. This review paper examines clustering unsupervised machine learning technique developed for software defect prediction spanning the years 2017 to 2023 and covered the 15 researches.
References
Thirumoorthy, K., & Britto, J. J. J. (2022). A clustering approach for software defect prediction using hybrid social mimic optimization algorithm. Computing, 104(12), 2605-
Khalid, A., Badshah, G., Ayub, N., Shiraz, M., & Ghouse, M. (2023). Software Defect Prediction Analysis Using Machine Learning Techniques. Sustainability, 15(6), 5517.
Zhang, S., Jiang, S., & Yan, Y. (2023). A Software Defect Prediction Approach Based on Hybrid Feature
Dimensionality Reduction. Scientific Programming, 2023.
Tang, S., Huang, S., Liu, E., Yao, Y., Wu, K., & Ji, H. (2022).
Tsbagging: A Novel Cross-Project Software Defect Prediction Algorithm Based on Semisupervised Clustering. Scientific Programming, 2022.
Almayyan, W. (2021). Towards predicting software defects with clustering techniques. International Journal of Artificial Intelligence and Applications (IJAIA), 12(1).
Xiaolong, X. U., Wen, C. H. E. N., & Xinheng, W. A. N. G. (2021). RFC: a feature selection algorithm for software defect prediction. Journal of Systems Engineering and Electronics, 32(2), 389-398.
Alsawalqah, H., Hijazi, N., Eshtay, M., Faris, H., Radaideh, A. A., Aljarah, I., & Alshamaileh, Y. (2020). Software defect prediction using heterogeneous ensemble classification based on segmented patterns. Applied Sciences, 10(5), 1745.
Annisa, R., Rosiyadi, D., & Riana, D. (2020). Improved point center algorithm for k-means clustering to increase software defect prediction. Int. J. Adv. Intell. Informatics, 6(3), 328-339.
Balogun, A., Oladele, R., Mojeed, H., Amin-Balogun, B., Adeyemo, V. E., & Aro, T. O. (2019). Performance analysis of selected clustering techniques for software defects prediction.
Usman-Hamza, F. E., Atte, A. F., Balogun, A. O., Mojeed, H. A., Bajeh, A. O., & Adeyemo, V. E. (2019). Impact of feature selection on classification via clustering techniques in software defect prediction. Journal of Computer Science and Its Application, 26(1).
Gong, L., Jiang, S., & Jiang, L. (2019). Tackling class imbalance problem in software defect prediction through cluster- based over-sampling with filtering. IEEE Access, 7, 145725-145737.
Ayon, S. I. (2019, May). Neural network based software defect prediction using genetic algorithm and particle swarm optimization. In 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT) (pp. 1-4). IEEE.
Yang, Y., Yang, J., & Qian, H. (2018, March). Defect prediction by using cluster ensembles. In 2018 tenth international conference on advanced computational intelligence (ICACI) (pp. 631-636). IEEE.
Ni, C., Liu, W., Gu, Q., Chen, X., & Chen, D. (2017, July). FeSCH: a
feature selection method using clusters of hybrid-data for cross-project defect prediction. In 2017 IEEE 41st Annual Computer Software and Applications Conference (COMPSAC) (Vol. 1, pp. 51-56). IEEE.
Sharma, T., Jatain, A., Bhaskar, S., & Pabreja, K. (2023). Ensemble Machine Learning Paradigms in Software Defect Prediction. Procedia Computer Science, 218, 199- 209.
Mafarja, M., Thaher, T., Al-Betar, M. A., Too, J., Awadallah, M. A., Abu Doush, I., & Turabieh, H. (2023). Classification framework for faulty-software using enhanced exploratory whale optimizer-based feature selection scheme and random forest ensemble learning. Applied Intelligence, 1-43.
Bowes, D., Hall, T., & Petrić, J. (2018). Software defect prediction: do different classifiers find the same defects?. Software Quality Journal, 26, 525-552.
Huda, S., Liu, K., Abdelrazek, M., Ibrahim, A., Alyahya, S., Al- Dossari, H., & Ahmad, S. (2018). An ensemble oversampling model for class imbalance problem in software defect prediction. IEEE access, 6, 24184- 24195.
Matloob, F., Ghazal, T. M., Taleb, N., Aftab, S., Ahmad, M., Khan,
M. A., ... & Soomro, T. R. (2021). Software defect prediction using ensemble learning: A systematic literature review. IEEE Access, 9, 98754-98771.
Aftab, S., Alanazi, S., Ahmad, M., Khan, M. A., Fatima, A., & Elmitwally, N. S. (2021). Cloud-Based Diabetes Decision Support System Using Machine Learning Fusion. Computers, Materials & Continua, 68(1).