You have no items in your shopping cart.
ABSTRACT
Accurately estimating software development costs is paramount for software project success. Global attention has focused on addressing inaccuracies in software cost estimating models leading to numerous approaches. Although machine learning algorithm techniques have seen improvement, the challenge of varying bias and variance from weaker learners persists, leading to the adoption of ensemble learning techniques for increased accuracy. This research extends previous work on the ensemble learning Software Cost Estimation (SCE) models through the development of a Stacking Ensemble Learning Model for SCE (SELM-SCE) to reinforce the accuracy of software cost estimation by applying the combination of unsupervised K-Means and supervised K-Nearest Neighbor and LASSO machine learning methods on software projects from two datasets obtained from Promise Software Engineering Repository. The proposed SELM-SCE model was subjected to two trainings comprised of K-Means and SMOTE technique. All model training examined and reported on the performances of some essential questions related to the machine learning method. The performance of the proposed SELM-SCE model was first evaluated against three regression performance metrics. They are coefficient of determination (R2 ), root mean squared error (RMSE), and root absolute error (MAE). The results from the conducted regression experiments have shown that the SMOTE SELM-SCE without unsupervised machine learning improves the accuracy of the model performance with the following respective R 2 , RMSE, MAE values: 93.15, 1687.0780, 1188.0015 for Desharnais dataset and 94.67, 4532.7856 and 255.9115 for Maxwell dataset. The performance of SELMSCE was further evaluated against four classification performance metrics to compare with Hidmi and Sakar (2017) which is the existing boosting ensemble model. Experimental analysis of the results shows that SELM-SCE models achieved a prediction accuracy of 97.14% and 96.96% respectively higher than the existing works that exhibited prediction accuracy of 74.28% and 81.81% respectively. Other metrics obtained for the proposed model were 96.29% - 92.85%, 100% - 100%, and 96.29% - 96.29% as precisions, recalls, and F1-scores respectively.