After the parameter tuning process, it is necessary to split the datasets used for training and testing. For every used dataset, in order to detect the suitable percentage to train and test the data, a fitting strategy was used and a value of approximately 80% was chosen for the training set and the remainder for the test set.
Table 4 shows information about the number of effort values (columns 2 and 7), the minimum effort value (columns 3 and 8), the maximum effort value (columns 4 and 9), the average effort value (columns 5 and 10) and the standard deviation (columns 6 and 11) used both in the training stage of the intelligent methods and in their testing stage.
4.1. K-Nearest Neighbours
The K-nearest neighbours method, having the six inputs presented in
Table 3 and an output (estimated effort), was set with the following features: Minkowski metric for distance computation, leaf size equal with 30, and uniform weights. To implement this method, the KNeighborsRegressor function from the SKlearn 1.0.1 library was used. For the K-nearest neighbours method, in the parameter tuning process, the following two parameters were used to determine the performance of this method:
The usefulness of the Minkowski distance within the KNN method consists in determining which neighbours will be analysed to compare their characteristics with those of a new instance for which a new prediction is determined. Thereby, distance metrics are used calculate which are the neighbours with the most appropriate features and choose the first K neighbours to obtain the new prediction. For this second parameter, three values between 1 and 3 were used by the KNN method to predict the effort. If the parameter p is equal to the value 1, the Minkowski distance is reduced to the Manhattan distance [
38] given by the following formula:
In the case when the parameter p is equal with value 2, the Minkowski distance is transformed into the Euclidean distance [
39], represented by the next formula:
Following the used values in the parameter tuning process (eight values for parameter k and three values for parameter p), 24 variants of the KNN method were trained and tested. In
Table 5, columns 3 to 5 show the values obtained for the four used metrics by these 24 variants of the KNN method: the third column shows the minimum value, the fourth column shows the maximum value, and the fifth column shows the average value for each metric. Columns 6 and 7 present the values of the parameters for which the minimum values of the four metrics were obtained.
The last three columns show information about the estimated effort by the KNN model for which optimum values of the metrics were obtained. For all five datasets, by comparing the real effort values from
Table 4 with the estimated values from
Table 5, it is observed that all estimated intervals are included in the intervals corresponding to the real values of the effort.
4.3. Random Forest
Random forest [
41] is an estimator that fits a number of decision trees on various sub-samples of the dataset and uses averaging to improve the predictive accuracy. At the implementation of RF method, the RandomForestRegressor function from the SKlearn 1.0.1 library was used. The RF method, having the six inputs presented in
Table 3 and an output (estimated effort), was set with the following features: squared error as a function to measure the quality of a split, and the value of 2 for the minimum number of samples required to split an internal node and also for the minimum number of samples required to be at a leaf node.
To determine the performance of the random forest method applied to the five datasets, in the parameter tuning process, the following two parameters were tuned:
d—represents the maximum depth of the tree. In this research study, six values were used for this parameter varying between 5 and 10.
t—represents the number of trees in the forest. In this research study, six values were used for this parameter belonging to the following set: {50, 100, 150, 200, 250, 300}.
Following the used values in the parameter tuning process (six values for parameter d and six values for parameter t), 36 variants of the RF method were trained and tested.
Table 7 shows the values provided by the 36 variants of the RF method, the meanings of the columns from
Table 7 being the same as those from
Table 5.
For all five datasets, by comparing the real effort values from
Table 4 with the estimated values from
Table 7, it is observed that all estimated intervals are included in the intervals corresponding to the real values of the effort.
4.4. Gradient Boosted Tree
The gradient boosted tree [
42] estimator makes an additive model in a forward stepwise manner, allowing the optimization of arbitrary differentiable loss functions. In each step, a regression tree is fitted on the negative gradient of the given loss function. For the implementation of the GBT method, the GradientBoostingRegressor function from the SKlearn 1.0.1 library was used. The GBT method, characterized by the six inputs presented in
Table 3 and an output (estimated effort), has been designed with the following features: the squared error to optimize the loss function, the friedman_mse function to measure the quality of a split, the value of 3 as the minimum number of samples required to split an internal node, and the value of 200 for the number of boosting stages to perform.
To determine the performance of the gradient boosted tree method applied to the five datasets, in the parameter tuning process, the following two parameters were tuned:
d—represents the maximum depth of the individual regression estimators. In this research study, five values were used for this parameter varying between 1 and 5.
l—represents the learning rate which shrinks the contribution of each tree. In this research study, five values were used for this parameter belonging to the following set: {0.05, 0.1, 0.15, 0.2, 0.25}.
Following the used values in the parameter tuning process (five values for parameter d and five values for parameter l), 25 variants of the GBT method were trained and tested.
Table 8 shows the values provided by the 25 variants of the GBT method, the meanings of the columns from
Table 8 being the same as those from
Table 5. For all five datasets, by comparing the real effort values from
Table 3 with the estimated values from
Table 8, it is observed that all estimated intervals are included in the real intervals.
4.6. Long Short-Term Memory
In addition to recurrent neural networks, to which category it belongs, long short-term memory [
44] adds a gate structure to its architecture. Compared to traditional neural networks, which are characterized by only one input, LSTM has two input sets: the current information and the output vector provided by the previous unit, while the complex processes associated with the LSTM cell are performed by the unit state. The essence of LSTM lies in the hidden layer, which instead of having nodes contains blocks of memory that contain components which make them smarter than nodes, called memory cells, consisting of three separate gates to regulate the flow and modification of information. The LSTM unit state consists of a forget gate, an input gate, and an output gate. The purpose of the forget gate is to choose to retain or discard some information. The input gate has the role of determining which information is retained internally and of ensuring that critical information can be saved. The output gate’s role is to ascertain the output value and to control the current LSTM state which must be pass to the enable function.
In order to implement the LSTM method, an instance of the LSTM class defined in Keras.layers was used. The LSTM method, characterized by the six inputs presented in
Table 3 and an output (estimated effort), has been designed with the following features: hyperbolic tangent as activation function, sigmoid as recurrent activation function, and a value of 0.5 for dropout probability. To determine the performance of the LSTM method applied to five datasets using an Adam optimizer, in the parameter tuning process the following two parameters were tuned:
e—represents the number of training epochs. The values of this parameter are represented by the elements of the set {100, 200, 300, 400, 500, 600, 700, 800, 900, 1000}.
n—represents the number of neurons in the LSTM hidden layer. The values of this parameter belong to the set {25, 50, 75, 100}.
Following the used values in the parameter tuning process (10 values for parameter e and four values for parameter n), 40 variants of the LSTM method were trained and tested.
Table 10 shows the values provided by the 40 variants of the LSTM method, the meanings of the columns from
Table 10 being the same as those from
Table 5. For all five datasets, by comparing the real effort values from
Table 3 with the estimated values from
Table 10, it is observed that all estimated intervals are included in the intervals corresponding to the real values of the effort.
4.7. Comparative Analysis with Previous Works
In the case of the Albrecht dataset, the minimum value obtained for the MAE metric is 4.870, the minimum value obtained for the MdAE metric is 2.911, and the minimum value obtained for the RMSE metric is 6.560, all these three values being obtained using the LSTM method.
Thus, this method provides the most efficient estimate for the Albrecht dataset among the six analysed. Comparing the results obtained for the Albrecht dataset with the value of 7.742 (
Table 11) for the MAE metric presented in the paper [
6], it is observed that the optimal variants of the DT, RF, RBT, MLP, and LSTM methods provide lower values, and therefore is a better estimate of the effort. Only in the case of the KNN method was a value greater than 7.742 obtained.
For the Kemerer dataset, the minimum value obtained for the MAE metric is 42.094, the minimum value obtained for the MdAE metric is 26.206, the minimum value obtained for the RMSE metric is 57.560, and the maximum value for the CD metric is 0.493, all these values also being obtained using the LSTM method. Therefore, the LSTM method provides the most efficient estimate for the Kemerer dataset among the six analysed methods. Comparing the results obtained for the Kemerer dataset with the value 138.911 (
Table 11) for the MAE metric presented in paper [
6], it is observed that the optimal variants of all six analysed methods provide lower values for the MAE metric, and is thus a better estimation of effort.
In the case of the Cocomo81 dataset, the minimum value obtained for the MAE metric is 178.051, the minimum value obtained for the MdAE metric is 30.642, the minimum value obtained for the RMSE metric is 245.153, and the maximum value for the CD metric is 0.897, all these values being obtained thanks to the LSTM method. Thus, this method provides the most efficient estimate for the Cocomo81 dataset among the six analysed methods. Comparing the results obtained for the Cocomo81 dataset with the value 928.3318 (
Table 11) for the MAE metric and with the value 2278.87 for the RMSE metric presented in the paper [
9], it is observed that the optimal variants of all six analysed methods provide lower values, and so it is a better estimate of the effort. Comparing the results obtained for Cocomo81 dataset with the value 255.2615 (
Table 11) for the MAE metric presented in the paper [
5], it is observed that the optimal variants of KNN, RF, GBT, MLP, and LSTM methods provide lower values, and is thus a better estimate of the effort. Only in the case of the DT method was a value greater than 255.2615 obtained. Comparing the results obtained for the Cocomo81 dataset with the value of 533.4206 for the RMSE metric presented in the paper [
5], it is observed that the optimal variants of all the six analysed methods provide lower values, and is thus a better estimation of the effort. Comparing the minimum value 178.051 obtained for the MAE metric through the LSTM method with the value 153 (
Table 11) presented in the paper [
7], the minimum value 245.153 obtained for the RMSE metric through LSTM method with the value 228.7 presented in the paper [
7], and the maximum value 0.897 obtained for the CD metric through the LSTM method with the value 0.98 presented in the same paper, it can be concluded that the model presented in [
7] is more efficient than the LSTM method used in this research work.
For the China dataset, the minimum value obtained for the MAE metric is 865.122, the minimum value obtained for the MdAE metric is 272.353, the minimum value obtained for the RMSE metric is 2034.574, and the maximum value for the CD metric is 0.902, all these values being obtained using the LSTM method. Thus, this method provides the most efficient estimate for China dataset among the six analysed methods. Comparing the results obtained for the China dataset with the value 926.182 (
Table 11) for the MAE metric presented in paper [
6], it is observed that only the LSTM method provides lower values, and is thus a better estimate of the effort. In the case of the KNN, DT, RF, GBT, and MLP methods, values higher than 926.182 were obtained, and so they are more ineffective methods. Comparing the minimum value 865.122 obtained for the MAE metric by means of the LSTM method with the value 676.6 (
Table 11) presented in the paper [
7], the minimum value 2034.574 obtained for the RMSE metric by means of the LSTM method with the value 1803.3 presented in paper [
7], and the maximum value 0.902 obtained for the CD metric through the LSTM method with the value 0.93 presented in the same paper, it can be concluded that the model presented in [
7] is more efficient than the LSTM method used in this research paper for the China dataset.
In the case of the Desharnais dataset, the minimum value obtained for the MAE metric is 1404.571, the minimum value obtained for the MdAE metric is 880.121, and the minimum value obtained for the RMSE metric is 1847.561, the first two values being obtained thanks to the LSTM method, and the last one through the KNN method. For the CD metric, the maximum value 0.662 was obtained using the LSTM method. Comparing the results obtained for the Desharnais dataset with the value 2244.675 (
Table 11) for the MAE metric presented in the paper [
6], it is observed that the optimal variants of all six analysed methods provide lower values, therefore it is a better estimate of the effort. Moreover, comparing the results obtained for the Desharnais dataset with the value 2013.79 for the MAE metric presented in the work [
8] and with value 2824.57 for the RMSE metric presented in the same work, it is observed that the optimal variants of all six analysed methods provide lower values, so it is a better estimation of the effort.
As can be seen from the second paragraph, the research methodologies in the case of the five used works [
5,
6,
7,
8,
9] in the comparison of the results are similar to the methodological process used in this study, but with different percentages used when dividing the data for training and testing. After comparing the results obtained with the values presented in the research works selected for comparison, it can be observed that for the Albrecht, Kemerer, and Desharnais datasets, the LSTM method provides better estimates of the effort. Because, for Cocomo81 and China datasets, the architecture of the LSTM method presented in this paper does not provide satisfactory results in comparison with the results obtained by the model presented in the paper [
7], further research should be carried out to improve the LSTM method.