1. Introduction
Internet transactions are increasing, and the logistics market is also activated. Many logistics centers have been built, and the parcel forwarding service has grown. As of 2020, the volume of general freight has been continually increasing [
1]. The importance of domestic freight transportation using roads has been emphasized, even with the outbreak of COVID-19. The traffic volume of small- and medium-sized vehicles used for freight transportation increased between January and August 2020, after the outbreak of COVID-19, compared with 2019 [
2]. Therefore, domestic freight transportation using roads is quite important for stimulating the logistics market. However, there is no accurate standard for the shipping costs in the domestic freight industry. Currently, the criteria for setting the shipping costs simply consider distance and vehicle tonnage. This is only a guideline for new market entrants because it cannot consider various characteristics of freight and is difficult to use in practice. Shipping costs are set based on the shipper’s know-how. Shippers set the shipping costs by considering the shipping costs of similar freights in the past and the current market price. Shipping costs are undervalued relative to labor and are unreasonable from the perspective of vehicle owners, and some transportation agents use their high market position to charge excessive commissions [
3]. Due to this situation, which is made up of strong disputes between shippers and vehicle owners, current vehicle owners have a strong dissatisfaction.
This paper proposes a machine learning-based shipping cost prediction method for a domestic freight transportation environment using data from a freight brokerage platform. It also shows that predictive models can set the shipping costs appropriately, and it compares the predictive power to present the best predictive model.
We used transportation-related data for 6 months from the freight brokerage platform. To identify the major factors, new factors were added, and various preprocessing methods were applied. Correlational analysis and a step selection method were used to derive the major factors. After that, we developed a fare prediction model using the derived factors with a machine learning algorithm. The machine learning algorithms we used were multiple linear regression (MLR), deep neural networks (DNNs), extreme gradient boosting (XGBoost) regression, and light gradient boosting machine (LightGBM) regression. LightGBM is a model that reduces the learning time compared to the XGBoost model.
We present a method for setting the range of predicted fares considering realistic usage behaviors; the fares should be presented as a range rather than as a single value to the user. A total of 30 training sets were generated using k-fold cross-validation. We trained the sets and predicted the test set for each iteration. Assuming that the 30 derived predicted values follow a normal distribution, a confidence interval was calculated, and an appropriate fare range was presented. R-squared was used as the performance evaluation index for the predictive model.
The structure of this paper is as follows.
Section 2 explains the theoretical background and previous research.
Section 3 describes the results of the data collection and preprocessing, and
Section 4 describes the derivation of the major factors. In
Section 5, the model construction and results are explained, and, finally, in
Section 6, conclusions and future research directions are presented.
2. Literature Review
2.1. Prior Studies
Kovács [
4] calculated road freight shipping costs, which previously had only been estimated. Transport-related factors such as “distance,” “fuel,” “price,” and “highway toll” were selected to calculate the cost. A predictive model based on multiple regression analysis was built using the selected factors, and it demonstrated an excellent predictive performance.
Sternad [
5] attempted to extract the major factors that affect road freight shipping costs. Fixed costs related to vehicles and drivers, and variable costs such as “fuel cost,” “toll fee,” and “mileage” were derived as characteristic factors. Next, the coefficient values for each characteristic factor were derived through multiple regression analysis. As a result, “fuel cost,” “travel cost,” and “working cost” were found to be major factors that affect the shipping costs.
Li et al. [
6] used characteristic factors, such as “vehicle capacity,” “delivery location,” and “cargo volume,” with the mixed constant planning method to optimize the matching and pricing of multidelivery services for the cargo O2O (online to offline) platform. As a result of using the developed optimization technique on data from the Chinese cargo O2O platform, the pickup distance was improved by 75–81%, and the shipping costs were reduced by 60–93%.
Lindsey et al. [
7] investigated factors that affect truck fare rates in North America and found that factors such as “distance” and “truck type” were the most important factors for determining the shipping costs.
Price predictions in other fields are studied. Jo et al. [
8] selected factors related to housing prices, such as “total lump-sum housing lease price index,” “increase in KOSPI (Korea Composite Stock Price Index),” and “consumer price index,” to predict changes in housing sale prices. The collected factors were used for logistic regression and random forest algorithms, and an appropriate prediction accuracy was achieved in the dataset. Jang and Park [
9] predicted art prices based on eight factors whose correlation with art prices was verified. The algorithms used for the prediction were linear regression and k-nearest neighbor (KNN). The KNN algorithm, a nonparametric model capable of flexible fitting to the data, showed a better performance in that there were not many variables that were relevant to the art, and it was difficult to assume the distribution of the data due to insufficient information.
As a result of previous studies, factors that affect the shipping cost setting include freight information factors such as “distance,” “vehicle type,” and “car volume,” and additional cost factors such as “delivery location,” “fuel cost,” and “highway fee.” Various algorithms have been used for price prediction. In the case of freight shipping cost forecasting, most studies have used traditional analysis models such as multiple regression analysis and mixed integer programming. In fields other than the shipping cost, research on price prediction methods has been conducted using traditional analysis models and machine learning such as MLR, random forest, and KNN algorithms.
There have been many studies where advanced optimization algorithms have been applied as solution approaches, such as online learning, scheduling, multiobjective optimization, data classification, and others. The effectiveness of these advanced optimization algorithms in the various domains, such as transportation and logistics, and their potential applications for the decision problem have been addressed in the studies [
10,
11,
12,
13].
Currently, predictive research on freight shipping costs needs further research considering more factors and algorithms. In order to set freight shipping costs, not only freight characteristics but also environmental factors must be considered. Therefore, in this study, factors such as “distance,” “vehicle type,” and “car volume”, that have been considered in previous studies, and environmental factors such as precipitation are included to derive factors that affect how the shipping costs are set. Furthermore, currently, most studies on freight shipping cost prediction are conducted using traditional regression models. Looking at cost prediction studies in other fields, there are many studies using artificial intelligence algorithms. AI algorithms often have a higher accuracy than traditional models. Therefore, it is necessary to advance research by applying artificial intelligence algorithms to the field of freight shipping cost prediction. Therefore, we build a shipping prediction model using the derived factors and artificial intelligence regression algorithm. For this process, we use the k-fold cross-validation and the confidence interval to predict the range of the shipping costs and to increase applicability in the field.
2.2. Theoretical Background
Machine learning is a field of artificial intelligence that analyzes and learns data using algorithms, and it determines or predicts the dependent variables based on what has been learned [
14]. According to the learning method, machine learning is categorized as supervised learning and unsupervised learning. Supervised learning is a learning algorithm that learns data with input and output values, and it predicts output values for unseen data or future data. It is used for classification or regression analysis [
15].
In this study, MLR, and the supervised learning algorithms DNN, XGBoost regression, and LightGBM regression were used. The definition and characteristics of each algorithm are shown in
Table 1.
3. Data Collection and Preprocessing
3.1. Data Collection
For this study, we collected freight brokerage data that were registered on the freight brokerage platform within the 6 months from April to September 2020. The dataset consists of 1,885,033 data observations and 78 variables used by freight brokerages, such as cargo information, vehicle type, vehicle tonnage, loading date and time, and unloading date and time.
3.2. Data Preprocessing
3.2.1. Creating and Removing Variables
In the dataset, variables that were related to the personal information of the cargo owners and vehicle owners, such as name, vehicle number, and phone number, were deleted, as they were judged to be irrelevant to the shipping cost prediction.
To derive factors that affect the shipping cost and to increase the predictive power of the shipping cost prediction model, variables that were expected to affect the shipping cost were added. The latitude and longitude of the upper and lower location were calculated using the haversine distance formula and were added as a “linear distance.” It was judged that detailed date and time information could affect the cost setting, so the arrival and departure dates were subdivided into the month, day, day of the week, and time, and new variables were created for each. In addition, we added the precipitation amount as a new factor, considering that the weather conditions at the time of the cargo transport would affect the shipping cost. The precipitation data of the Korea Meteorological Administration were used, and the precipitation value was added by considering the loading and unloading locations and dates.
3.2.2. Removing Data Outliers
The interquartile range (IQR) was used to remove outliers in the data. Outlier removal was applied only to continuous variables.
3.2.3. Handling Missing Data
To predict accurate shipping costs, it was important to manage missing values in the input data. After applying two methods for managing missing values, we compared which method was more useful. Before the processing of missing values, factors for which more than 50% of the data were missing were determined to be factors that did not have a great influence on the prediction and were, thus, removed. We removed 20 factors, including “load/unload name address,” “summary,” and “order number.” For the missing value treatment, listwise deletion and the mean imputation were used, and a dataset was created to which each treatment for missing values was applied. The listwise deletion removed all data with missing values, and the mean imputation replaced the missing values with the average value of each factor. After we processed the missing values, the listwise deletion dataset consisted of 73 factors and 1,353,543 data observations, and the mean imputation dataset consisted of 73 factors and 1,442,036 data observations.
Figure 1 shows the dataset before and after preprocessing.
Figure 1a is the data form before preprocessing, while
Figure 1b,c are the data form after preprocessing. The white part in the figure indicates the missing values, and the black part indicates the data.
4. Derivation of Key Factors
From the 73 factors obtained through the data collection and preprocessing, we attempted to derive the factors that affect the shipping costs. Correlational analysis and step selection were applied as a means to derive the major factors.
4.1. Correlational Analysis
Correlational analysis is a method for analyzing a linear relationship between two variables. When a dependent variable is predicted through several independent variables, a meaningful variable can be selected by considering the correlation between the independent variable and the dependent variable, and the correlations between the independent variables [
21]. In this study, independent variables with a correlation coefficient of 0.1 or higher, which is judged to indicate a linear relationship between the independent variable and the dependent variable, were judged to be the main factors.
Table 2 shows the variables that had a linear relationship with the dependent variable “shipping cost,” as well as the values of each Pearson correlation coefficient. As a result of the correlational analysis, we found seven significant factors in the dataset, to which the listwise deletion was applied. There were eight significant factors in the dataset, to which the mean imputation was applied, and the factor “phase difference” was added to the significant factors for the listwise deletion. Additionally, for both datasets, it can be observed that the “linear distance” and “actual distance” factors have a high linear relationship with the “shipping cost.” Therefore, a shipping cost prediction model was constructed using the significant variables from each dataset.
4.2. Stepwise Method
The stepwise method was one of the methods used for selecting several independent variables to be included in the regression model. It is a method that is used to find the variable constituting the optimal regression model by repeating the addition and removal of variables. The selected variable was judged to be a strong predictor in the prediction model [
22].
Table 3 shows the variables selected through the stepwise method. As a result of applying the stepwise selection method to the listwise deletion of the dataset, 35 variables were selected. After we applied the mean imputation to the dataset, there were a total of 33 selected predictors, which were the same predictors found after we applied the listwise deletion and removed the “total cost” and “arranging fee” factors. Thus, a shipping cost prediction model was constructed using the significant variables of each dataset.
5. Model Construction and Analysis Results
5.1. Data Preparation
To ensure the accuracy of the model, all variables were normalized to the same scale. Min–max normalization, which converts all continuous variable data to values between 0 and 1, was used for normalization. We attempted to derive the shipping cost as a range using the confidence interval. In the case of freight, some characteristics were not fully expressed in the data. Therefore, if the recommended cost is presented as a single value, it has limited means to reflect the volatility of reality. To derive the shipping cost as a range, a 95% confidence interval was calculated for the cost value predicted by the shipping cost prediction model. To ensure that the distribution of the predicted values follows a normal distribution, we increased the predicted values (number of samples) using K-fold cross-validation.
For the suitability of the model, 80% of the collected data were allocated to a training set and 20% to a test set, and the datasets were then used for the model construction and verification. At this time, to derive the cost range, the training set was divided into 30 folds through K-fold cross-validation, and 30 predicted values were derived by predicting the test set for each iteration. For each iteration, 29 training sets and 1 validation set were used. After that, the model was trained on the training set of each fold, and the process of predicting with the test set was repeated until 30 predicted values were derived. Afterward, the maximum and minimum values in the confidence interval were determined as the upper and lower limits of the prediction interval to estimate the predicted cost range. Overall configuration of a dataset for range prediction is shown in
Figure 2.
This study was conducted using Python version 3.9. Python language-based TensorFlow and the scikit-learn machine learning algorithms were also used.
5.2. MLR
Multicollinearity, where two or more independent variables have a high correlation, is among the many assumptions made about the regression analysis model. If there is a correlation between independent variables, the standard error increases, and the variance of the independent variable coefficient increases. Therefore, the process of diagnosing multicollinearity is important, and the variance inflation factor (VIF) is used for this diagnosis. The VIF is a tool that measures and quantifies how inflated the variance is. In general, when the VIF is 10 or more, there is a high correlation between independent variables [
23]. In this study, the VIF was confirmed for four cases obtained through data preprocessing and variable selection processes.
Table 4 shows the VIF for variables judged to have a high correlation with independent variables for each case.
As a result of confirming multicollinearity, as shown in
Table 4, variables with a high multicollinearity were removed from each dataset. Two factors were removed from the dataset to which correlation analysis and listwise deletion were applied. In addition, three factors were removed from the dataset using correlation analysis and mean imputation. Both datasets were analyzed using five factors. The dataset with the stepwise method and listwise deletion was applied and analyzed with 16 factors after removing 19 factors. A total of 18 factors were removed from the dataset using the stepwise method and mean imputation. After that, 15 factors were used for analysis.
The process of learning and predicting was repeated 30 times to obtain enough samples so that the predicted value could approximately follow a normal distribution. For the learning step, 30 training sets obtained through K-fold cross-validation were used, and the prediction was carried out on one test set. The R-squared value of the resulting MLR model is shown in
Table 5. The explanatory power of the model was calculated as the average of the R-squared for each predicted value. The fare prediction results for the MLR model show that the average explanatory power of the model was approximately 63.4%, and the R-squared value of the model obtained by processing the missing data by listwise deletion and processing the factors using the stepwise method was the highest.
Table 6 shows the five values with the smallest error between the actual value and the predicted value among the results obtained for predicting the range of the cost using the multilinear regression model. The fare range was calculated using the confidence interval of the predicted value obtained by the model. The predicted fare range was estimated by judging the maximum and minimum values in the confidence interval as the upper and lower limits of the prediction interval.
5.3. DNN
In this study, a DNN model with five hidden layers was constructed. The number of hidden layers and neurons was empirically determined after testing various combinations. The parameters of the DNN model are shown in
Table 7.
The DNN model repeated the same learning and prediction process 30 times. The R-squared value of the DNN model after the learning process is shown in
Table 8. The fare prediction by the DNN algorithm showed that the average explanatory power of the model was about 73.2%. When the variables obtained through the stepwise method were applied, it was found that the difference in the predictive power was large, depending on the preprocessing method. In addition, it was confirmed that the R-squared value of the model was the highest when the mean imputation method and the stepwise method were used.
Table 9 shows the five values with the smallest error between the actual value and the predicted value among the results for the fare range prediction using the DNN model.
5.4. XGBoost Regression
The XGBoost model was built using the basic form of the XGBoost algorithm, which consists of 400 weak learners and a maximum tree depth of three levels. The learning rate was set to be a default value of 0.3. The results of the XGBoost model are shown in
Table 10. The results for predicting the shipping cost using the XGBoost model show that the model has an average explanatory power of about 74.6%. A significant difference in the explanatory power according to the variable selection method was found.
Table 11 shows the five values with the smallest error between the actual value and the predicted value among the results for the predicted cost range using the XGBoost model.
5.5. LightGBM
Table 12 shows the analysis results of the model built using the basic form of the LightGBM algorithm. The learning rate was set to be a default value of 0.1. The results for predicting the shipping cost using the LightGBM model show that the model has an average explanatory power of about 78.6%. The LightGBM model also shows a difference in the explanatory power depending on the variable selection method, but the explanatory power of all the models was 0.7 or greater.
Table 13 shows the five values with the smallest error between the actual value and the predicted value among the results for predicting the cost range using the LightGBM model.
5.6. Model Comparison
Table 14 shows a comparison of the explanatory power of all the analysis methods. When the predictive power was compared based on the preprocessing method, there appears to be no significant difference between the models, except for the DNN model. In addition, the model that predicted the shipping cost using the variables selected through the step selection method has a higher explanatory power than the model to which the correlation analysis was applied. It was confirmed that there was a big difference in the case of the boosting model. Compared to the other models, the learning time was short, and the predictive power was high. The reason why the model to which the stepwise selection method was applied has a higher explanatory power is thought to be because relatively more factors are considered by the model. In addition, the variables obtained through the stepwise method include all the variables obtained through the correlational analysis.
The results show that the boosting model has an excellent predictive power. The LightGBM model has the best predictive power, followed by the XGBoost and DNN, and the MLR model has the lowest predictive power. Machine learning has the characteristic of iteratively learning and improving the model to increase the probability of success in the prediction. The traditional analysis model has the characteristic of making predictions with a fixed model through a single analysis. This is thought to be due to the differences between the machine learning and traditional analysis models.
The time required for model learning is also an important factor to consider for field applications. If a model takes a long time to learn, even if it shows a high accuracy, it may be difficult to apply to a field where rapid decision-making is required. The learning time of the correlation analysis method with few variables to consider was short, and the model with the shortest required time was the MLR model. The second shortest required time model was the LightGBM. The DNN model had a high predictive power, but it took a long time to learn compared with the other models, so it was judged to be difficult to apply in the field.
Considering the above results, it is evident that the machine learning models show a higher predictive power than the traditional analysis method. Considering both the speed and performance among the machine learning models, the LightGBM model was judged to be the most suitable for predicting the shipping cost. If the model performance is optimized in the future, the accuracy of the model could be further increased.
5.7. Variable Importance
Another purpose of this study was to derive factors that affect the cost setting. In order to derive these factors, the variable with the greatest contribution to the prediction of the shipping cost was identified using Shapley Additive exPlanations (SHAP). SHAP comprehensively calculates the contribution of each variable by comparing all combinations of variables in the model. In this study, the SHAP value of the LightGBM model with the highest predictive power was measured, and the analysis results are shown in
Figure 3.
The factors that make a high contribution to the shipping cost prediction are “linear distance,” “actual distance,” “freight weight,” and “vehicle tonnage,” which are highly related to transportation distance and freight characteristics. In particular, “linear distance” showed a high contribution of greater than 50%. It was judged that “linear distance” has a greater influence than “actual distance” in determining the shipping cost.
6. Conclusions and Future Research
To solve the problem of fare setting on a freight transportation brokerage platform, where there is no standardized shipping cost, the main factors that affect the shipping cost setting were derived in this study, and a price prediction model was built using machine learning. Factors that affect the shipping cost were selected using correlational analysis and the stepwise method from a total of 73 factors, including factors that were obtained from the freight brokerage process and environmental factors such as precipitation. The selected factors were cargo characteristic factors, vehicle owner characteristic factors, and environmental factors. Using these factors, a shipping cost prediction model was built, and the performance of each model was compared. Cargo characteristic factors included “freight weight,” “loading/unloading time,” and “loading/unloading location.” “Vehicle tonnage” and “vehicle type” were included as characteristic factors of the owner. Precipitation was an environmental factor. The results of the analysis showed that the DNN, XGBoost, and LightGBM models, which are machine learning models, performed better than the linear regression, which is a traditional analysis method. The model that showed the best predictive power among the models used was the LightGBM model.
In addition, this study explored factors that affect the cost setting. Factor exploration was conducted using the LightGBM model, which had the highest predictive power. The factor that contributed the most to the cost setting was “linear distance,” and “actual distance,” “freight weight,” and “vehicle tonnage” were also found to be major variables that influence the cost setting. No valid results were obtained for “precipitation,” which was thought to affect the forecast of the shipping costs. This is believed to be due to the characteristics of the freight market. In the actual freight market, cargo transport volume decreases on rainy days. Since the data used in this study are the data of cargoes that have completed freight transportation, these market conditions do not appear. However, major variables obtained through research can be a quantitative indicator for determining the shipping costs. This is expected to solve the problem of setting the shipping costs based on the shipper’s experience. Because factors that were not previously considered could be considered in the future, the appropriateness of the shipping costs are expected to improve. In this study, daily precipitation data were added as an environmental factor, but a significant correlation between precipitation and the shipping costs could not be confirmed. A more accurate model could be presented if research is conducted by additionally considering other factors that reflect the actual situation.
Machine learning has been in the spotlight because it has shown excellent performance in forecasting for many fields, but there have been no case studies in the field of shipping cost prediction. In this study, a model with a high predictive power was presented by introducing a machine learning algorithm for fare prediction. This model has a higher accuracy than currently existing freight rates because it considers more cargo characteristics. Therefore, this model could be used in future cost prediction research and for setting the standard shipping costs on freight transport brokerage platforms. However, there are factors that cannot be confirmed with data, since the actual shipping costs are determined by the know-how of the shipper. Because of this, there is a limit to accurately predicting the transportation costs. However, if more factors are considered through future research and the model is advanced, a more accurate model can be built.
In future studies, we will generate and analyze meaningful data by changing the preprocessing method of environmental factors. In addition, other factors such as “highway fee” and “fuel cost” will be added to determine which environmental factors affect the shipping costs. In addition, it is necessary to optimize the model performance in future studies to increase its accuracy. Freight is divided into a range of fares according to various characteristics. A more accurate model could be built if the shipping cost is predicted after the data are filtered with consideration for these data characteristics. Exploring more advanced optimization algorithms or metaheuristics for this decision problem could be provided. In the future research, the proposed approach could be compared to the advanced optimization or metaheuristic algorithms.
Author Contributions
Conceptualization, H.-S.J.; Funding acquisition T.-W.C.; Investigation, H.-S.J.; Project administration, H.-S.J. and T.-W.C.; Validation, T.-W.C. and S.-H.K.; Writing—original draft, H.-S.J. and T.-W.C.; Writing—review and editing, S.-H.K. All authors have read and agreed to the published version of the manuscript.
Funding
This work was supported by a Korea Institute for Advancement of Technology (KIAT) grant funded by the Korea Government (MOTIE) (P0008691, HRD Program for Industrial Innovation) and the GRRC program of Gyeonggi province [(GRRC KGU 2020-B01), Research on Intelligent Industrial Data Analytics].
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable. Due to trade secret concerns, the raw data are kept confidential, and you may request some data from the authors or Hwamulman Co. Ltd., Gwangju, Republic of Korea.
Conflicts of Interest
The authors declare no conflict of interest.
References
- Lee, T.-H.; Heo, J.-S. 2021 Logistics Industry Outlook; Issue Paper 2021-02; The Korea Transport Institute: Sejong, Korea, 2021. [Google Scholar]
- Ko, Y.-S. Post COVID-19, the Change of Road Transport System and Logistics. Transp. Technol. Policy 2021, 18, 12–16. [Google Scholar]
- Do, K.-H. A Study on the Problems and Improvement of the THC, CAF & BAF in Korea Ocean Freight; Konkuk University Graduate School: Seoul, Republic of Korea, 2009. [Google Scholar]
- Kovács, G. First cost calculation methods for road freight transport activity. Transp. Telecommun. J. 2017, 18, 107–117. [Google Scholar] [CrossRef] [Green Version]
- Sternad, M. Cost Calculation in road freight transport. Bus. Logist. Mod. Manag. 2019, 19, 215–225. [Google Scholar]
- Li, J.; Zheng, Y.; Dai, B.; Yu, J. Implications of matching and pricing strategies for multiple-delivery-points service in a freight O2O platform. Transp. Res. Part E Logist. Transp. Rev. 2020, 136, 101871. [Google Scholar] [CrossRef]
- Lindsey, C.; Frei, A.; Babai, H.; Mahmassani, H.; Park, Y.; Klabjan, D.; Reed, M.; Langheim, G.; Keating, T. Modeling carrier truckload shipping costs in spot markets. In Proceedings of the 24 Annual Meeting of the Transportation Research Board, Washington, DC, USA, 13–17 January 2013. [Google Scholar]
- Jo, S.-H.; Kang, M.-G.; Kim, G.-E.; Ban, J.-H.; Lee, J.-H.; Kang, T.-W. Predicting changes in housing prices nationwide through machine learning. In Proceedings of the KIIT Conference, Bhubaneswar, India, 17–18 December 2021. [Google Scholar]
- Jang, D.-R.; Park, M.-J. Price Determinant Factors of Artworks and Prediction Model Based on Machine Learning. J. Korean Soc. Qual. Manag. 2019, 47, 687–700. [Google Scholar]
- Zhao, H.; Zhang, C. An online-learning-based evolutionary many-objective algorithm. Inf. Sci. 2020, 509, 1–21. [Google Scholar] [CrossRef]
- Dulebenets, M.A. An adaptive polyploid memetic algorithm for scheduling trucks at a cross-docking terminal. Inf. Sci. 2021, 565, 390–421. [Google Scholar] [CrossRef]
- Pasha, J.; Nwodu, A.L.; Fathollahi-Fard, A.M.; Tian, G.; Li, Z.; Wang, H.; Dulebenets, M.A. Exact and metaheuristic algorithms for the vehicle routing problem with a factory-in-a-box in multi-objective settings. Adv. Eng. Inform. 2022, 52, 101623. [Google Scholar] [CrossRef]
- Rabbani, M.; Oladzad-Abbasabady, N.; Akbarian-Saravi, N. Ambulance routing in disaster response considering variable patient condition: NSGA-II and MOPSO algorithms. J. Ind. Manag. Optim. 2022, 18, 1035–1062. [Google Scholar] [CrossRef]
- Lee, Y.-S.; Moon, P.-J. A Comparison and Analysis of Deep Learning Framework. J. Korea Inst. Electron. Commun. Sci. 2017, 12, 115–122. [Google Scholar]
- Bae, S.-W.; Yu, J.-S. Predicting the Real Estate Price Index Using Machine Learning Methods and Time Series Analysis Model. Hous. Stud. Rev. 2018, 26, 107–133. [Google Scholar] [CrossRef]
- Wilks, D.S. Statistical Methods in the Atmospheric Sciences; Academic Press: Cambridge, MA, USA, 2011; Volume 100. [Google Scholar]
- Schmidhuber, J. Deep learning in neural networks: An overview. Neural Netw. 2015, 61, 85–117. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Chen, T.; Guestrin, C. Xgboost: A scalable tree boosting system. In Proceedings of the 22nd Acm Sigkdd International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016. [Google Scholar]
- Natekin, A.; Knoll, A. Gradient boosting machines, a tutorial. Front. Neurorobotics 2013, 7, 21. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst. 2017, 30, 3815895. [Google Scholar]
- Hall, M.A. Correlation-Based Feature Selection for Machine Learning. Ph.D. Dissertation, The University of Waikato, Hamilton, New Zealand, 1999. [Google Scholar]
- Wang, D.; Zhang, W.; Bakhai, A. Comparison of Bayesian model averaging and stepwise methods for model selection in logistic regression. Stat. Med. 2004, 23, 3451–3467. [Google Scholar] [CrossRef] [PubMed]
- Daoud, J.I. Multicollinearity and regression analysis. J. Phys. Conf. Ser. 2017, 949, 012009. [Google Scholar] [CrossRef]
| Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).