Prediction of the Behaviour from Discharge Points for Solid Waste Management
Abstract
:1. Introduction
2. Related Works
3. Objective and Research Questions
- RQ1: Are the size, reliability, and stability of the available data sufficient to obtain precise results for each individual DP?
- RQ2: Is the precision of the prediction sufficient to optimise the MSW management effectively?
- RQ3: Is it possible to perform the necessary calculations with the appropriate speed or within the required time frame to be used in real-time systems?
4. Materials and Methods
4.1. Data Preparation
4.2. Forecast Algorithms Selection
4.3. Forecast Scenario Selection
- Day of the week and hour
- Day of the month and hour
- Weekend and hour
- Holiday and hour
5. Discussion
5.1. Predictions vs. Real Values
5.2. Results for the Selected Metrics
5.3. Algorithms Comparison
5.4. Scenarios Analysis
- Low Variability in Data: Scenarios with relatively stable and low variability in waste generation patterns led to more accurate predictions. For instance, DP locations that had consistent usage patterns and fewer fluctuations showed better prediction accuracy.
- Higher Data Quality: DPs with more complete and higher-quality data tended to produce better results. This included fewer missing values and more consistent data recording practices.
- Effective Feature Relevance: Scenarios where the selected features (like day of the week or weekend) had a strong correlation with the waste generation patterns yielded more accurate predictions.
- Stability in Patterns: DPs with predictable and stable waste generation patterns allowed the models to learn and generalise more effectively. Consistency in data helps in identifying clear trends and reduces the noise that can obscure the underlying patterns.
- Data Completeness and Quality: High-quality data with minimal missing values and consistent recording practices provided a solid foundation for model training. Accurate and complete datasets ensure that the models are trained on reliable information, enhancing their predictive capabilities.
- Relevance of Features: The selection of relevant features that strongly correlate with waste generation patterns proved crucial.
- High Variability in Data: High variability in the waste generation data, possibly due to erratic usage patterns or inconsistent data collection, led to poor prediction accuracy. For instance, DPs in areas from the city with fluctuating population density or irregular events had significant prediction errors.
- Incomplete Data: Scenarios with many missing values or inconsistencies in data recording practices negatively impacted the model performance. The lack of a complete and clean dataset made it challenging for the algorithm to learn accurate patterns.
- Weak Feature Relevance: When the selected feature did not strongly correlate with the waste generation patterns, the models struggled to make accurate predictions. This was evident in DPs where external factors like special events or holidays significantly influenced waste generation but were not included in the feature set.
- Small Dataset: DPs having this characteristic, where only a limited amount of data were available, posed significant challenges for the algorithms. A small dataset can lead to overfitting, where the model performs well on the training data but fails to generalise to new data. Additionally, small datasets often lack the diversity needed to capture all possible variations in waste generation patterns. For example, DPs in newly established or low-traffic areas may not have accumulated enough data to train robust models, resulting in less reliable predictions.
5.5. Exploring Alternative Approaches
- Pattern Similarity: The predictions from the two algorithms displayed a pattern like actual DP values, suggesting that the models capture some underlying trends despite the overall prediction errors.
- Significant Upturns: The substantial differences between the upturns and the other values, which typically do not exhibit regular behaviour and have noticeable fluctuations, might explain why the forecasting algorithms struggled to match the DP values accurately.
- Recurring Daily Patterns: The recurring daily pattern in the data could be a clue as to why the algorithms fail to predict the next hour accurately. A broader approach, predicting the next 24-h pattern and then adjusting it with real-time data, might be required.
6. Conclusions
- For RQ1, the answer is positive, as we were able to perform calculations without major issues related to data size or quality.
- For RQ2, the answer is negative, as the forecasts were not accurate enough to optimise management effectively.
- For RQ3, we found that the time required for calculations is substantial and would need enhancements and optimisations, such as pre-calculations or other techniques, to be feasible in a real-time scenario.
- Incorporating External Factors: While this research considered aspects related to the time of day, week, month or year, and local holidays, future research could investigate the impact of other external factors such as weather conditions and local events on the waste management generation patterns at DPs. This could enhance the predictive accuracy of the models.
- Improving Algorithm Performance: Explore additional forecast algorithms not used but analysed in this study and investigate others that may offer a better performance for time series predictions.
- Real-Time Data Integration: Develop methods to integrate real-time data from the IoT sensors from the DPs into the predictive models to improve the timeliness and accuracy of forecasts.
- Cost-Benefit Analysis: Conduct a detailed cost–benefit analysis of implementing advanced predictive models in MSW management to quantify the economic and ecological benefits.
- Scalability Studies: Assess the scalability of the predictive models for larger datasets and more extensive municipal areas to ensure that the proposed solutions can be effectively applied in diverse urban settings.
- User Behaviour Analysis: Study the behaviour of residents and businesses contributing to the waste stream to identify patterns and trends that could inform more targeted waste management strategies.
- Optimisation of Collection Routes: Investigate the integration of predictive models with dynamic route optimisation algorithms to enhance the efficiency of waste collection operations.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Appendix A
Attribute | Description |
---|---|
Number of DP | The numerical ID of the DP. |
Date and time | Timestamp with the time reduced to hours with 0 at minutes and seconds. |
Real increment | Integer number with the DP filling increment for the hour. |
Day of the month | Obtained from date. Value from 1 to 31 depending on the month. |
Day of the week | Obtained from date. Value from 1 to 7. |
No. month | Obtained from date. Value from 1 to 12. |
No. week of month | Obtained from date. Value from 1 to 5. |
No. week of year | Obtained from date. Value from 1 to 53. |
Year | Date decomposition. |
Month | Date decomposition. |
Day | Date decomposition. |
Holiday/Not holiday | Calculated with date and the use of a calendar with local holidays. Values are “HOLIDAY” or “NOT_HOLIDAY”. |
Season | Obtained from date. Values are “SPRING”, “SUMMER”, “AUTUMN” or “WINTER”. |
Time of the day | Obtained from date. Values are “MORNING”, “AFTERNOON”, “EVENING” or “NIGHT” with the following ranges:
|
Weekday/Weekend | Obtained from date. Values “WEEKDAY” (Monday to Friday) or “WEEKEND”. |
Appendix B
Forecast Algorithm | Description |
---|---|
Gradient Boosting | Based on an ensemble meta-algorithm to reduce errors in forecast analysis. It creates a prediction model from a set of weak predictions models where each of them makes a few assumptions related to the data. |
Extreme Gradient Boosting (XGBoost) Regression | Based on the Gradient Boosting and Decision Tree algorithms, used for supervised learning tasks and supporting parallel processing to capture complex relationships between input features and target variables and to have a better selection and understanding of model behaviour. |
Light Gradient Boosting Machine (LightGBM) | Based on the Gradient Boosting algorithm, designed for efficient training on large-scale datasets with low memory cost using parallel and distributed computing. |
CatBoost | Based on Decision Tree algorithms using Gradient Boosting, used to classify the results from different searches. |
Stepwise Regression | Iteratively selects significant explanatory variables for the model, discarding less important ones by statical significance after each iteration. |
Linear Regression | Supervised learning algorithm predicting the relationship between two variables, assuming a linear connection between them. |
Adaptive Boosting (AdaBoost) | Boosting algorithm that classifies data by combining multiple weak learners into a strong one. |
Autoregressive Integrated Moving Average (ARIMA) | Regression algorithm measuring the strength of one dependent variable relative to another changing variable using historical values. |
Seasonal-ARIMA (SARIMA) | Based on the ARIMA algorithm including seasonality in the forecast. |
Neural Networks Regression | Uses artificial neural networks where each node has an activation function that defines the output based on a set of inputs, building a complex relationship between inputs and outputs. |
Multiple Linear Regression | Extension of Linear Regression algorithm allowing predictions with multiple independent variables. |
Ordinal Regression | Predicts variables on an arbitrary scale, considering the relative order of variables. |
Fast Forest Quantile Regression | Based on the Decision Tree algorithm, predicting not only the mean but also quantiles of the target variable. |
Boosted Decision Tree Regression | Ensemble algorithm combining predictions from multiple weak learners to create a strong predictive model by the correction of errors with the iterative trees created. |
Robust Regression | Provides an alternative to least squares regression, reducing the influence of outliers to fit better to a greater part of the data. |
Stochastic Gradient Descent | Efficient algorithm fitting linear regressors under convex loss regression algorithms, suitable for large-scale datasets. |
Decision Tree | Non-linear regression algorithm splitting the dataset into smaller parts, creating a tree-like structure. |
Elastic Net | Based on Linear Regression, using penalisations to reduce predictor coefficients, combining absolute and squared values for the prediction. |
Gaussian Regression | Flexible supervised learning algorithm with inherent uncertainty measures over predictions. |
K-Nearest Neighbours (KNN) | Non-linear regression algorithm predicting the target variable by averaging values of its k-nearest neighbours. |
LASSO Regression | Based on Linear Regression, estimates sparse coefficients by selecting variables and regulating them to improve accuracy. |
Logistic Regression | Models the probability of a discrete outcome given an input variable. |
Naïve Bayes (Bayesian Regression) | Incorporates Bayesian principles into another regression algorithm to estimate the probability distribution of the model. |
Polynomial Regression | Extension from Linear Regression, predicting based on complex relationships using an nth degree polynomial over the independent variable. |
Poisson Regression | Counts data or events within a fixed interval, considering each as a rare and independent event. |
Random Forest | Non-linear regression algorithm using multiple Decision Trees to predict the output. |
Ridge Regression | Based on Linear Regression, provides regularisation to prevent overfitting. |
Support Vector Regression (SVR) | Supervised machine learning algorithm identifying the output in a multidimensional space. |
Appendix C
Metric | Description |
Explained Variance Regression Score (VRS) | Based on the variance metric, representing the dispersion of a continuous dataset. Closer to 1 is better. |
Mean Squared Error (MSE) | Measures the quality of a predictor and prediction intervals. Lower values are better. |
Mean Absolute Error (MAE) | Represents the average error between the real and predicted values. Lower values are better. |
Root Mean Squared Error (RMSE) | Measures the average difference between predicted and actual values. Closer to 0 is better. |
Mean Squared Logarithmic Error (MSLE) | Measures the relative difference between the logarithmic transformed actual and predicted values. Closer to 0 but not 0 is better. |
Median Absolute Error (Median AE) | Median of the differences between observed and predicted values. Closer to 0 is better. |
Coefficient of determination (R2) | Indicates how well one variable explains the variance of another. Closer to 1 is better. |
Mean Absolute Percentage Error (MAPE) | Shows the average absolute percentage difference between real and predicted values. |
Mean Tweedie Deviance (MTD) | Calculates the mean Tweedie deviance error, indicating the prediction type (Mean Squared Error, Mean Poisson Deviance, or Gamma Deviance). |
D2 score | Generalisation of R2, replacing squared error by a deviance like Tweedie (D2 TS), Pinball (D2 PS), or Mean Absolute Error (D2 AES). |
Maximum Error (ME) | Captures the worst-case error between predicted and real values. The closer to 0 is better. |
Appendix D
Algorithm | Case Studied | DP | MAE | MSE | RMSE | R2 | ME |
---|---|---|---|---|---|---|---|
Decision Tree | Day of Month + Hour | 1 | 7.93 | 144.11 | 12 | 0.37 | 139.67 |
2 | 7.55 | 135.69 | 11.65 | 0.33 | 115.26 | ||
3 | 5.87 | 81.61 | 9.03 | 0.4 | 112.71 | ||
4 | 9.53 | 191.13 | 13.83 | 0.27 | 135.22 | ||
5 | 9.91 | 249.62 | 15.8 | 0.3 | 231.91 | ||
6 | 7.45 | 114.49 | 10.7 | 0.35 | 113.71 | ||
Day of Week + Hour | 1 | 7.56 | 133.46 | 11.55 | 0.41 | 138.76 | |
2 | 7.27 | 127.98 | 11.31 | 0.37 | 117.14 | ||
3 | 5.6 | 76.58 | 8.75 | 0.44 | 111.24 | ||
4 | 9.02 | 176.85 | 13.3 | 0.33 | 131.3 | ||
5 | 9.45 | 238.4 | 15.44 | 0.33 | 229.06 | ||
6 | 7.19 | 108.41 | 10.41 | 0.39 | 119.44 | ||
Decision Tree | Holiday + Hour | 1 | 7.83 | 140.39 | 11.85 | 0.38 | 138.91 |
2 | 7.49 | 132.5 | 11.51 | 0.35 | 117.79 | ||
3 | 5.78 | 79.79 | 8.93 | 0.42 | 114.22 | ||
4 | 9.35 | 186.14 | 13.64 | 0.29 | 132.4 | ||
5 | 9.76 | 243.78 | 15.61 | 0.32 | 230.56 | ||
6 | 7.35 | 110.86 | 10.53 | 0.37 | 109.3 | ||
Weekend + Hour | 1 | 7.65 | 135.29 | 11.63 | 0.4 | 138.98 | |
2 | 7.3 | 129.12 | 11.36 | 0.36 | 117.13 | ||
3 | 5.62 | 77.24 | 8.79 | 0.43 | 111.97 | ||
4 | 9.06 | 178.45 | 13.36 | 0.32 | 132.66 | ||
5 | 9.5 | 239.62 | 15.48 | 0.33 | 228.87 | ||
6 | 7.25 | 109.57 | 10.47 | 0.38 | 114.44 | ||
Random Forest | Day of Month + Hour | 1 | 7.9 | 142.58 | 11.94 | 0.37 | 138.45 |
2 | 7.55 | 133.59 | 11.56 | 0.34 | 115.16 | ||
3 | 5.88 | 81.35 | 9.02 | 0.4 | 115.56 | ||
4 | 9.46 | 188.06 | 13.71 | 0.28 | 132.38 | ||
5 | 9.86 | 245.61 | 15.67 | 0.31 | 232.86 | ||
6 | 7.4 | 111.6 | 10.56 | 0.37 | 111.23 | ||
Day of Week + Hour | 1 | 7.83 | 140.54 | 11.86 | 0.38 | 138.45 | |
2 | 7.41 | 130.69 | 11.43 | 0.35 | 116.1 | ||
3 | 5.8 | 80.36 | 8.96 | 0.41 | 115.57 | ||
4 | 9.15 | 180.01 | 13.42 | 0.31 | 131.62 | ||
5 | 9.61 | 241.94 | 15.55 | 0.32 | 231.05 | ||
6 | 7.34 | 110.52 | 10.51 | 0.38 | 111.55 | ||
Holiday + Hour | 1 | 7.9 | 142.53 | 11.94 | 0.37 | 138.45 | |
2 | 7.54 | 133.43 | 11.55 | 0.34 | 117.65 | ||
3 | 5.88 | 81.33 | 9.02 | 0.4 | 115.56 | ||
4 | 9.45 | 187.72 | 13.7 | 0.28 | 132.38 | ||
5 | 9.85 | 245.58 | 15.67 | 0.31 | 232.86 | ||
6 | 7.39 | 111.49 | 10.56 | 0.37 | 111.23 | ||
Weekend + Hour | 1 | 7.83 | 140.55 | 11.86 | 0.38 | 138.45 | |
2 | 7.43 | 131.42 | 11.46 | 0.35 | 117.08 | ||
3 | 5.8 | 80.41 | 8.97 | 0.41 | 115.57 | ||
4 | 9.16 | 180.15 | 13.42 | 0.31 | 131.62 | ||
5 | 9.62 | 242.32 | 15.57 | 0.32 | 230.94 | ||
6 | 7.35 | 110.6 | 10.52 | 0.38 | 111.23 |
References
- Department of Economic and Social Affairs. United Nations. World Urbanization Prospects. The 2018 Revision. 2019. Available online: https://population.un.org/wup/Publications/Files/WUP2018-Report.pdf (accessed on 16 January 2024).
- Ashtari, A.; Tabrizi, J.S.; Rezapour, R.; Maleki, M.R.; Azami-Aghdash, S. Health Care Waste Management Improvement Interventions Specifications and Results: A Systematic Review and Meta-Analysis. Iran. J. Public Health 2020, 49, 1611–1621. [Google Scholar] [CrossRef] [PubMed]
- Somani, P. Health Impacts of Poor Solid Waste Management in the 21st Century. In Solid Waste Management—Recent Advances, New Trends and Applications; IntechOpen: London, UK, 2023. [Google Scholar] [CrossRef]
- Singh, M.; Singh, M.; Singh, S.K. Tackling municipal solid waste crisis in India: Insights into cutting-edge technologies and risk assessment. Sci. Total Environ. 2024, 917, 170453. [Google Scholar] [CrossRef] [PubMed]
- Directorate-General for Internal Market, Industry, Entrepreneurship and SMEs (European Commission); Grohol, M.; Veeh, C. Study on the Critical Raw Materials for the EU 2023: Final Report. Publications Office of the European Union. 2023. Available online: https://data.europa.eu/doi/10.2873/725585 (accessed on 11 March 2024).
- Rosanvallon, S.; Kanth, P.; Elbez-Uzan, J. Waste management strategy for EU DEMO: Status, challenges and perspectives. Fusion Eng. Des. 2024, 202, 114307. [Google Scholar] [CrossRef]
- Anuardo, R.G.; Espuny, M.; Costa, A.C.F.; Oliveira, O.J. Toward a cleaner and more sustainable world: A framework to develop and improve waste management through organizations, governments and academia. Heliyon 2022, 8, e09225. [Google Scholar] [CrossRef] [PubMed]
- Perkumienė, D.; Atalay, A.; Safaa, L.; Grigienė, J. Sustainable Waste Management for Clean and Safe Environments in the Recreation and Tourism Sector: A Case Study of Lithuania, Turkey and Morocco. Recycling 2023, 8, 4. [Google Scholar] [CrossRef]
- Hoy, Z.X.; Phuang, Z.X.; Farooque, A.A.; Fan, Y.V.; Woon, K.S. Municipal solid waste management for low-carbon transition: A systematic review of artificial neural network applications for trend prediction. Environ. Pollut. 2024, 344, 123386. [Google Scholar] [CrossRef] [PubMed]
- Kaur, E.A.M. Mathematical Modelling Of Municipal Solid Waste Management In Spherical Fuzzy Environment. Adv. Nonlinear Var. Inequalities 2023, 26, 47–64. [Google Scholar] [CrossRef]
- Zhao, J.; Li, X.; Chen, L.; Liu, W.; Wang, M. Scenario analysis of the eco-efficiency for municipal solid waste management: A case study of 211 cities in western China. Sci. Total Environ. 2024, 919, 170536. [Google Scholar] [CrossRef]
- Meng, T.; Shan, X.; Ren, Z.; Deng, Q. Analysis of Influencing Factors on Solid Waste Generation of Public Buildings in Tropical Monsoon Climate Region. Buildings 2024, 14, 513. [Google Scholar] [CrossRef]
- Ahmed, A.K.A.; Ibraheem, A.M.; Abd-Ellah, M.K. Forecasting of municipal solid waste multi-classification by using time-series deep learning depending on the living standard. Results Eng. 2022, 16, 100655. [Google Scholar] [CrossRef]
- Ferrão, C.C.; Moraes, J.A.R.; Fava, L.P.; Furtado, J.C.; Machado, E.; Rodrigues, A.; Sellitto, M.A. Optimizing routes of municipal waste collection: An application algorithm. Manag. Environ. Qual. Int. J. 2024; ahead-of-print. [Google Scholar] [CrossRef]
- Rekabi, S.; Sazvar, Z.; Goodarzian, F. A bi-objective sustainable vehicle routing optimization model for solid waste networks with internet of things. Supply Chain Anal. 2024, 5, 100059. [Google Scholar] [CrossRef]
- Mohammadi, M.; Rahmanifar, G.; Hajiaghaei-Keshteli, M.; Fusco, G.; Colombaroni, C. Industry 4.0 in waste management: An integrated IoT-based approach for facility location and green vehicle routing. J. Ind. Inf. Integr. 2023, 36, 100535. [Google Scholar] [CrossRef]
- Hu, Y.; Ju, Q.; Peng, T.; Zhang, S.; Wang, X. Municipal solid waste collection and transportation routing optimization based on iac-sfla. J. Environ. Eng. Landsc. Manag. 2024, 32, 31–44. [Google Scholar] [CrossRef]
- Hashemi, S.E. A fuzzy multi-objective optimization model for a sustainable reverse logistics network design of municipal waste-collecting considering the reduction of emissions. J. Clean. Prod. 2021, 318, 128577. [Google Scholar] [CrossRef]
- Ge, Z.; Zhang, D.; Lu, X.; Jia, X.; Li, Z. A Disjunctive Programming Approach for Sustainable Design of Municipal Solid Waste Management. Chem. Eng. Trans. 2023, 103, 283–288. [Google Scholar] [CrossRef]
- Ramadan, B.S.; Ardiansyah, S.Y.; Sendari, S.; Wibowo, Y.G.; Rachman, I.; Matsumoto, T. Optimization of municipal solid waste collection sites by an integrated spatial analysis approach in Semarang City. J. Mater. Cycles Waste Manag. 2024, 26, 1231–1242. [Google Scholar] [CrossRef]
- Dudar, I.; Yavorovska, O.; Cirella, G.T.; Buha, V.; Kuznetsova, M.; Iarmolenko, I.; Svitlychnyy, O.; Pankova, L. Enhancing Urban Solid Waste Management Through an Integrated Geographic Information System and Multicriteria Decision Analysis: A Case Study in Postwar Reconstruction. In Handbook on Post-War Reconstruction and Development Economics of Ukraine; Cirella, G.T., Ed.; Springer International Publishing: Cham, Switzerland, 2024; pp. 377–392. [Google Scholar] [CrossRef]
- Kolekar, K.; Hazra, T.; Chakrabarty, S. A Review on Prediction of Municipal Solid Waste Generation Models. Procedia Environ. Sci. 2016, 35, 238–244. [Google Scholar] [CrossRef]
- Singh, D.; Satija, A. Prediction of municipal solid waste generation for optimum planning and management with artificial neural network-case study: Faridabad City in Haryana State (India). Int. J. Syst. Assur. Eng. Manag. 2018, 9, 91–97. [Google Scholar] [CrossRef]
- Paulauskaite-Taraseviciene, A.; Raudonis, V.; Sutiene, K. Forecasting municipal solid waste in Lithuania by incorporating socioeconomic and geographical factors. Waste Manag. 2022, 140, 31–39. [Google Scholar] [CrossRef]
- Meza, J.K.S.; Yepes, D.O.; Rodrigo-Ilarri, J.; Cassiraga, E. Predictive analysis of urban waste generation for the city of Bogotá, Colombia, through the implementation of decision trees-based machine learning, support vector machines and artificial neural networks. Heliyon 2019, 5, e02810. [Google Scholar] [CrossRef]
- Li, Z.; Zheng, Z.; Washington, S. Short-Term Traffic Flow Forecasting: A Component-Wise Gradient Boosting Approach With Hierarchical Reconciliation. IEEE Trans. Intell. Transp. Syst. 2019, 21, 5060–5072. [Google Scholar] [CrossRef]
- Cai, R.; Xie, S.; Wang, B.; Yang, R.; Xu, D.; He, Y. Wind Speed Forecasting Based on Extreme Gradient Boosting. IEEE Access 2020, 8, 175063–175069. [Google Scholar] [CrossRef]
- Sun, X.; Liu, M.; Sima, Z. A novel cryptocurrency price trend forecasting model based on LightGBM. Financ. Res. Lett. 2018, 32, 101084. [Google Scholar] [CrossRef]
- Zhang, J.; Mucs, D.; Norinder, U.; Svensson, F. LightGBM: An Effective and Scalable Algorithm for Prediction of Chemical Toxicity–Application to the Tox21 and Mutagenicity Data Sets. J. Chem. Inf. Model. 2019, 59, 4150–4158. [Google Scholar] [CrossRef] [PubMed]
- Huang, G.; Wu, L.; Ma, X.; Zhang, W.; Fan, J.; Yu, X.; Zeng, W.; Zhou, H. Evaluation of CatBoost method for prediction of reference evapotranspiration in humid regions. J. Hydrol. 2019, 574, 1029–1041. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhao, Z.; Zheng, J. CatBoost: A new approach for estimating daily reference crop evapotranspiration in arid and semi-arid regions of Northern China. J. Hydrol. 2020, 588, 125087. [Google Scholar] [CrossRef]
- Hwang, J.-S.; Hu, T.-H. A stepwise regression algorithm for high-dimensional variable selection. J. Stat. Comput. Simul. 2014, 85, 1793–1806. [Google Scholar] [CrossRef]
- Burkholder, T.J.; Lieber, R.L. Stepwise regression is an alternative to splines for fitting noisy data. J. Biomech. 1996, 29, 235–238. [Google Scholar] [CrossRef]
- Heshmaty, B.; Kandel, A. Fuzzy linear regression and its applications to forecasting in uncertain environment. Fuzzy Sets Syst. 1985, 15, 159–191. [Google Scholar] [CrossRef]
- Nikolopoulos, K.; Goodwin, P.; Patelis, A.; Assimakopoulos, V. Forecasting with cue information: A comparison of multiple regression with alternative forecasting approaches. Eur. J. Oper. Res. 2007, 180, 354–368. [Google Scholar] [CrossRef]
- Li, B.-j.; He, C.-h. The combined forecasting method of GM(1,1) with linear regression and its application. In Proceedings of the 2007 IEEE International Conference on Grey Systems and Intelligent Services, Nanjing, China, 18–20 November 2007; pp. 394–398. [Google Scholar] [CrossRef]
- Heo, J.; Yang, J.Y. AdaBoost based bankruptcy forecasting of Korean construction companies. Appl. Soft Comput. 2014, 24, 494–499. [Google Scholar] [CrossRef]
- Mishra, S.; Mishra, D.; Santra, G.H. Adaptive boosting of weak regressors for forecasting of crop production considering climatic variability: An empirical assessment. J. King Saud Univ.-Comput. Inf. Sci. 2017, 32, 949–964. [Google Scholar] [CrossRef]
- Schapire, R.E.; Singer, Y. Improved Boosting Algorithms Using Confidence-rated Predictions. Mach. Learn. 1999, 37, 297–336. [Google Scholar] [CrossRef]
- Zhang, G.P. Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing 2003, 50, 159–175. [Google Scholar] [CrossRef]
- Khashei, M.; Bijari, M. A novel hybridization of artificial neural networks and ARIMA models for time series forecasting. Appl. Soft Comput. 2010, 11, 2664–2675. [Google Scholar] [CrossRef]
- Liang, Y.-H. Combining seasonal time series ARIMA method and neural networks with genetic algorithms for predicting the production value of the mechanical industry in Taiwan. Neural Comput. Appl. 2008, 18, 833–841. [Google Scholar] [CrossRef]
- Wong, F. Time series forecasting using backpropagation neural networks. Neurocomputing 1991, 2, 147–159. [Google Scholar] [CrossRef]
- Hill, T.; Marquez, L.; O’Connor, M.; Remus, W. Artificial neural network models for forecasting and decision making. Int. J. Forecast. 1994, 10, 5–15. [Google Scholar] [CrossRef]
- Nimon, K.F.; Oswald, F.L. Understanding the Results of Multiple Linear Regression: Beyond Standardized Regression Coefficients. Organ. Res. Methods 2013, 16, 650–674. [Google Scholar] [CrossRef]
- Saber, A.Y.; Alam, A.K.M.R. Short term load forecasting using multiple linear regression for big data. In Proceedings of the 2017 IEEE Symposium Series on Computational Intelligence (SSCI), Honolulu, HI, USA, 27 November–1 December 2017; pp. 1–6. [Google Scholar] [CrossRef]
- Gutierrez, P.A.; Perez-Ortiz, M.; Sanchez-Monedero, J.; Fernandez-Navarro, F.; Hervas-Martinez, C. Ordinal Regression Methods: Survey and Experimental Study. IEEE Trans. Knowl. Data Eng. 2015, 28, 127–146. [Google Scholar] [CrossRef]
- Taillardat, M.; Mestre, O.; Zamo, M.; Naveau, P. Calibrated Ensemble Forecasts Using Quantile Regression Forests and Ensemble Model Output Statistics. Mon. Weather Rev. 2016, 144, 2375–2393. [Google Scholar] [CrossRef]
- Molinder, J.; Scher, S.; Nilsson, E.; Körnich, H.; Bergström, H.; Sjöblom, A. Probabilistic Forecasting of Wind Turbine Icing Related Production Losses Using Quantile Regression Forests. Energies 2020, 14, 158. [Google Scholar] [CrossRef]
- Lic, I.; Görgülü, B.; Cevik, M.; Baydoğan, M.G. Explainable boosted linear regression for time series forecasting. Pattern Recognit. 2021, 120, 108144. [Google Scholar] [CrossRef]
- De’Ath, G. Boosted trees for ecological modeling and prediction. Ecology 2007, 88, 243–251. [Google Scholar] [CrossRef] [PubMed]
- Elith, J.; Leathwick, J.R.; Hastie, T. A working guide to boosted regression trees. J. Anim. Ecol. 2008, 77, 802–813. [Google Scholar] [CrossRef]
- Preminger, A.; Franck, R. Forecasting exchange rates: A robust regression approach. Int. J. Forecast. 2006, 23, 71–84. [Google Scholar] [CrossRef]
- Ikeuchi, K. (Ed.) Robust Regression; Computer Vision; Springer: Boston, MA, USA, 2014; p. 697. [Google Scholar] [CrossRef]
- Bonnabel, S. Stochastic Gradient Descent on Riemannian Manifolds. IEEE Trans. Autom. Control. 2013, 58, 2217–2229. [Google Scholar] [CrossRef]
- Mercier, Q.; Poirion, F.; Désidéri, J.-A. A stochastic multiple gradient descent algorithm. Eur. J. Oper. Res. 2018, 271, 808–817. [Google Scholar] [CrossRef]
- Ulvila, J.W. Decision trees for forecasting. J. Forecast. 1985, 4, 377–385. [Google Scholar] [CrossRef]
- Decision Tree Methods: Applications for Classification and Prediction—Shanghai Carchives of Psychiatry. Available online: https://shanghaiarchivesofpsychiatry.org/en/215044.html (accessed on 25 May 2024).
- Sokolov, A.; Carlin, D.E.; Paull, E.O.; Baertsch, R.; Stuart, J.M. Pathway-Based Genomics Prediction using Generalized Elastic Net. PLoS Comput. Biol. 2016, 12, e1004790. [Google Scholar] [CrossRef]
- Liu, W.; Dou, Z.; Wang, W.; Liu, Y.; Zou, H.; Zhang, B.; Hou, S. Short-Term Load Forecasting Based on Elastic Net Improved GMDH and Difference Degree Weighting Optimization. Appl. Sci. 2018, 8, 1603. [Google Scholar] [CrossRef]
- Parussini, L.; Venturi, D.; Perdikaris, P.; Karniadakis, G. Multi-fidelity Gaussian process regression for prediction of random fields. J. Comput. Phys. 2017, 336, 36–50. [Google Scholar] [CrossRef]
- Fang, D.; Zhang, X.; Yu, Q.; Jin, T.C.; Tian, L. A novel method for carbon dioxide emission forecasting based on improved Gaussian processes regression. J. Clean. Prod. 2018, 173, 143–150. [Google Scholar] [CrossRef]
- Sun, B.; Cheng, W.; Goswami, P.; Bai, G. Short-term traffic forecasting using self-adjusting k-nearest neighbours. IET Intell. Transp. Syst. 2017, 12, 41–48. [Google Scholar] [CrossRef]
- Yang, D.; Ye, Z.; Lim, L.H.I.; Dong, Z. Very short term irradiance forecasting using the lasso. Sol. Energy 2015, 114, 314–326. [Google Scholar] [CrossRef]
- Ranstam, J.; A Cook, J. LASSO regression. Br. J. Surg. 2018, 105, 1348. [Google Scholar] [CrossRef]
- Ben Bouallègue, Z. Calibrated Short-Range Ensemble Precipitation Forecasts Using Extended Logistic Regression with Interaction Terms. Weather Forecast. 2013, 28, 515–524. [Google Scholar] [CrossRef]
- Stoltzfus, J.C. Logistic Regression: A Brief Primer. Acad. Emerg. Med. 2011, 18, 1099–1104. [Google Scholar] [CrossRef] [PubMed]
- Davig, T.; Hall, A.S. Recession forecasting using Bayesian classification. Int. J. Forecast. 2019, 35, 848–867. [Google Scholar] [CrossRef]
- Aditya, E.; Situmorang, Z.; Hayadi, B.H.; Zarlis, M.; Wanayumini. New Student Prediction Using Algorithm Naive Bayes And Regression Analysis In Universitas Potensi Utama. In Proceedings of the 2022 4th International Conference on Cybernetics and Intelligent System (ICORIS), Prapat, Indonesia, 8–9 October 2022; pp. 1–6. [Google Scholar] [CrossRef]
- Xu, M.; Pinson, P.; Lu, Z.; Qiao, Y.; Min, Y. Adaptive robust polynomial regression for power curve modeling with application to wind power forecasting. Wind Energy 2016, 19, 2321–2336. [Google Scholar] [CrossRef]
- Regonda, S.; Rajagopalan, B.; Lall, U.; Clark, M.; Moon, Y.-I. Local polynomial method for ensemble forecast of time series. Nonlinear Process. Geophys. 2005, 12, 397–406. [Google Scholar] [CrossRef]
- Yelland, L.N.; Salter, A.B.; Ryan, P. Performance of the Modified Poisson Regression Approach for Estimating Relative Risks From Clustered Prospective Data. Am. J. Epidemiol. 2011, 174, 984–992. [Google Scholar] [CrossRef] [PubMed]
- Frome, E.L. The Analysis of Rates Using Poisson Regression Models. Biometrics 1983, 39, 665–674. [Google Scholar] [CrossRef] [PubMed]
- Dudek, G. A Comprehensive Study of Random Forest for Short-Term Load Forecasting. Energies 2022, 15, 7547. [Google Scholar] [CrossRef]
- Tyralis, H.; Papacharalampous, G. Variable Selection in Time Series Forecasting Using Random Forests. Algorithms 2017, 10, 114. [Google Scholar] [CrossRef]
- Ziegler, A.; König, I.R. Mining data with random forests: Current options for real-world applications. WIREs Data Min. Knowl. Discov. 2013, 4, 55–63. [Google Scholar] [CrossRef]
- Peña, M.; Dool, H.v.D. Consolidation of Multimodel Forecasts by Ridge Regression: Application to Pacific Sea Surface Temperature. J. Clim. 2008, 21, 6521–6538. [Google Scholar] [CrossRef]
- McDonald, G.C. Ridge regression. WIREs Comput. Stat. 2009, 1, 93–100. [Google Scholar] [CrossRef]
- Hoerl, A.E.; Kennard, R.W. Ridge Regression: Applications to Nonorthogonal Problems. Technometrics 1970, 12, 69–82. [Google Scholar] [CrossRef]
- Hao, W.; Yu, S. Support Vector Regression for Financial Time Series Forecasting. In Knowledge Enterprise: Intelligent Strategies in Product Design, Manufacturing, and Management; Wang, K., Kovacs, G.L., Wozny, M., Fang, M., Eds.; Springer: Boston, MA, USA, 2006; pp. 825–830. [Google Scholar] [CrossRef]
- Bao, Y.; Xiong, T.; Hu, Z. Multi-step-ahead time series prediction using multiple-output support vector regression. Neurocomputing 2014, 129, 482–493. [Google Scholar] [CrossRef]
- Singh, T.; Uppaluri, R.V.S. Machine learning tool-based prediction and forecasting of municipal solid waste generation rate: A case study in Guwahati, Assam, India. Int. J. Environ. Sci. Technol. 2022, 20, 12207–12230. [Google Scholar] [CrossRef]
Equip ID | Date | Real Increment |
---|---|---|
63 | 2019-jan-08 00:00:00 | 0 |
63 | 2019-jan-08 13:20:00 | 1400 |
63 | 2019-jan-08 13:30:00 | 1500 |
63 | 2019-jan-08 14:00:00 | 1400 |
63 | 2019-jan-08 14:35:00 | 0 |
63 | 2019-jan-08 16:25:00 | 1400 |
63 | 2019-jan-08 17:20:00 | 1500 |
63 | 2019-jan-08 17:40:00 | 0 |
63 | 2019-jan-08 18:30:00 | 1400 |
63 | 2019-jan-08 18:35:00 | 0 |
Date | Hour | Equip ID | Real Increment | Datetime | Year | Month | Day |
---|---|---|---|---|---|---|---|
2019-jan-08 | 0 | 63 | 0 | 2019-jan-08 T00:00 | 2019 | 1 | 08 |
2019-jan-08 | 1 | 63 | 0 | 2019-jan-08 T01:00 | 2019 | 1 | 08 |
2019-jan-08 | 2 | 63 | 0 | 2019-jan-08 T02:00 | 2019 | 1 | 08 |
2019-jan-08 | 3 | 63 | 0 | 2019-jan-08 T03:00 | 2019 | 1 | 08 |
2019-jan-08 | 4 | 63 | 0 | 2019-jan-08 T04:00 | 2019 | 1 | 08 |
2019-jan-08 | 5 | 63 | 0 | 2019-jan-08 T05:00 | 2019 | 1 | 08 |
2019-jan-08 | 6 | 63 | 0 | 2019-jan-08 T06:00 | 2019 | 1 | 08 |
2019-jan-08 | 7 | 63 | 0 | 2019-jan-08 T07:00 | 2019 | 1 | 08 |
2019-jan-08 | 8 | 63 | 0 | 2019-jan-08 T08:00 | 2019 | 1 | 08 |
2019-jan-08 | 9 | 63 | 0 | 2019-jan-08 T09:00 | 2019 | 1 | 08 |
MAE | MSE | RMSE | R2 | ME | |
---|---|---|---|---|---|
Average | 11.24 | 388.89 | 16.46 | −0.8 | 152.43 |
Min value | 5.6 | 76.57 | 8.75 | −61.21 | 109.3 |
Max value | 70.43 | 11,013.97 | 104.95 | 0.44 | 490 |
Deviation | 6.919 | 962.841 | 10,881 | 5149 | 57.405 |
Algorithm | MAE | MSE | RMSE | R2 | ME |
---|---|---|---|---|---|
Decision Tree | 7.84 | 147.54 | 11.96 | 0.36 | 140.94 |
Elastic Net | 11.2 | 221.17 | 14.7 | 0.03 | 147.75 |
Gaussian Process | 7.85 | 147.56 | 11.96 | 0.36 | 140.94 |
KNN | 8.8 | 183.98 | 13.32 | 0.2 | 142.53 |
Lasso | 11.18 | 220.99 | 14.69 | 0.03 | 147.85 |
Linear Regression | 11.18 | 220.99 | 14.69 | 0.03 | 147.86 |
Logistic Regression | 12.28 | 372.23 | 19.07 | −0.64 | 161.33 |
Naïve Bayes | 27.94 | 2443.97 | 44.73 | −10.81 | 222.83 |
Polynomial Regression | 9.49 | 178.78 | 13.18 | 0.22 | 143.24 |
Random Forest | 7.93 | 148.93 | 12.02 | 0.35 | 140.96 |
Ridge | 11.18 | 220.99 | 14.69 | 0.03 | 147.86 |
SVR | 8.02 | 159.55 | 12.48 | 0.3 | 145.11 |
Algorithm | DP | MAE | MSE | RMSE | R2 | ME |
---|---|---|---|---|---|---|
Decision Tree | 1 | 7.74 | 138.31 | 11.76 | 0.39 | 139.08 |
2 | 7.4 | 131.32 | 11.46 | 0.35 | 116.83 | |
3 | 5.72 | 78.81 | 8.88 | 0.42 | 112.54 | |
4 | 9.24 | 183.14 | 13.53 | 0.3 | 132.89 | |
5 | 9.65 | 242.86 | 15.58 | 0.32 | 230.1 | |
6 | 7.31 | 110.83 | 10.53 | 0.37 | 114.22 | |
Random Forest | 1 | 7.86 | 141.55 | 11.9 | 0.38 | 138.45 |
2 | 7.48 | 132.28 | 11.5 | 0.34 | 116.5 | |
3 | 5.84 | 80.86 | 8.99 | 0.4 | 115.56 | |
4 | 9.3 | 183.98 | 13.56 | 0.3 | 132 | |
5 | 9.74 | 243.86 | 15.62 | 0.32 | 231.93 | |
6 | 7.37 | 111.05 | 10.54 | 0.38 | 111.31 |
Algorithm | Case Studied | MAE | MSE | RMSE | R2 | ME |
---|---|---|---|---|---|---|
Decision Tree | Day of Month + Hour | 8.04 | 152.78 | 12.17 | 0.34 | 141.41 |
Day of Week + Hour | 7.68 | 143.61 | 11.79 | 0.38 | 141.16 | |
Holiday + Hour | 7.93 | 148.91 | 12.01 | 0.36 | 140.53 | |
Weekend + Hour | 7.73 | 144.88 | 11.85 | 0.37 | 140.67 | |
Random Forest | Day of Month + Hour | 8.01 | 150.46 | 12.08 | 0.34 | 140.94 |
Day of Week + Hour | 7.86 | 147.34 | 11.96 | 0.36 | 140.72 | |
Holiday + Hour | 8 | 150.35 | 12.07 | 0.34 | 141.36 | |
Weekend + Hour | 7.86 | 147.57 | 11.97 | 0.36 | 140.82 |
Algorithm | Case Studied | DP | MAE | MSE | RMSE | R2 | ME |
---|---|---|---|---|---|---|---|
Decision Tree | Day of Week + Hour | 3 | 5.6 | 76.58 | 8.75 | 0.44 | 111.24 |
Decision Tree | Weekend + Hour | 3 | 5.62 | 77.24 | 8.79 | 0.43 | 111.97 |
Decision Tree | Holiday + Hour | 3 | 5.78 | 79.79 | 8.93 | 0.42 | 114.22 |
Decision Tree | Day of Week + Hour | 1 | 7.56 | 133.46 | 11.55 | 0.41 | 138.76 |
Random Forest | Day of Week + Hour | 3 | 5.8 | 80.36 | 8.96 | 0.41 | 115.57 |
Random Forest | Weekend + Hour | 3 | 5.8 | 80.41 | 8.97 | 0.41 | 115.57 |
Decision Tree | Weekend + Hour | 1 | 7.65 | 135.29 | 11.63 | 0.4 | 138.98 |
Decision Tree | Day of Month + Hour | 3 | 5.87 | 81.61 | 9.03 | 0.4 | 112.71 |
Random Forest | Day of Month + Hour | 3 | 5.88 | 81.35 | 9.02 | 0.4 | 115.56 |
Random Forest | Holiday + Hour | 3 | 5.88 | 81.33 | 9.02 | 0.4 | 115.56 |
Decision Tree | Day of Week + Hour | 3 | 5.6 | 76.58 | 8.75 | 0.44 | 111.24 |
Decision Tree | Weekend + Hour | 3 | 5.62 | 77.24 | 8.79 | 0.43 | 111.97 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
De-la-Mata-Moratilla, S.; Gutierrez-Martinez, J.-M.; Castillo-Martinez, A.; Caro-Alvaro, S. Prediction of the Behaviour from Discharge Points for Solid Waste Management. Mach. Learn. Knowl. Extr. 2024, 6, 1389-1412. https://doi.org/10.3390/make6030066
De-la-Mata-Moratilla S, Gutierrez-Martinez J-M, Castillo-Martinez A, Caro-Alvaro S. Prediction of the Behaviour from Discharge Points for Solid Waste Management. Machine Learning and Knowledge Extraction. 2024; 6(3):1389-1412. https://doi.org/10.3390/make6030066
Chicago/Turabian StyleDe-la-Mata-Moratilla, Sergio, Jose-Maria Gutierrez-Martinez, Ana Castillo-Martinez, and Sergio Caro-Alvaro. 2024. "Prediction of the Behaviour from Discharge Points for Solid Waste Management" Machine Learning and Knowledge Extraction 6, no. 3: 1389-1412. https://doi.org/10.3390/make6030066
APA StyleDe-la-Mata-Moratilla, S., Gutierrez-Martinez, J. -M., Castillo-Martinez, A., & Caro-Alvaro, S. (2024). Prediction of the Behaviour from Discharge Points for Solid Waste Management. Machine Learning and Knowledge Extraction, 6(3), 1389-1412. https://doi.org/10.3390/make6030066