Improving Road Traffic Forecasting Using Air Pollution and Atmospheric Data: Experiments Based on LSTM Recurrent Neural Networks
Abstract
:1. Introduction
1.1. Motivation
- the concentration of pollutants is greater close to big roads [17] (this is also why we tried to consider traffic intensity sensors close to the pollution sensors).
- the set of vehicles may be dynamic in composition (more diesel, more electric, and so on) during the days.
- the pollution level generated can be impacted by the meteorological condition.
1.2. Contribution
- We provide a detailed statistical analysis based on the relationship between air pollutants, atmospheric variables, and road traffic;
- To the best of our knowledge, this is the first attempt to use air pollutants in combination with atmospheric variables to improve traffic forecasting in a smart city;
- Our approach uses a well-known LSTM RNN for time-series traffic data forecasting; and
- We provide some proof of the validity of our approach and avenues for future work.
1.3. Organization
2. Related Work
3. Methodology
3.1. Statistical Analysis
3.2. Linear Interpolation
3.3. Traffic Forecasting Using LSTM Recurrent Neural Network
3.4. Data Normalization
Min-Max Normalization
3.5. Hyperparameter
- 3 LSTM layers;
- Dropout: To keep our model from going into overfitting, we applied dropout [46] at each LSTM layer with a value of ;
- Early Stopping: To stop the training before the model approaches overfitting, we used early stopping [47] with the patience value of 5;
- Look Back Steps: In order to do prediction at time t, “look back” shows how many previous time steps need to be considered. We set the “look back” steps value at 168, which represents the total number of hours in a week. We chose 168 hours (one week) as “look back” period. The plan is to capture the evolution of the air pollutants over a period in which different, but recursive patterns may occur, e.g., working day traffic vs. Week-end traffic. We wanted to grasp the differences between working days and week-end. In addition, in such a period, the pollutants have time to consolidate (some pollutant can float for hours or more). Moreover, this time period could result a better forecasting. Traffic intensity shows different patterns between weekdays and weekends. Pollution “signatures” refer to longer and more complex situations. A week within a particular month (e.g., December before Christmas time) can be characterized by higher volume of traffic and hence pollution. Different months can have very different levels of traffic and pollution. The choice of considering one week is due to the possibility to grasp these variations, while still maintaining a short period for observation and data capture. With respect to pollution, a longer period of time (e.g., a month) would allow a more specific characterization of the traffic in that specific month and the related pollution signature could be used in order to help the prediction. A shorter period of time (one day, two days) is not able to capture these variations in traffic intensity and pollution measurements. However, the choice of one week is a starting point and, for further work, a better tuning of the time could be envisaged.
4. Dataset and Performance Evaluation
4.1. Evaluation Metrics
4.2. Results
4.3. Further Evaluation
- Overfitting represents the ability of the model to learn too much during the training process, so that when unseen data are provided for prediction, it shows poor performance. Overfitting can be diagnosed by plotting learning curves. If the training loss is decreasing but validation loss starts increasing after a specific point, this shows that a model is overfitting [51].
- Underfitting represents the inability of the model to learn from training data. If a learning curve shows either of the following two behaviors, the model is underfitting:
- –
- Validation loss is very high and training loss is flat regardless of training time.
- –
- Training loss is continuously decreasing without being stable until the training is complete.
4.4. Threat to Validity
5. Conclusions
Future Work
Author Contributions
Funding
Conflicts of Interest
References
- Schmidt, J.M.; Tendwa, O.; Bruwer, M.M. Traffic impact of the its time event. In Proceedings of the 37th Annual Southern African Transport Conference, Pretoria, South Africa, 9–12 July 2018; pp. 704–716. [Google Scholar]
- Kuang, Y.; Yen, B.T.; Suprun, E.; Sahin, O. A soft traffic management approach for achieving environmentally sustainable and economically viable outcomes: An Australian case study. J. Environ. Manag. 2019, 237, 379–386. [Google Scholar] [CrossRef] [PubMed]
- Bogaerts, T.; Masegosa, A.D.; Angarita-Zapata, J.S.; Onieva, E.; Hellinckx, P. A graph CNN-LSTM neural network for short and long-term traffic forecasting based on trajectory data. Transp. Res. Part C Emerg. Technol. 2020, 112, 62–77. [Google Scholar] [CrossRef]
- Lazić, L.; Urošević, M.A.; Mijić, Z.; Vuković, G.; Ilić, L. Traffic contribution to air pollution in urban street canyons: Integrated application of the OSPM, moss biomonitoring and spectral analysis. Atmos. Environ. 2016, 141, 347–360. [Google Scholar] [CrossRef]
- World Health Organization. Air Pollution. Available online: https://www.euro.who.int/en/health-topics/environment-and-health/Transport-and-health/data-and-statistics/air-pollution-and-climate-change2 (accessed on 27 March 2020).
- Analyzing Traffic Flows in Madrid City. Available online: https://ec.europa.eu/eurostat/cros/system/files/s06p2-analizing-traffic-flows-in-madrid-city.pdf (accessed on 23 June 2020).
- Maciag, P.S.; Kasabov, N.; Kryszkiewicz, M.; Bembenik, R. Air pollution prediction with clustering-based ensemble of evolving spiking neural networks and a case study for London area. Environ. Mod. Soft. 2019, 118, 262–280. [Google Scholar] [CrossRef]
- Rosenlund, M.; Forastiere, F.; Stafoggia, M.; Porta, D.; Perucci, M.; Ranzi, A.; Nussio, F.; Perucci, C.A. Comparison of regression models with land-use and emissions data to predict the spatial distribution of traffic-related air pollution in Rome. J. Expo. Sci. Environ. Epidem. 2008, 18, 192–199. [Google Scholar] [CrossRef] [Green Version]
- Crouse, D.L.; Goldberg, M.S.; Ross, N.A. A prediction-based approach to modelling temporal and spatial variability of traffic-related air pollution in Montreal, Canada. Atmos. Environ. 2009, 43, 5075–5084. [Google Scholar] [CrossRef]
- Batterman, S.; Ganguly, R.; Harbin, P. High resolution spatial and temporal mapping of traffic-related air pollutants. Int. J. Environ. Res. Public Health 2015, 12, 3646–3666. [Google Scholar] [CrossRef] [Green Version]
- Ly, H.B.; Le, L.M.; Phi, L.V.; Phan, V.H.; Tran, V.Q.; Pham, B.T.; Le, T.T.; Derrible, S. Development of an AI model to measure traffic air pollution from multisensor and weather data. Sensors 2019, 19, 4941. [Google Scholar] [CrossRef] [Green Version]
- Laña, I.; Del Ser, J.; Padró, A.; Vélez, M.; Casanova-Mateo, C. The role of local urban traffic and meteorological conditions in air pollution: A data-based case study in Madrid, Spain. Atmos. Environ. 2016, 145, 424–438. [Google Scholar] [CrossRef]
- Russo, A.; Lind, P.G.; Raischel, F.; Trigo, R.; Mendes, M. Neural network forecast of daily pollution concentration using optimal meteorological data at synoptic and local scales. Atmos. Pollut. Res. 2015, 6, 540–549. [Google Scholar] [CrossRef] [Green Version]
- Brunello, A.; Kamińska, J.; Marzano, E.; Montanari, A.; Sciavicco, G.; Turek, T. Assessing the Role of Temporal Information in Modelling Short-Term Air Pollution Effects Based on Traffic and Meteorological Conditions: A Case Study in Wrocław. In Proceedings of the European Conference on Advances in Databases and Information Systems, Bled, Slovenia, 8–11 September 2019; pp. 463–474. [Google Scholar]
- World Economic Forum, This Is Why People Live, Work, and Stay in a Growing City. Available online: https://www.weforum.org/agenda/2018/10/this-is-why-people-live-work-stay-leave-in-growing-city/ (accessed on 27 March 2020).
- Pant, P.; Shi, Z.; Pope, F.D.; Harrison, R.M. Characterization of traffic-related particulate matter emissions in a road tunnel in Birmingham, UK: Trace metals and organic molecular markers. Aerosol. Air. Qual. Res. 2016, 17, 117–130. [Google Scholar] [CrossRef] [Green Version]
- Zhang, X.; Craft, E.; Zhang, K. Characterizing spatial variability of air pollution from vehicle traffic around the Houston Ship Channel area. Atmos. Environ. 2017, 161, 167–175. [Google Scholar] [CrossRef]
- Laput, G.; Zhang, Y.; Harrison, C. Synthetic sensors: Towards general-purpose sensing. In Proceedings of the 1st CHI Conference on Human Factors in Computing Systems, Colorado, CO, USA, 6–11 May 2017; pp. 3986–3999. [Google Scholar]
- Guo, T.; Xu, Z.; Yao, X.; Chen, H.; Aberer, K.; Funaya, K. Robust online time-series prediction with recurrent neural networks. In Proceedings of the IEEE International Conference on Data Science and Advanced Analytics, Montreal, QC, Canada, 17–19 October 2016; pp. 816–825. [Google Scholar]
- Ji, B.; Hong, E.J. Deep-learning-based real-time road traffic prediction using long-term evolution access data. Sensors 2019, 19, 5327. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Wei, W.; Wu, H.; Ma, H. An autoencoder and LSTM-based traffic flow prediction method. Sensors 2019, 19, 2946. [Google Scholar] [CrossRef] [Green Version]
- Li, Y.; Shahabi, C. A brief overview of machine learning methods for short-term traffic forecasting and future directions. Sigspatial Spec. 2018, 10, 3–9. [Google Scholar] [CrossRef]
- Ketabi, R.; Al-Qathrady, M.; Alipour, B.; Helmy, A. Vehicular Traffic Density Forecasting through the Eyes of Traffic Cameras; a Spatio-Temporal Machine Learning Study. In Proceedings of the 9th ACM Symposium on Design and Analysis of Intelligent Vehicular Networks and Applications, Miami, FL, USA, 25–29 November 2019; pp. 81–88. [Google Scholar]
- Zhu, D.; Du, H.; Sun, Y.; Cao, N. Research on path planning model based on short-term traffic flow prediction in intelligent transportation system. Sensors 2018, 18, 4275. [Google Scholar] [CrossRef] [Green Version]
- Hou, Q.; Leng, J.; Ma, G.; Liu, W.; Cheng, Y. An adaptive hybrid model for short-term urban traffic flow prediction. Phys. A Stat. Mech. Appl. 2019, 527, 121065. [Google Scholar] [CrossRef]
- Tang, J.; Chen, X.; Hu, Z.; Zong, F.; Han, C.; Li, L. Traffic flow prediction based on combination of support vector machine and data denoising schemes. Phys. A Stat. Mech. Appl. 2019, 534, 120642. [Google Scholar] [CrossRef]
- Wang, W.; Zhang, H.; Li, T.; Guo, J.; Huang, W.; Wei, Y.; Cao, J. An interpretable model for short term traffic flow prediction. Math. Comp. Simul. 2019, 171, 264–278. [Google Scholar] [CrossRef]
- Rajabzadeh, Y.; Rezaie, A.H.; Amindavar, H. Short-term traffic flow prediction using time-varying Vasicek model. Transp. Res. Part C Emerg. Technol. 2017, 74, 168–181. [Google Scholar] [CrossRef]
- Goudarzi, S.; Kama, M.N.; Anisi, M.H.; Soleymani, S.A.; Doctor, F. Self-organizing traffic flow prediction with an optimized deep belief network for internet of vehicles. Sensors 2018, 18, 3459. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Abadi, A.; Rajabioun, T.; Ioannou, P.A. Traffic flow prediction for road transportation networks with limited traffic data. IEEE Trans. Intell. Transp. Syst. 2014, 16, 653–662. [Google Scholar] [CrossRef]
- Zhang, D.; Kabuka, M.R. Combining weather condition data to predict traffic flow: A GRU-based deep learning approach. IET Intell. Transp. Syst. 2018, 12, 578–585. [Google Scholar] [CrossRef]
- Analyzing Traffic Flows in Madrid City. Available online: https://eprints.ucm.es/49461/1/TFM-201809-4.0%20-%20Pina%20Lagunas%20-%20Sergio.pdf (accessed on 23 June 2020).
- Tsirigotis, L.; Vlahogianni, E.I.; Karlaftis, M.G. Does information on weather affect the performance of short-term traffic forecasting models? Int. J. Intell. Transp. Syst. Res. 2012, 10, 1–10. [Google Scholar] [CrossRef]
- Xu, X.; Su, B.; Zhao, X.; Xu, Z.; Sheng, Q.Z. Effective traffic flow forecasting using taxi and weather data. In Proceedings of the International Conference on Advanced Data Mining and Applications, Gold Coast, Australia, 12–15 December 2016; pp. 507–519. [Google Scholar]
- European Commission Directorate-General for the Environment. Available online: https://ec.europa.eu/environment/pubs/pdf/streets-people.pdf (accessed on 7 May 2020).
- Badii, C.; Nesi, P.; Paoli, I. Predicting available parking slots on critical and regular services by exploiting a range of open data. IEEE Access 2018, 6, 44059–44071. [Google Scholar] [CrossRef]
- Open data portal of the Madrid City Council. Available online: https://datos.madrid.es/portal/site/egob (accessed on 2 February 2020).
- Baldauf, R.; Watkins, N.; Heist, D.; Bailey, C.; Rowley, P.; Shores, R. Near-road air quality monitoring: Factors affecting network design and interpretation of data. Air Qual. Atmos. Health 2009, 2, 1–9. [Google Scholar] [CrossRef] [Green Version]
- Che, Z.; Purushotham, S.; Cho, K.; Sontag, D.; Liu, Y. Recurrent neural networks for multivariate time-series with missing values. Sci. Rep. 2018, 8, 6085. [Google Scholar] [CrossRef] [Green Version]
- Li, L.; Zhang, J.; Wang, Y.; Ran, B. Missing value imputation for traffic-related time-series data based on a multi-view learning method. IEEE Trans. Intell. Transp. Syst. 2018, 20, 2933–2943. [Google Scholar] [CrossRef]
- Usman, K.; Ramdhani, M. Comparison of Classical Interpolation Methods and Compressive Sensing for Missing Data Reconstruction. In Proceedings of the IEEE International Conference on Signals and Systems, Bandung, Indonesia, 16–18 July 2019; pp. 29–33. [Google Scholar]
- Zhao, Z.; Chen, W.; Wu, X.; Chen, P.C.; Liu, J. LSTM network: A deep learning approach for short-term traffic forecast. IET Intell. Transp. Syst. 2017, 11, 68–75. [Google Scholar] [CrossRef] [Green Version]
- Ma, X.; Tao, Z.; Wang, Y.; Yu, H.; Wang, Y. Long short-term memory neural network for traffic speed prediction using remote microwave sensor data. Transp. Res. Part C Emerg. Technol. 2015, 54, 187–197. [Google Scholar] [CrossRef]
- Nayak, S.C.; Misra, B.B.; Behera, H.S. Impact of data normalization on stock index forecasting. Int. J. Comput. Inf. Syst. Ind. Manag. Appl. 2014, 6, 357–369. [Google Scholar]
- Gajera, V.; Gupta, R.; Jana, P.K. An effective multi-objective task scheduling algorithm using min-max normalization in cloud computing. In Proceedings of the 2nd International Conference on Applied and Theoretical Computing and Communication Technology, Bengaluru, India, 21–23 July 2016; pp. 812–816. [Google Scholar]
- Srivastava, N.; Hinton, G.; Krizhevsky, A.; Sutskever, I.; Salakhutdinov, R. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 2014, 15, 1929–1958. [Google Scholar]
- Prechelt, L. Neural Network: Tricks of the Trade; Springer: Heiderlberg, Germany, 1998. [Google Scholar]
- Zhang, L.; Liu, Q.; Yang, W.; Wei, N.; Dong, D. An improved k-nearest neighbor model for short-term traffic flow prediction. Procedia-Soc. Behav. Sci. 2013, 96, 653–662. [Google Scholar] [CrossRef] [Green Version]
- Li, L.; Su, X.; Wang, Y.; Lin, Y.; Li, Z.; Li, Y. Robust causal dependence mining in big data network and its application to traffic flow predictions. Transp. Res. Part C Emerg. Technol. 2015, 58, 292–307. [Google Scholar] [CrossRef]
- Perlich, C.; Provost, F.; Simonoff, J.S. Tree induction vs. logistic regression: A learning-curve analysis. J. Mach. Learn. Res. 2003, 4, 211–255. [Google Scholar]
- Perlich, C. Encyclopedia of Machine Learning; Springer: Boston, MA, USA, 2011. [Google Scholar]
- Nimesh, V.; Sharma, D.; Reddy, V.M.; Goswami, A.K. Implication viability assessment of shift to electric vehicles for present power generation scenario of India. Energy 2020, 195, 116976. [Google Scholar] [CrossRef]
- Awan, F.M.; Saleem, Y.; Minerva, R.; Crespi, N. A Comparative Analysis of Machine/Deep Learning Models for Parking Space Availability Prediction. Sensors 2020, 20, 322. [Google Scholar] [CrossRef] [Green Version]
- Makridakis, S.; Spiliotis, E.; Assimakopoulos, V. Statistical and Machine Learning forecasting methods: Concerns and ways forward. PLoS ONE 2018, 13, e0194889. [Google Scholar] [CrossRef] [Green Version]
- Open Data Portal of the Barcelona City. Available online: https://opendata-ajuntament.barcelona.cat/data/es/dataset (accessed on 25 March 2020).
- Open data portal of the Turin City. Available online: https://www.torinocitylab.it/en/assetto/open-data (accessed on 25 March 2020).
Feature | Value/Unit |
---|---|
Month | 1–12 |
Day | 1–28/29/30/31 |
Weekday | 1–7 |
Hour | 0–23 |
mg/m | |
g/m | |
g/m | |
g/m | |
g/m | |
Pressure | mb |
Temperature | |
Wind Direction | Angle |
Wind Speed | m/s |
Traffic Flow | Vehicles/h |
Air Pollution Sensor Station | Traffic Flow Sensor | Distance from Air Pollution Sensor Station | Minimum Flow (Annual) | Maximum Flow (Annual) | Average Flow (Annual) |
---|---|---|---|---|---|
28079016 | 6037 | 240 m | 0 | 384 | 112.344 |
3791 | 79 m | 4 | 1601 | 493.693 | |
3775 | 294 m | 17 | 1166 | 468.615 | |
5938 | 205 m | 0 | 220 | 32.011 | |
5939 | 125 m | 5 | 1980 | 522.943 | |
10124 | 242 m | NA | NA | NA | |
6058 | 214 m | NA | NA | NA | |
3594 | 296 m | NA | NA | NA | |
5922 | 366 m | NA | NA | NA | |
10128 | 500 m | 4 | 1413 | 437.701 | |
10125 | 455 m | NA | NA | NA | |
5941 | 303 m | 0 | 1324 | 135.017 | |
5923 | 426 m | 5 | 1334 | 437.864 | |
5994 | 483 m | 0 | 480 | 135.389 | |
5940 | 369 m | NA | NA | NA | |
5942 | 336 m | 0 | 1523 | 534.091 | |
5944 | 349 m | 0 | 182 | 72.176 | |
5921 | 374 m | 23 | 1214 | 481.669 | |
3776 | 425 m | 17 | 1208 | 476.911 | |
5937 | 484 m | 0 | 313 | 86.216 | |
28079035 | 3731 | 26 m | NA | NA | NA |
4303 | 39 m | 0 | 181 | 52.188 | |
3730 | 133 m | NA | NA | NA | |
4301 | 137 m | NA | NA | NA | |
10387 | 196 m | 40 | 1260 | 608.482 |
Air Pollution Sensor Station | Traffic Flow Sensor | MAE | MSE |
---|---|---|---|
28079016 | 6037 | 0.183 | 0.045 |
3791 | 0.206 | 0.056 | |
3775 | 0.206 | 0.054 | |
5938 | 0.073 | 0.009 | |
5939 | 0.166 | 0.035 | |
10128 | 0.203 | 0.053 | |
5941 | 0.061 | 0.005 | |
5923 | 0.188 | 0.046 | |
5994 | 0.173 | 0.047 | |
5942 | 0.214 | 0.060 | |
5944 | 0.208 | 0.056 | |
5921 | 0.200 | 0.051 | |
3776 | 0.193 | 0.051 | |
5937 | 0.160 | 0.030 | |
28079035 | 4303 | 0.105 | 0.017 |
10387 | 0.136 | 0.029 |
© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).
Share and Cite
Awan, F.M.; Minerva, R.; Crespi, N. Improving Road Traffic Forecasting Using Air Pollution and Atmospheric Data: Experiments Based on LSTM Recurrent Neural Networks. Sensors 2020, 20, 3749. https://doi.org/10.3390/s20133749
Awan FM, Minerva R, Crespi N. Improving Road Traffic Forecasting Using Air Pollution and Atmospheric Data: Experiments Based on LSTM Recurrent Neural Networks. Sensors. 2020; 20(13):3749. https://doi.org/10.3390/s20133749
Chicago/Turabian StyleAwan, Faraz Malik, Roberto Minerva, and Noel Crespi. 2020. "Improving Road Traffic Forecasting Using Air Pollution and Atmospheric Data: Experiments Based on LSTM Recurrent Neural Networks" Sensors 20, no. 13: 3749. https://doi.org/10.3390/s20133749
APA StyleAwan, F. M., Minerva, R., & Crespi, N. (2020). Improving Road Traffic Forecasting Using Air Pollution and Atmospheric Data: Experiments Based on LSTM Recurrent Neural Networks. Sensors, 20(13), 3749. https://doi.org/10.3390/s20133749