Systematic Review of Machine Learning and Deep Learning Techniques for Spatiotemporal Air Quality Prediction

Agbehadji, Israel Edem; Obagbuwa, Ibidun Christiana

doi:10.3390/atmos15111352

Open AccessReview

Systematic Review of Machine Learning and Deep Learning Techniques for Spatiotemporal Air Quality Prediction

by

Israel Edem Agbehadji

^* and

Ibidun Christiana Obagbuwa

Department of Computer Science and Information Technology, Faculty of Natural and Applied Sciences, Sol Plaatje University, Kimberly 8300, South Africa

^*

Author to whom correspondence should be addressed.

Atmosphere 2024, 15(11), 1352; https://doi.org/10.3390/atmos15111352

Submission received: 21 September 2024 / Revised: 5 November 2024 / Accepted: 8 November 2024 / Published: 10 November 2024

(This article belongs to the Section Atmospheric Techniques, Instruments, and Modeling)

Download

Browse Figure

Versions Notes

Abstract

:

Background: Although computational models are advancing air quality prediction, achieving the desired performance or accuracy of prediction remains a gap, which impacts the implementation of machine learning (ML) air quality prediction models. Several models have been employed and some hybridized to enhance air quality and air quality index predictions. The objective of this paper is to systematically review machine and deep learning techniques for spatiotemporal air prediction challenges. Methods: In this review, a methodological framework based on PRISMA flow was utilized in which the initial search terms were defined to guide the literature search strategy in online data sources (Scopus and Google Scholar). The inclusion criteria are articles published in the English language, document type (articles and conference papers), and source type (journal and conference proceedings). The exclusion criteria are book series and books. The authors’ search strategy was complemented with ChatGPT-generated keywords to reduce the risk of bias. Report synthesis was achieved by keyword grouping using Microsoft Excel, leading to keyword sorting in ascending order for easy identification of similar and dissimilar keywords. Three independent researchers were used in this research to avoid bias in data collection and synthesis. Articles were retrieved on 27 July 2024. Results: Out of 374 articles, 80 were selected as they were in line with the scope of the study. The review identified the combination of a machine learning technique and deep learning techniques for data limitations and processing of the nonlinear characteristics of air pollutants. ML models, such as random forest, and decision tree classifier were among the commonly used models for air quality index and air quality predictions, with promising performance results. Deep learning models are promising due to the hyper-parameter components, which consist of activation functions suitable for nonlinear spatiotemporal data. The emergence of low-cost devices for data limitations is highlighted, in addition to the use of transfer learning and federated learning models. Again, it is highlighted that military activities and fires impact the O₃ concentration, and the best-performing models highlighted in this review could be helpful in developing predictive models for air quality prediction in areas with heavy military activities. Limitation: This review acknowledges methodological challenges in terms of data collection sources, as there are equally relevant materials on other online data sources. Again, the choice and use of keywords for the initial search and the creation of subsequent filter keywords limit the collection of other relevant research articles.

Keywords:

machine learning; deep learning; spatiotemporal air quality prediction

1. Introduction

Air pollution is a health concern in many countries and the current pace of urbanization and industrialization has exacerbated the problem of air pollution globally [1,2,3]. Sources of air pollution include the burning of coal, oil, and natural gas in industrial processes, power generation, and vehicles, indoor activities such as the burning of coal, wood, and cigarette smoke, and the use of insecticides, among others. Air pollution impacts human lives, agriculture, and many facets of society [4,5]. Common air pollutants include sulphur dioxide, nitrogen oxides, particulate matter, lead, ozone, carbon monoxide, and many more [6]. Meanwhile, identifying locations with high exposure to air pollutants can help in designing the needed interventions in terms of policy, operational, and legislative instruments. Additionally, meteorological conditions, such as temperature, relative humidity, and wind speed, influence the risk of exposure to air pollution in geo-locations [7]. Thus, air quality prediction is a critical research area that requires much attention because of the health impact [8].

Considering the health implications of air pollutants, it is imperative to constantly monitor the concentration levels in order not to exceed the specified threshold [9]; thus, measures to check this include the use of the air quality index (AQI). The air quality index (AQI) is a quantitative air quality assessment tool that provides a standard measurement framework and the quality of air that is safe for human beings [10]. Although the AQI is helpful, it is not required for the development of all air prediction applications.

Air pollution is a global issue and while some developed countries have devised methods to check the pollutants’ concentration, developing countries struggle to monitor air pollutants, let alone establish ground monitoring stations in multiple locations. Fortunately, some developing countries such as South Africa have established 133 monitoring stations to monitor the quality of air across the country and also in some industrial hubs of the country [6]. In view of this, industries are regulated by the National Environmental Management Act, which ensures the required level of emission [11]. As per global standards, countries are required to ensure that their industries strictly adhere to international environmental standards on air pollution [12]. Meanwhile, when dedicated environmental monitoring stations are established to collect data on air pollutant concentration, it ensures compliance with best industry practices. Wang and Song [13] indicated that although monitoring stations can provide real-time air quality information, they are sometimes challenged with the level of complications in weather patterns and spatial-temporal dependency on air pollutants.

Research has shown that environmental monitoring stations collect large volumes and varieties of data, which need to be analyzed to understand the nonlinear nature of air pollutants using robust computational approaches [14]. Approaches that have been used to analyze air pollutants include numerical models, statistical approaches, and machine learning techniques [15,16,17,18].

Several machine learning approaches, including random forest (RF), decision tree (DT), and support vector machine (SVM), have been used for air quality predictions. As mentioned earlier, these machine learning approaches have their limitations. Moreover, deep learning approaches include long short-term memory (LSTM) and a convolutional neural network (CNN). The relationship between deep learning and machine learning can be considered in two aspects. Firstly, deep learning is a subset of machine learning. Secondly, deep learning models are algorithms using neural networks with multiple layers. These approaches may have limitations that could impact their predictability of air pollutant concentration when large volumes of data are used with their inherent nonlinear spatiotemporal features. As research evolves, it is important to explore how computational models have also evolved in addressing air pollution issues, including the nonlinearity in air quality data. Thus, spatiotemporal models have been saddled with very complex pollutants and their environmental challenges. In this regard, the categories of air pollutants with associated levels of concentration, seasonality, time, geographical area determination, air quality index check, and meteorological parameters, pose a challenge, which makes it more complex to model the nonlinear nature of air pollution phenomena [19]. In this study, the objective is to conduct a systematic review of the extant literature on existing machine learning and deep learning approaches to unravel the approaches used to address the issues of nonlinearity in spatiotemporal air quality predictions. To this end, it is important to formulate the underlying question as what machine learning and deep learning models have been developed to address issues of nonlinearity among others in spatiotemporal air quality prediction. This systematic review contributes to bridging research gaps to help in developing models for future air quality prediction systems or applications. In this research, the application of Chat Generative Pre-Trained Transformer (ChatGPT) complements the generation of search items used as “Filters”, thereby guiding the selection of keywords to scope the selection of published research articles. The remaining sections are Section 2 (Materials and Methods), Section 3 (Discussions), and Section 4 (Conclusions and Future Directions).

2. Materials and Methods

In this section, the method and material to help in synthesizing the extant literature are presented to ascertain the study’s initial preposition on the machine learning and deep learning models that have been used in addressing the nonlinearity in spatiotemporal air quality predictions. In ascertaining this preposition, data were retrieved from Scopus in addition to other equally relevant sources such as Google Scholar, using a search term strategy. Some criteria were set to help collect the relevant literature from these sources. Among articles, the inclusion criteria were articles published in the English language, document type (articles and conference papers), and source type (journal and conference proceedings), while book series and books were excluded. Due to the possibility of missing articles/documents in data extracts from Scopus, Google Scholar was utilized to retrieve those missing articles with their DOI, as they contain relevant information. The articles were subjected to PRISMA flow methodological phases, as shown in Figure 1, together with the corresponding inclusion and exclusion criteria.

Scopus was used as an online database due to the wider coverage of research articles. The initial search keywords used by the author are (“machine learning techniques”) AND (“air quality prediction”) OR “spatio-temporal” or “spatio temporal”, which resulted in the identification of 374 documents on 27 July 2024. Boolean operators such as “AND”, “OR”, and “AND NOT” were used in the search string to identify documents. The limitation of the search keyword is that “Spatiotemporal air quality prediction” could not yield any response.

Scopus has predefined search criteria such as “Article title, abstract, keywords”, which were used in conjunction with the initial keywords defined by the author, like (“machine learning techniques”) AND (“air quality prediction”) OR “spatio-temporal” or “spatiotemporal”. Additionally, the article inclusion criteria include articles written in the English Language, written in the year ranging from 2019 to 2024, the document type (articles and conference papers), and the source type (journal and conference proceedings). Using the search engine in Scopus, the authors’ initial search terms were explored to retrieve articles that met the inclusion and exclusion criteria. The ChatGPT-generated search terms were used to complement the article selection. Thus, both authors’ expert knowledge and ChatGPT-generated search terms were utilized. The authors designed queries, which were posed on Chat Generative Pre-Trained Transformer (ChatGPT), and these queries were presented as follows:

Query 1: “Write the keywords in review of machine learning techniques for spatiotemporal air quality prediction”.

Query 2: “What are the keywords on review of machine learning techniques for spatiotemporal air quality prediction”.

Query 3: “Write all the keywords on review of machine learning techniques for spatiotemporal air quality prediction”.

Query 4: “Write all the keywords on review of machine learning models for spatiotemporal air quality prediction”.

Query 5: “What are all the keywords on review of machine learning models for spatiotemporal air quality prediction”.

Query 6: “Write keywords on review of machine learning techniques for spatiotemporal air quality prediction”.

Generative Pre-trained Transformer (GPT) versions are helping create models with human-like text that are highly coherent and complex [20]. The use of ChatGPT in a systematic literature review reduces error and subjectivity, thereby helping create an enhanced methodological approach for literature reviews. ChatGPT as a topic modeling technique leverages ML techniques to generate summarized human-like text and help identify keywords on a specified topic. Additionally, three independent researchers were used in the data collection process: The first independent researcher was responsible for data retrieval from online repositories. The second independent researcher was responsible for the ChatGPT-generated query, and the third independent researcher was responsible for entry in a Microsoft Excel template. Finally, group discussions ensured consistency and agreement among the researchers to avoid any risk of bias.

3. Discussions

3.1. ChatGPT-Generated Keywords

The ChatGPT-generated keywords were similar in all the query responses. There were some initial search keywords among the ChatGPT-generated keywords. Instead of using the “Filter by keyword” feature on Scopus, the ChatGPT-generated keywords were used. Table A1 presents the results of the ChatGPT query responses.

Having identified these keywords, similar keywords in each query were grouped as well as dissimilar ones. During the keyword grouping, Microsoft Excel was used to sort the keywords in an ascending order for easy identification of similar keywords. The same keywords appearing in each query are grouped as “category 2”, and dissimilar keywords appearing once in each query response are grouped as “category 1”. During the comparison of the “Filter by keyword” feature on Scopus with the groupings, many ChatGPT-generated keywords were not among the “Filter by keyword” on Scopus, and those on Scopus are presented in Table 1. As mentioned earlier, independent expert/researchers’ opinions were leveraged to help reduce any possible errors in keyword groupings during researchers’ discussions. Again, the group discussions among the researchers were key in addressing any risk of bias from keyword categorization.

Upon applying the keywords in Table 1 as filters, the subject matter expert’s opinion was critical at this stage because there were several keywords (see Table 2) on Scopus that aligned strongly with the research question but were not among the ChatGPT-generated keywords. Subsequently, these keywords were selected, thereby leading to 309 needing further processing. Subject matter experts discussed these keywords to ensure alignment with the scope of this study.

Among the studies that met the inclusion criteria but were excluded was Arunarani, Selvanayaki [21], as it was more of a crop prediction. Brayshaw, Ward-Cherrier [22] was more into Neuromorphic Tactile classification, while Broumand, Asgarian [23] focused on the fan liquid sheet.

3.2. Machine Learning Techniques for Spatiotemporal Air Quality Prediction

Spatiotemporal refers to both space (that is, geographical location) and time (that is, future time). Designing a predictive model to forecast the level of air pollution in geographical areas and in different time seasons requires a thorough analysis of features that constitute air pollution. Such predictive models have their mathematical underpinning and computing techniques [24], suggesting that spatiotemporal models have different mathematical and computing models that are capable of predicting air quality in multiple geographical locations (that is, spatial coverage) and also in predicting the level of pollutants at different time scales (that is, temporal dynamics). Mathematical techniques involve statistical modeling tools which can model the relationship between pollutant concentration and factors that influence pollution, including meteorology and emissions. For instance, Kelly, Jang [25] applied statistical methods to fit a polynomial function that was used to examine the nonlinear response of air quality emission changes for PM2.5 and Ozone. Furthermore, computational models such as machine learning (ML) models are crucial in providing automated systems or applications to support decision-making in the health and environment sectors. ML techniques have been used for spatial and temporal analysis and the estimation of nitrogen dioxide concentration, among others [26]. In some instances, several linear classifiers can be stacked on the Perceptron algorithm to help approximate nonlinear functions, which is often known as the “multilayer perceptron”.

Machine learning models have demonstrated impressive performance in spatiotemporal temperature fluctuations. Awan, Batool [27] applied ML techniques such as ARIMA; “Trigonometric seasonality, Box-Cox transformation, ARIMA errors, Trend and Seasonal components” (TBATS); the extreme learning machine (ELM); and ANN—Multilayer Perceptron (MLP) to predict the value of the Generalized Probabilistic Standardized Temperature Index (GPSTI) for monitoring, forecasting, and evaluating the acceleration of temperature fluctuations. The experiment results showed that TBATS outperformed other algorithms in terms of lower error rates. TBATS is a forecasting method, based on exponential smoothing, for time series data that have complex seasonal patterns.

Hashemi and Karimi [28] proposed weighted ML techniques for spatiotemporal data to determine how data samples should be captured in training and testing in ML models. They addressed the concerns of most researchers in terms of either using location data and ignoring time or considering both as input features in ML models. Furthermore, they suggested that ML models should be trained separately on data samples, in which small samples should be used for training with large spatiotemporal weights to reduce the training time and guarantee the best accuracy.

Shafi [29] indicated that spatial patterns of urbanization affect the environment and agricultural activities, thus leading to a combination of spatiotemporal satellite images and supervised machine learning techniques to classify data into different land use/land cover. The spatiotemporal pattern studied by Yang, Du [30], suggested that ML can predict methane and nitrous oxide emissions from inland water. According to these findings, significant seasonal fluctuations in methane emissions are seen in their data, and these variations are caused by the concentrations of chemical oxygen demand and total nitrogen.

Machine learning technologies are permeating every facet of society, and the use of sophisticated ML techniques has been emphasized over simple ML techniques for efficient prediction tasks [31]. Morapedi and Obagbuwa [15] explored ML techniques for air pollutant analysis to predict PM2.5 concentration levels in selected cities in South Africa. Although it was suggested that combining two ML techniques provides more prediction accuracy than a single model, the accuracy suffers when larger datasets are utilized.

Soh, Chang [32] suggested that out of all the air pollutants, particulate matter has a strongly greater influence on human health than other toxins due to its small dimension, which can penetrate the respiratory system of humans and cause serious health challenges. Meanwhile, long-term contact with particulate matter raises the risk of lung cancer, asthma, and cardiovascular diseases [33].

Several human activities have been identified as a contributing factor to air pollution. Khan, Ellermann [34] indicated that the road transportation sector contributed significantly to air pollution due to the emission of pollutants from vehicles. As a result, ML methods including the support vector machine (SVM), RF, and ANN were utilized to predict the hourly concentrations of NO₂ and PM2.5 in traffic hotspots. Their findings suggest that RF outperformed the comparative ML techniques when evaluated using RMSE and R2. Park, Jeong [35] also indicated that due to a limited understanding of the variation in CO₂ concentration on roads, predicting the hourly “on-roads” CO₂ concentration and variations using ML models can help manage the high spatiotemporal CO₂ concentration. Among the key traffic information observed to understand the variations are data on traffic speed, traffic volumes, and wind speed. Subsequently, RF was applied to estimate the “trafficCO₂” variations, which led to a high precision value of R2 (0.8) and RMSE (22.9 ppm). Such an experiment utilized data on the type of road and land use (either residential, commercial, etc.) to determine the spatiotemporal variability of CO₂.

Tu, Hase [36] suggested that industrial sites are another hotspot for NO₂ emissions, such as cement manufacturing plants and power production plants, in addition to the transport sector. Although the TROPOspheric Monitoring Instrument (TROPOMI) was proposed to help forecast the average tropospheric NO₂ emissions, emissions that are quite complex in some locations render this monitoring instrument inefficient due to the rapid decay of the NO₂ concentration from the identified NO₂ emission source. In this regard, the TROPOMI model was proposed, which combines a wind-assigned anomaly and the Gradient Descent algorithm.

According to Meng, Hang [37], desulfurization initiatives in coal-fired power stations and industrial facilities contributed to declining PM2.5 concentrations in China. In this regard, the RF technique was proposed to help in the prediction of sulfate to resolve the difficulty in monitoring ground-based networks. The proposed RF technique used varied datasets including Aerosol data from the Multiangle Imaging Spectro Radiometer (MISR) and ground network observation systems for sulfate concentration identification. Their experiment results showed an out-of-bag cross-validation R2 value of 0.93 monthly and 0.68 daily. Also, Choi, Park [38] employed RF, the Light Gradient Boosting Machine (LGBM), and ANN to estimate the ground-level PM2.5 concentrations, and suggested the superiority of LGBM over the comparative techniques in their experiment.

Ezhilkumar, Karthikeyan [39] et al. also attested to the complication in collecting large real-time air pollutant data and proposed an ML technique and IoT system that overcame the data collection challenges. Their approach utilized a combination of ANN and back-propagation neural network (BPNN) models to predict the concentration trend of PM2.5 and PM10. The findings suggest that local weather and emissions at a certain geographic location greatly affect the PM2.5 and PM10 pollutant concentrations, which are not constant. Mathew, Gokul [40] identified the limitations of data and also proposed three data-driven ML algorithms, namely multilinear regression, KNN, and HGBoost, to forecast PM2.5 concentration. Their experiment showed that the HGBoost3 model outperformed the other two techniques, using both pollution and meteorological data as input.

Air quality models, according to Xiong, Xie [41], can offer comprehensive spatiotemporal coverage, but they are biased by physicochemical process simplifications and uncertainties related to collecting data from multiple sources such as meteorological and emission data. They indicated that although ground-based observation networks are highly accurate, they are limited in terms of spatial coverage. Their experiment tackled the uncertainty related to multi-data sources using three ML techniques—the Light Gradient Boosting Machine (LightGBM), RF, and eXtreme Gradient Boosting—to increase the predictive accuracy of O₃ concentration in the “Community Multiscale Air Quality” (CMAQ) model. In their experiment, they compared the LGBR and LGBR-CHAP (LGBR-China High Air Pollutants) models with the original CMAQ model, and suggested after several validations that the LGBR-CHAP model performs better in terms of prediction and was used to generate high-resolution O₃ data.

Again, to predict air pollution at different locations and sources, Sharma, Khurana [42] suggested the use of the ML approach to determine the source of air pollution in urban areas. Their approach was based on DT Regression, RF Regression (RFR), XGBoost, Linear regression, and a hybrid model combining RF and XGBoost models. Their hybrid model demonstrated superior accuracy compared to their counterparts in terms of better coefficient of determination (R2) values and a lower MAE, MSE, and RMSE. The challenge of their approach was that meteorological features were not considered in testing the effectiveness of the proposed model for air quality predictions. Meteorological data can have useful features such as temperature, humidity, and wind speed, among others, which are synonymous with location data.

Supervised machine learning methods such as RF, Logistic Regression, DT, and Naive Byes have also been utilized to predict air quality in which DT was superior [43]. This superiority was determined using precision, recall, and the F1 score. Similar to this, Mahesh Babu and Rene Beulah [44] experimented with supervised machine learning techniques, such as SVM, RF, K-nearest neighbors (k-NN), DT, and LR (Logistic Regression), to predict air quality. In their approach, the algorithms’ performance was evaluated using the F1 score, precision, and recall, leading to the suggestion that the decision tree algorithm was the best-performing algorithm. Further Niveshitha, Amsaad [45] also experimented with the use of RF and decision tree Regression models in an attempt to reduce the execution time of air quality predictive models on the cloud computing framework. The results of their experiment on Amazon “SageMaker’s Jupyter” showed a reduced executing time in which the RF Regression achieved better accuracy. These experiments indicate the superiority of DT in air quality predictions, and in terms of time of prediction, the RF Regressor happens to be best. The AQI was computed using pollutants such as particulate matter, O₃, nitric oxide, NOx, CO, benzene, toluene, sulfur dioxide, ammonia, xylene, and nitrogen dioxide to tackle air pollution. William, Paithankar [46] was concerned that the increasing use of devices or sensors and machine learning techniques can impact the processing time and error rate in anticipating pollution. Understanding how long it takes for these algorithms to analyze a range of public datasets was experimented on Apache Spark using four different advanced regression algorithms, which were evaluated in terms of processing time and error rate (using MAE and the RMSE) to find the best fitting model. The simulation results showed that RF Regression performed better in reliably forecasting pollution in different sizes of datasets and locations with a variety of features. It also had a much smaller processing time than gradient boosting, DT Regression, and Multilayer Perceptron techniques. Again, it had the lowest error rate among the four approaches.

Kunnathettu and Varma [47] applied Logistic Regression (LR), SVM, RF, and NN to predict the quality of air based on the level of PM2.5 concentration, where hyper-parameters of the LR and SVM were tuned to give better accuracy. Their model’s performance evaluation approach was based on recall, precision, the F1 score, and support value. Sanjeev [48] indicated that air quality in cities is deteriorating daily and suggested the use of the RF algorithm for air pollutants’ prediction because of its superior performance over SVM and ANN. The author’s experiment leveraged air pollutant concentrations and meteorological factors (temperature and relative humidity). Performance was measured using recall, precision, F1, and specificity.

A Self-Organizing Map (SOM) was used by Chang, Chang [49] to cluster high-dimensional datasets into topological maps in which the spatiotemporal characteristics of long-term regional PM2.5 concentration levels were successfully recovered and distinctly separated into a two-dimensional topological map. The BPNN was employed to help build the prediction model by generalizing responses that resemble the data throughout the learning phase. Their experiment outcome showed an increase in the accuracy of air quality predictions while also skillfully summarizing and visually presenting the clustered spatiotemporal PM2.5 concentration data. During their experiment, R2 (coefficient of determination), RMSE, and NSE (Nash–Sutcliffe Efficiency coefficient) were used to assess the model’s predictability and accuracy.

Ly, Matsumi [50] studied the situation of heavy pollution in winter seasons and indicated that the spatiotemporal variations in the PM2.5 levels can provide insight into the source and transport of PM2.5. Unfortunately, the unpublished spatiotemporal data limited their research into heavy pollution situations in winter seasons. Thus, the use of low-cost sensors was used to fill the data gap in analyzing long-range transport and meteorological factors on PM2.5. Afterward, RF and concentration weight trajectory (CWT) was proposed, which showed that mass trajectory and sources impact on the PM2.5 air quality indicator. Again, their experiment showed that long-range transport and climate affect PM2.5 levels.

As Zareba, Dlugosz [51] points out, spatial-temporal analysis is a crucial first step in understanding the complex aspects of air pollution. High-resolution time-stamp observations and sensor technology advancements made it more difficult to analyze spatiotemporal patterns. Unsupervised machine learning methods were used to assess the spatiotemporal patterns of air pollution. In this experiment, the k-means method with Dynamic Time Warping (DTW) swiftly grouped spatially shifting data to recognize yearly patterns more accurately than the “Spatial ‘K’luster Analysis by Tree Edge Removal” (SKATER) technique. The analytical results obtained with the K-means algorithm and SKATER clustering again showed a considerable difference between the average and maximum pollutant concentration values.

Li, Zhang [52] leveraged the use of multi-spectral satellite data and machine learning models to analyze the spatiotemporal distribution of primary air pollutants. Sentinel-2, Landsat 8 OLI, MODIS AOD/SR, and other multi-spectral satellite data are used, and the machine learning models such as the multi-layer back-propagation neural network (MLBPN) and RF were leveraged for ground-surface concentrations prediction, in which the RF technique demonstrated superior performance with the MODIS AOD dataset. The spatiotemporal distribution of air pollutants, such as PM2.5, PM10, NO₂, CO, O₃, and SO₂, was also analyzed using the best estimation model, and the results show a decreasing trend.

In an attempt to address the complexities in air quality predictions, Dragomir and Oprea [19] suggested that the more proper way to address this is the use of nonlinear models such as ANN and SVM, which are most suitable compared with linear models such as linear regression and LR, as they are not suitable to capture complex patterns. Among the nonlinear models are DT and ensemble methods like RF. Table 3 presents the highlights of some ML models, challenges, and their performance metrics. It also represents studies included in this study.

Table 3 provides the highlight of ML and its performance metrics. It can be observed that different performance evaluation results are obtained for the models, both the best-performing models and the comparative models. The effectiveness of this literature review is considered in terms of the impact of the performance evaluation results and consistency of the performance results. In this regard, DT has a high impact on air quality prediction as it showed an accuracy of 99.88% [43]. Also, RF showed an accuracy of 99.4% in air quality prediction [48]. Furthermore, RF was applied in predicting the executing time, with a guaranteed R2 score of 91.43% [45]. These performance results demonstrate the impact of different models in the problem domain. Again, the results of the models are not consistent in different problem domains.

3.3. Air Quality Index (AQI) Models

There are several ML techniques proposed to assess the air quality index. For instance, Chakradhar Reddy, Nagarjuna Reddy [53] indicated that there are several risk factors associated with the use of the AQI and attempted to reduce those risk factors to a level where the air quality is considered safe for humans. Their study went a step further to predict the air quality from pollutants in India. Their approach used a supervised machine learning technique to identify features from datasets like univariate, bivariant, and multivariate features and data preprocessing issues which helped to conduct sensitivity analysis on model parameters in terms of which model can guarantee the best performance. Their findings suggested the use of supervised machine learning to reduce the risk factor associated with forecasting the AQI in real-time. Pant, Sharma [54] predicted the AQI with supervised machine learning techniques like the Logistic Regression and DT classifier, and suggested that DT was more accurate for pollutants such as PM10, PM2.5, SO₂, and NO₂. However, insufficient data for model training contributed to poor model selection, which limited the air quality prediction accuracy.

Difaizi, Camille [55] indicated that air is very dynamic and volatile, making the pollutant concentration level frequently change, thus making the prediction of the AQI complex. They further indicated that when the AQI is properly monitored, it controls the pollutant concentration levels. Subsequently, they employed AdaBoost, Logistic Regression, and k-NN to predict the AQI, in which k-NN yielded the best predictive accuracy, using evaluation metrics like the confusion matrix, the F1 score, precision, and recall. Alam, Hussain [56] employed Artificial Neural Networks, SVMs, LightGMB, XGboost, Catboost, RF, Extra Trees, Naïve Bayes, and Autoregressive Integrated Moving Average (ARIMA) to forecast the air quality index and air pollutants such as PM1, PM10, and PM2.5. The experimental results demonstrate that the LightGBM and CatBoost algorithms are excellent choices for regression and classification, respectively, using performance metrics such as MSE, MAE, and R2. Almaliki, Derdour [57] applied Fine DT (FDT), ensemble bagged tree (EBAT), and Ensemble Boosted Tree (EBOT) to understand the different spatiotemporal nature of air pollutants and to project AQI levels. The results showed that the EBOT exhibits unparalleled accuracy in forecasting the AQI.

Xiang, Fahad [58] indicated that most research on air quality prediction has to normalize their dataset due to the level of nonlinearity, and the small spatial and temporal scales in the dataset that made accurate AQI prediction very challenging. In their attempt to address these challenges, multiple tasked ML techniques such as simple linear regression (SLR), SVR, RF, and probabilistic voting ensemble models were assessed to develop regression models that can forecast Beijing’s AQI in six of its city’s outer and center zones. They indicated that data normalization plays an effective role in enhancing the feature of air quality to avoid data distortion. Their regression models’ performance was evaluated using the coefficient of determination (R2), RMSE, and MAE. While RF outperformed the probabilistic voting ensemble in terms of MAE and RMSE scores, the experiment’s results showed that the former performed better in terms of R2 when it came to AQI prediction.

Research has shown that during the design of the well-known Common Air Quality Index (CAQI) model, five ML approaches were utilized, that is, SVR, RF, Multilayer Perceptron, Extreme Gradient Boosting, and Multiple Linear Regression [59]. Meteorological features and air pollutant concentrations were used to predict the CAQI. While evaluating the prediction performance using regression metrics, such as R-squared and RMSE, the ensemble technique based on the RF method emerged as the superior method as it produced the greatest performance result.

Combining multiple machine learning techniques to predict the AQI offers more promising results, as demonstrated in the use of neural networks combined with the SVMs to predict the AQI and reduce the impact of pollutants on smart cities [60]. Again, combining ML techniques, classical regression models, and more sophisticated deep learning models provides more promising results when employed to predict the AQI, than using only the conventional methods that rely on deterministic models and historical patterns [61]. Hardini, Chakim [62] used the CNN to extract features from images to help determine the AQI, and also identify the complex correlation of air quality and factors of the environment, namely temperature and humidity. Thus, our review highlights the use of deep learning models to achieve AQI prediction. Again, ML models such as Linear Regression, DT, RF, ANN, and SVM have all been used for AQI determination [63]. Table 4 presents the ML techniques, best-performing models, and their performance metrics.

Table 4 shows the ML techniques for AQI prediction. It is observed from the perspective of performance evaluation results that DT appears to have 100% accuracy in AQI prediction. Similarly, k-NN also had 100% accuracy in AQI prediction. Both RF and ensemble techniques also have an R2 value of 99% in AQI prediction. The impact of these results demonstrates the superiority of DT and k-NN in AQI prediction.

3.4. Model Hybridization Using ML and DL for Air Quality Prediction

Deep learning could be referred to as an algorithm composed of several neural networks in its architecture, in which the algorithm can automatically learn features without human intervention [64]. By far, neural networks are generally adaptive to data generated from a natural environment because they naturally exhibit nonlinear properties. For instance, data from air pollutants, and meteorological or weather data, all exist in the natural environment. As the magnitude of these factors increases, deep learning models become more computationally attractive than simple or shallow networks. In this regard, deep learning could be considered a sophisticated ML model. Again, sophisticated ML could also be created by stacking multiple linear models to solve nonlinear problems in air quality prediction. The key consideration for deep learning models’ ability to capture complex relationships is the use of nonlinear activation functions in the deep learning layers, and among the activation functions utilized are ReLU, sigmoid, and “tanh”.

While many deep learning models exist, Drewil and Al-Bahadili [14] suggested that because long short-term models (LSTM) use many statistical and machine learning methods, they are likely not able to provide adequate prediction outcomes because of the inherent noise in data and improper settings of hyper-parameters. Because a large amount of data is usually required in deep learning models, tuning the hyper-parameter is very imperative. Their approach to improper hyper-parameterization is the use of a genetic algorithm with an LSTM model to effectively predict four pollutants: PM10, PM2.5, NOX, and CO. The proposed model was evaluated using the RMSE and MAE. Aruna Kumari, Ananda Kumar [65] indicated that instead of using separate independent models for air quality monitoring, it is more suitable to use a single model for air quality prediction, interpolation, and feature analysis. Subsequently, their proposed approach can gather unlabeled spatiotemporal data and perform interpolation before predicting the quality of air, which was validated in real-time. Zhang, Duan [66] utilized seven separate models and an ensemble learning algorithm to build a hybrid LSTM-SVR predictive model, where the eight ML techniques were used to predict the index of air quality in China. Their experiment outcome demonstrated that in areas with high levels of air pollution, the models dealt with more prediction complications, which caused the accuracy of predictions to decrease. That notwithstanding, the hybrid LSTM-SVR model showed the best prediction accuracy followed by the ensemble RF model, which was very useful in highly contaminated environments. Therefore, hybrid and ensemble models perform better in air quality prediction than single-model approaches.

The difficulties encountered in obtaining real-time data on air pollution have led to the use of sensors. Shrivastava and Dwivedi [67] presented a sensor device to collect toxic gases and an ML technique was used to forecast future air contaminates. Sun, Li [68] proposed a hybrid deep learning model for hourly PM2.5 pollutant prediction using edge devices in addition to deep learning models such as the multi-factor LSTM and Deep Reinforcement Learning (DRL). The model’s performance was evaluated in terms of latency and prediction accuracy of edge devices. Afterward, the resulting model was optimized to ensure efficient monitoring of air quality information that was collected from several sensor devices. Thus, with this approach, the multi-task offloading approach based on the Optimal Stopping Theory (OST) was also ensured, in which each edge device can offload its data to an optimal web server in a time-optimized manner.

Lin, Jin [69] suggested that dust from multiple sources such as desert and anthropogenic emissions contribute to increased PM10 concentrations. Dust or heavy smog creates poor visibility that can cause air traffic operational challenges. However, real-time measurement of local emissions of PM10 is challenging, and could be overcome with RNN and LSTM techniques. This is because it can generate local emissions to aid in real-time observation and prediction of PM10 concentration levels.

Wang, McGibbon [70] combined CNN with the “parallelized large-eddy simulation model” (PALM) to enhance the forecast accuracy of urban air quality, in which the model was used to predict the spatial distribution of CO concentration. Performance metrics were based on R2 and RMSE. Zareba, Cogiel [71] attempted to understand the impact of energy transformation on the environment using different ML techniques such as regression models, deep neural networks, and ensemble learning. These ML techniques predicted air pollution by considering spatial factors such as data from different locations, which were collected using low-cost neighboring sensors. Their findings suggest that transitioning from coal to more sustainable energy sources was key in improving the air quality, such as PMx. It further stated that the best models to predict rare smog were linear ML models.

Chang, Abimannan [72] suggested that ML techniques such as Gradient Boosted Tree Regression (GBTR), SVM-based regression (that is, SVR), and deep learning models, such as LSTM, are the most promising approaches to enhancing the prediction performance of air quality models. Their method included several forecasting models to create a framework based on the TensorFlow deep learning and Spark+Hadoop machine learning for air prediction.

Karthikeyan, Jenefa [73] introduce a multi-model machine learning framework, integrating sophisticated algorithms like LSTM, gated recurrent unit (GRU), CNN, and an ensemble model. The ensemble model demonstrated an exceptional performance in effectively mapping the intricacies of air quality fluctuations [73]. In order to capture the complex nature of PM2.5 concentrations hourly, a CNN model was used to perform cross-domain and time series analyses on the characteristics of meteorological and gaseous pollutants from multiple monitoring stations [74]. Afterwards, Gaussian weighted parameter was used to determine the relationship between regional and neighboring monitoring stations. Thereafter, the LSTM model was used to extract the temporal features of PM2.5 concentration levels. Similarly, Cican, Buturache [75] applied LSTM and a recurrent unit (GRU) to predict NO₂ level concentration and also recorded the same prediction performance. Their approach was aimed at solving the increased population density and difficulties in crucial meteorological conditions.

Sasaki, Harada [76] noted that most research is focused on monitored locations or cities, neglecting unmonitored cities which also contribute to air pollution. Their approach leverages the use of ML techniques and expert opinions to correlate air qualities from both monitored and unmonitored locations. In the outcome, a neural network-based AIREX was proposed. Subsequently, the attention technique was employed to compute the impact of making inferences between monitored and unmonitored cities in a bid to improve the model’s performance. Gladkova and Saychenko [77] examined the performance of LSTM, Facebook Prophet (or Prophet FB), and ARIMA in predicting the most harmful pollutant, particulate matter. They suggested the possibility of predicting the average values of pollutant levels in advance; however, a lack of correct data hampered the accurate prediction of particulate matter concentration using the models they examined [77].

Neo, Hasikin [78] proposed an air quality prediction model for smart city applications using four machine learning methods (AdaBoost, SVR, RF, KNN) and two deep learning approaches (MLP Regressor and LSTM). The model considered data on air quality for various gases, such as CO, O₃, PM2.5, and PM10. Furthermore, meteorological data were considered focusing on wind direction, wind speed, and humidity. Afterward, the impact of these variables on the pollutants was measured, in which LSTM was better in forecasting PM10 and PM2.5 concentrations, with strong R2 values across all four research zones. It further concluded that PM2.5, PM10, NO₂, humidity, and wind speed are important features to monitor.

Kumbalaparambi, Menon [79] utilized a qualitative method to track air pollution in a city using Twitter conversations about air quality. Afterward, a self-attention network-based ML technique was used to extract tweets, in which an embedding layer was employed in the first layer of a multilayer classification model. Subsequently, a bi-directional long-short-term memory (BiLSTM) layer was used in the second layer. The’spaCy’ similarity analysis of classified tweets and data taken from “Continuous Ambient Air Quality Monitoring Stations” (CAAQMS) were then used to develop a method for calculating the PM2.5 concentration from the tweets. The outcome showed that the estimation accuracy was highest under extreme circumstances (very good or severe air quality) and lowest under mild fluctuations in air quality.

Machine learning techniques have been utilized in numerous air quality prediction studies to find patterns and rules in the data. However, Chiang, Wang [80] pointed out that these studies typically used prediction intervals, frequently measured in hours, which are not appropriate for minute-by-minute air quality forecasts. Consequently, two “acrlong” RNN models, long short-term memory (LSTM) and gated recurrent unit (GRU), were the underlying models of their proposed deep learning-based multi-timestamp multi-location system for predicting the PM2.5 concentration levels. The “acrlong” OVMS is one of the models that uses the Internet of Things (IoT) to allow real-time data sensing, collection, and wireless transmission. Compared to the GRU-based prediction model, the LSTM-based prediction model has higher accuracy and fewer errors.

Wang, Yuan [81] attempted to train estimation models to comprehend the intricate interactions between surface CO and multi-source data. While deep neural networks, Light Gradient Boosting Machines, and ensemble learning techniques such as Deep Forest (DF) were used, the outcome suggested that DF was superior. Again, when compared to the “Goddard Earth Observing System Composition Forecasting” (GEOS-CF), DF performed noticeably better. The proposed model was evaluated with R and RMSE.

The sparse distribution of air monitoring stations away from observation locations in large cities makes it challenging to make any long-term accurate prediction because of the dynamic nature of air pollutants [82]. While low-cost sensors were introduced to collect data on pollutants, advanced deep CNN was used to simultaneously extract the temporal and spatial features from the observed dataset, leading to a significant improvement in the air quality prediction outcome. Table 5 shows the best-performing models, research focus, and performance metrics.

Deep learning models have been used for complexities in spatiotemporal data through hyper-parameter settings, as indicated by Drewil and Al-Bahadili [14]. Also, machine learning methods like LR and SVM have been used to address the hyper-parameter setting, as indicated by Kunnathettu and Varma [47]. Highlighting the role of hyper-parameterization in spatiotemporal data analyses for air pollutant concentration prediction is crucial in the attempt to address the nonlinearity in spatiotemporal data. It is important to ensure proper and efficient hyper-parameter tuning of SVM, which can be achieved with the use of methods based on randomization, grid search, and nature-inspired approaches, among others [83]. Although this study highlights deep learning as the most attractive model for spatiotemporal data analysis, the computational demand is a critical issue. In the context of computational demand in using machine learning models, William, Paithankar [46] suggested that RF Regression performs better in processing different datasets and their respective location information. The scalability of these computational models to data continues to be open research due to the use of low-cost sensor devices in spatiotemporal data capturing. It also has been highlighted that different scenarios or research focuses revealed the use of different models, and while those that require real-time data capturing from multiple sources have also been highlighted, model hybridization continues to be an open research issue. This review also highlighted the use of ML techniques in both determining the AQI and predicting air pollutant concentration. It also demonstrates that the performance metrics largely used are the RMSE, R2, MAE, F1 score, precision, and recall. Furthermore, the review also highlighted the different performance evaluation results for the models. For instance, in predicting the PM10 and PM2.5 for the four cities in Malaysia, the LSTM model had different performance evaluation results, in which the R2 values were all above 91% [78]. When the LSTM model was hybridized [68], the MAPE value of the DRL-LSTM was 32.45, which was the minimum among the comparative models. Again, the hybrid model (LSTM+GA) in [14] showed the best RMSE value of 9.58. These differences in performance results show the impact of models in different research. While these highlights are necessary, assessing the effectiveness of action plans to reduce the concentration of air pollutants provides clear policy interventions [84]. While acknowledging the use of devices in data collection to address the significant data challenges associated with air quality prediction, the use of transfer learning, federated learning, or the use of synthetic data for model training is also a key consideration. As highlighted by Pant, Sharma [54], insufficient data in training models can limit the accuracy of air quality prediction. Hashemi and Karimi [28] highlighted the use of small data samples for model training and the use of large weight parameters to reduce the model’s training time for better accuracy.

This research highlights a very significant issue regarding research data collection from online repositories. While some research has identified reliance on ChatGPT [20], our research highlights the need for expert knowledge in filter selection on online repositories to reduce the bias in the data collection of the extant literature.

This review also acknowledges methodological challenges in terms of data collection sources, as there are equally relevant materials on other online data sources. Again, the choice and use of keywords for the initial search and the creation of subsequent filter keywords limit the collection of other relevant research articles. While recognizing these methodological limitations, the findings of the review address both policy and practice. There should be a clear-cut policy on the use of models such as transfer learning, federated models, and many others to support the data limitations in air quality predictions. Practically, these reviewed models with their respective performance accuracy can provide accurate predictions on future air pollutants. This calls for the implementation of more such models in real-time data analysis and prediction of air pollutants. Again, researchers should also look into model hybridization to address the limitations of spatiotemporal data analysis.

Table 6 provides a summary of observation data, study periods, and the computational domain. It is observed that different computational domains were utilized, in which the study periods were within the short and long term.

From Table 6, it can be observed that different authors used different observation data for their study. Again, different computational models were also used. For example, the computational model includes the use of statistical models and TROPOMI instruments. Some of the major conclusions include the impact of local pollution on regional pollution data [92]. Furthermore, military activities and fires also increase O₃ concentration pollution [91].

4. Conclusions and Future Directions

This review shed light on prospects in combining machine learning techniques to solve the problem of the predictive accuracy of air quality prediction models. Again, machine learning techniques have been leveraged to understand spatiotemporal data and estimate pollutant concentration.

Existing methods or models such as the BPNN [49], HGBoost [40], RF, and CWT [50] have been applied to address the data limitation in PM2.5 concentration prediction. The RF model has also been applied to address the execution time of models with an accuracy of 91.43% [45] and processing time in predicting PM2.5 [46]. Furthermore, the RF model has been applied in predicting air pollutants (such as NO₂, PM2.5) [34], CO₂ [35], sulfate [37], and many more. The DT model demonstrated a high-performance value of 100% [53] and 91.78% [54] in predicting the air quality of pollutants. The k-NN model also showed 100% accuracy in predicting the AQI [55]. The RF had an accuracy of 99% and ensemble models had an accuracy of 99% in predicting air quality [59]. While combining several machine learning techniques, the processing time should also be accounted for. Currently, this review identified random forest regression as having less processing time, which was experimented on Apache Spark, to forecast air pollution in different datasets from different locations. Deep learning techniques (e.g., CNN, LSTM, etc.) and machine learning techniques (e.g., SVR) have also been hybridized to create a new LSTM-SVR model to predict pollution in areas of high levels of air pollution.

Future directions should explore the scalability of best-performing models and the use of transfer learning or federated learning to address data limitation issues in model training for spatiotemporal data analysis and prediction of air quality. Again, a future research direction could focus on the use of hyper-parameter tuning approaches, such as nature-inspired methods including genetic algorithms. A future research direction could apply the best-performing models (such as RF, DT, LSTM hybrid model (LSTM-SVR)) highlighted in the review to analyze the impact of military activities and fire on air quality in war zones.

This review highlighted research gaps in developing models for air quality prediction. It further contributes to knowing some best-performing models in deep learning and machine learning for air quality prediction. Also, some challenges of these models are highlighted. The implication of the findings serves as a guide to practitioners in developing predictive air models. Again, it implies that legacy-based air prediction applications that were developed using machine learning models could use these findings to enhance their air quality prediction systems.

Author Contributions

Conceptualization, I.E.A. and I.C.O.; methodology, I.E.A.; data curation, I.E.A. and I.C.O.; writing—original draft preparation, I.E.A. and I.C.O.; writing—review and editing, I.C.O.; supervision, I.C.O. All authors have read and agreed to the published version of the manuscript.

Funding

This work is funded by the Center for Global Change, Sol Plaatje University, with National Research Foundation (NRF) (Number: 136097).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

AdaBoost	Adaptive Boosting
AQI	Air quality index
AQP	Air Quality Prediction
BPNN	Back Propagation Neural Network
MLP	Artificial Neural Network—Multilayer Perceptron
ANN	Artificial neural networks
ARIMA	Autoregressive Integrated Moving Average
BiLSTM	Bi-directional long-short-term memory layer
CR	Catboost Regression
R2	Coefficient of determination
CAQI	Common Air Quality Index
CMAQ	Community Multiscale Air Quality model
CWT	Concentration weight trajectory
CAAQMS	Continuous Ambient Air Quality Monitoring Stations
CNN	Convolutional neural network
DT	Decision Tree
DTR	Decision Tree Regression
DF	Deep Forest
DRL	Deep Reinforcement Learning
DTW	Dynamic Time Warping
EBAT	Ensemble bagged tree
EBOT	Ensemble boosted tree
ETR	Extra Trees Regression
XGBoost	eXtreme Gradient Boosting
ERT	Extremely Randomized Tree
ELM	Extreme learning machine
FDT	Fine Decision Tree
GEE	Generalized Estimating Equation
GPSTI	Generalized Probabilistic Standardized Temperature Index
GA	Genetic algorithm
GEOS-CF	Goddard Earth Observing System Composition Forecasting
GBTR	Gradient Boosted Tree Regression
GAST	Graph Attention-based Spatial-Temporal model
HGBoost	Histogram-based Gradient Boost
k-NN	k-Nearest Neighbor
LUR	Land Use Regression
LGBM or LightGBM	Light Gradient Boosting Machine
LR	Logistic Regression
LSTM	Long short-term memory
RNN	Recurrent neural network
ML	Machine Learning
MAE	Mean Absolute Error
MSE	Mean Squared Error
ME	Microenvironment Exposure
MODIS	Moderate Resolution Imaging Spectro Radiometer
MP	Multilayer Perceptron
ML	Multilinear Regression
MISR	Multiangle Imaging Spectro Radiometer
MLR	Multiple Linear Regression
NSE	Nash-Sutcliffe Efficiency coefficient
NN	Neural Network
OST	Optimal Stopping Theory
PALM	Parallelized large-eddy simulation model
PVE	Probabilistic Voting Ensemble
RF	Random Forest
RMSE	Root Mean Square Error
SOM	Self-Organizing Map
SKATER	Spatial K’luster Analysis by Tree Edge Removal
SMLT	Supervised machine learning technique
SVR	Support Vector Regression
SLR	Simple Linear Regression
GCNs	Transformers and Graph Convolutional Networks
WLSTME	Weighted Long Short-Term Memory Extended model
TBATS	Trigonometric seasonality, Box-Cox transformation, ARMA errors, Trend and Seasonal components
TROPOMI	TROPOspheric Monitoring Instrument

Appendix A

Table A1. Queries and ChatGPT-generated keyword responses.

Queries	ChatGPT-Generated Keywords
Query 1	“Machine Learning”, “Spatiotemporal”, “Air Quality Prediction”, “Atmospheric Modeling”, “Time Series Analysis”, “Spatial Interpolation”, “Feature Engineering”, “Ensemble Methods”, “Deep Learning”, “Convolutional Neural Networks (CNNs)”, “Recurrent Neural Networks (RNNs)”, “Long Short-Term Memory (LSTM)”, “Gated Recurrent Units (GRUs)”, “Support Vector Machines (SVMs)”, “Random Forests”, “Gradient Boosting Machines (GBMs)”, “Gaussian Processes”, “Kriging”, “Data Fusion”, “Model Evaluation”
Query 2	“Air quality prediction”, “Spatiotemporal modeling”, “Machine learning algorithms”, “Atmospheric data analysis”, “Time series forecasting”, “Spatial interpolation methods”, “Feature extraction and selection”, “Sensor networks”, “Data fusion techniques”, “Deep learning architectures”, “Convolutional neural networks (CNN)”, “Recurrent neural networks (RNN)”, “Long short-term memory (LSTM)”, “Support vector machines (SVM)”, “Random forests”, “Gradient boosting methods”, “Ensemble learning approaches”, “Model evaluation metrics”, “Uncertainty quantification”, “Urban air pollution”
Query 3	“Air quality prediction”, “Spatiotemporal modeling”, “Machine learning algorithms”, “Atmospheric data analysis”, “Time series forecasting”, “Spatial interpolation methods”, “Feature extraction”, “Feature selection”, “Sensor networks”, “Data fusion techniques”, “Deep learning architectures”, “Convolutional neural networks (CNN)”, “Recurrent neural networks (RNN)”, “Long short-term memory (LSTM)”, “Support vector machines (SVM)”, “Random forests”, “Gradient boosting methods”, “Ensemble learning approaches”, “Model evaluation metrics”, “Uncertainty quantification”, “Urban air pollution”, “Environmental monitoring”, “Satellite data integration”, “Hybrid models”, “Transfer learning”
Query 4	“Air quality prediction”, “Spatiotemporal modeling”, “Machine learning algorithms”, “Atmospheric data analysis”, “Time series forecasting”, “Spatial interpolation techniques”, “Feature engineering”, “Feature selection”, “Sensor networks”, “Data fusion methods”, “Deep learning architectures”, “Convolutional neural networks (CNN)”, “Recurrent neural networks (RNN)”, “Long short-term memory (LSTM)”, “Support vector machines (SVM)”, “Random forests”, “Gradient boosting methods”, “Ensemble learning techniques”, “Model evaluation metrics”, “Uncertainty estimation”, “Urban air pollution monitoring”, “Satellite data integration”, “Hybrid models”, “Transfer learning applications”, “Environmental health impacts”
Query 5	“Air quality prediction”, “Spatiotemporal modeling”, “Machine learning algorithms”, “Atmospheric data analysis”, “Time series forecasting”, “Spatial interpolation”, “Feature engineering”, “Feature selection”, “Sensor networks”, “Data fusion”, “Deep learning”, “Convolutional neural networks (CNN)”, “Recurrent neural networks (RNN)”, “Long short-term memory (LSTM)”, “Support vector machines (SVM)”, “Random forests”, “Gradient boosting”, “Ensemble learning”, “Model evaluation”, “Uncertainty quantification”, “Urban air pollution”, “Environmental monitoring”, “Satellite data integration”, “Hybrid models”, “Transfer learning”
Query 6	Spatiotemporal Data: “Spatial interpolation”, “Temporal interpolation”, “Data fusion” Machine Learning Techniques: “Supervised learning”, “Unsupervised learning”, “Semi-supervised learning”, “Deep learning”, “Ensemble methods” Air Quality Parameters: “PM2.5”, “PM10”, “Ozone (O₃)”, “Nitrogen dioxide (NO₂)”, “Sulfur dioxide (SO₂)”, “Carbon monoxide (CO)” Feature Engineering: “Meteorological data”, “Geographic data”, “Land use data”, “Satellite imagery” Evaluation Metrics: “Mean Absolute Error (MAE)”, “Root Mean Squared Error (RMSE)”, “Mean Absolute Percentage Error (MAPE)”, “R-squared (R^2)”, “F1-score (for classification tasks)”

References

Zhu, S.; Xu, J.; Fan, M.; Yu, C.; Letu, H.; Zeng, Q.; Zhu, H.; Wang, H.; Wang, Y.; Shi, J. Estimating Near-Surface Concentrations of Major Air Pollutants From Space: A Universal Estimation Framework LAPSO. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4101011. [Google Scholar] [CrossRef]
Mumtaz, R.; Amin, A.; Khan, M.A.; Asif, M.D.A.; Anwar, Z.; Bashir, M.J. Impact of Green Energy Transportation Systems on Urban Air Quality: A Predictive Analysis Using Spatiotemporal Deep Learning Techniques. Energies 2023, 16, 6087. [Google Scholar] [CrossRef]
Wang, J.; Lei, Y.; Chen, Y.; Wu, Y.; Ge, X.; Shen, F.; Zhang, J.; Ye, J.; Nie, D.; Zhao, X. Comparison of active and passive sampling methods for air pollutants in urban environments. Environ. Sci. Pollut. Res. Int. 2020, 27, 33173–33183. [Google Scholar]
Khokhar, M.; Abid, M.; Zahid, M. Air pollution and its impact on agriculture: A review. Environ. Sci. Pollut. Res 2016, 23, 1703–1715. [Google Scholar]
Ali, M.; Athar, M.A.; Ashfaq, M.; Tariq, M.A. Air Pollution and Human Health: A Review. Environ. Pollut 2015, 207, 427–438. [Google Scholar]
Matooane, M.; John, J.; Oosthuizen, R.; Binedell, M. Vulnerability of South African communities to air pollution. In Proceedings of the 8th World Congress on Environmental Health; Document Transformation Technologies Organised by SB Conferences, Durban, South Africa, 22–27 February 2004. [Google Scholar]
Zhan, Y.; Li, X.; Li, G.; Li, Z. Deep learning for source identification of ambient air pollutants using air quality and meteorological data. Environ. Sci. Pollut. Res 2021, 28, 14380–14391. [Google Scholar]
Kaur, M.; Singh, D.; Jabarulla, M.Y.; Kumar, V.; Kang, J.; Lee, H.N. Computational deep air quality prediction techniques: A systematic review. Artif. Intell. Rev. 2023, 56, S2053–S2098. [Google Scholar] [CrossRef]
Gugnani, V.; Singh, R.K. Analysis of deep learning approaches for air pollution prediction. Multimed. Tools Appl. 2022, 81, 6031–6049. [Google Scholar] [CrossRef]
Doush, I.A.; Sultan, K.; Alsaber, A.; Alkandari, D.; Abdullah, A. Improving Neural Network Using Jaya Algorithm with Opposite Learning for Air Quality Prediction. In Lecture Notes in Networks and Systems; Springer: Cham, Switzerland, 2024. [Google Scholar]
South Africa. Government Gazette: National Environment Management: Air Quality Act, 2004; Parliament of the Republic of South Africa: Cape Town, South Africa, 2005; Volume 476, pp. 1–29.
Environment, A. Environmental Monitoring. 2024. Available online: https://apexenviro.co.za/services/environmental-monitoring/ (accessed on 20 August 2024).
Wang, J.; Song, G. A Deep Spatial-Temporal Ensemble Model for Air Quality Prediction. Neurocomputing 2018, 314, 198–206. [Google Scholar] [CrossRef]
Drewil, G.I.; Al-Bahadili, R.J. Air pollution prediction using LSTM deep learning and metaheuristics algorithms. Sensors 2022, 24, 100546. [Google Scholar] [CrossRef]
Morapedi, T.D.; Obagbuwa, I.C. Air pollution particulate matter (PM2.5) prediction in South African cities using machine learning techniques. Front Artif Intell. 2023, 6, 1230087. [Google Scholar] [CrossRef]
Cosemans, G.; Janssen, S.; Panis, L.I.; Mishra, V. A Comparison of Linear Regression, Regularization, and Machine Learning Algorithms to Develop Europe-wide Spatial Models of Fine Particles and Nitrogen Dioxide. Atmosphere 2021, 12, 798. [Google Scholar]
Lin, G.Y.; Chen, H.W.; Chen, B.J.; Chen, S.C. A machine learning model for predicting PM2.5 and nitrate concentrations based on long-term water-soluble inorganic salts datasets at a road site station. Chemosphere 2022, 289, 133123. [Google Scholar] [CrossRef] [PubMed]
Choi, S.; Lee, Y.; Kim, E.; Jeong, I.; Kim, H.; Kim, K.H. Mapping urban air quality using mobile sampling with low-cost sensors and machine learning in Seoul, South Korea. Environ. Pollut 2021, 275, 116586. [Google Scholar]
Dragomir, E.G.; Oprea, M. Air Quality Forecasting by Using Nonlinear Modeling Methods. In International Conference on Nonlinear Dynamics of Electronic Systems; Springer International Publishing: Cham, Switzerland, 2014. [Google Scholar]
Masinde, M. Enhancing Systematic Literature Reviews using LDA and ChatGPT: Case of Framework for Smart City Planning. In Proceedings of the 2024 IST-Africa Conference (IST-Africa), Dublin, Ireland, 20–24 May 2024; IEEE Explore: New York, NY, USA, 2024. [Google Scholar]
Arunarani, A.R.; Selvanayaki, S.; Saleh Al Ansari, M.; Ala Walid, M.A.; Devireddy, N.; Keerthi, M.M. Crop Yield Prediction Using Spatio Temporal CNN and Multimodal Remote Sensing. In Proceedings of the 2nd International Conference on Edge Computing and Applications, ICECAA 2023, Namakkal, India, 19–21 July 2023; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2023. [Google Scholar]
Brayshaw, G.; Ward-Cherrier, B.; Pearson, M. Temporal and Spatio-temporal domains for Neuromorphic Tactile Texture Classification. In Proceedings of the ACM International Conference Proceeding Series, Online, 28 March–1 April 2022; Association for Computing Machinery: New York, NY, USA, 2022. [Google Scholar]
Broumand, M.; Asgarian, A.; Bussmann, M.; Chattopadhyay, K.; Thomson, M.J. Spatio-temporal dynamics and disintegration of a fan liquid sheet. Phys. Fluids 2021, 33, 112109. [Google Scholar] [CrossRef]
Mohammadi, F.; Teiri, H.; Hajizadeh, Y.; Abdolahnejad, A.; Ebrahimi, A. Prediction of atmospheric PM2.5 level by machine learning techniques in Isfahan, Iran. Sci. Rep. 2024, 14, 2109. [Google Scholar] [CrossRef] [PubMed]
Kelly, J.T.; Jang, C.; Zhu, Y.; Long, S.; Xing, J.; Wang, S.; Murphy, B.N.; Pye, H.O.T. Predicting the nonlinear response of PM2.5 and ozone to precursor emission changes with a response surface model. Atmosphere 2021, 12, 1044. [Google Scholar] [CrossRef]
Alsaedi, A.S.; Liyakathunisa, L. Spatial and temporal data analysis with deep learning for air quality prediction. In Proceedings of the—International Conference on Developments in eSystems Engineering, DeSE, Kazan, Russia, 7–10 October 2019; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2019. [Google Scholar]
Awan, W.B.; Batool, A.; Ali, Z.; Xu, Z.; Niaz, R.; Sammen, S.S. A Unified procedure for the probabilistic assessment and forecasting temperature characteristics under global climate change. Environ. Dev. Sustain. 2024, 1–25. [Google Scholar] [CrossRef]
Hashemi, M.; Karimi, H.A. Weighted Machine Learning for Spatial-Temporal Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2020, 13, 3066–3082. [Google Scholar] [CrossRef]
Shafi, M. Utilizing Spatio-temporal Satellite Images and Machine Learning to Examine the Impacts of Changing Land Use and Land Cover on Sohar City. In Proceedings of the 2023 24th International Arab Conference on Information Technology, ACIT, Ajman, United Arab Emirates, 6–8 December 2023. [Google Scholar]
Yang, C.; Du, W.J.; He, R.L.; Hu, Y.R.; Liu, H.; Huang, T.; Li, W.W. Spatiotemporal Patterns of Methane and Nitrous Oxide Emissions in China’s Inland Waters Identified by Machine Learning Technique. ACS ES T Water 2024, 4, 936–947. [Google Scholar] [CrossRef]
Iskandaryan, D.; Ramos, F.; Trilles, S. Air quality prediction in smart cities using machine learning technologies based on sensor data: A review. Appl. Sci. 2020, 10, 2401. [Google Scholar] [CrossRef]
Soh, P.-W.; Chang, J.-W.; Huang, J.-W. Adaptive Deep Learning-Based Air Quality Prediction Model Using the Most Relevant Spatial-Temporal Relations. IEEE Access 2018, 6, 38186–38199. [Google Scholar] [CrossRef]
Power, M.; Cascio, K.; Adgate, A. Air pollution and cardiovascular disease: A window of opportunity. Curr. Opin. Cardiol. 2018, 33, 578–584. [Google Scholar]
Khan, J.; Ellermann, T.; Hertel, O. Predicting Hourly Street-Scale NO₂ and PM2.5 Concentrations Using Machine Learning at One of the Danish Traffic Hotspots. In Springer Proceedings in Complexity; Springer International Publishing: Cham, Switzerland, 2022. [Google Scholar]
Park, C.; Jeong, S.; Kim, C.; Shin, J.; Joo, J. Machine learning based estimation of urban on-road CO₂ concentration in Seoul. Environ. Res. 2023, 231, 116256. [Google Scholar] [CrossRef] [PubMed]
Tu, Q.; Hase, F.; Chen, Z.; Schneider, M.; García, O.; Khosrawi, F.; Chen, S.; Blumenstock, T.; Liu, F.; Qin, K.; et al. Estimation of NO₂ emission strengths over Riyadh and Madrid from space from a combination of wind-Assigned anomalies and a machine learning technique. Atmos. Meas. Tech. 2023, 16, 2237–2262. [Google Scholar] [CrossRef]
Meng, X.; Hang, Y.; Lin, X.; Li, T.; Wang, T.; Cao, J.; Fu, Q.; Dey, S.; Huang, K.; Liang, F.; et al. A satellite-driven model to estimate long-term particulate sulfate levels and attributable mortality burden in China. Environ. Int. 2023, 171, 107740. [Google Scholar] [CrossRef]
Choi, H.; Park, S.; Kang, Y.; Im, J.; Song, S. Retrieval of hourly PM2.5 using top-of-atmosphere reflectance from geostationary ocean color imagers I and II. Environ. Pollut. 2023, 323, 121169. [Google Scholar] [CrossRef]
Ezhilkumar, M.R.; Karthikeyan, S.; Chalishajar, D.N.; Ramesh, R. Air quality prediction using machine learning techniques for intelligent monitoring systems. In Industry Automation: The Technologies, Platforms and Use Cases; River Publishers: Gistrup, Denmark, 2024; pp. 223–235. [Google Scholar]
Mathew, A.; Gokul, P.R.; Raja Shekar, P.; Arunab, K.S.; Ghassan Abdo, H.; Almohamad, H.; Abdullah Al Dughairi, A. Air quality analysis and PM2.5 modelling using machine learning techniques: A study of Hyderabad city in India. Cogent Eng. 2023, 10, 2243743. [Google Scholar] [CrossRef]
Xiong, K.; Xie, X.; Huang, L.; Hu, J. Improved O₃ predictions in China by combining chemical transport model and multi-source data with machining learning techniques. Atmos. Environ. 2024, 318, 120269. [Google Scholar] [CrossRef]
Sharma, G.; Khurana, S.; Saina, N.; Gupta, G. Comparative Analysis of Machine Learning Techniques in Air Quality Index (AQI) prediction in smart cities. Int. J. Syst. Assur. Eng. Manag. 2024, 15, 3060–3075. [Google Scholar] [CrossRef]
Pandithurai, O.; Bharathiraja, N.; Pradeepa, K.; Meenakshi, D.; Kathiravan, M.; Vinoth Kumar, M. Air Pollution Prediction using Supervised Machine Learning Technique. In Proceedings of the 3rd International Conference on Artificial Intelligence and Smart Energy, ICAIS, Coimbatore, India, 2–4 February 2023. [Google Scholar]
Mahesh Babu, K.; Rene Beulah, J. Air quality prediction based on supervised machine learning methods. Int. J. Innov. Technol. Explor. Eng. 2019, 8, 206–212. [Google Scholar] [CrossRef]
Niveshitha, N.; Amsaad, F.; Jhanjhi, N.Z. Air Quality Prediction in Smart Cities Using Cloud Machine Learning. In Proceedings of the 2023 2nd International Conference on Smart Technologies for Smart Nation, SmartTechCon, Singapore, 18–19 August 2023. [Google Scholar]
William, P.; Paithankar, D.N.; Yawalkar, P.M.; Korde, S.K.; Pabale, A.R.; Rakshe, D.S. Divination of Air Quality Assessment using Ensembling Machine Learning Approach. In Proceedings of the International Conference on Artificial Intelligence and Knowledge Discovery in Concurrent Engineering, ICECONF, Chennai, India, 5–7 January 2023. [Google Scholar]
Kunnathettu, A.J.; Varma, S.L. Comparative Analysis of Neural Network and Machine Learning Techniques for Air Quality Prediction. In Proceedings of the IEEE 2020 2nd International Conference on Advances in Computing, Communication Control and Networking, ICACCCN, Greater Noida, India, 18–19 December 2020. [Google Scholar]
Sanjeev, D. Implementation of machine learning algorithms for analysis and prediction of air quality. Int. J. Eng. Res. Technol. 2021, 10, 533–538. [Google Scholar]
Chang, F.J.; Chang, L.C.; Kang, C.C.; Wang, Y.S.; Huang, A. Explore spatio-temporal PM2.5 features in northern Taiwan using machine learning techniques. Sci. Total Environ. 2020, 736, 139656. [Google Scholar] [CrossRef]
Ly, B.T.; Matsumi, Y.; Vu, T.V.; Sekiguchi, K.; Nguyen, T.T.; Pham, C.T.; Nghiem, T.D.; Ngo, I.H.; Kurotsuchi, Y.; Nguyen, T.H.; et al. The effects of meteorological conditions and long-range transport on PM2.5 levels in Hanoi revealed from multi-site measurement using compact sensors and machine learning approach. J. Aerosol Sci. 2021, 152, 105716. [Google Scholar] [CrossRef]
Zareba, M.; Dlugosz, H.; Danek, T.; Weglinska, E. Big-Data-Driven Machine Learning for Enhancing Spatiotemporal Air Pollution Pattern Analysis. Atmosphere 2023, 14, 760. [Google Scholar] [CrossRef]
Li, Y.; Zhang, M.; Ma, G.; Ren, H.; Yu, E. Analysis of Primary Air Pollutants’ Spatiotemporal Distributions Based on Satellite Imagery and Machine-Learning Techniques. Atmosphere 2024, 15, 287. [Google Scholar] [CrossRef]
Chakradhar Reddy, K.; Nagarjuna Reddy, K.; Brahmaji Prasad, K.; Rajendran, P.S. The Prediction of Quality of the Air Using Supervised Learning. In Proceedings of the 6th International Conference on Communication and Electronics Systems, ICCES, Coimbatre, India, 8–10 July 2021; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2021. [Google Scholar]
Pant, A.; Sharma, S.; Bansal, M.; Narang, M. Comparative Analysis of Supervised Machine Learning Techniques for AQI Prediction. In Proceedings of the 2022 International Conference on Advanced Computing Technologies and Applications, ICACTA, Coimbatore, India, 4–5 March 2022. [Google Scholar]
Difaizi, T.Z.; Camille, O.P.L.; Nadine, A.; Kalita, S.; Kumar, A.; Rakesh, N. Comparative analysis of machine learning techniques for air quality index prediction for Indian cities. AIP Conf. Proc. 2023, 2917, 050014. [Google Scholar]
Alam, B.; Hussain, A.; Fayaz, M. An Effective Approach for Air Quality Prediction in Bishkek Based on Machine Learning techniques. In Proceedings of the ACM International Conference Proceeding Series, Istanbul, Turkey, 13–15 October 2023; Association for Computing Machinery: New York, NY, USA, 2023. [Google Scholar]
Almaliki, A.H.; Derdour, A.; Ali, E. Air Quality Index (AQI) Prediction in Holy Makkah Based on Machine Learning Methods. Sustainability 2023, 15, 3168. [Google Scholar] [CrossRef]
Xiang, X.; Fahad, S.; Han, M.S.; Naeem, M.R.; Room, S. Air quality index prediction via multi-task machine learning technique: Spatial analysis for human capital and intensive air quality monitoring stations. Air Qual. Atmos. Health 2023, 16, 85–97. [Google Scholar] [CrossRef]
Džaferović, E.; Karaduzović-Hadžiabdić, K. Air Quality Prediction Using Machine Learning Methods: A Case Study of Bjelave Neighborhood, Sarajevo, BiH. In Lecture Notes in Networks and Systems; Springer International Publishing: Cham, Switzerland, 2021. [Google Scholar]
Mahalingam, U.; Elangovan, K.; Dobhal, H.; Valliappa, C.; Shrestha, S.; Kedam, G. A machine learning model for air quality prediction for smart cities. In Proceedings of the 2019 International Conference on Wireless Communications, Signal Processing and Networking, WiSPNET, Chennai, India, 21–23 March 2019. [Google Scholar]
Raut, A.R.; Kharade, H.B.; Nashikkar, R.N.; Padhye, Y.N. Harnessing Machine Learning for Predictive Analysis of Air Quality in Pune City: A Comparative Study. In Proceedings of the 2023 IEEE Pune Section International Conference, PuneCon, Pune, India, 14–16 December 2023. [Google Scholar]
Hardini, M.; Chakim, M.H.R.; Magdalena, L.; Kenta, H.; Rafika, A.S.; Julianingsih, D. Image-based Air Quality Prediction using Convolutional Neural Networks and Machine Learning. APTISI Trans. Technopreneurship 2023, 5, 109–123. [Google Scholar] [CrossRef]
Madan, T.; Sagar, S.; Virmani, D. Air Quality Prediction using Machine Learning Algorithms-A Review. In Proceedings of the IEEE 2020 2nd International Conference on Advances in Computing, Communication Control and Networking, ICACCCN, Greater Noida, India, 18–19 December 2020. [Google Scholar]
Le, Q.V. A Tutorial on Deep Learning Part 1: Nonlinear Classifiers and the Backpropagation Algorithm. Standford University: Mountain View, CA, USA, 2015; p. 7. [Google Scholar]
Aruna Kumari, N.S.; Ananda Kumar, K.S.; Hitesh Vardhan Raju, S.; Vasuki, H.R.; Nikesh, M.P. Prediction of Air Quality in Industrial Area. In Proceedings of the—5th IEEE International Conference on Recent Trends in Electronics, Information and Communication Technology, RTEICT, Bangalore, India, 12–13 November 2020. [Google Scholar]
Zhang, B.; Duan, M.; Sun, Y.; Lyu, Y.; Hou, Y.; Tan, T. Air Quality Index Prediction in Six Major Chinese Urban Agglomerations: A Comparative Study of Single Machine Learning Model, Ensemble Model, and Hybrid Model. Atmosphere 2023, 14, 1478. [Google Scholar] [CrossRef]
Shrivastava, A.L.; Dwivedi, R.K. Air Quality Prediction Using Supervised Machine Learning Techniques. In Smart Innovation, Systems and Technologies; Springer Nature Singapore: Singapore, 2023. [Google Scholar]
Sun, C.; Li, J.; Sulaiman, R.; Alotaibi, B.S.; Elattar, S.; Abuhussain, M. Air Quality Prediction and Multi-Task Offloading based on Deep Learning Methods in Edge Computing. J. Grid Comput. 2023, 21, 32. [Google Scholar] [CrossRef]
Lin, H.X.; Jin, J.; Van Den Herik, J. Air quality forecast through integrated data assimilation and machine learning. In Proceedings of the ICAART 2019—Proceedings of the 11th International Conference on Agents and Artificial Intelligence, Prague, Czech Republic, 19–21 February 2019. [Google Scholar]
Wang, S.; McGibbon, J.; Zhang, Y. Predicting high-resolution air quality using machine learning: Integration of large eddy simulation and urban morphology data. Environ. Pollut. 2024, 344, 123371. [Google Scholar] [CrossRef] [PubMed]
Zareba, M.; Cogiel, S.; Danek, T.; Weglinska, E. Machine Learning Techniques for Spatio-Temporal Air Pollution Prediction to Drive Sustainable Urban Development in the Era of Energy and Data Transformation. Energies 2024, 17, 2738. [Google Scholar] [CrossRef]
Chang, Y.S.; Abimannan, S.; Chiao, H.T.; Lin, C.Y.; Huang, Y.P. An ensemble learning based hybrid model and framework for air pollution forecasting. Environ. Sci. Pollut. Res. 2020, 27, 38155–38168. [Google Scholar] [CrossRef]
Karthikeyan, I.; Jenefa, A.; Edward Naveen, V.; Santhiya, P.; Sangeetha, R.; Lincy, A. EcoAI Forecast: Multi—Model Air Quality Prediction in Patancheru. In Proceedings of the 2024 International Conference on Recent Advances in Electrical, Electronics, Ubiquitous Communication, and Computational Intelligence, RAEEUCCI, Chennai, India, 17–18 April 2024. [Google Scholar]
Zhang, Z.; Ren, J.; Chang, Y. Improving Intra-Urban Prediction of Atmospheric Fine Particles Using a Hybrid Deep Learning Approach. Atmosphere 2023, 14, 599. [Google Scholar] [CrossRef]
Cican, G.; Buturache, A.N.; Mirea, R. Applying Machine Learning Techniques in Air Quality Prediction—A Bucharest City Case Study. Sustainability 2023, 15, 8445. [Google Scholar] [CrossRef]
Sasaki, Y.; Harada, K.; Yamasaki, S.; Onizuka, M. AIREX: Neural Network-based Approach for Air Quality Inference in Unmonitored Cities. In Proceedings of the IEEE International Conference on Mobile Data Management, Paphos, Cyprus, 6–9 June 2022. [Google Scholar]
Gladkova, E.; Saychenko, L. Applying machine learning techniques in air quality prediction. Transp. Res. Procedia 2022, 63, 1999–2006. [Google Scholar] [CrossRef]
Neo, E.X.; Hasikin, K.; Lai, K.W.; Mokhtar, M.I.; Azizan, M.M.; Hizaddin, H.F.; Razak, S.A. Artificial intelligence-assisted air quality monitoring for smart city management. PeerJ Comput. Sci. 2023, 9, e1306. [Google Scholar] [CrossRef]
Kumbalaparambi, T.S.; Menon, R.; Radhakrishnan, V.P.; Nair, V.P. Assessment of urban air quality from Twitter communication using self-attention network and a multilayer classification model. Environ. Sci. Pollut. Res. 2023, 30, 10414–10425. [Google Scholar] [CrossRef]
Chiang, Y.L.; Wang, J.C.; Lee, M.H.; Liu, A.C.; Jiang, J.A. Deep-Learning-Based Multi-Timestamp Multi-Location PM2.5 Prediction: Verification by Using a Mobile Monitoring System with an IoT Framework Deployed in the Urban Zone of a Metropolitan Area. IEEE Internet Things J. 2024, 11, 8815–8837. [Google Scholar] [CrossRef]
Wang, Y.; Yuan, Q.; Li, T.; Zhu, L. Global spatiotemporal estimation of daily high-resolution surface carbon monoxide concentrations using Deep Forest. J. Clean. Prod. 2022, 350, 131500. [Google Scholar] [CrossRef]
Huang, G.; Ge, C.; Xiong, T.; Song, S.; Yang, L.; Liu, B.; Yin, W.; Wu, C. Large scale air pollution prediction with deep convolutional networks. Sci. China Inf. Sci. 2021, 64, 192107. [Google Scholar] [CrossRef]
Rojas-Domínguez, A.; Padierna, L.C.; Valadez, J.M.C.; Puga-Soberanes, H.J.; Fraire, H.J. Optimal Hyper-Parameter Tuning of SVM Classifiers With Application to Medical Diagnosis. IEEE Access 2017, 6, 7164–7176. [Google Scholar] [CrossRef]
Li, R.; Liu, M.; Wang, C.; Dong, L.; Song, Y.; Zhang, J. Assessing the impact of clean air action on air quality trends in Beijing using a machine learning technique. Atmos. Envrion. 2020, 231, 117463. [Google Scholar]
Wu, L.; An, J. Quantitative impacts of meteorology and emissions on the long-term trend of O₃ in the YRD, China from 2015 to 2022. J. Environ. Sci. 2025, 149, 314–329. [Google Scholar] [CrossRef]
Wu, J.; Xu, L.; Zheng, H.; Cao, X.; Lu, H. Spatiotemporal assessment of evapotranspiration of desert steppe in northern china: A case of otog front banner. In Proceedings of the International Geoscience and Remote Sensing Symposium (IGARSS), Brussels, Belgium, 11–16 July 2021. [Google Scholar]
Ahamad, F.; Griffiths, P.T.; Latif, M.T.; Juneng, L.; Xiang, C.J. Ozone trends from two decades of ground level observation in Malaysia. Atmosphere 2020, 11, 755. [Google Scholar] [CrossRef]
Cui, Y.; Zha, H.; Dang, Y.; Qiu, L.; He, Q.; Jiang, L. Spatio-Temporal Heterogeneous Impacts of the Drivers of NO₂ Pollution in Chinese Cities: Based on Satellite Observation Data. Remote Sens. 2022, 14, 3487. [Google Scholar] [CrossRef]
Ivanova, N.S.; Kruchenitskii, G.M.; Kuznetsova, I.N.; Demin, V.I.; Lapchenko, V.A. Ozone Content over the Russian Federation in the First Quarter of 2020. Russ. Meteorol. Hydrol. 2020, 45, 447–454. [Google Scholar] [CrossRef]
Liu, H.; Song, D.; Huang, F.; Lu, C.; Zhang, X. A Study on the Characteristics of Spatial and Temporal Evolution of Ozone Pollution in Chengdu (2014–2016). In Proceedings of the IOP Conference Series: Earth and Environmental Science, Shanghai, China, 14–16 March 2019. [Google Scholar]
Maidanovych, N.; Khlobystov, D. Assessment of atmospheric CO vertical column density over the Crimean Peninsula (Ukraine) by the TROPOMI instrument on the Sentinel-5 Precursor satellite. In Proceedings of the 17th International Conference Monitoring of Geological Processes and Ecological Condition of the Environment, Monitoring 2023, Kyiv, Ukraine, 7–10 November 2023. [Google Scholar]
Li, H.; Shi, R.; Jin, S.; Wang, W.; Fan, R.; Zhang, Y.; Liu, B.; Zhao, P.; Gong, W.; Zhao, Y. Study of persistent haze pollution in winter over jinan (China) based on ground-based and satellite observations. Remote Sens. 2021, 13, 4862. [Google Scholar] [CrossRef]

Figure 1. PRISMA flowchart.

Table 1. Similar and dissimilar keywords.

Category 1: Dissimilar Keywords	Category 2: Similar Keywords
“Deep learning”, “Gradient boosting”, “Machine Learning”, “Time Series Analysis”, “Uncertainty estimation”, “PM2.5”, “Nitrogen dioxide (NO₂)”, “Supervised learning”	“Air Quality Prediction”, “Convolutional Neural Networks (CNNs)”, “Environmental monitoring”, “Feature extraction”, “Feature selection”, “Long Short-Term Memory (LSTM)”, “Machine learning algorithms”, “Random forests”, “Support vector machines (SVM)”

Table 2. Subject matter keywords’ outcome.

Subject Matter Expert Keywords
“Air quality index”, “Air quality indices”, “Air pollution”, “Air pollutants”, “Machine learning approaches”, “Machine learning models”, “Machine learning techniques”, “Machine-learning”, “Particulate Matter”, “Supervised Machine learning”, “Spatiotemporal analysis”, “Time series”, “Air monitoring”

Table 3. Highlights of ML models, challenges, and performance metrics.

Author	Challenges/Limitations	Pollutants	Best Model	Performance Metrics	Performance Evaluation Results
Mathew, Gokul [40]	Data limitations	PM2.5	HGBoost	R2, MAE, RMSE	HGBoost: R2 (85.90%), MAE (5.717 μg/m³), and RMSE (7.647 μg/m³). Comparative models: MR and k-NN.
Xiong, Xie [41]	Uncertainties in data collection from multiple sources	O₃	LGBR-CHAP model	Coefficient (R), RMSE	R value of LGBM was 0.84. Comparative models: RF and eXtreme Gradient Boosting.
Pandithurai, Bharathiraja [43]	Air quality	-	DT	Precision, recall, F1 score	Accuracy of DT: 99.88%
Chang, Chang [49]	Data limitations	PM2.5	BPNN	R2, RMSE, NSE	-
Ly, Matsumi [50]	Data limitations	PM2.5	RF and CWT	Averaging hourly concentration	-
Sharma, Khurana [42]	Source of air pollution determination	-	hybrid (RF and XGBoost) model	Coefficient of determination (R2) values and lower MAE, MSE, RMSE	Hybrid model has R2 value of 95.43%. Comparative models using R2 value: RF (87.18%), DT (76.39%), LR (83.84%), XgBoost (86.86%).
Niveshitha, Amsaad [45]	-Execution time -AQI forecast -Air quality prediction	Particulate matter, O₃, NO, NOx, CO, benzene, toluene, SO₂, xylene, NH3, NO₂	RF Regression	Evaluation metrics (MAE, MSE, RMSE, R2 score) and execution time	RF model had R2 score of 91.43%. Comparative model was DT (R2 score of 83.89%).
Sanjeev [48]	Air pollutants prediction	-	RF algorithm	Recall, precision, F1, specificity	Accuracy of the RF was 99.4%. Comparative models’ accuracy: SVM (93.5%) and ANN (90.4%).
Khan, Ellermann [34]	Emission from vehicles	NO₂, PM2.5	RF	RMSE, R2	RMSE of RF was between 11.2 and 13.5. Comparative model’s RMSE were ANN (between 13.5 and 14.1), SVM (between 14.7 and 15.1).
Park, Jeong [35]	Traffic CO₂	CO₂	RF	R2, RMSE	RF model had R2 value of 0.8 and RMSE of 22.9 ppm
Meng, Hang [37]	Monitoring ground-based sulfate	Sulfate	RF	R2	RF model had R2 value of 0.68 at daily level and 0.93 at the monthly level
Kunnathettu and Varma [47]	Air quality prediction with hyper-parameter tuning	PM2.5	LR and SVM	Recall, precision, F1 score, support value	Accuracy of LR was 88.12% and SVM with hyper-parameter tuning was 87.56%. Comparative models’ accuracy was NN (83.70%), RF (83.42%), SVM (53.31%).
Mahesh Babu and Rene Beulah [44]	Air quality		DT	F1 score, precision, and recall	-
William, Paithankar [46]	Processing time and error rate	PM2.5	RF	MAE, RMSE	RMSE of RF varies between 0.05 and 0.18. MAE of RF varies betweeen 6% and 18%. Comparative model were DT (MAE was between 8% and 21%) and RMSE was between 0.06 and 0.24.

Table 4. Machine learning techniques for air quality prediction.

Author	Focus	Best Model	Performance Metrics	Performance Evaluation Results
Chakradhar Reddy, Nagarjuna Reddy [53]	Predict the air quality from pollutants Predict AQI	DT	Accuracy measurement: precision, recall, sensitivity, specificity, F1 score, and accuracy	Accuracy: DT (100%). Comparative models and accuracy: LR (98), RF (99%), SVM (70%), K-NN (97%), Naïve Bayen (95%)
Pant, Sharma [54]	Predicted AQI of pollutants (PM10, PM2.5, SO₂, NO₂)	DT classifier	F1 score, precision, and recall	DT (91.78%), comparative model was Logistic Regression (91.78%)
Difaizi, Camille [55]	Prediction of AQI	k-NN	Confusion matrix, F1 score, precision, and recall	Accuracy: K-NN (100%). Comparative model: LR (98.8%), Adaboost (83%).
Alam, Hussain [56]	Air quality index prediction for pollutants (PM1, PM10, PM2.5)	LightGBM and CatBoost	Classification: accuracy, precision, recall and F1 score Regression metrics: MSE, MAE, and R2	Regression value: CatBoost (R2-score 85% on PM1 and 95% on PM2.5). Classification accuracy of LightGBM (99.75%), CatBoost (99.5%), K-NN (97.2%), NN (96%)
Xiang, Fahad [58]	AQI	RF and probabilistic voting ensemble	R2, RMSE, and MAE	Comparative model: SLR, SVR, RF, and probabilistic voting ensemble
Hardini, Chakim [62]	AQI Identify the complex correlation of air quality Environmental factors	CNN	Cronbach’s Alpha and composite reliability score	The average variance extracted (AVE) showed a 0.5 threshold, indicating strong validity.
Džaferović and Karaduzović-Hadžiabdić [59]	Prediction of air quality using meteorological features and air pollutant concentration AQI prediction	Ensemble technique, RF	Regression metrics, such as R-squared and RMSE	Ensemble technique with R2 value of 99% and RMSE of 2.30. RF with R2 score of 99% and RMSE value of 2.58. Comparative models include SVR, MLR, and Multilayer Perceptron.

Table 5. Some best-performing models with performance evaluation approach.

Author	Best-Performing ML Technique	Research Focus	Pollutant	Performance Metrics	Performance Evaluation Results
Wang, Yuan [81]	Deep Forest (DF)	Complex correlation between surface CO and multi-source data	CO	R and RMSE	DF had R/RMSE as (0.73/0.273 ppm and 0.77/0.215 ppm) at daily and monthly scales. Comparative models: Light Gradient Boosting Machine and deep neural network.
Drewil and Al-Bahadili [14]	LSTM + GA	Improper hyper-parameter settings	PM10, PM2.5, CO, NOX	RMSE and MAE	RMSE value of LSTM+GA was 9.58. Comparative models’ RMSE: Bi-LSTM (22.58), C-LSTM (13.97), WLSTEM (40.67).
Lin, Jin [69]	RNN and LSTM	Data-driven for non-dust PM10. Real-time measurement of local emissions is challenging.	PM10	Real observations	-
Sun, Li [68]	Hybrid deep learning model: multi-factor LSTM + DRL	Monitoring air quality from multiple sensor data emanating from multiple dimensions and locations	Hourly PM2.5	Latency and use of accuracy (that is, RMSE, MAE, MAPE, R2)	MAPE value of DRL-LSTM was 32.45. Comparative models’ MAPE values were CNN-LSTM (90.43), NLSTM (72.67), XGBoost (79.09), ANN (42.39).
Wang, McGibbon [70]	CNN + PALM	Spatial distribution of CO concentration	CO concentration	R2 and RMSE	High precision accuracy of (R2 > 0.8)
Zhang, Duan [66]	Hybrid model (LSTM-SVR)	AQI prediction Air quality prediction	-	R2, RMSE, MAE	Hybrid LSTM-SVR model achieved the best R2 and RMSE value. Comparative models were the ensemble model (RF, XGBT, LGBM) and single models (K-NN, LR, SVR, LSTM).
Neo, Hasikin [78]	LSTM	Air quality prediction in four urban cities in Malaysia (that is, Petaling Jaya, Banting, Klang, Shah Alam)	CO, O₃, PM2.5, PM10. Wind direction and speed, humidity.	RSME and R2	In predicting PM10 and PM2.5, the R2 values for LSTM in four cities were Banting (0.998), Petaling (0.995), Klang (0.918), Shah Alam (0.993). Comparative models were Ada Boost, SVR, RF, KNN, MLP Regressor.
Wang, Yuan [81]	DF	Surface CO and multi-source data	Surface CO	R and RMSE	R and RMSE values for DF are 0.73 and 0.273 ppm for daily. While for the monthly scale, R and RMSE are 0.77 and 0.215 ppm, respectively. Comparative models are XGBoost, Light-GBM, RF, ERT, DNN.
Chiang, Wang [80]	Deep-learning-based multi-timestamp multi-location (based on LSTM+GRU)	Minute-by-minute air quality forecasts	PM2.5 concentration levels	RMSE and accuracy	The RMSE and accuracy are 0.922 µg/m³ and 100% for the LSTM-based prediction model, and GRU-based predictive model had RMSE of 0.940 µg/m³ and accuracy of 95.7%.
Gladkova and Saychenko [77]	LSTM	Forecasting the time series of PM2.5 concentration	PM2.5 concentrations	МSЕ and RМSЕ	LSTM had RMSE of 7.86. Comparative models are Prophet with RMSE of 12.25, ARIMA (12.46).
Chang, Abimannan [72]	Hybrid model that exploits stacking ensemble learning model	1 to 8 h air pollution forecasting developed on cloudbased big data platform (comprising Spark+Hadoop machine learning and TensorFlow-based deep learning)	PM2.5 and PM10	Use of MAE and RMSE. Pearson correlation coefficient to find the correlation between the four models (GBT, SVR, LSTM, LSTM2).	-

Table 6. Observation data, study period, and computational domains.

Authors	Observation Data	Study Periods	Computation Domains	Study Area	Major Conclusions
Wu and An [85]	Surface ozone of coastal cities	Short term, seasonal, and long term	Statistical analysis based on Kolmogorov–Zurbenko (KZ) filter	Yangtze River Delta	A decreasing spatial pattern was observed from the coastal cities towards the northwest, which were influenced by synoptic and monsoon conditions. Again, cities located at the same latitudes were significantly impacted by atmospheric transmission.
Wu, Xu [86]	MOD16 products with ground observation data from eight (8) weather stations	2011–2014	Regression analysis	Otog Front Banner	The evapotranspiration space showed a decreasing trend from the southeast to the northwest.
Ahamad, Griffiths [87]	Change in surface ozone	20-year period (1997–2016) at four locations in western Peninsular Malaysia.	Trend and correlation analyses	Western Peninsular Malaysia	The oxides of nitrogen ratios (NO/NO₂) had a significant inverse relationship with O₃ at all stations.
Cui, Zha [88]	Atmospheric nitrogen dioxide (NO₂) pollution	Chinese cities from 2005 to 2020	Geographically and temporally weighted regression model	Chinese cities	The population density and the ambient air pressure positively correlate with NO₂ pollution.
Ivanova, Kruchenitskii [89]	Ozone Monitoring Instrument (OMI) satellite equipment to observe surface ozone	First quarter of 2020	-	Commonwealth of Independent States (CIS) and Balticcountries	Generalization of total O₃ observation for each month of the first quarter of 2020
Liu, Song [90]	Spatial and temporal evolution of ozone pollution	8-h average O₃ concentrations from 2014 to 2016 observation data	Unimodal distribution	Chengdu	Ozone pollution in the west of Chengdu is more serious than in the east of Chengdu.
Maidanovych and Khlobystov [91]	Atmospheric carbon monoxide (CO)	Sentinel-5 Precursor satellite for the period from January 2019 to July 2023	TROPOMI instrument	Crimean Peninsula (Ukraine)	There was an increase in O₃ concentration in the atmosphere due to heavy enemy military equipment and fires.
Li, Shi [92]	Ground-based and satellite observation data, PM2.5 concentrations	1 December 2020 to 12 January 2021	Photochemical processes	Jinan area of China	The local pollution is often accompanied by the regional pollution during haze pollution events.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Agbehadji, I.E.; Obagbuwa, I.C. Systematic Review of Machine Learning and Deep Learning Techniques for Spatiotemporal Air Quality Prediction. Atmosphere 2024, 15, 1352. https://doi.org/10.3390/atmos15111352

AMA Style

Agbehadji IE, Obagbuwa IC. Systematic Review of Machine Learning and Deep Learning Techniques for Spatiotemporal Air Quality Prediction. Atmosphere. 2024; 15(11):1352. https://doi.org/10.3390/atmos15111352

Chicago/Turabian Style

Agbehadji, Israel Edem, and Ibidun Christiana Obagbuwa. 2024. "Systematic Review of Machine Learning and Deep Learning Techniques for Spatiotemporal Air Quality Prediction" Atmosphere 15, no. 11: 1352. https://doi.org/10.3390/atmos15111352

APA Style

Agbehadji, I. E., & Obagbuwa, I. C. (2024). Systematic Review of Machine Learning and Deep Learning Techniques for Spatiotemporal Air Quality Prediction. Atmosphere, 15(11), 1352. https://doi.org/10.3390/atmos15111352

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Systematic Review of Machine Learning and Deep Learning Techniques for Spatiotemporal Air Quality Prediction

Abstract

1. Introduction

2. Materials and Methods

3. Discussions

3.1. ChatGPT-Generated Keywords

3.2. Machine Learning Techniques for Spatiotemporal Air Quality Prediction

3.3. Air Quality Index (AQI) Models

3.4. Model Hybridization Using ML and DL for Air Quality Prediction

4. Conclusions and Future Directions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

Abbreviations

Appendix A

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI