Modeling and Optimization of NO2 Stations in the Smart City of Barcelona

Soriano-Gonzalez, Raquel; Martin, Xabier A.; Perez-Bernabeu, Elena; Carracedo, Patricia

doi:10.3390/app142210355

Open AccessArticle

Modeling and Optimization of NO₂ Stations in the Smart City of Barcelona

Research Center on Production Management and Engineering, Universitat Politècnica de València, Ferrandiz-Carbonell, 03802 Alcoy, Spain

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2024, 14(22), 10355; https://doi.org/10.3390/app142210355

Submission received: 11 October 2024 / Revised: 1 November 2024 / Accepted: 8 November 2024 / Published: 11 November 2024

(This article belongs to the Special Issue Sustainable Urban Mobility)

Download

Browse Figures

Versions Notes

Abstract

:

The growing problem of nitrogen dioxide (NO₂) pollution in urban environments is driving cities to adopt smart and sustainable approaches to address this challenge. To quantify and compare the effect of environmental policies, cities must be able to make informed decisions with real-time data that reflect the actual situation. Therefore, the objective of this work is threefold: The first is to study the behavior of the key performance indicator (KPI) of NO₂ concentrations per station in Barcelona through exploratory analysis and clustering. The second is to predict NO₂ concentration behavior, considering meteorological data. Lastly, a new distribution of current and new stations will be proposed using an optimization algorithm that maximizes the distance between them and covers the largest area of the city. As a result of this study, the importance of the location of measurement points and the need for better distribution in the city are highlighted. These new spatial distributions predict an 8% increase in NO₂ concentrations. In conclusion, this study is a comprehensive tool for obtaining an accurate representation of NO₂ concentrations in the city, contributing to informed decision-making, helping to improve air quality, and promoting a more sustainable urban environment.

Keywords:

NO₂ concentration; smart cities; KPI environmental; machine learning; intelligent algorithms

1. Introduction

In recent years, there has been a growing interest among municipal authorities worldwide in the study and research of smart and sustainable cities. This concept combines human and social capital with the city’s infrastructure capital to achieve a sustainable, livable, and efficient city [1]. These cities strive to contribute to the Sustainable Development Goals [2] and attain a carbon-neutral future, as well as a better quality of life for their residents. To achieve these goals, cities are assessed in various areas to comply with national and international strategic plans for decarbonization and environmental sustainability. In this context, key performance indicators (KPIs) have been developed to measure the sustainability of cities, some of which are included in the list elaborated by the International Telecommunication Union in 2017 [3], as well as in various European projects conducted to evaluate smart cities [4]. Within the study of sustainable mobility in smart cities, particularly in terms of passenger and freight transport, aspects related to energy, environment, safety, and security are considered, taking into account the real-time socio-economic dimensions [5]. These aspects are part of the pillars included in the studies of KPIs for smart and sustainable cities, which should be used to evaluate defined objectives with high viability (i.e., data availability and accessibility) and a perceived high importance value by the indicator [6].

Specifically, Barcelona is one of the most populous cities in the European Union and was ranked among the top five smart cities in 2022, according to a study conducted by Juniper Research: “Smart Cities: Key Technologies, Environmental Impact & Market Forecasts 2022–2026” [7]. Despite being considered a smart city, analyses initiated in the city on pollutant concentrations indicate that it is well above the targets set by the European Parliament [8], both in terms of peak nitrogen dioxide (NO₂) concentration values and the citywide average in µg/m³ [9]. Another issue with this indicator is whether the average values obtained in the city capture the true behavior of NO₂ across its entire extent. Considering that Barcelona is the second most populous city in Spain, with a population density of 16,339 inhabitants per km² and an area of 101.37 km² [10], using only eight air quality stations may not provide sufficient information about the behavior of pollutants in the city. This study stems from the need to create a KPI that accurately represents NO₂ concentrations throughout the city in a truthful and instantaneous manner [11].

Therefore, the study has three main objectives: (i) to study the behavior of NO₂ concentrations, (ii) to predict NO₂ concentrations considering the influence of meteorological variables, and (iii) to propose new distributions of air quality stations in the city to obtain information on real pollution levels. To achieve this, we employ a variety of Artificial Intelligence (AI) and Machine Learning (ML) models for analysis and prediction, alongside an optimization algorithm to relocate the air quality stations and provide an optimal distribution. This algorithm includes an original concept for estimating potential new stations based on the identification of sensitive points in the city and the effective area of action of each station. One of the limitations of our study lies in the limited number of air quality stations available in the city. This results in restricted coverage and can provide a biased view of the actual pollution situation in Barcelona, as the measurements are concentrated in a few locations and do not adequately reflect the spatial variability of pollution levels across the entire city. Our proposed optimization algorithm allows for the identification and prioritization of sensitive points in the city, where the presence of emission sources or the vulnerability of the population makes the information obtained more valuable. By using this algorithm, we aim to improve the representativeness of the data and provide a more accurate and comprehensive assessment of air quality in densely populated urban areas. This comprehensive and detailed approach will allow us to identify specific areas with higher concentration levels, elucidate the concentration behavior, and assess the effectiveness of current policies and initiatives to reduce pollution levels. Ultimately, this approach will provide valuable insights for designing specific and practical strategies to improve air quality and promote sustainable urban development in the city.

The remainder of this paper is structured as follows. Section 2 reviews the recent literature on the use of NO₂ indicators in sustainable mobility and its role as an indicator of pollution in smart cities. Section 3 introduces the case study on Barcelona and the data sources used. Section 4 presents the methodology used to achieve the aforementioned objectives, while Section 5 presents the results obtained. Lastly, Section 6 outlines the main conclusions and suggests potential avenues for future research.

2. Overview of NO₂ Concentrations

The growing concentration of urban populations has led to environmental challenges, such as increased air pollution, which affects health in urban settings. A primary contributor to this pollution is NO₂, primarily emitted by diesel engines, which poses serious environmental and human health risks. Subramaniam et al. [12] highlighted that NO₂ significantly contributes to global warming, the greenhouse effect, and climate change. This pollutant is also a primary cause of acid rain, which harms aquatic and terrestrial ecosystems. Regarding human health, Zhu et al. [13] showed that exposure to elevated NO₂ is linked to respiratory diseases and lung cancer, with significant mortality rates from these diseases in a studied population in Hefei, China. Women with respiratory diseases appeared more susceptible to air pollution than men. Additionally, Gurjar et al. [14] warned about health risks in megacities, noting that some cities face higher risks due to high levels of pollutants like NO₂, especially in South Asia.

To address these issues, traffic reduction measures, such as pedestrian zones and Low-Emission Zones (LEZs), have been implemented [15]. The effectiveness of these measures is often questioned, and this is where smart city tools play a crucial role by providing data to scientifically evaluate their impact. Various approaches have been proposed to evaluate smart city interventions. For instance, Ntafalias [16] proposed a seven-step methodology for assessing the impact of these interventions, emphasizing the importance of a comprehensive analysis of the city’s long-term vision and cooperation among all stakeholders. Additionally, Lebrusán and Toutouh [17] analyzed the effectiveness of an LEZ in Madrid, demonstrating its capability to significantly reduce air pollution and noise in the city. Analyzing Shared Mobility Systems (SMSs) is another critical approach to addressing urban transportation challenges. Golpayegani et al. [18] emphasized the need to address traffic-related NO₂ concentrations and how SMS solutions can contribute to reducing this pollution in urban areas. To evaluate cities, Angelakoglou et al. [19] developed a repository of 75 KPIs in six dimensions, including environmental aspects and concentrations of air pollutants. This repository can serve as a basis for assessing the impact of solutions aimed at improving air quality and reducing pollution in urban environments. The mentioned articles highlight the importance of addressing gas pollutant concentrations, such as NO₂, in the context of smart and sustainable cities. A holistic approach involving all stakeholders and interconnected entities is essential to evaluate the impact of smart city interventions and achieve greater efficiency in the urban mobility system.

The application of AI and ML techniques, as shown by Subramaniam et al. [12], has been instrumental in developing effective pollution control strategies by predicting NO₂ concentrations more accurately. In particular, several works on atmospheric pollution and NO₂ concentrations in Barcelona aimed to understand the dynamics of key pollutants, such as NO, NO₂, and O₃. Malik and Tauler [20] utilized the Multivariate Curve Resolution–Alternating Least Squares technique to analyze temporal variations and diurnal profiles of these pollutants. Basagaña et al. [21] examined the impact of public transportation strikes on air quality, revealing increased NOx and black carbon levels during strikes. Gignac et al. [22] investigated the short-term effects of NO₂ exposure on cognitive and mental health, while Pierangeli et al. [23] estimated childhood asthma cases attributable to air pollution. Benavides et al. [24] developed accurate urban air quality models using operational prediction systems and specific dispersion models. Rodriguez-Rey et al. [25] evaluated traffic restriction measures in Barcelona, focusing on reducing NO₂ levels. Recently, Cican et al. [26] applied two ML techniques to predict the air quality in a city of Bucharest. In particular, the authors used advanced recurrent neural networks, specifically Long Short-Term Memory (LSTM) and Gated Recurrent Unit models, which achieved improved performance over traditional methods. Wu et al. [27] introduced a novel deep learning model that combines Residual Neural Network, Graph Convolutional Network, and bidirectional LSTM architectures to improve the short-term regional predictions of NO₂ and O₃ concentrations in Shanghai (China). Similarly, Tao et al. [28] developed an ensemble ML model that incorporated deep learning to forecast NO₂ levels using data from 1609 air quality monitors in China. Lastly, El Mghouchi et al. [29] explored multivariable air quality predictions using five hybrid ML models to analyze the relationships between meteorological factors and particulate matter concentrations in Craiova (Romania).

The growing body of research highlights the necessity of addressing air pollution, particularly NO₂, in urban areas to safeguard public health and improve air quality. The combination of traffic reduction measures, the promotion of SMS, and the use of smart city tools are critical steps toward creating healthier urban environments. Active collaboration with local communities and policymakers is essential to ensure the successful implementation of these strategies, ultimately contributing to a more sustainable future for cities.

3. A Case of Study: Barcelona City

This study extracts data from Barcelona’s Open Data Air Quality dataset, which stores the hourly measurements of various pollutants from the city’s eight air quality stations, along with daily meteorological data measured at four meteorological stations.

3.1. Air Quality Stations

To obtain the pollutant concentration data from each air quality station, the corresponding data for each year in the study period (2020–2023) must be downloaded from Open Data Barcelona (https://opendata-ajuntament.barcelona.cat/, accessed on 10 October 2024). These files include the station name, population, province, pollutant code, year, month, day, and hourly variables reflecting the concentration measurements for each pollutant. In addition to these files, it is necessary to download the location of each station and the pollutant codes. For our study, we focused on measurements of NO₂, which has code 8 in the extracted datasets. The station locations related to NO₂ measurements are detailed in Table 1.

It is important to note that NO₂ concentrations are measured in µg/m³, and historical data in this format have been available since 2018. The average concentrations per year were close to the recommended limit set by the World Health Organization (WHO) at the time (25 µg/m³). In December 2022, the WHO updated its global air quality guidelines, setting a new target of 10 µg/m³ for the annual average and a maximum value of 25 µg/m³ for a 24 h period [30].

3.2. Meteorological Stations

To obtain data on the meteorological variables measured in Barcelona during the study period, the corresponding files for each year are downloaded from Open Data Barcelona. Each file contains the measurement date, hour (if applicable), station code, meteorological variable code, and the corresponding value. A separate file from Open Data provides the station identifiers and coordinates, while another file in the same repository lists the codes for meteorological variables and the units in which they are measured. Table 2 displays the stations’ identifiers and coordinates.

The selected variables for the analysis of meteorological data are Daily Mean Temperature (TM), measured in °C; Daily Mean Relative Humidity (HRM), measured in %; Daily Mean Atmospheric Pressure (PM), measured in hPa; Accumulated Daily Precipitation (PPT), measured in mm; Daily Solar Radiation (RS24h), measured in MJ/m²; Daily Mean Wind Speed at 10 m (VVM10), measured in m/s; and Daily Mean Wind Direction at 10 m (DVM10), measured in degrees. Each variable’s maximum, minimum, and mean values are measured at each meteorological station. For the study, we work with the mean value of each variable. Station X2 only has records for the temperature and relative humidity variables throughout the entire period, and the reasons for this data gap are unknown. Due to the lack of data, this station is not considered in the study. Figure 1 shows the location of all the air quality and meteorological stations in Barcelona that are considered in the study.

4. Methodology

The methodology employed in this study to achieve the proposed objectives is described in this section. Various tools are used to conduct the descriptive analysis, predict NO₂ behavior, and optimize the distribution of stations using an optimization algorithm. In particular, the descriptive analysis is performed using Python 3.10. The libraries NumPy [31], Pandas [32], Seaborn [33], and Matplotlib [34] are used for data analysis and visualization. The Folium library [35] is used to work with the map of Barcelona and generate heat maps. The Geocoder library is used to obtain coordinates for points of interest in Barcelona. The Scikit-learn library [36] is used for clustering analysis concentrations at the stations. This same library is necessary to apply ML models to predict concentrations at each station. As for the optimization algorithm, it is implemented using Python 3.10 with support of Numpy and Pandas libraries to perform array manipulations and manage the input datasets, respectively.

4.1. Descriptive Analysis

A detailed analysis of the measured NO₂ concentrations across the city is conducted to identify and observe patterns over different time frames, including daily and weekly behavior. Various graphical representations are created to explore the temporal dynamics of NO₂ concentrations [37].

First, daily averages are analyzed to examine general trends in NO₂ behavior, investigating daily and weekly variability, showing the influence of traffic emissions and weather conditions [38]. Additionally, weekly patterns are explored to identify potential cyclical behaviors influenced by factors such as urban mobility and industrial activity. Finally, a spatial analysis is then conducted to distinguish the behavior of NO₂ concentrations at individual monitoring stations. This approach examines the spatial and temporal representativeness of NO₂ monitoring stations in urban settings, highlighting the importance of capturing local variations to obtain a complete picture of air quality [39]. The analysis is further complemented by using meteorological variables obtained from stations distributed throughout the city. A wind rose is generated to visualize and understand the predominant wind directions and speeds in the city. This tool is essential for identifying local climate patterns and variations in different parts of the city [40]. This analysis helps in understanding how meteorological conditions contribute to the dispersion of atmospheric pollutants, highlighting the influence of urban climates on air quality [41]. Additionally, integrating meteorological data with air quality models is crucial to obtain a more accurate view of pollutant dispersion in urban areas [42].

As the final step of the descriptive analysis, a cluster analysis is performed to study the behavior of NO₂ concentrations at the monitoring stations. The hourly average values of NO₂ concentrations at each station during the study period are used to identify common patterns among the stations [43]. For this analysis, the K-means algorithm is selected, which requires specifying the number of clusters in advance [44]. To determine the optimal number of clusters, the elbow method is employed [45]. This method involves plotting the number of clusters against inertia (the sum of squared distances within each cluster), and the point where a significant change in the inertia decrease occurs indicates the optimal number of clusters. In addition, agglomerative hierarchical clustering is applied, a method that does not require specifying the number of clusters initially [46]. This algorithm treats each point as an individual cluster and successively merges the closest clusters, creating a hierarchical structure. The result is represented by a dendrogram, where the branches indicate the clusters, and the height of each union reflects the Euclidean distance [47]. In our study, the Euclidean distance is used as the metric that captures the separation between points in an n-dimensional space, providing a clear representation of how NO₂ concentrations vary according to the location of the monitoring stations.

4.2. Behavior Prediction of NO₂

Afterward, the NO₂ concentration predictions for each station are analyzed using different methodologies. The objective is to determine which methodology best approximates the data for each particular station. Drawing on several studies [12,48,49], we decide to utilize the following methods:

K-Nearest Neighbors (KNN): This algorithm is based on the idea that data points with similar characteristics tend to have similar output values. It works by finding the K closest points in the training dataset and predicting the output value based on the majority of the K-nearest neighbors [50].
Decision Tree: This model uses a decision tree to make predictions. Each internal node represents a feature, each branch represents a decision rule, and each leaf represents the prediction result [51].
Support Vector Regression (SVR): SVR is a regression technique based on support vectors that seeks to find an optimal regression function within a feature space. It uses a supervised learning approach to predict output values [52].
Random Forest: Random Forest is an ensemble of decision trees, where each tree votes for the predicted output. The final prediction is determined by selecting the output with the most votes. By combining multiple trees, the risk of overfitting is reduced, and the prediction accuracy is improved [53].
Artificial Neural Network (ANN): ANN is a model inspired by the structure and functioning of the human brain. It consists of a network of interconnected artificial neurons used for making predictions. The model learns from training data by adjusting the synaptic weights of neurons [54].

All the prediction models proposed in this study were validated with the Holdout method, where a

70 %

data split was used for training (training) and the remaining

30 %

for testing (test). This division made it possible to evaluate the predictive capacity of the models [55]. By applying these methodologies to predict NO₂ concentrations at various stations, we aim to identify the best-suited approach for each station based on its specific characteristics and patterns. This analysis will contribute to a better understanding of the performance of different prediction techniques in capturing the complexities and variations of NO₂ concentrations across different locations. To compare the effectiveness of each method at the stations, we use evaluation metrics such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and the Coefficient of Determination (R²). These metrics help determine which models offer the best approximation to the observed data at each station. For more details about these statistics, readers are referred to Hrust et al. [56].

4.3. Optimal Location of Stations

Lastly, a new placement of stations at strategic locations is proposed to improve the NO₂ pollution index in Barcelona, and obtain a more representative average of NO₂ concentrations across the city. This is supported by the analysis since it has been determined that the current 8 monitoring stations are insufficient, as they do not cover a large or representative area of the city. To determine the potential new locations, we rely on the report from the European Commission’s Mobility Observatory, using the Eltis method [57], which recommends selecting 40 sensitive locations in the city distributed as follows: (i) 5 locations near highways, (ii)5 locations near ring roads, (iii) 10 locations near access roads to the city center, (iv) 10 locations near sensitive facilities (schools, hospitals, residences, etc.), (v) 5 locations in low-income neighborhoods, and (vi) 5 locations in recreational areas (sports facilities, parks, museums, etc.). The points chosen for this study are shown in Appendix A, Table A1. These strategically selected points cover the entire metropolitan area and are added to the existing stations to form an expanded set of air quality monitoring stations (Figure 2). This approach allows for a more detailed understanding of NO₂ concentrations in the city, facilitating the implementation of effective measures to improve air quality in critical areas and throughout Barcelona. These results will support future research and environmental management actions, contributing to more effective policies to reduce NO₂ pollution and its adverse impacts.

To optimize the placement of air quality monitoring stations in Barcelona, we consider the effective radius of each station, which varies according to traffic levels in the area [58]. We choose an average radius of 500 m due to the city’s high traffic intensity. This approach also aims to maximize city area coverage, minimizing the overlap between stations to avoid redundant measurements and ensure no areas are left uncovered. However, the budget constrains the number of stations that can be installed, necessitating a balance between maximizing coverage and managing the costs of implementation and maintenance of the stations. This problem can be modeled as a Capacitated Dispersion Problem (CPD) [59], which aims to maximize the minimum distance between elements. Various approaches have been proposed in the literature to solve the CDP, but it is common to employ heuristics and metaheuristics to solve large-scale instances in short computing times [60]. This study uses an adapted version of the approach proposed by [61], which has been proven to generate high-quality solutions within short computational times. Algorithm 1 outlines the main steps of our algorithm.

Algorithm 1 Biased-randomized algorithm.

1:: function BR-CDP( $I, β, t_{\max}, i t_{\max}$ )
2:: $S \leftarrow$ DestructiveHeuristic(I)
3:: $S \leftarrow$ LocalSearch( $S, i t_{\max}$ )
4:: while $t i m e \leq t_{\max}$ do
5:: $S_{new} \leftarrow$ DestructiveHeuristic( $I, β$ )
6:: $S_{new} \leftarrow$ LocalSearch( $S_{new}, i t_{\max}$ )
7:: if $f (S_{new}) > f (S)$ then
8:: $S \leftarrow S_{new}$
9:: end if
10:: end while
11:: return S
12:: end function

The algorithm receives as input an instance comprising the stations and the distances between stations, denoted by I, as well as the maximum execution time

t_{m a x}

and the maximum number of iterations without improvement

i t_{m a x}

. The algorithm generates a feasible initial solution S by applying a destructive heuristic followed by a local search operator. The destructive heuristic and local search procedures are presented next. At this point, this initial solution becomes the best-found solution so far. Next, the algorithm performs a multistart procedure to generate new solutions until a maximum execution time is reached. In each iteration, the algorithm generates a new solution

S_{n e w}

using a biased-randomized version of the destructive heuristic combined with the local search operator. The biased-randomized heuristic introduces a slight modification in the greedy constructive behavior, which provides a certain degree of randomness while maintaining the logic behind the constructive heuristic. The biased-randomized version considers each element in the edges list with a probability that follows a geometric distribution with a single parameter

β \in (0, 1)

, which controls the relative level of greediness present in the randomized behavior of the algorithm [62]. By employing a biased-randomized version of the constructive heuristic, multiple alternative solutions can be generated without losing the logic behind the original heuristic. Next, the algorithm compares the newly generated solution to the best-known solution. If the new solution has a lower objective function value, the best-known solution is updated. Once the stopping criterion is met, the algorithm returns the best-found solution.

Algorithm 2 shows the destructive heuristic procedure. Initially, the heuristic assumes all stations are opened. Then, the edges connecting the stations are sorted in ascending order according to their distance between the stations. Next, an iterative process begins, where certain stations are removed from the solution. At each iteration, an edge is selected from the list of edges in a greedy or biased-randomized manner. The facility to be removed is chosen randomly from the two stations connected by the selected edge, and the edges connected to the deleted facility are also removed from the list of edges. This procedure is repeated until the percentage of open stations falls below the required threshold. Then, the last facility that was removed is reintroduced in the solution to preserve its feasibility, and the initial solution is returned by the procedure.

The local search procedure is depicted in Algorithm 3. This procedure involves removing the oldest station from the solution and reconstructing the solution with a station not currently included. It is important to note that a removed station will not be considered for generating a new solution until all older stations (i.e., those added earlier) have been eliminated from the solution. This approach facilitates efficient space exploration while avoiding redundancy in the search process. The procedure continues until a maximum number of iterations without improvement is reached.

Algorithm 2 Destructive heuristic procedure.

1:: function destructiveHeuristic( $I, β$ )
2:: $S \leftarrow V$
3:: $e d g e s \leftarrow$ getEdges(I)
4:: $e d g e s \leftarrow$ sort( $e d g e s$ )
5:: while isFeasible(S) do
6:: $e^{*} \leftarrow$ selectEdge( $e d g e s, β$ )
7:: $i^{*} \leftarrow$ selectNode( $e^{*}$ )
8:: $S \leftarrow$ drop( $S, i^{*}$ )
9:: end while
10:: $S \leftarrow$ add( $S, i^{*}$ )
11:: return S
12:: end function

Algorithm 3 Local search procedure.

1:: function localSearch( $S, i t_{m a x}$ )
2:: $S^{'} \leftarrow S$
3:: $n o I m p r o v \leftarrow 0$
4:: while $n o I m p r o v < i t_{m a x}$ do
5:: $n o I m p r o v \leftarrow n o I m p r o v + 1$
6:: $u \leftarrow$ oldestSelectedNode(S)
7:: $S \leftarrow$ drop( $S, u$ )
8:: $u * \leftarrow$ selectBestNode( $V ∖ S$ )
9:: $S \leftarrow$ add( $S, u *$ )
10:: if $f (S) > f (S^{'})$ then
11:: $S^{'} \leftarrow S$
12:: $n o I m p r o v \leftarrow 0$
13:: end if
14:: end while
15:: return $S^{'}$
16:: end function

Given the budget constraints on the number of stations that can be installed, we also analyze the impact of the percentage of open stations, which is controlled by a parameter m. When m is set to 0.1, it indicates that only 10% of the stations are open; conversely, setting m to 0.9 means that 90% of the air quality stations are operational. Our objective is to identify the optimal combination of area coverage percentage and overlap among these stations, resulting in the most efficient air quality monitoring network. To achieve this, we utilize a Pareto frontier to evaluate the effects of the parameter m on both the percentage of area coverage and the percentage of overlap between stations. By generating the Pareto frontier, we can discern the trade-offs between the percentage of covered area and the percentage of overlap, enabling us to identify the best configurations that balance maximizing coverage with minimizing redundancy.

5. Computational Results

In this section, we present the computational results derived from the methodologies outlined in the previous section. The analysis encompasses the evaluation of NO₂ concentration predictions, and the optimization of the location of stations.

5.1. Behavior of Measured NO₂ Concentrations

The obtained results highlight the importance of temporal and spatial variability in air quality analysis, demonstrating how local conditions and weekly traffic patterns can have a significant impact on NO₂ concentrations.

Figure 3 shows the average value per year for each time slot at each station. Although most stations exhibit a similar pattern, with peak concentrations observed around 9:00 a.m. and another local peak in the late hours, the concentration ranges differ across stations. Similarly, Figure 4 shows the average value per year for each time slot, considering the day of the week. This is because it is known that traffic patterns depend on the day of the week, which directly affects NO₂ concentrations [63]. In this case, the values from Monday to Friday are similar, while weekends show a different behavior, with significantly lower concentrations.

Except for station 58, which is located in a residential area on the city’s outskirts, the annual mean values in the remaining stations are much higher than the 10 µg/m³ threshold proposed by the WHO. Moreover, outliers with values significantly exceeding the 24 h limit of 25 µg/m³ are present, especially in stations 57 and 54. The behavior is consistent across different years, although higher annual mean concentration values are obtained for the year 2022 in all stations (Figure 5).

Considering the adaptation periods proposed by the WHO to reduce NO₂ concentration levels (40 µg/m³ for adaptation level 1, 30 µg/m³ for adaptation level 2, and 20 µg/m³ for level 3), it can be concluded that the city of Barcelona currently falls within adaptation level 2 regarding the NO₂ concentration limits established by the WHO. The WHO recommends a daily average NO₂ concentration of 25 µg/m³ (level 3 in the air quality guidelines), which should not be exceeded on more than 3 to 4 days per year. Figure 6 illustrates the daily average values during the study period alongside the recommended daily average limit (dashed red line).

As observed, the number of days surpassing the threshold per year exceeds the 3–4 days limit. Taking a more flexible approach and considering the adaptation period to reduce NO₂ concentrations, the maximum daily concentration could be considered to be 120 µg/m³ (adaptation level 1) and 50 µg/m³ (adaptation level 2). If we consider the adaptation periods rather than the strict guidelines, it can be concluded that the city of Barcelona is at adaptation level 2 but still far from achieving compliance with the guidelines.

Figure 7 presents a wind rose diagram depicting the average wind direction trends for each station throughout the study period. The differences in altitude among the stations, combined with their proximity to the sea, result in varying wind rose diagrams, despite the stations being relatively close to one another. Station X4, located nearest to the sea, experiences winds from multiple directions. In conjunction with the other variables studied, a similar behavioral pattern is observed across the stations, with stations X4 and X8 exhibiting more comparable behavior than station D5, which is situated further inland.

The dendrogram illustrating the average concentrations at the air quality stations reveals interesting clustering patterns among the stations (Figure 8a). Station 58 emerges as a distinct cluster, with a different behavior from any other station due to its location in the city’s peripheral areas. On the other hand, stations 44, 50, and 43 exhibit similar behavior, indicating their central location in the city’s high-traffic zone with increased concentrations. Likewise, stations 4 and 42 demonstrate identical patterns, while stations 57 and 54 share similar behavior. These two groups of stations are clustered together, forming a distinct cluster defined by their proximity to the city’s peripheral highways. For the analysis of the meteorological stations, the representation of wind speed along the x-axis and y-axis is utilized (Figure 8b). This method demonstrates the existence of two distinct groups with different behaviors. Stations X4 and X8 form one cluster in the southern part of the city at an altitude of 47 m within a high-traffic area. The other cluster consists of station D5, located in the northern region at an altitude of 415 m and away from heavy traffic.

In the case study of Barcelona, there are fewer meteorological stations than air quality stations. Therefore, the KNN algorithm is employed to classify and predict meteorological variables at each station. The goal of applying this algorithm is to obtain meteorological variables for all air quality stations. The results are presented in Table 3.

The descriptive study highlights the importance of the location of air quality stations, as NO₂ concentrations largely depend on traffic in the area. It is crucial to distinguish whether the station is situated in a city center or a residential neighborhood, as these characteristics influence both the average and maximum concentrations recorded at each station. Additionally, traffic patterns not only affect concentration levels but also the dynamics of variations in NO₂ concentrations. Furthermore, it has been shown that weather conditions vary depending on the location within the city, which can also influence the distribution and effectiveness of air quality stations. These factors underscore the importance of correctly locating each station to accurately represent the real pollution situation in the city.

5.2. Prediction of NO₂ Concentration per Station

Next, we present the results obtained with the different prediction models for our dataset. Table 4 demonstrates that Random Forest outperforms other methodologies regarding prediction accuracy for NO₂ concentrations at each station. The lower MAE and RMSE values indicate a closer approximation to the observed data, while the higher R² suggests a better fit to the data variability. These findings underscore the efficacy of Random Forest for predicting NO₂ concentrations in diverse locations, and they explain the specific characteristics influencing the predictive performance of each method.

When applying the Random Forest algorithm to predict NO₂ concentration in different stations, it is observed that the evaluation metrics showed similar and close values across all stations. However, the question arises as to whether the model’s effectiveness in predictions varies depending on the concentration of NO₂ concentrations at each station. To address this question, a correlation analysis is conducted between the normalized evaluation metrics and the annual average concentration of NO₂ at each station. The results indicate that in stations with lower NO₂ concentrations, the MAE is lower than those with higher NO₂ concentrations (Figure 9a). This relationship between NO₂ concentration and MAE yields a correlation coefficient of 0.82. To ensure robustness, station 50 is excluded from the analysis, as it exhibits poor fit and appears as an outlier (Figure 9b).

The study finds a strong positive correlation between NO₂ concentration and prediction error (RMSE), indicating that stations with higher NO₂ concentrations also experience higher prediction errors. Additionally, a moderate negative correlation is observed between the NO₂ concentration and the coefficient of determination (R²), suggesting that the model is less effective in stations with high NO₂ concentrations, where it fails to adequately explain the variability of the observed data. These findings reveal that the performance of the Random Forest model in predicting NO₂ concentrations varies according to the concentration level at each station; it is more effective in stations with low NO₂ concentrations and less accurate in stations with high concentrations.

5.3. Optimization of Location of Stations

Lastly, the obtained results of the proposed optimization algorithm for the optimal location of air quality stations are presented. First, the results of the Pareto frontier are shown to obtain the optimal number of stations to maximize the percentage of area coverage and minimize the percentage of overlap between stations. In addition, a scenario with minimal changes is considered, in which a total of 10 stations are placed throughout the city, ensuring that they are as far apart from each other as possible.

Figure 10 illustrates the results of the Pareto frontier, depicting two key relationships concerning the number of open air quality stations. The left sub-plot displays the number of open stations alongside the percentage coverage achieved by each configuration. As expected, increasing the number of open stations generally leads to improved coverage across the air quality monitoring network. The right sub-plot exhibits the number of open stations concerning the percentage of overlap between these stations. As the number of open stations increases, so does the likelihood of overlap between their coverage areas. Initially, as the number of stations grows, the overlap percentage remains at 0%. However, beyond a certain threshold, the percentage of overlap escalates rapidly.

Observe that the Pareto frontier demonstrates the trade-off between increasing the number of open stations to enhance coverage and the consequent rise in overlap, which may lead to redundant monitoring and inefficient resource allocation. The optimal configuration is identified as

m = 0.7

since an increase in the number of stations with the chosen locations does not lead to an increase in coverage due to overlaps and the distribution of stations. At this point, the plot displaying coverage indicates that 35 open stations offer the highest coverage without significantly increasing overlap. This suggests that the air quality monitoring network achieves a well-balanced configuration, maximizing coverage while minimizing redundancy.

Figure 11 presents the optimal scenario of open air quality stations obtained through the Pareto frontier analysis. It highlights the selected stations that maximize coverage while minimizing overlap. This optimal set includes both the initial air quality stations (denoted by red circles) and newly opened potential stations (blue circles). However, budget constraints limit the number of stations that can be installed, requiring a balance between maximizing coverage and managing the costs associated with implementation and maintenance. Thus, the goal is to demonstrate the optimal placement of the current stations, with the addition of two new stations for improved distribution, without incurring significant investment. This minimum scenario is shown in Figure 12. The stations that remain in the exact location as the initial arrangement are marked with red circles, while the new stations, selected at points of interest, are marked with blue circles.

To compare the NO₂ concentration KPI calculated in both scenarios with the current situation, we will use the annual mean NO₂ concentration measured across all stations as the KPI. Each new station will be assigned an average concentration value using the KNN algorithm, which has been previously utilized. In this case, the average value for each new station will be determined by the proportional mean distance to the three nearest current stations. The values for the nearest current stations for the studied period are shown in Appendix A, Table A2. After applying the algorithm, the values obtained for each new station are included in Appendix A, Table A3.

Figure 13 shows the created heat maps to facilitate visual comparison, considering the annual mean concentration value for each station for 2023. The comparison of these maps reveals that the optimal scenario covers a larger portion of the city and provides a more representative depiction of the current situation. Although the algorithm’s assigned values do not account for traffic conditions or other urban characteristics, they are considered a reasonable approximation for comparing the scenarios.

Using the values assigned to the new stations, the NO₂ concentration KPI for the study period is calculated for the different scenarios. The values obtained for each case are shown in Table 5. Throughout the study period, both the minimum and optimal scenarios yield higher KPI values. For the minimum scenario, the KPI increases by 4% to 6%, while for the optimal scenario, the KPI increases by 6% to 9% compared to the current situation. This confirms that the current scenario may be underestimating this indicator for the city and may not capture all the necessary information to accurately represent reality.

6. Conclusions

This study evaluates the suitability of using the daily average concentration of NO₂ as a KPI to assess air quality in a smart city like Barcelona. This evaluation builds on the availability of high-quality initial data that accurately reflect the actual concentration levels across the city as measured by monitoring stations. After analyzing the behavior of NO₂ using the available data, important relationships are evident between the concentration measured at each station and its location. This is not only linked to traffic in the area but also to the city’s meteorological conditions.

Moreover, the study highlights the importance of station placement, as an inadequate distribution could result in a distorted KPI: overestimated if the stations are concentrated in high-traffic areas, or underestimated if they are mainly located in residential zones. For this reason, strategic points are identified in the city where measuring the NO₂ concentration would provide significant added value. Given that the optimal solution for the distribution of air quality monitoring stations could be very costly, a more conservative alternative is also proposed that minimizes investment. This solution only requires the installation of two additional stations and the relocating of some existing ones to achieve more representative results. Furthermore, it is observed that when the stations are better distributed, the NO₂ KPI value exceeds the thresholds set by the WHO. This suggests that Barcelona needs the continuous and precise monitoring of NO₂ levels to quantify the effects of the policies implemented in the city, enabling informed decision-making that improves air quality.

As a future line of research, it would be highly relevant to correlate real-time traffic data in the city with NO₂ concentration data and consider the population density in each zone. This integration would allow for a better understanding of the relationship between vehicular flow, atmospheric pollutant concentrations, and the population exposed to this pollution, providing a more comprehensive perspective on the impact of traffic on air quality in Barcelona.

Author Contributions

Conceptualization, R.S.-G. and P.C.; methodology, P.C.; software, R.S.-G. and X.A.M.; validation, P.C., R.S.-G. and X.A.M.; formal analysis, R.S.-G.; investigation, P.C., R.S.-G. and X.A.M.; resources, R.S.-G. and P.C.; data curation, R.S.-G.; writing—original draft preparation, R.S.-G. and X.A.M.; writing—review and editing, P.C. and E.P.-B.; visualization, P.C., R.S.-G. and X.A.M.; supervision, P.C.; project administration, E.P.-B.; funding acquisition, E.P.-B. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been partially funded by the Spanish Ministry of Science and Innovation (PID2022-138860NB-I00 and RED2022-134703-T), the project SUN (HORIZON-CL4-2022-HUMAN-01-14-101092612), the project UP2030 (HORIZON-MISS-2021-CIT-02-01-101096405) as well as by the Barcelona City Council and Fundació “la Caixa” under the framework of the Barcelona Science Plan 2020–2023 (grant 21S09355-001). Funding for open access charge: CRUE-Universitat Politècnica de València.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in [Open Data Barcelona] at [https://opendata-ajuntament.barcelona.cat/] (accessed on 10 October 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Tables

Appendix A.1. Location of the 40 Potential New Air Quality Stations

Table A1. Location of the 40 potential new air quality stations.

ID	Item	Address
1	Location near highway 1	Carrer número 62
2	Location near highway 2	Carrer de portbou
3	Location near highway 3	Carrer de la mare de deu del port 1
55	Location near highway 4	Gran via Cortes Catalans
5	Location near highway 5	Passeig de García Gracia 15
6	Location ring road 1	Carrer de Can Móra (Ronda Dált
7	Location ring road 2	Carrer de Scala Dei (Ronda de Dalt
8	Location ring road 3	Carrer de Cuzco (Ronda del litoral)
9	Location ring road 4	Carrer de Vicenc Montal (Ronda del litoral)
10	Location ring road 5	Carrer de Aristides Maillol (Ronda del litoral
11	Location city access road 1	Carrrer de Sancho de Ávila 1 ( Av. Meridiana)
12	Location city access road 2	Carrer de Padilla 167 (Av. Cortes Catalanas)
13	Location city access road 3	Carrer de Valencia (Cruce Calle de Padillal)
14	Location city access road 4	Carrer de Bac de Roda (Gran via de las Corts Catalanas)
15	Location city access road 5	Carrer de Rogent (Av. Meridiana)
16	Location city access road 6	Carrer de Badajoz (Av. Diangonal)
17	Location city access road 7	Carre de Bailen (Av. Diagonal)
18	Location city access road 8	Carrer de Balmes (Gran Via Corts Catalanes)
19	Location city access road 9	Passeo de Fabra i Puig (Av. Meridiana)
20	Location city access road 10	Travesia de las Cortes
21	Sensitive location 1	Hospital del mar (Passeig Maritim 25)
22	Sensitive location 2	Hospital de Nens. Carrer Consell de Cent, 437)
23	Sensitive location 3	Hospital dos de Maig (carrer dos de Maig, 301)
24	Sensitive location 4	Hospital Quironsalud Barcelona (Plaza Alfonso Comins 5)
25	Sensitive location 5	Hospital Universitari Dexeus (Calle de Sabino Arana19)
26	Sensitive location 6	Universida de Barcelona (Av. Del Doctor Marañón)
27	Sensitive location 7	Universidad Abierta de Cataluña (Av. Del Tibidabo 31)
28	Sensitive location 8	Centro de dia La Torre Setze ( Calle la Torre 16)
29	Sensitive location 9	Residencia de mayores Felizvita (Av. Josep Tarradellas 38)
30	Sensitive location 10	Centro residencial Bonaire. (Carrer Alt de Pedrell 100)
31	Low-income neighborhood location 1	Carrer del Vesuvi/Nou Barris
32	Low-income neighborhood location 2	Passeig d’Úrrutia/Nou Barris
33	Low-income neighborhood location 3	Calle Huelva/Barrio San Martí
34	Low-income neighborhood location 4	Cami Noy de la RAmbla/Barrio del Raval
35	Low-income neighborhood location 5	Carrer de la Pedrosa/Barrio Trinitat Nova
36	Recreation area location 1	Complejo deportivo Municipal Mar Bella (Av. del litoral)
37	Recreation area location 2	Camp Nou. Calle de Aristides Maillol 12
38	Recreation area location 3	Parque de Montjuic. C/Lleida 35
39	Recreation area location 4	Parque de la Ciutadella. Paseo de Pujades 22
40	Recreation area location 5	Parque Güell. Carrer de Olot 5

Appendix A.2. Current Stations: Location and Annual Mean NO₂ Concentration in (µg/m³)

Table A2. Current stations: location and annual mean NO₂ concentration in (µg/m³).

			2020	2021	2022
ID	Longitude	Latitude	[NO₂]	[NO₂]	[NO₂]
50	2.1874	41.3864	22.9552	24.5184	27.4114
43	2.1538	41.3853	34.1354	36.8185	25.7095
44	2.1534	41.3987	31.2408	30.8547	41.4748
57	2.1151	41.3875	17.5895	18.2762	34.1493
4	2.2045	41.4039	27.9477	25.4330	31.0970
42	2.1331	41.3788	23.1318	21.6456	21.7994
54	2.1480	41.4261	21.3454	20.2746	20.8797
58	2.1239	41.4184	8.3513	8.7778	9.0200

Appendix A.3. New Scenario Stations: Location and Annual Mean NO₂ Concentration in (µg/m³)

Table A3. New scenario stations: location and annual mean NO₂ concentration in (µg/m³).

			2020	2021	2022
ID	Longitude	Latitude	[NO₂]	[NO₂]	[NO₂]
1	41.3618	2.1381	25.0244	25.2807	25.9081
2	41.3758	2.1269	23.3542	23.0103	25.4241
3	41.3536	2.1504	29.2182	29.5320	28.2716
55	41.3728	2.1463	29.1856	29.5732	27.5267
5	41.3896	2.1674	30.0775	31.4250	31.4502
6	41.4014	2.1140	15.7675	15.8991	23.2605
7	41.4392	2.1564	20.5333	19.9485	22.6254
8	41.4374	2.2083	24.8566	23.8632	27.4571
9	41.3829	2.1830	25.1718	26.3397	27.6735
12	41.4004	2.1838	26.3689	26.2020	31.6768
15	41.4205	2.1864	24.6427	23.7635	27.2174
16	41.3954	2.1996	26.5924	25.7325	30.9999
18	41.3975	2.1520	31.1350	31.0611	38.4371
19	41.4325	2.1631	21.0723	20.4905	23.5300
20	41.3976	2.1325	23.8908	23.4705	32.1344
21	41.3808	2.1722	28.8873	30.3454	30.3644
22	41.3973	2.1738	28.8754	30.1216	31.5276
23	41.4106	2.1770	27.3635	26.9645	33.3680
24	41.4156	2.1385	18.9812	18.6260	21.5304
25	41.3855	2.1266	22.8310	22.8241	27.0318
28	41.4044	2.1487	29.9043	30.0493	33.9573
29	41.3833	2.1427	29.1355	29.4766	27.7437
30	41.4235	2.1710	26.2113	24.9961	30.1150
31	41.4465	2.1832	26.3096	24.9517	30.0425
33	41.4177	2.1967	27.2233	26.1888	32.0277
35	41.4495	2.1919	24.2373	23.3618	26.4753
36	41.3985	2.2093	27.2300	25.7422	31.2939
38	41.3650	2.1669	29.6844	31.1761	30.4976
40	41.4135	2.1532	27.6891	27.6332	29.7088
50	41.3853	2.1538	34.1354	36.8185	25.7095
57	41.3875	2.1151	17.5895	18.2762	34.1493
4	41.4039	2.2045	27.9477	25.4330	31.0970
54	41.4261	2.1480	21.3454	20.2746	20.8797
58	41.4184	2.1239	8.3513	8.7778	9.0200

References

Toli, A.M.; Murtagh, N. The concept of sustainability in smart city definitions. Front. Built Environ. 2020, 6, 77. [Google Scholar] [CrossRef]
Sustainable Development Goals. Available online: https://bit.ly/2R8siwl (accessed on 17 July 2024).
International Telecommunication Union. Available online: https://unece.org/fileadmin/DAM/hlm/documents/Publications/U4SSC-CollectionMethodologyforKPIfoSSC-2017.pdf (accessed on 17 July 2024).
Bosch, P.; Jongeneel, S.; Rovers, V.; Neumann, H.M.; Airaksinen, M.; Huovila, A. CITYkeys Indicators for Smart City Projects and smart Cities; CITYkeys Report 10. 2017. Available online: https://cordis.europa.eu/project/id/646440/reporting (accessed on 7 November 2024).
Nowicka, K. Cloud computing in sustainable mobility. Transp. Res. Procedia 2016, 14, 4070–4079. [Google Scholar] [CrossRef]
Haddad, C. Choosing suitable indicators for the assessment of urban air mobility: A case of upper Bavaria, Germany. Eur. J. Transp. Infrastruct. Res. 2020, 20, 214–232. [Google Scholar] [CrossRef]
Smart Cities: Key Technologies, Environmental Impact and Market Forecast 2022–2026. Available online: https://www.juniperresearch.com/researchstore/sustainability-technology-iot/smart-cities-research-report (accessed on 17 July 2024).
European Parlament: Air Pollution: Deal with Council to Improve Air Quality. Available online: https://www.europarl.europa.eu/news/es/press-room/20240219IPR17816/air-pollution-deal-with-council-to-improve-air-quality (accessed on 17 July 2024).
Soriano-Gonzalez, R.; Perez-Bernabeu, E.; Ahsini, Y.; Carracedo, P.; Camacho, A.; Juan, A.A. Analyzing key performance indicators for mobility logistics in smart and sustainable cities: A case study centered on Barcelona. Logistics 2023, 7, 75. [Google Scholar] [CrossRef]
Catalan Institute of Statistics. Available online: https://www.idescat.cat/emex/?id=080193&lang=es (accessed on 17 July 2024).
Almalki, F.A.; Alsamhi, S.H.; Sahal, R.; Hassan, J.; Hawbani, A.; Rajput, N.; Saif, A.; Morgan, J.; Breslin, J. Green IoT for eco-friendly and sustainable smart cities: Future directions and opportunities. Mob. Netw. Appl. 2021, 28, 178–202. [Google Scholar] [CrossRef]
Subramaniam, S.; Raju, N.; Ganesan, A.; Rajavel, N.; Chenniappan, M.; Prakash, C.; Pramanik, A.; Basak, A.K.; Dixit, S. Artificial Intelligence Technologies for Forecasting Air Pollution and Human Health: A Narrative Review. Sustainability 2022, 14, 9951. [Google Scholar] [CrossRef]
Zhu, F.; Ding, R.; Lei, R.; Cheng, H.; Liu, J.; Shen, C.; Zhang, C.; Xu, Y.; Xiao, C.; Li, X.; et al. The short-term effects of air pollution on respiratory diseases and lung cancer mortality in Hefei: A time-series analysis. Respir. Med. 2019, 146, 57–65. [Google Scholar] [CrossRef]
Gurjar, B.R.; Jain, A.; Sharma, A.; Agarwal, A.; Gupta, P.; Nagpure, A.; Lelieveld, J. Human health risks in megacities due to air pollution. Atmos. Environ. 2010, 44, 4606–4613. [Google Scholar] [CrossRef]
Low-Emission Zones. Available online: https://www.idae.es/movilidad-sostenible/zonas-de-bajas-emisiones (accessed on 17 July 2024).
Ntafalias, A. A comprehensive methodology for assessing the impact of smart city interventions: Evidence from Espoo transformation process. Smart Cities 2022, 5, 90–107. [Google Scholar] [CrossRef]
Lebrusán, I.; Toutouh, J. Using smart city tools to evaluate the effectiveness of a low emissions zone in Spain: Madrid central. Smart Cities 2020, 3, 456–478. [Google Scholar] [CrossRef]
Golpayegani, F.; Guériau, M.; Laharotte, P.A.; Ghanadbashi, S.; Guo, J.; Geraghty, J.; Wang, S. Intelligent Shared Mobility Systems: A Survey on Whole System Design Requirements, Challenges and Future Direction. IEEE Access 2022, 10, 35302–35320. [Google Scholar] [CrossRef]
Angelakoglou, K.; Nikolopoulos, N.; Giourka, P.; Svensson, I.L.; Tsarchopoulos, P.; Tryferidis, A.; Tzovaras, D. A methodological framework for the selection of key performance indicators to assess smart city solutions. Smart Cities 2019, 2, 269–306. [Google Scholar] [CrossRef]
Malik, A.; Tauler, R. Exploring the interaction between O3 and NOx pollution patterns in the atmosphere of Barcelona, Spain using the MCR–ALS method. Sci. Total Environ. 2015, 517, 151–161. [Google Scholar] [CrossRef] [PubMed]
Basagaña, X.; Triguero-Mas, M.; Agis, D.; Pérez, N.; Reche, C.; Alastuey, A.; Querol, X. Effect of public transport strikes on air pollution levels in Barcelona (Spain). Sci. Total Environ. 2018, 610, 1076–1082. [Google Scholar] [CrossRef] [PubMed]
Gignac, F.; Righi, V.; Toran, R.; Errandonea, L.P.; Ortiz, R.; Mijling, B.; Naranjo, A.; Nieuwenhuijsen, M.; Creus, J.; Basagana, X. Short-term NO₂ exposure and cognitive and mental health: A panel study based on a citizen science project in Barcelona, Spain. Environ. Int. 2022, 164, 107284. [Google Scholar] [CrossRef]
Pierangeli, I.; Nieuwenhuijsen, M.; Cirach, M.; Rojas-Rueda, D. Health equity and burden of childhood asthma-related to air pollution in Barcelona. Environ. Res. 2020, 186, 109067. [Google Scholar] [CrossRef]
Benavides, J.; Snyder, M.; Guevara, M.; Soret, A.; Pérez García-Pando, C.; Amato, F.; Querol, X.; Jorba, O. CALIOPE-Urban v1.0: Coupling R-LINE with a mesoscale air quality modelling system for urban air quality forecasts over Barcelona city (Spain). Geosci. Model Dev. 2019, 12, 2811–2835. [Google Scholar] [CrossRef]
Rodriguez-Rey, D.; Guevara, M.; Linares, M.P.; Casanovas, J.; Armengol, J.M.; Benavides, J.; Soret, A.; Jorba, O.; Tena, C.; García-Pando, C.P. To what extent the traffic restriction policies applied in Barcelona city can improve its air quality? Sci. Total Environ. 2022, 807, 150743. [Google Scholar] [CrossRef]
Cican, G.; Buturache, A.N.; Mirea, R. Applying Machine Learning Techniques in Air Quality Prediction—A Bucharest City Case Study. Sustainability 2023, 15, 8445. [Google Scholar] [CrossRef]
Wu, C.l.; Song, R.f.; Zhu, X.h.; Peng, Z.r.; Fu, Q.y.; Pan, J. A hybrid deep learning model for regional O₃ and NO₂ concentrations prediction based on spatiotemporal dependencies in air quality monitoring network. Environ. Pollut. 2023, 320, 121075. [Google Scholar] [CrossRef]
Tao, C.; Jia, M.; Wang, G.; Zhang, Y.; Zhang, Q.; Wang, X.; Wang, Q.; Wang, W. Time-sensitive prediction of NO₂ concentration in China using an ensemble machine learning model from multi-source data. J. Environ. Sci. 2024, 137, 30–40. [Google Scholar] [CrossRef] [PubMed]
El Mghouchi, Y.; Udristioiu, M.T.; Yildizhan, H. Multivariable Air-Quality Prediction and Modelling via Hybrid Machine Learning: A Case Study for Craiova, Romania. Sensors 2024, 24, 1532. [Google Scholar] [CrossRef] [PubMed]
World Health Organization. Available online: https://www.who.int/en/news-room/fact-sheets/detail/ambient-(outdoor)-air-quality-and-health (accessed on 17 July 2024).
Harris, C.R.; Millman, K.J.; van der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J.; et al. Array programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef] [PubMed]
McKinney, W. Data structures for statistical computing in python. In Proceedings of the 9th Python in Science Conference, Austin, TX, USA, 28–30 June 2010; Volume 445, pp. 51–56. [Google Scholar]
Waskom, M.L. Seaborn: Statistical Data Visualization. 2021. Available online: https://seaborn.pydata.org/ (accessed on 10 October 2024).
Hunter, J.D. Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
contributors, F. Folium: Python Data, Leaflet.js Maps. 2021. Available online: https://github.com/python-visualization/folium (accessed on 10 October 2024).
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
Munir, S.; Mayfield, M.; Coca, D. Understanding spatial variability of NO₂ in urban areas using spatial modelling and data fusion approaches. Atmosphere 2021, 12, 179. [Google Scholar] [CrossRef]
Fan, C.; Li, Z.; Li, Y.; Dong, J.; van der A, R.; de Leeuw, G. Variability of NO₂ concentrations over China and effect on air quality derived from satellite and ground-based observations. Atmos. Chem. Phys. 2021, 21, 7723–7748. [Google Scholar] [CrossRef]
Zhu, Y.; Chen, J.; Bi, X.; Kuhlmann, G.; Chan, K.L.; Dietrich, F.; Brunner, D.; Ye, S.; Wenig, M. Spatial and temporal representativeness of point measurements for nitrogen dioxide pollution levels in cities. Atmos. Chem. Phys. 2020, 20, 13241–13251. [Google Scholar] [CrossRef]
Ortiz, A.F.; Jiménez Núñez, M.d.l.L.; Díaz Godoy, R.V. Study of the behavior of air parcels, using PIXE, Hysplit and wind rose in the metropolitan zone of Toluca Valley, Mexico. J. Energy Res. Rev. 2021, 9, 51–66. [Google Scholar] [CrossRef]
Sun, S.; Tian, L.; Cao, W.; Lai, P.C.; Wong, P.P.Y.; Lee, R.S.y.; Mason, T.G.; Krämer, A.; Wong, C.M. Urban climate modified short-term association of air pollution with pneumonia mortality in Hong Kong. Sci. Total Environ. 2019, 646, 618–624. [Google Scholar] [CrossRef]
Chang, Y.S.; Chiao, H.T.; Abimannan, S.; Huang, Y.P.; Tsai, Y.T.; Lin, K.M. An LSTM-based aggregated model for air pollution forecasting. Atmos. Pollut. Res. 2020, 11, 1451–1463. [Google Scholar] [CrossRef]
Yadav, M.; Singh, N.K.; Sahu, S.P.; Padhiyar, H. Investigations on air quality of a critically polluted industrial city using multivariate statistical methods: Way forward for future sustainability. Chemosphere 2022, 291, 133024. [Google Scholar] [CrossRef] [PubMed]
Govender, P.; Sivakumar, V. Application of k-means and hierarchical clustering techniques for analysis of air pollution: A review (1980–2019). Atmos. Pollut. Res. 2020, 11, 40–56. [Google Scholar] [CrossRef]
Saputra, D.M.; Saputra, D.; Oswari, L.D. Effect of distance metrics in determining k-value in k-means clustering using elbow and silhouette method. In Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019), Palembang, Indonesia, 16 November 2019; Atlantis Press: Amsterdam, The Netherlands, 2020; pp. 341–346. [Google Scholar]
Xu, D.; Tian, Y. A comprehensive survey of clustering algorithms. Ann. Data Sci. 2015, 2, 165–193. [Google Scholar] [CrossRef]
Jaeger, A.; Banks, D. Cluster analysis: A modern statistical review. Wiley Interdiscip. Rev. Comput. Stat. 2023, 15, e1597. [Google Scholar] [CrossRef]
Masood, A.; Ahmad, K. A review on emerging artificial intelligence (AI) techniques for air pollution forecasting: Fundamentals, application and performance. J. Clean. Prod. 2021, 322, 129072. [Google Scholar] [CrossRef]
Li, Y.; Guo, J.e.; Sun, S.; Li, J.; Wang, S.; Zhang, C. Air quality forecasting with artificial intelligence techniques: A scientometric and content analysis. Environ. Model. Softw. 2022, 149, 105329. [Google Scholar] [CrossRef]
Kothandaraman, D.; Praveena, N.; Varadarajkumar, K.; Madhav Rao, B.; Dhabliya, D.; Satla, S.; Abera, W. Intelligent forecasting of air quality and pollution prediction using machine learning. Adsorpt. Sci. Technol. 2022, 2022, 5086622. [Google Scholar] [CrossRef]
Parhizkar, T.; Rafieipour, E.; Parhizkar, A. Evaluation and improvement of energy consumption prediction models using principal component analysis based feature reduction. J. Clean. Prod. 2021, 279, 123866. [Google Scholar] [CrossRef]
Dun, M.; Xu, Z.; Chen, Y.; Wu, L. Short-Term Air Quality Prediction Based on Fractional Grey Linear Regression and Support Vector Machine. Math. Probl. Eng. 2020, 2020, 8914501. [Google Scholar] [CrossRef]
Gariazzo, C.; Carlino, G.; Silibello, C.; Renzi, M.; Finardi, S.; Pepe, N.; Radice, P.; Forastiere, F.; Michelozzi, P.; Viegi, G.; et al. A multi-city air pollution population exposure study: Combined use of chemical-transport and random-Forest models with dynamic population data. Sci. Total Environ. 2020, 724, 138102. [Google Scholar] [CrossRef] [PubMed]
Das, B.; Dursun, Ö.O.; Toraman, S. Prediction of air pollutants for air quality using deep learning methods in a metropolitan city. Urban Clim. 2022, 46, 101291. [Google Scholar] [CrossRef]
Sekeroglu, B.; Ever, Y.K.; Dimililer, K.; Al-Turjman, F. Comparative evaluation and comprehensive analysis of machine learning models for regression problems. Data Intell. 2022, 4, 620–652. [Google Scholar] [CrossRef]
Hrust, L.; Klaić, Z.B.; Križan, J.; Antonić, O.; Hercog, P. Neural network forecasting of air pollutants hourly concentrations using optimised temporal averages of meteorological variables and pollutant concentrations. Atmos. Environ. 2009, 43, 5588–5596. [Google Scholar] [CrossRef]
European Commission’s Mobility Observatory. Available online: https://urban-mobility-observatory.transport.ec.europa.eu/ (accessed on 17 July 2024).
Kanaroglou, P.S.; Jerrett, M.; Morrison, J.; Beckerman, B.; Arain, M.A.; Gilbert, N.L.; Brook, J.R. Establishing an air pollution monitoring network for intra-urban population exposure assessment: A location-allocation approach. Atmos. Environ. 2005, 39, 2399–2409. [Google Scholar] [CrossRef]
Martí, R.; Martínez-Gavara, A.; Sánchez-Oro, J. The capacitated dispersion problem: An optimization model and a memetic algorithm. Memetic Comput. 2021, 13, 131–146. [Google Scholar] [CrossRef]
Carlson, S. International transmission of information and the business firm. Ann. Am. Acad. Political Soc. Sci. 1974, 412, 55–63. [Google Scholar] [CrossRef]
Gomez, J.F.; Panadero, J.; Tordecilla, R.D.; Castaneda, J.; Juan, A.A. A multi-start biased-randomized algorithm for the capacitated dispersion problem. Mathematics 2022, 10, 2405. [Google Scholar] [CrossRef]
Estrada-Moreno, A.; Savelsbergh, M.; Juan, A.A.; Panadero, J. Biased-randomized iterated local search for a multiperiod vehicle routing problem with price discounts for delivery flexibility. Int. Trans. Oper. Res. 2019, 26, 1293–1314. [Google Scholar] [CrossRef]
Shoari, N.; Heydari, S.; Blangiardo, M. School neighbourhood and compliance with WHO-recommended annual NO₂ guideline: A case study of Greater London. Sci. Total Environ. 2022, 803, 150038. [Google Scholar] [CrossRef]

Figure 1. Spatial distribution of air quality stations (red) and meteorological stations (green).

Figure 2. Initial situation of air quality stations (red) and sensitive points for the possible location of new stations (blue).

Figure 3. Mean value of the daily concentration evolution at each station.

Figure 4. Average hourly distribution of NO₂ concentrations by day of the week.

Figure 5. Annual mean of each station with the WHO limit value as the 24 h average (yellow) and limit annual average 2023 (red).

Figure 6. Daily average of NO₂ concentrations in the period under study.

Figure 7. Wind rose for each station.

Figure 8. Cluster in different stations. (a) Air quality stations. (b) Meteorological station.

Figure 9. Relationship between average NO₂ concentration of stations and MAE error. (a) With all station. (b) Without station 50.

Figure 10. Pareto chart.

Figure 11. Optimal scenario: Final situation solution of the problem, maximum number of stations, occupying the maximum area without overlapping.

Figure 12. Minimum scenario: New locations of the minimum stations that should be considered as a result of the problem.

Figure 13. Heat map comparison with annual average NO₂ concentrations in 2023. (a) Heat map with the initial stations. (b) Heat map with minimum scenario. (c) Heat map with the 39 stations.

Table 1. Air quality stations and their locations.

Station ID	Longitude	Latitude
50	2.1874	41.38640
43	2.1538	41.38530
44	2.1534	41.39870
57	2.1151	41.38750
4	2.2045	41.40390
42	2.1331	41.37880
54	2.1480	41.42610
58	2.1239	41.41843

Table 2. Meteorological stations and their locations.

Station ID	Longitude	Latitude
D5	2.12379	41.41864
X2	2.18847	41.38943
X4	2.16775	41.38390
X8	2.10540	41.37919

Table 3. Application of the KNN algorithm to the stations.

Air Quality Station	Meteorological Station
50	X4
43	X4
44	X4
57	X8
4	X4
42	X8
54	D5
58	D5

Table 4. Errors of ML models used to predict NO₂ concentrations by station.

Station	50			43			44
Station	MAE	RMSE	R2	MAE	RMSE	R2	MAE	RMSE	R2
KNN	0.1221	0.1575	0.2723	0.1090	0.1431	0.3197	0.1112	0.1470	0.3056
Decision Tree	0.0023	0.0065	0.9987	0.0016	0.0035	0.9996	0.0017	0.0033	0.9996
SVR	0.1442	0.1874	−0.0295	0.1368	0.1740	−0.0057	0.1399	0.1781	−0.0201
Random Forest	0.0016	0.0057	0.9991	0.0009	0.0027	0.9998	0.0011	0.0030	0.9997
ANN	0.1524	0.2031	−0.2094	0.1304	0.1668	0.0767	0.1641	0.1957	−0.2306
Station	4			42			54
Station	MAE	RMSE	R2	MAE	RMSE	R2	MAE	RMSE	R2
KNN	0.1175	0.1544	0.2932	0.1001	0.1332	0.3130	0.0989	0.1293	0.2864
Decision Tree	0.0015	0.0025	0.9998	0.0018	0.0050	0.9990	0.0013	0.0030	0.9996
SVR	0.1467	0.1851	−0.0161	0.1250	0.1619	−0.0151	0.1165	0.1541	−0.0139
Random Forest	0.0011	0.0028	0.9998	0.0009	0.0024	0.9998	0.0008	0.0021	0.9998
ANN	0.1506	0.1952	−0.1296	0.1267	0.1697	−0.1153	0.1320	0.1741	−0.2942
Station	57			58
Station	MAE	RMSE	R2	MAE	RMSE	R2
KNN	0.0891	0.1206	0.3603	0.0768	0.0980	0.2597
Decision Tree	0.0016	0.0042	0.9992	0.0009	0.0020	0.9997
SVR	0.1152	0.1511	−0.0048	0.0878	0.1139	0.0015
Random Forest	0.0008	0.0017	0.9999	0.0006	0.0016	0.9998
ANN	0.1013	0.1391	0.1482	0.1370	0.1630	−1.046

Note: Values in bold represent the best results for each metric across models.

Table 5. Comparison of the current NO₂ concentration KPI values (µg/m³) with two scenarios.

	2020	2021	2022
Current Scenario [NO₂]	23.3372	23.3249	26.4427
Minimum Scenario [NO₂]	24.3717	24.0780	28.2378
Optimal Scenario [NO₂]	25.3184	25.2244	28.2392
Increase in the Minimum Scenario (%)	4.43	3.23	6.78
Increase in the Optimal Scenario (%)	8.49	8.14	6.79

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Soriano-Gonzalez, R.; Martin, X.A.; Perez-Bernabeu, E.; Carracedo, P. Modeling and Optimization of NO₂ Stations in the Smart City of Barcelona. Appl. Sci. 2024, 14, 10355. https://doi.org/10.3390/app142210355

AMA Style

Soriano-Gonzalez R, Martin XA, Perez-Bernabeu E, Carracedo P. Modeling and Optimization of NO₂ Stations in the Smart City of Barcelona. Applied Sciences. 2024; 14(22):10355. https://doi.org/10.3390/app142210355

Chicago/Turabian Style

Soriano-Gonzalez, Raquel, Xabier A. Martin, Elena Perez-Bernabeu, and Patricia Carracedo. 2024. "Modeling and Optimization of NO₂ Stations in the Smart City of Barcelona" Applied Sciences 14, no. 22: 10355. https://doi.org/10.3390/app142210355

APA Style

Soriano-Gonzalez, R., Martin, X. A., Perez-Bernabeu, E., & Carracedo, P. (2024). Modeling and Optimization of NO₂ Stations in the Smart City of Barcelona. Applied Sciences, 14(22), 10355. https://doi.org/10.3390/app142210355

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Modeling and Optimization of NO₂ Stations in the Smart City of Barcelona

Abstract

1. Introduction

2. Overview of NO₂ Concentrations