Next Article in Journal
A Methodological Approach to the Study of Retroreflective Pavements
Next Article in Special Issue
Unlocking the Potential of Pick-Up Points in Last-Mile Delivery in Relation to Gen Z: Case Studies from Greece and Italy
Previous Article in Journal
Esports Training, Periodization, and Software—A Scoping Review
Previous Article in Special Issue
Predicting and Analyzing Electric Bicycle Adoption to Enhance Urban Mobility in Belgrade Using ANN Models
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Modeling and Optimization of NO2 Stations in the Smart City of Barcelona

by
Raquel Soriano-Gonzalez
,
Xabier A. Martin
,
Elena Perez-Bernabeu
and
Patricia Carracedo
*
Research Center on Production Management and Engineering, Universitat Politècnica de València, Ferrandiz-Carbonell, 03802 Alcoy, Spain
*
Author to whom correspondence should be addressed.
Appl. Sci. 2024, 14(22), 10355; https://doi.org/10.3390/app142210355
Submission received: 11 October 2024 / Revised: 1 November 2024 / Accepted: 8 November 2024 / Published: 11 November 2024
(This article belongs to the Special Issue Sustainable Urban Mobility)

Abstract

:
The growing problem of nitrogen dioxide (NO2) pollution in urban environments is driving cities to adopt smart and sustainable approaches to address this challenge. To quantify and compare the effect of environmental policies, cities must be able to make informed decisions with real-time data that reflect the actual situation. Therefore, the objective of this work is threefold: The first is to study the behavior of the key performance indicator (KPI) of NO2 concentrations per station in Barcelona through exploratory analysis and clustering. The second is to predict NO2 concentration behavior, considering meteorological data. Lastly, a new distribution of current and new stations will be proposed using an optimization algorithm that maximizes the distance between them and covers the largest area of the city. As a result of this study, the importance of the location of measurement points and the need for better distribution in the city are highlighted. These new spatial distributions predict an 8% increase in NO2 concentrations. In conclusion, this study is a comprehensive tool for obtaining an accurate representation of NO2 concentrations in the city, contributing to informed decision-making, helping to improve air quality, and promoting a more sustainable urban environment.

1. Introduction

In recent years, there has been a growing interest among municipal authorities worldwide in the study and research of smart and sustainable cities. This concept combines human and social capital with the city’s infrastructure capital to achieve a sustainable, livable, and efficient city [1]. These cities strive to contribute to the Sustainable Development Goals [2] and attain a carbon-neutral future, as well as a better quality of life for their residents. To achieve these goals, cities are assessed in various areas to comply with national and international strategic plans for decarbonization and environmental sustainability. In this context, key performance indicators (KPIs) have been developed to measure the sustainability of cities, some of which are included in the list elaborated by the International Telecommunication Union in 2017 [3], as well as in various European projects conducted to evaluate smart cities [4]. Within the study of sustainable mobility in smart cities, particularly in terms of passenger and freight transport, aspects related to energy, environment, safety, and security are considered, taking into account the real-time socio-economic dimensions [5]. These aspects are part of the pillars included in the studies of KPIs for smart and sustainable cities, which should be used to evaluate defined objectives with high viability (i.e., data availability and accessibility) and a perceived high importance value by the indicator [6].
Specifically, Barcelona is one of the most populous cities in the European Union and was ranked among the top five smart cities in 2022, according to a study conducted by Juniper Research: “Smart Cities: Key Technologies, Environmental Impact & Market Forecasts 2022–2026” [7]. Despite being considered a smart city, analyses initiated in the city on pollutant concentrations indicate that it is well above the targets set by the European Parliament [8], both in terms of peak nitrogen dioxide (NO2) concentration values and the citywide average in µg/m3 [9]. Another issue with this indicator is whether the average values obtained in the city capture the true behavior of NO2 across its entire extent. Considering that Barcelona is the second most populous city in Spain, with a population density of 16,339 inhabitants per km2 and an area of 101.37 km2 [10], using only eight air quality stations may not provide sufficient information about the behavior of pollutants in the city. This study stems from the need to create a KPI that accurately represents NO2 concentrations throughout the city in a truthful and instantaneous manner [11].
Therefore, the study has three main objectives: (i) to study the behavior of NO2 concentrations, (ii) to predict NO2 concentrations considering the influence of meteorological variables, and (iii) to propose new distributions of air quality stations in the city to obtain information on real pollution levels. To achieve this, we employ a variety of Artificial Intelligence (AI) and Machine Learning (ML) models for analysis and prediction, alongside an optimization algorithm to relocate the air quality stations and provide an optimal distribution. This algorithm includes an original concept for estimating potential new stations based on the identification of sensitive points in the city and the effective area of action of each station. One of the limitations of our study lies in the limited number of air quality stations available in the city. This results in restricted coverage and can provide a biased view of the actual pollution situation in Barcelona, as the measurements are concentrated in a few locations and do not adequately reflect the spatial variability of pollution levels across the entire city. Our proposed optimization algorithm allows for the identification and prioritization of sensitive points in the city, where the presence of emission sources or the vulnerability of the population makes the information obtained more valuable. By using this algorithm, we aim to improve the representativeness of the data and provide a more accurate and comprehensive assessment of air quality in densely populated urban areas. This comprehensive and detailed approach will allow us to identify specific areas with higher concentration levels, elucidate the concentration behavior, and assess the effectiveness of current policies and initiatives to reduce pollution levels. Ultimately, this approach will provide valuable insights for designing specific and practical strategies to improve air quality and promote sustainable urban development in the city.
The remainder of this paper is structured as follows. Section 2 reviews the recent literature on the use of NO2 indicators in sustainable mobility and its role as an indicator of pollution in smart cities. Section 3 introduces the case study on Barcelona and the data sources used. Section 4 presents the methodology used to achieve the aforementioned objectives, while Section 5 presents the results obtained. Lastly, Section 6 outlines the main conclusions and suggests potential avenues for future research.

2. Overview of NO2 Concentrations

The growing concentration of urban populations has led to environmental challenges, such as increased air pollution, which affects health in urban settings. A primary contributor to this pollution is NO2, primarily emitted by diesel engines, which poses serious environmental and human health risks. Subramaniam et al. [12] highlighted that NO2 significantly contributes to global warming, the greenhouse effect, and climate change. This pollutant is also a primary cause of acid rain, which harms aquatic and terrestrial ecosystems. Regarding human health, Zhu et al. [13] showed that exposure to elevated NO2 is linked to respiratory diseases and lung cancer, with significant mortality rates from these diseases in a studied population in Hefei, China. Women with respiratory diseases appeared more susceptible to air pollution than men. Additionally, Gurjar et al. [14] warned about health risks in megacities, noting that some cities face higher risks due to high levels of pollutants like NO2, especially in South Asia.
To address these issues, traffic reduction measures, such as pedestrian zones and Low-Emission Zones (LEZs), have been implemented [15]. The effectiveness of these measures is often questioned, and this is where smart city tools play a crucial role by providing data to scientifically evaluate their impact. Various approaches have been proposed to evaluate smart city interventions. For instance, Ntafalias [16] proposed a seven-step methodology for assessing the impact of these interventions, emphasizing the importance of a comprehensive analysis of the city’s long-term vision and cooperation among all stakeholders. Additionally, Lebrusán and Toutouh [17] analyzed the effectiveness of an LEZ in Madrid, demonstrating its capability to significantly reduce air pollution and noise in the city. Analyzing Shared Mobility Systems (SMSs) is another critical approach to addressing urban transportation challenges. Golpayegani et al. [18] emphasized the need to address traffic-related NO2 concentrations and how SMS solutions can contribute to reducing this pollution in urban areas. To evaluate cities, Angelakoglou et al. [19] developed a repository of 75 KPIs in six dimensions, including environmental aspects and concentrations of air pollutants. This repository can serve as a basis for assessing the impact of solutions aimed at improving air quality and reducing pollution in urban environments. The mentioned articles highlight the importance of addressing gas pollutant concentrations, such as NO2, in the context of smart and sustainable cities. A holistic approach involving all stakeholders and interconnected entities is essential to evaluate the impact of smart city interventions and achieve greater efficiency in the urban mobility system.
The application of AI and ML techniques, as shown by Subramaniam et al. [12], has been instrumental in developing effective pollution control strategies by predicting NO2 concentrations more accurately. In particular, several works on atmospheric pollution and NO2 concentrations in Barcelona aimed to understand the dynamics of key pollutants, such as NO, NO2, and O3. Malik and Tauler [20] utilized the Multivariate Curve Resolution–Alternating Least Squares technique to analyze temporal variations and diurnal profiles of these pollutants. Basagaña et al. [21] examined the impact of public transportation strikes on air quality, revealing increased NOx and black carbon levels during strikes. Gignac et al. [22] investigated the short-term effects of NO2 exposure on cognitive and mental health, while Pierangeli et al. [23] estimated childhood asthma cases attributable to air pollution. Benavides et al. [24] developed accurate urban air quality models using operational prediction systems and specific dispersion models. Rodriguez-Rey et al. [25] evaluated traffic restriction measures in Barcelona, focusing on reducing NO2 levels. Recently, Cican et al. [26] applied two ML techniques to predict the air quality in a city of Bucharest. In particular, the authors used advanced recurrent neural networks, specifically Long Short-Term Memory (LSTM) and Gated Recurrent Unit models, which achieved improved performance over traditional methods. Wu et al. [27] introduced a novel deep learning model that combines Residual Neural Network, Graph Convolutional Network, and bidirectional LSTM architectures to improve the short-term regional predictions of NO2 and O3 concentrations in Shanghai (China). Similarly, Tao et al. [28] developed an ensemble ML model that incorporated deep learning to forecast NO2 levels using data from 1609 air quality monitors in China. Lastly, El Mghouchi et al. [29] explored multivariable air quality predictions using five hybrid ML models to analyze the relationships between meteorological factors and particulate matter concentrations in Craiova (Romania).
The growing body of research highlights the necessity of addressing air pollution, particularly NO2, in urban areas to safeguard public health and improve air quality. The combination of traffic reduction measures, the promotion of SMS, and the use of smart city tools are critical steps toward creating healthier urban environments. Active collaboration with local communities and policymakers is essential to ensure the successful implementation of these strategies, ultimately contributing to a more sustainable future for cities.

3. A Case of Study: Barcelona City

This study extracts data from Barcelona’s Open Data Air Quality dataset, which stores the hourly measurements of various pollutants from the city’s eight air quality stations, along with daily meteorological data measured at four meteorological stations.

3.1. Air Quality Stations

To obtain the pollutant concentration data from each air quality station, the corresponding data for each year in the study period (2020–2023) must be downloaded from Open Data Barcelona (https://opendata-ajuntament.barcelona.cat/, accessed on 10 October 2024). These files include the station name, population, province, pollutant code, year, month, day, and hourly variables reflecting the concentration measurements for each pollutant. In addition to these files, it is necessary to download the location of each station and the pollutant codes. For our study, we focused on measurements of NO2, which has code 8 in the extracted datasets. The station locations related to NO2 measurements are detailed in Table 1.
It is important to note that NO2 concentrations are measured in µg/m3, and historical data in this format have been available since 2018. The average concentrations per year were close to the recommended limit set by the World Health Organization (WHO) at the time (25 µg/m3). In December 2022, the WHO updated its global air quality guidelines, setting a new target of 10 µg/m3 for the annual average and a maximum value of 25 µg/m3 for a 24 h period [30].

3.2. Meteorological Stations

To obtain data on the meteorological variables measured in Barcelona during the study period, the corresponding files for each year are downloaded from Open Data Barcelona. Each file contains the measurement date, hour (if applicable), station code, meteorological variable code, and the corresponding value. A separate file from Open Data provides the station identifiers and coordinates, while another file in the same repository lists the codes for meteorological variables and the units in which they are measured. Table 2 displays the stations’ identifiers and coordinates.
The selected variables for the analysis of meteorological data are Daily Mean Temperature (TM), measured in °C; Daily Mean Relative Humidity (HRM), measured in %; Daily Mean Atmospheric Pressure (PM), measured in hPa; Accumulated Daily Precipitation (PPT), measured in mm; Daily Solar Radiation (RS24h), measured in MJ/m2; Daily Mean Wind Speed at 10 m (VVM10), measured in m/s; and Daily Mean Wind Direction at 10 m (DVM10), measured in degrees. Each variable’s maximum, minimum, and mean values are measured at each meteorological station. For the study, we work with the mean value of each variable. Station X2 only has records for the temperature and relative humidity variables throughout the entire period, and the reasons for this data gap are unknown. Due to the lack of data, this station is not considered in the study. Figure 1 shows the location of all the air quality and meteorological stations in Barcelona that are considered in the study.

4. Methodology

The methodology employed in this study to achieve the proposed objectives is described in this section. Various tools are used to conduct the descriptive analysis, predict NO2 behavior, and optimize the distribution of stations using an optimization algorithm. In particular, the descriptive analysis is performed using Python 3.10. The libraries NumPy [31], Pandas [32], Seaborn [33], and Matplotlib [34] are used for data analysis and visualization. The Folium library [35] is used to work with the map of Barcelona and generate heat maps. The Geocoder library is used to obtain coordinates for points of interest in Barcelona. The Scikit-learn library [36] is used for clustering analysis concentrations at the stations. This same library is necessary to apply ML models to predict concentrations at each station. As for the optimization algorithm, it is implemented using Python 3.10 with support of Numpy and Pandas libraries to perform array manipulations and manage the input datasets, respectively.

4.1. Descriptive Analysis

A detailed analysis of the measured NO2 concentrations across the city is conducted to identify and observe patterns over different time frames, including daily and weekly behavior. Various graphical representations are created to explore the temporal dynamics of NO2 concentrations [37].
First, daily averages are analyzed to examine general trends in NO2 behavior, investigating daily and weekly variability, showing the influence of traffic emissions and weather conditions [38]. Additionally, weekly patterns are explored to identify potential cyclical behaviors influenced by factors such as urban mobility and industrial activity. Finally, a spatial analysis is then conducted to distinguish the behavior of NO2 concentrations at individual monitoring stations. This approach examines the spatial and temporal representativeness of NO2 monitoring stations in urban settings, highlighting the importance of capturing local variations to obtain a complete picture of air quality [39]. The analysis is further complemented by using meteorological variables obtained from stations distributed throughout the city. A wind rose is generated to visualize and understand the predominant wind directions and speeds in the city. This tool is essential for identifying local climate patterns and variations in different parts of the city [40]. This analysis helps in understanding how meteorological conditions contribute to the dispersion of atmospheric pollutants, highlighting the influence of urban climates on air quality [41]. Additionally, integrating meteorological data with air quality models is crucial to obtain a more accurate view of pollutant dispersion in urban areas [42].
As the final step of the descriptive analysis, a cluster analysis is performed to study the behavior of NO2 concentrations at the monitoring stations. The hourly average values of NO2 concentrations at each station during the study period are used to identify common patterns among the stations [43]. For this analysis, the K-means algorithm is selected, which requires specifying the number of clusters in advance [44]. To determine the optimal number of clusters, the elbow method is employed [45]. This method involves plotting the number of clusters against inertia (the sum of squared distances within each cluster), and the point where a significant change in the inertia decrease occurs indicates the optimal number of clusters. In addition, agglomerative hierarchical clustering is applied, a method that does not require specifying the number of clusters initially [46]. This algorithm treats each point as an individual cluster and successively merges the closest clusters, creating a hierarchical structure. The result is represented by a dendrogram, where the branches indicate the clusters, and the height of each union reflects the Euclidean distance [47]. In our study, the Euclidean distance is used as the metric that captures the separation between points in an n-dimensional space, providing a clear representation of how NO2 concentrations vary according to the location of the monitoring stations.

4.2. Behavior Prediction of NO2

Afterward, the NO2 concentration predictions for each station are analyzed using different methodologies. The objective is to determine which methodology best approximates the data for each particular station. Drawing on several studies [12,48,49], we decide to utilize the following methods:
  • K-Nearest Neighbors (KNN): This algorithm is based on the idea that data points with similar characteristics tend to have similar output values. It works by finding the K closest points in the training dataset and predicting the output value based on the majority of the K-nearest neighbors [50].
  • Decision Tree: This model uses a decision tree to make predictions. Each internal node represents a feature, each branch represents a decision rule, and each leaf represents the prediction result [51].
  • Support Vector Regression (SVR): SVR is a regression technique based on support vectors that seeks to find an optimal regression function within a feature space. It uses a supervised learning approach to predict output values [52].
  • Random Forest: Random Forest is an ensemble of decision trees, where each tree votes for the predicted output. The final prediction is determined by selecting the output with the most votes. By combining multiple trees, the risk of overfitting is reduced, and the prediction accuracy is improved [53].
  • Artificial Neural Network (ANN): ANN is a model inspired by the structure and functioning of the human brain. It consists of a network of interconnected artificial neurons used for making predictions. The model learns from training data by adjusting the synaptic weights of neurons [54].
All the prediction models proposed in this study were validated with the Holdout method, where a 70 % data split was used for training (training) and the remaining 30 % for testing (test). This division made it possible to evaluate the predictive capacity of the models [55]. By applying these methodologies to predict NO2 concentrations at various stations, we aim to identify the best-suited approach for each station based on its specific characteristics and patterns. This analysis will contribute to a better understanding of the performance of different prediction techniques in capturing the complexities and variations of NO2 concentrations across different locations. To compare the effectiveness of each method at the stations, we use evaluation metrics such as Mean Absolute Error (MAE), Root Mean Squared Error (RMSE), and the Coefficient of Determination (R2). These metrics help determine which models offer the best approximation to the observed data at each station. For more details about these statistics, readers are referred to Hrust et al. [56].

4.3. Optimal Location of Stations

Lastly, a new placement of stations at strategic locations is proposed to improve the NO2 pollution index in Barcelona, and obtain a more representative average of NO2 concentrations across the city. This is supported by the analysis since it has been determined that the current 8 monitoring stations are insufficient, as they do not cover a large or representative area of the city. To determine the potential new locations, we rely on the report from the European Commission’s Mobility Observatory, using the Eltis method [57], which recommends selecting 40 sensitive locations in the city distributed as follows: (i) 5 locations near highways, (ii)5 locations near ring roads, (iii) 10 locations near access roads to the city center, (iv) 10 locations near sensitive facilities (schools, hospitals, residences, etc.), (v) 5 locations in low-income neighborhoods, and (vi) 5 locations in recreational areas (sports facilities, parks, museums, etc.). The points chosen for this study are shown in Appendix A, Table A1. These strategically selected points cover the entire metropolitan area and are added to the existing stations to form an expanded set of air quality monitoring stations (Figure 2). This approach allows for a more detailed understanding of NO2 concentrations in the city, facilitating the implementation of effective measures to improve air quality in critical areas and throughout Barcelona. These results will support future research and environmental management actions, contributing to more effective policies to reduce NO2 pollution and its adverse impacts.
To optimize the placement of air quality monitoring stations in Barcelona, we consider the effective radius of each station, which varies according to traffic levels in the area [58]. We choose an average radius of 500 m due to the city’s high traffic intensity. This approach also aims to maximize city area coverage, minimizing the overlap between stations to avoid redundant measurements and ensure no areas are left uncovered. However, the budget constrains the number of stations that can be installed, necessitating a balance between maximizing coverage and managing the costs of implementation and maintenance of the stations. This problem can be modeled as a Capacitated Dispersion Problem (CPD) [59], which aims to maximize the minimum distance between elements. Various approaches have been proposed in the literature to solve the CDP, but it is common to employ heuristics and metaheuristics to solve large-scale instances in short computing times [60]. This study uses an adapted version of the approach proposed by [61], which has been proven to generate high-quality solutions within short computational times. Algorithm 1 outlines the main steps of our algorithm.
Algorithm 1 Biased-randomized algorithm.
1:
function BR-CDP( I , β , t max , i t max )
2:
     S DestructiveHeuristic(I)
3:
     S LocalSearch( S , i t max )
4:
    while  t i m e t max  do
5:
         S new DestructiveHeuristic( I , β )
6:
         S new LocalSearch( S new , i t max )
7:
        if  f ( S new ) > f ( S )  then
8:
            S S new
9:
        end if
10:
    end while
11:
    return S
12:
end function
The algorithm receives as input an instance comprising the stations and the distances between stations, denoted by I, as well as the maximum execution time t m a x and the maximum number of iterations without improvement i t m a x . The algorithm generates a feasible initial solution S by applying a destructive heuristic followed by a local search operator. The destructive heuristic and local search procedures are presented next. At this point, this initial solution becomes the best-found solution so far. Next, the algorithm performs a multistart procedure to generate new solutions until a maximum execution time is reached. In each iteration, the algorithm generates a new solution S n e w using a biased-randomized version of the destructive heuristic combined with the local search operator. The biased-randomized heuristic introduces a slight modification in the greedy constructive behavior, which provides a certain degree of randomness while maintaining the logic behind the constructive heuristic. The biased-randomized version considers each element in the edges list with a probability that follows a geometric distribution with a single parameter β ( 0 , 1 ) , which controls the relative level of greediness present in the randomized behavior of the algorithm [62]. By employing a biased-randomized version of the constructive heuristic, multiple alternative solutions can be generated without losing the logic behind the original heuristic. Next, the algorithm compares the newly generated solution to the best-known solution. If the new solution has a lower objective function value, the best-known solution is updated. Once the stopping criterion is met, the algorithm returns the best-found solution.
Algorithm 2 shows the destructive heuristic procedure. Initially, the heuristic assumes all stations are opened. Then, the edges connecting the stations are sorted in ascending order according to their distance between the stations. Next, an iterative process begins, where certain stations are removed from the solution. At each iteration, an edge is selected from the list of edges in a greedy or biased-randomized manner. The facility to be removed is chosen randomly from the two stations connected by the selected edge, and the edges connected to the deleted facility are also removed from the list of edges. This procedure is repeated until the percentage of open stations falls below the required threshold. Then, the last facility that was removed is reintroduced in the solution to preserve its feasibility, and the initial solution is returned by the procedure.
The local search procedure is depicted in Algorithm 3. This procedure involves removing the oldest station from the solution and reconstructing the solution with a station not currently included. It is important to note that a removed station will not be considered for generating a new solution until all older stations (i.e., those added earlier) have been eliminated from the solution. This approach facilitates efficient space exploration while avoiding redundancy in the search process. The procedure continues until a maximum number of iterations without improvement is reached.
Algorithm 2 Destructive heuristic procedure.
1:
function destructiveHeuristic( I , β )
2:
     S V
3:
     e d g e s getEdges(I)
4:
     e d g e s sort( e d g e s )
5:
    while isFeasible(S) do
6:
         e * selectEdge( e d g e s , β )
7:
         i * selectNode( e * )
8:
         S drop( S , i * )
9:
    end while
10:
     S add( S , i * )
11:
    return S
12:
end function
Algorithm 3 Local search procedure.
1:
function localSearch( S , i t m a x )
2:
     S S
3:
     n o I m p r o v 0
4:
    while  n o I m p r o v < i t m a x  do
5:
         n o I m p r o v n o I m p r o v + 1
6:
         u oldestSelectedNode(S)
7:
         S drop( S , u )
8:
         u * selectBestNode( V S )
9:
         S add( S , u * )
10:
        if  f ( S ) > f ( S )  then
11:
            S S
12:
            n o I m p r o v 0
13:
        end if
14:
    end while
15:
    return  S
16:
end function
Given the budget constraints on the number of stations that can be installed, we also analyze the impact of the percentage of open stations, which is controlled by a parameter m. When m is set to 0.1, it indicates that only 10% of the stations are open; conversely, setting m to 0.9 means that 90% of the air quality stations are operational. Our objective is to identify the optimal combination of area coverage percentage and overlap among these stations, resulting in the most efficient air quality monitoring network. To achieve this, we utilize a Pareto frontier to evaluate the effects of the parameter m on both the percentage of area coverage and the percentage of overlap between stations. By generating the Pareto frontier, we can discern the trade-offs between the percentage of covered area and the percentage of overlap, enabling us to identify the best configurations that balance maximizing coverage with minimizing redundancy.

5. Computational Results

In this section, we present the computational results derived from the methodologies outlined in the previous section. The analysis encompasses the evaluation of NO2 concentration predictions, and the optimization of the location of stations.

5.1. Behavior of Measured NO2 Concentrations

The obtained results highlight the importance of temporal and spatial variability in air quality analysis, demonstrating how local conditions and weekly traffic patterns can have a significant impact on NO2 concentrations.
Figure 3 shows the average value per year for each time slot at each station. Although most stations exhibit a similar pattern, with peak concentrations observed around 9:00 a.m. and another local peak in the late hours, the concentration ranges differ across stations. Similarly, Figure 4 shows the average value per year for each time slot, considering the day of the week. This is because it is known that traffic patterns depend on the day of the week, which directly affects NO2 concentrations [63]. In this case, the values from Monday to Friday are similar, while weekends show a different behavior, with significantly lower concentrations.
Except for station 58, which is located in a residential area on the city’s outskirts, the annual mean values in the remaining stations are much higher than the 10 µg/m3 threshold proposed by the WHO. Moreover, outliers with values significantly exceeding the 24 h limit of 25 µg/m3 are present, especially in stations 57 and 54. The behavior is consistent across different years, although higher annual mean concentration values are obtained for the year 2022 in all stations (Figure 5).
Considering the adaptation periods proposed by the WHO to reduce NO2 concentration levels (40 µg/m3 for adaptation level 1, 30 µg/m3 for adaptation level 2, and 20 µg/m3 for level 3), it can be concluded that the city of Barcelona currently falls within adaptation level 2 regarding the NO2 concentration limits established by the WHO. The WHO recommends a daily average NO2 concentration of 25 µg/m3 (level 3 in the air quality guidelines), which should not be exceeded on more than 3 to 4 days per year. Figure 6 illustrates the daily average values during the study period alongside the recommended daily average limit (dashed red line).
As observed, the number of days surpassing the threshold per year exceeds the 3–4 days limit. Taking a more flexible approach and considering the adaptation period to reduce NO2 concentrations, the maximum daily concentration could be considered to be 120 µg/m3 (adaptation level 1) and 50 µg/m3 (adaptation level 2). If we consider the adaptation periods rather than the strict guidelines, it can be concluded that the city of Barcelona is at adaptation level 2 but still far from achieving compliance with the guidelines.
Figure 7 presents a wind rose diagram depicting the average wind direction trends for each station throughout the study period. The differences in altitude among the stations, combined with their proximity to the sea, result in varying wind rose diagrams, despite the stations being relatively close to one another. Station X4, located nearest to the sea, experiences winds from multiple directions. In conjunction with the other variables studied, a similar behavioral pattern is observed across the stations, with stations X4 and X8 exhibiting more comparable behavior than station D5, which is situated further inland.
The dendrogram illustrating the average concentrations at the air quality stations reveals interesting clustering patterns among the stations (Figure 8a). Station 58 emerges as a distinct cluster, with a different behavior from any other station due to its location in the city’s peripheral areas. On the other hand, stations 44, 50, and 43 exhibit similar behavior, indicating their central location in the city’s high-traffic zone with increased concentrations. Likewise, stations 4 and 42 demonstrate identical patterns, while stations 57 and 54 share similar behavior. These two groups of stations are clustered together, forming a distinct cluster defined by their proximity to the city’s peripheral highways. For the analysis of the meteorological stations, the representation of wind speed along the x-axis and y-axis is utilized (Figure 8b). This method demonstrates the existence of two distinct groups with different behaviors. Stations X4 and X8 form one cluster in the southern part of the city at an altitude of 47 m within a high-traffic area. The other cluster consists of station D5, located in the northern region at an altitude of 415 m and away from heavy traffic.
In the case study of Barcelona, there are fewer meteorological stations than air quality stations. Therefore, the KNN algorithm is employed to classify and predict meteorological variables at each station. The goal of applying this algorithm is to obtain meteorological variables for all air quality stations. The results are presented in Table 3.
The descriptive study highlights the importance of the location of air quality stations, as NO2 concentrations largely depend on traffic in the area. It is crucial to distinguish whether the station is situated in a city center or a residential neighborhood, as these characteristics influence both the average and maximum concentrations recorded at each station. Additionally, traffic patterns not only affect concentration levels but also the dynamics of variations in NO2 concentrations. Furthermore, it has been shown that weather conditions vary depending on the location within the city, which can also influence the distribution and effectiveness of air quality stations. These factors underscore the importance of correctly locating each station to accurately represent the real pollution situation in the city.

5.2. Prediction of NO2 Concentration per Station

Next, we present the results obtained with the different prediction models for our dataset. Table 4 demonstrates that Random Forest outperforms other methodologies regarding prediction accuracy for NO2 concentrations at each station. The lower MAE and RMSE values indicate a closer approximation to the observed data, while the higher R2 suggests a better fit to the data variability. These findings underscore the efficacy of Random Forest for predicting NO2 concentrations in diverse locations, and they explain the specific characteristics influencing the predictive performance of each method.
When applying the Random Forest algorithm to predict NO2 concentration in different stations, it is observed that the evaluation metrics showed similar and close values across all stations. However, the question arises as to whether the model’s effectiveness in predictions varies depending on the concentration of NO2 concentrations at each station. To address this question, a correlation analysis is conducted between the normalized evaluation metrics and the annual average concentration of NO2 at each station. The results indicate that in stations with lower NO2 concentrations, the MAE is lower than those with higher NO2 concentrations (Figure 9a). This relationship between NO2 concentration and MAE yields a correlation coefficient of 0.82. To ensure robustness, station 50 is excluded from the analysis, as it exhibits poor fit and appears as an outlier (Figure 9b).
The study finds a strong positive correlation between NO2 concentration and prediction error (RMSE), indicating that stations with higher NO2 concentrations also experience higher prediction errors. Additionally, a moderate negative correlation is observed between the NO2 concentration and the coefficient of determination (R2), suggesting that the model is less effective in stations with high NO2 concentrations, where it fails to adequately explain the variability of the observed data. These findings reveal that the performance of the Random Forest model in predicting NO2 concentrations varies according to the concentration level at each station; it is more effective in stations with low NO2 concentrations and less accurate in stations with high concentrations.

5.3. Optimization of Location of Stations

Lastly, the obtained results of the proposed optimization algorithm for the optimal location of air quality stations are presented. First, the results of the Pareto frontier are shown to obtain the optimal number of stations to maximize the percentage of area coverage and minimize the percentage of overlap between stations. In addition, a scenario with minimal changes is considered, in which a total of 10 stations are placed throughout the city, ensuring that they are as far apart from each other as possible.
Figure 10 illustrates the results of the Pareto frontier, depicting two key relationships concerning the number of open air quality stations. The left sub-plot displays the number of open stations alongside the percentage coverage achieved by each configuration. As expected, increasing the number of open stations generally leads to improved coverage across the air quality monitoring network. The right sub-plot exhibits the number of open stations concerning the percentage of overlap between these stations. As the number of open stations increases, so does the likelihood of overlap between their coverage areas. Initially, as the number of stations grows, the overlap percentage remains at 0%. However, beyond a certain threshold, the percentage of overlap escalates rapidly.
Observe that the Pareto frontier demonstrates the trade-off between increasing the number of open stations to enhance coverage and the consequent rise in overlap, which may lead to redundant monitoring and inefficient resource allocation. The optimal configuration is identified as m = 0.7 since an increase in the number of stations with the chosen locations does not lead to an increase in coverage due to overlaps and the distribution of stations. At this point, the plot displaying coverage indicates that 35 open stations offer the highest coverage without significantly increasing overlap. This suggests that the air quality monitoring network achieves a well-balanced configuration, maximizing coverage while minimizing redundancy.
Figure 11 presents the optimal scenario of open air quality stations obtained through the Pareto frontier analysis. It highlights the selected stations that maximize coverage while minimizing overlap. This optimal set includes both the initial air quality stations (denoted by red circles) and newly opened potential stations (blue circles). However, budget constraints limit the number of stations that can be installed, requiring a balance between maximizing coverage and managing the costs associated with implementation and maintenance. Thus, the goal is to demonstrate the optimal placement of the current stations, with the addition of two new stations for improved distribution, without incurring significant investment. This minimum scenario is shown in Figure 12. The stations that remain in the exact location as the initial arrangement are marked with red circles, while the new stations, selected at points of interest, are marked with blue circles.
To compare the NO2 concentration KPI calculated in both scenarios with the current situation, we will use the annual mean NO2 concentration measured across all stations as the KPI. Each new station will be assigned an average concentration value using the KNN algorithm, which has been previously utilized. In this case, the average value for each new station will be determined by the proportional mean distance to the three nearest current stations. The values for the nearest current stations for the studied period are shown in Appendix A, Table A2. After applying the algorithm, the values obtained for each new station are included in Appendix A, Table A3.
Figure 13 shows the created heat maps to facilitate visual comparison, considering the annual mean concentration value for each station for 2023. The comparison of these maps reveals that the optimal scenario covers a larger portion of the city and provides a more representative depiction of the current situation. Although the algorithm’s assigned values do not account for traffic conditions or other urban characteristics, they are considered a reasonable approximation for comparing the scenarios.
Using the values assigned to the new stations, the NO2 concentration KPI for the study period is calculated for the different scenarios. The values obtained for each case are shown in Table 5. Throughout the study period, both the minimum and optimal scenarios yield higher KPI values. For the minimum scenario, the KPI increases by 4% to 6%, while for the optimal scenario, the KPI increases by 6% to 9% compared to the current situation. This confirms that the current scenario may be underestimating this indicator for the city and may not capture all the necessary information to accurately represent reality.

6. Conclusions

This study evaluates the suitability of using the daily average concentration of NO2 as a KPI to assess air quality in a smart city like Barcelona. This evaluation builds on the availability of high-quality initial data that accurately reflect the actual concentration levels across the city as measured by monitoring stations. After analyzing the behavior of NO2 using the available data, important relationships are evident between the concentration measured at each station and its location. This is not only linked to traffic in the area but also to the city’s meteorological conditions.
Moreover, the study highlights the importance of station placement, as an inadequate distribution could result in a distorted KPI: overestimated if the stations are concentrated in high-traffic areas, or underestimated if they are mainly located in residential zones. For this reason, strategic points are identified in the city where measuring the NO2 concentration would provide significant added value. Given that the optimal solution for the distribution of air quality monitoring stations could be very costly, a more conservative alternative is also proposed that minimizes investment. This solution only requires the installation of two additional stations and the relocating of some existing ones to achieve more representative results. Furthermore, it is observed that when the stations are better distributed, the NO2 KPI value exceeds the thresholds set by the WHO. This suggests that Barcelona needs the continuous and precise monitoring of NO2 levels to quantify the effects of the policies implemented in the city, enabling informed decision-making that improves air quality.
As a future line of research, it would be highly relevant to correlate real-time traffic data in the city with NO2 concentration data and consider the population density in each zone. This integration would allow for a better understanding of the relationship between vehicular flow, atmospheric pollutant concentrations, and the population exposed to this pollution, providing a more comprehensive perspective on the impact of traffic on air quality in Barcelona.

Author Contributions

Conceptualization, R.S.-G. and P.C.; methodology, P.C.; software, R.S.-G. and X.A.M.; validation, P.C., R.S.-G. and X.A.M.; formal analysis, R.S.-G.; investigation, P.C., R.S.-G. and X.A.M.; resources, R.S.-G. and P.C.; data curation, R.S.-G.; writing—original draft preparation, R.S.-G. and X.A.M.; writing—review and editing, P.C. and E.P.-B.; visualization, P.C., R.S.-G. and X.A.M.; supervision, P.C.; project administration, E.P.-B.; funding acquisition, E.P.-B. All authors have read and agreed to the published version of the manuscript.

Funding

This work has been partially funded by the Spanish Ministry of Science and Innovation (PID2022-138860NB-I00 and RED2022-134703-T), the project SUN (HORIZON-CL4-2022-HUMAN-01-14-101092612), the project UP2030 (HORIZON-MISS-2021-CIT-02-01-101096405) as well as by the Barcelona City Council and Fundació “la Caixa” under the framework of the Barcelona Science Plan 2020–2023 (grant 21S09355-001). Funding for open access charge: CRUE-Universitat Politècnica de València.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The original data presented in the study are openly available in [Open Data Barcelona] at [https://opendata-ajuntament.barcelona.cat/] (accessed on 10 October 2024).

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Tables

Appendix A.1. Location of the 40 Potential New Air Quality Stations

Table A1. Location of the 40 potential new air quality stations.
Table A1. Location of the 40 potential new air quality stations.
IDItemAddress
1Location near highway 1Carrer número 62
2Location near highway 2Carrer de portbou
3Location near highway 3Carrer de la mare de deu del port 1
55Location near highway 4Gran via Cortes Catalans
5Location near highway 5Passeig de García Gracia 15
6Location ring road 1Carrer de Can Móra (Ronda Dált
7Location ring road 2Carrer de Scala Dei (Ronda de Dalt
8Location ring road 3Carrer de Cuzco (Ronda del litoral)
9Location ring road 4Carrer de Vicenc Montal (Ronda del litoral)
10Location ring road 5Carrer de Aristides Maillol (Ronda del litoral
11Location city access road 1Carrrer de Sancho de Ávila 1 ( Av. Meridiana)
12Location city access road 2Carrer de Padilla 167 (Av. Cortes Catalanas)
13Location city access road 3Carrer de Valencia (Cruce Calle de Padillal)
14Location city access road 4Carrer de Bac de Roda (Gran via de las Corts Catalanas)
15Location city access road 5Carrer de Rogent (Av. Meridiana)
16Location city access road 6Carrer de Badajoz (Av. Diangonal)
17Location city access road 7Carre de Bailen (Av. Diagonal)
18Location city access road 8Carrer de Balmes (Gran Via Corts Catalanes)
19Location city access road 9Passeo de Fabra i Puig (Av. Meridiana)
20Location city access road 10Travesia de las Cortes
21Sensitive location 1Hospital del mar (Passeig Maritim 25)
22Sensitive location 2Hospital de Nens. Carrer Consell de Cent, 437)
23Sensitive location 3Hospital dos de Maig (carrer dos de Maig, 301)
24Sensitive location 4Hospital Quironsalud Barcelona (Plaza Alfonso Comins 5)
25Sensitive location 5Hospital Universitari Dexeus (Calle de Sabino Arana19)
26Sensitive location 6Universida de Barcelona (Av. Del Doctor Marañón)
27Sensitive location 7Universidad Abierta de Cataluña (Av. Del Tibidabo 31)
28Sensitive location 8Centro de dia La Torre Setze ( Calle la Torre 16)
29Sensitive location 9Residencia de mayores Felizvita (Av. Josep Tarradellas 38)
30Sensitive location 10Centro residencial Bonaire. (Carrer Alt de Pedrell 100)
31Low-income neighborhood location 1Carrer del Vesuvi/Nou Barris
32Low-income neighborhood location 2Passeig d’Úrrutia/Nou Barris
33Low-income neighborhood location 3Calle Huelva/Barrio San Martí
34Low-income neighborhood location 4Cami Noy de la RAmbla/Barrio del Raval
35Low-income neighborhood location 5Carrer de la Pedrosa/Barrio Trinitat Nova
36Recreation area location 1Complejo deportivo Municipal Mar Bella (Av. del litoral)
37Recreation area location 2Camp Nou. Calle de Aristides Maillol 12
38Recreation area location 3Parque de Montjuic. C/Lleida 35
39Recreation area location 4Parque de la Ciutadella. Paseo de Pujades 22
40Recreation area location 5Parque Güell. Carrer de Olot 5

Appendix A.2. Current Stations: Location and Annual Mean NO2 Concentration in (µg/m3)

Table A2. Current stations: location and annual mean NO2 concentration in (µg/m3).
Table A2. Current stations: location and annual mean NO2 concentration in (µg/m3).
202020212022
IDLongitudeLatitude[NO2][NO2][NO2]
502.187441.386422.955224.518427.4114
432.153841.385334.135436.818525.7095
442.153441.398731.240830.854741.4748
572.115141.387517.589518.276234.1493
42.204541.403927.947725.433031.0970
422.133141.378823.131821.645621.7994
542.148041.426121.345420.274620.8797
582.123941.41848.35138.77789.0200

Appendix A.3. New Scenario Stations: Location and Annual Mean NO2 Concentration in (µg/m3)

Table A3. New scenario stations: location and annual mean NO2 concentration in (µg/m3).
Table A3. New scenario stations: location and annual mean NO2 concentration in (µg/m3).
202020212022
IDLongitudeLatitude[NO2][NO2][NO2]
141.36182.138125.024425.280725.9081
241.37582.126923.354223.010325.4241
341.35362.150429.218229.532028.2716
5541.37282.146329.185629.573227.5267
541.38962.167430.077531.425031.4502
641.40142.114015.767515.899123.2605
741.43922.156420.533319.948522.6254
841.43742.208324.856623.863227.4571
941.38292.183025.171826.339727.6735
1241.40042.183826.368926.202031.6768
1541.42052.186424.642723.763527.2174
1641.39542.199626.592425.732530.9999
1841.39752.152031.135031.061138.4371
1941.43252.163121.072320.490523.5300
2041.39762.132523.890823.470532.1344
2141.38082.172228.887330.345430.3644
2241.39732.173828.875430.121631.5276
2341.41062.177027.363526.964533.3680
2441.41562.138518.981218.626021.5304
2541.38552.126622.831022.824127.0318
2841.40442.148729.904330.049333.9573
2941.38332.142729.135529.476627.7437
3041.42352.171026.211324.996130.1150
3141.44652.183226.309624.951730.0425
3341.41772.196727.223326.188832.0277
3541.44952.191924.237323.361826.4753
3641.39852.209327.230025.742231.2939
3841.36502.166929.684431.176130.4976
4041.41352.153227.689127.633229.7088
5041.38532.153834.135436.818525.7095
5741.38752.115117.589518.276234.1493
441.40392.204527.947725.433031.0970
5441.42612.148021.345420.274620.8797
5841.41842.12398.35138.77789.0200

References

  1. Toli, A.M.; Murtagh, N. The concept of sustainability in smart city definitions. Front. Built Environ. 2020, 6, 77. [Google Scholar] [CrossRef]
  2. Sustainable Development Goals. Available online: https://bit.ly/2R8siwl (accessed on 17 July 2024).
  3. International Telecommunication Union. Available online: https://unece.org/fileadmin/DAM/hlm/documents/Publications/U4SSC-CollectionMethodologyforKPIfoSSC-2017.pdf (accessed on 17 July 2024).
  4. Bosch, P.; Jongeneel, S.; Rovers, V.; Neumann, H.M.; Airaksinen, M.; Huovila, A. CITYkeys Indicators for Smart City Projects and smart Cities; CITYkeys Report 10. 2017. Available online: https://cordis.europa.eu/project/id/646440/reporting (accessed on 7 November 2024).
  5. Nowicka, K. Cloud computing in sustainable mobility. Transp. Res. Procedia 2016, 14, 4070–4079. [Google Scholar] [CrossRef]
  6. Haddad, C. Choosing suitable indicators for the assessment of urban air mobility: A case of upper Bavaria, Germany. Eur. J. Transp. Infrastruct. Res. 2020, 20, 214–232. [Google Scholar] [CrossRef]
  7. Smart Cities: Key Technologies, Environmental Impact and Market Forecast 2022–2026. Available online: https://www.juniperresearch.com/researchstore/sustainability-technology-iot/smart-cities-research-report (accessed on 17 July 2024).
  8. European Parlament: Air Pollution: Deal with Council to Improve Air Quality. Available online: https://www.europarl.europa.eu/news/es/press-room/20240219IPR17816/air-pollution-deal-with-council-to-improve-air-quality (accessed on 17 July 2024).
  9. Soriano-Gonzalez, R.; Perez-Bernabeu, E.; Ahsini, Y.; Carracedo, P.; Camacho, A.; Juan, A.A. Analyzing key performance indicators for mobility logistics in smart and sustainable cities: A case study centered on Barcelona. Logistics 2023, 7, 75. [Google Scholar] [CrossRef]
  10. Catalan Institute of Statistics. Available online: https://www.idescat.cat/emex/?id=080193&lang=es (accessed on 17 July 2024).
  11. Almalki, F.A.; Alsamhi, S.H.; Sahal, R.; Hassan, J.; Hawbani, A.; Rajput, N.; Saif, A.; Morgan, J.; Breslin, J. Green IoT for eco-friendly and sustainable smart cities: Future directions and opportunities. Mob. Netw. Appl. 2021, 28, 178–202. [Google Scholar] [CrossRef]
  12. Subramaniam, S.; Raju, N.; Ganesan, A.; Rajavel, N.; Chenniappan, M.; Prakash, C.; Pramanik, A.; Basak, A.K.; Dixit, S. Artificial Intelligence Technologies for Forecasting Air Pollution and Human Health: A Narrative Review. Sustainability 2022, 14, 9951. [Google Scholar] [CrossRef]
  13. Zhu, F.; Ding, R.; Lei, R.; Cheng, H.; Liu, J.; Shen, C.; Zhang, C.; Xu, Y.; Xiao, C.; Li, X.; et al. The short-term effects of air pollution on respiratory diseases and lung cancer mortality in Hefei: A time-series analysis. Respir. Med. 2019, 146, 57–65. [Google Scholar] [CrossRef]
  14. Gurjar, B.R.; Jain, A.; Sharma, A.; Agarwal, A.; Gupta, P.; Nagpure, A.; Lelieveld, J. Human health risks in megacities due to air pollution. Atmos. Environ. 2010, 44, 4606–4613. [Google Scholar] [CrossRef]
  15. Low-Emission Zones. Available online: https://www.idae.es/movilidad-sostenible/zonas-de-bajas-emisiones (accessed on 17 July 2024).
  16. Ntafalias, A. A comprehensive methodology for assessing the impact of smart city interventions: Evidence from Espoo transformation process. Smart Cities 2022, 5, 90–107. [Google Scholar] [CrossRef]
  17. Lebrusán, I.; Toutouh, J. Using smart city tools to evaluate the effectiveness of a low emissions zone in Spain: Madrid central. Smart Cities 2020, 3, 456–478. [Google Scholar] [CrossRef]
  18. Golpayegani, F.; Guériau, M.; Laharotte, P.A.; Ghanadbashi, S.; Guo, J.; Geraghty, J.; Wang, S. Intelligent Shared Mobility Systems: A Survey on Whole System Design Requirements, Challenges and Future Direction. IEEE Access 2022, 10, 35302–35320. [Google Scholar] [CrossRef]
  19. Angelakoglou, K.; Nikolopoulos, N.; Giourka, P.; Svensson, I.L.; Tsarchopoulos, P.; Tryferidis, A.; Tzovaras, D. A methodological framework for the selection of key performance indicators to assess smart city solutions. Smart Cities 2019, 2, 269–306. [Google Scholar] [CrossRef]
  20. Malik, A.; Tauler, R. Exploring the interaction between O3 and NOx pollution patterns in the atmosphere of Barcelona, Spain using the MCR–ALS method. Sci. Total Environ. 2015, 517, 151–161. [Google Scholar] [CrossRef] [PubMed]
  21. Basagaña, X.; Triguero-Mas, M.; Agis, D.; Pérez, N.; Reche, C.; Alastuey, A.; Querol, X. Effect of public transport strikes on air pollution levels in Barcelona (Spain). Sci. Total Environ. 2018, 610, 1076–1082. [Google Scholar] [CrossRef] [PubMed]
  22. Gignac, F.; Righi, V.; Toran, R.; Errandonea, L.P.; Ortiz, R.; Mijling, B.; Naranjo, A.; Nieuwenhuijsen, M.; Creus, J.; Basagana, X. Short-term NO2 exposure and cognitive and mental health: A panel study based on a citizen science project in Barcelona, Spain. Environ. Int. 2022, 164, 107284. [Google Scholar] [CrossRef]
  23. Pierangeli, I.; Nieuwenhuijsen, M.; Cirach, M.; Rojas-Rueda, D. Health equity and burden of childhood asthma-related to air pollution in Barcelona. Environ. Res. 2020, 186, 109067. [Google Scholar] [CrossRef]
  24. Benavides, J.; Snyder, M.; Guevara, M.; Soret, A.; Pérez García-Pando, C.; Amato, F.; Querol, X.; Jorba, O. CALIOPE-Urban v1.0: Coupling R-LINE with a mesoscale air quality modelling system for urban air quality forecasts over Barcelona city (Spain). Geosci. Model Dev. 2019, 12, 2811–2835. [Google Scholar] [CrossRef]
  25. Rodriguez-Rey, D.; Guevara, M.; Linares, M.P.; Casanovas, J.; Armengol, J.M.; Benavides, J.; Soret, A.; Jorba, O.; Tena, C.; García-Pando, C.P. To what extent the traffic restriction policies applied in Barcelona city can improve its air quality? Sci. Total Environ. 2022, 807, 150743. [Google Scholar] [CrossRef]
  26. Cican, G.; Buturache, A.N.; Mirea, R. Applying Machine Learning Techniques in Air Quality Prediction—A Bucharest City Case Study. Sustainability 2023, 15, 8445. [Google Scholar] [CrossRef]
  27. Wu, C.l.; Song, R.f.; Zhu, X.h.; Peng, Z.r.; Fu, Q.y.; Pan, J. A hybrid deep learning model for regional O3 and NO2 concentrations prediction based on spatiotemporal dependencies in air quality monitoring network. Environ. Pollut. 2023, 320, 121075. [Google Scholar] [CrossRef]
  28. Tao, C.; Jia, M.; Wang, G.; Zhang, Y.; Zhang, Q.; Wang, X.; Wang, Q.; Wang, W. Time-sensitive prediction of NO2 concentration in China using an ensemble machine learning model from multi-source data. J. Environ. Sci. 2024, 137, 30–40. [Google Scholar] [CrossRef] [PubMed]
  29. El Mghouchi, Y.; Udristioiu, M.T.; Yildizhan, H. Multivariable Air-Quality Prediction and Modelling via Hybrid Machine Learning: A Case Study for Craiova, Romania. Sensors 2024, 24, 1532. [Google Scholar] [CrossRef] [PubMed]
  30. World Health Organization. Available online: https://www.who.int/en/news-room/fact-sheets/detail/ambient-(outdoor)-air-quality-and-health (accessed on 17 July 2024).
  31. Harris, C.R.; Millman, K.J.; van der Walt, S.J.; Gommers, R.; Virtanen, P.; Cournapeau, D.; Wieser, E.; Taylor, J.; Berg, S.; Smith, N.J.; et al. Array programming with NumPy. Nature 2020, 585, 357–362. [Google Scholar] [CrossRef] [PubMed]
  32. McKinney, W. Data structures for statistical computing in python. In Proceedings of the 9th Python in Science Conference, Austin, TX, USA, 28–30 June 2010; Volume 445, pp. 51–56. [Google Scholar]
  33. Waskom, M.L. Seaborn: Statistical Data Visualization. 2021. Available online: https://seaborn.pydata.org/ (accessed on 10 October 2024).
  34. Hunter, J.D. Matplotlib: A 2D graphics environment. Comput. Sci. Eng. 2007, 9, 90–95. [Google Scholar] [CrossRef]
  35. contributors, F. Folium: Python Data, Leaflet.js Maps. 2021. Available online: https://github.com/python-visualization/folium (accessed on 10 October 2024).
  36. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  37. Munir, S.; Mayfield, M.; Coca, D. Understanding spatial variability of NO2 in urban areas using spatial modelling and data fusion approaches. Atmosphere 2021, 12, 179. [Google Scholar] [CrossRef]
  38. Fan, C.; Li, Z.; Li, Y.; Dong, J.; van der A, R.; de Leeuw, G. Variability of NO2 concentrations over China and effect on air quality derived from satellite and ground-based observations. Atmos. Chem. Phys. 2021, 21, 7723–7748. [Google Scholar] [CrossRef]
  39. Zhu, Y.; Chen, J.; Bi, X.; Kuhlmann, G.; Chan, K.L.; Dietrich, F.; Brunner, D.; Ye, S.; Wenig, M. Spatial and temporal representativeness of point measurements for nitrogen dioxide pollution levels in cities. Atmos. Chem. Phys. 2020, 20, 13241–13251. [Google Scholar] [CrossRef]
  40. Ortiz, A.F.; Jiménez Núñez, M.d.l.L.; Díaz Godoy, R.V. Study of the behavior of air parcels, using PIXE, Hysplit and wind rose in the metropolitan zone of Toluca Valley, Mexico. J. Energy Res. Rev. 2021, 9, 51–66. [Google Scholar] [CrossRef]
  41. Sun, S.; Tian, L.; Cao, W.; Lai, P.C.; Wong, P.P.Y.; Lee, R.S.y.; Mason, T.G.; Krämer, A.; Wong, C.M. Urban climate modified short-term association of air pollution with pneumonia mortality in Hong Kong. Sci. Total Environ. 2019, 646, 618–624. [Google Scholar] [CrossRef]
  42. Chang, Y.S.; Chiao, H.T.; Abimannan, S.; Huang, Y.P.; Tsai, Y.T.; Lin, K.M. An LSTM-based aggregated model for air pollution forecasting. Atmos. Pollut. Res. 2020, 11, 1451–1463. [Google Scholar] [CrossRef]
  43. Yadav, M.; Singh, N.K.; Sahu, S.P.; Padhiyar, H. Investigations on air quality of a critically polluted industrial city using multivariate statistical methods: Way forward for future sustainability. Chemosphere 2022, 291, 133024. [Google Scholar] [CrossRef] [PubMed]
  44. Govender, P.; Sivakumar, V. Application of k-means and hierarchical clustering techniques for analysis of air pollution: A review (1980–2019). Atmos. Pollut. Res. 2020, 11, 40–56. [Google Scholar] [CrossRef]
  45. Saputra, D.M.; Saputra, D.; Oswari, L.D. Effect of distance metrics in determining k-value in k-means clustering using elbow and silhouette method. In Proceedings of the Sriwijaya International Conference on Information Technology and Its Applications (SICONIAN 2019), Palembang, Indonesia, 16 November 2019; Atlantis Press: Amsterdam, The Netherlands, 2020; pp. 341–346. [Google Scholar]
  46. Xu, D.; Tian, Y. A comprehensive survey of clustering algorithms. Ann. Data Sci. 2015, 2, 165–193. [Google Scholar] [CrossRef]
  47. Jaeger, A.; Banks, D. Cluster analysis: A modern statistical review. Wiley Interdiscip. Rev. Comput. Stat. 2023, 15, e1597. [Google Scholar] [CrossRef]
  48. Masood, A.; Ahmad, K. A review on emerging artificial intelligence (AI) techniques for air pollution forecasting: Fundamentals, application and performance. J. Clean. Prod. 2021, 322, 129072. [Google Scholar] [CrossRef]
  49. Li, Y.; Guo, J.e.; Sun, S.; Li, J.; Wang, S.; Zhang, C. Air quality forecasting with artificial intelligence techniques: A scientometric and content analysis. Environ. Model. Softw. 2022, 149, 105329. [Google Scholar] [CrossRef]
  50. Kothandaraman, D.; Praveena, N.; Varadarajkumar, K.; Madhav Rao, B.; Dhabliya, D.; Satla, S.; Abera, W. Intelligent forecasting of air quality and pollution prediction using machine learning. Adsorpt. Sci. Technol. 2022, 2022, 5086622. [Google Scholar] [CrossRef]
  51. Parhizkar, T.; Rafieipour, E.; Parhizkar, A. Evaluation and improvement of energy consumption prediction models using principal component analysis based feature reduction. J. Clean. Prod. 2021, 279, 123866. [Google Scholar] [CrossRef]
  52. Dun, M.; Xu, Z.; Chen, Y.; Wu, L. Short-Term Air Quality Prediction Based on Fractional Grey Linear Regression and Support Vector Machine. Math. Probl. Eng. 2020, 2020, 8914501. [Google Scholar] [CrossRef]
  53. Gariazzo, C.; Carlino, G.; Silibello, C.; Renzi, M.; Finardi, S.; Pepe, N.; Radice, P.; Forastiere, F.; Michelozzi, P.; Viegi, G.; et al. A multi-city air pollution population exposure study: Combined use of chemical-transport and random-Forest models with dynamic population data. Sci. Total Environ. 2020, 724, 138102. [Google Scholar] [CrossRef] [PubMed]
  54. Das, B.; Dursun, Ö.O.; Toraman, S. Prediction of air pollutants for air quality using deep learning methods in a metropolitan city. Urban Clim. 2022, 46, 101291. [Google Scholar] [CrossRef]
  55. Sekeroglu, B.; Ever, Y.K.; Dimililer, K.; Al-Turjman, F. Comparative evaluation and comprehensive analysis of machine learning models for regression problems. Data Intell. 2022, 4, 620–652. [Google Scholar] [CrossRef]
  56. Hrust, L.; Klaić, Z.B.; Križan, J.; Antonić, O.; Hercog, P. Neural network forecasting of air pollutants hourly concentrations using optimised temporal averages of meteorological variables and pollutant concentrations. Atmos. Environ. 2009, 43, 5588–5596. [Google Scholar] [CrossRef]
  57. European Commission’s Mobility Observatory. Available online: https://urban-mobility-observatory.transport.ec.europa.eu/ (accessed on 17 July 2024).
  58. Kanaroglou, P.S.; Jerrett, M.; Morrison, J.; Beckerman, B.; Arain, M.A.; Gilbert, N.L.; Brook, J.R. Establishing an air pollution monitoring network for intra-urban population exposure assessment: A location-allocation approach. Atmos. Environ. 2005, 39, 2399–2409. [Google Scholar] [CrossRef]
  59. Martí, R.; Martínez-Gavara, A.; Sánchez-Oro, J. The capacitated dispersion problem: An optimization model and a memetic algorithm. Memetic Comput. 2021, 13, 131–146. [Google Scholar] [CrossRef]
  60. Carlson, S. International transmission of information and the business firm. Ann. Am. Acad. Political Soc. Sci. 1974, 412, 55–63. [Google Scholar] [CrossRef]
  61. Gomez, J.F.; Panadero, J.; Tordecilla, R.D.; Castaneda, J.; Juan, A.A. A multi-start biased-randomized algorithm for the capacitated dispersion problem. Mathematics 2022, 10, 2405. [Google Scholar] [CrossRef]
  62. Estrada-Moreno, A.; Savelsbergh, M.; Juan, A.A.; Panadero, J. Biased-randomized iterated local search for a multiperiod vehicle routing problem with price discounts for delivery flexibility. Int. Trans. Oper. Res. 2019, 26, 1293–1314. [Google Scholar] [CrossRef]
  63. Shoari, N.; Heydari, S.; Blangiardo, M. School neighbourhood and compliance with WHO-recommended annual NO2 guideline: A case study of Greater London. Sci. Total Environ. 2022, 803, 150038. [Google Scholar] [CrossRef]
Figure 1. Spatial distribution of air quality stations (red) and meteorological stations (green).
Figure 1. Spatial distribution of air quality stations (red) and meteorological stations (green).
Applsci 14 10355 g001
Figure 2. Initial situation of air quality stations (red) and sensitive points for the possible location of new stations (blue).
Figure 2. Initial situation of air quality stations (red) and sensitive points for the possible location of new stations (blue).
Applsci 14 10355 g002
Figure 3. Mean value of the daily concentration evolution at each station.
Figure 3. Mean value of the daily concentration evolution at each station.
Applsci 14 10355 g003
Figure 4. Average hourly distribution of NO2 concentrations by day of the week.
Figure 4. Average hourly distribution of NO2 concentrations by day of the week.
Applsci 14 10355 g004
Figure 5. Annual mean of each station with the WHO limit value as the 24 h average (yellow) and limit annual average 2023 (red).
Figure 5. Annual mean of each station with the WHO limit value as the 24 h average (yellow) and limit annual average 2023 (red).
Applsci 14 10355 g005
Figure 6. Daily average of NO2 concentrations in the period under study.
Figure 6. Daily average of NO2 concentrations in the period under study.
Applsci 14 10355 g006
Figure 7. Wind rose for each station.
Figure 7. Wind rose for each station.
Applsci 14 10355 g007
Figure 8. Cluster in different stations. (a) Air quality stations. (b) Meteorological station.
Figure 8. Cluster in different stations. (a) Air quality stations. (b) Meteorological station.
Applsci 14 10355 g008
Figure 9. Relationship between average NO2 concentration of stations and MAE error. (a) With all station. (b) Without station 50.
Figure 9. Relationship between average NO2 concentration of stations and MAE error. (a) With all station. (b) Without station 50.
Applsci 14 10355 g009
Figure 10. Pareto chart.
Figure 10. Pareto chart.
Applsci 14 10355 g010
Figure 11. Optimal scenario: Final situation solution of the problem, maximum number of stations, occupying the maximum area without overlapping.
Figure 11. Optimal scenario: Final situation solution of the problem, maximum number of stations, occupying the maximum area without overlapping.
Applsci 14 10355 g011
Figure 12. Minimum scenario: New locations of the minimum stations that should be considered as a result of the problem.
Figure 12. Minimum scenario: New locations of the minimum stations that should be considered as a result of the problem.
Applsci 14 10355 g012
Figure 13. Heat map comparison with annual average NO2 concentrations in 2023. (a) Heat map with the initial stations. (b) Heat map with minimum scenario. (c) Heat map with the 39 stations.
Figure 13. Heat map comparison with annual average NO2 concentrations in 2023. (a) Heat map with the initial stations. (b) Heat map with minimum scenario. (c) Heat map with the 39 stations.
Applsci 14 10355 g013
Table 1. Air quality stations and their locations.
Table 1. Air quality stations and their locations.
Station IDLongitudeLatitude
502.187441.38640
432.153841.38530
442.153441.39870
572.115141.38750
42.204541.40390
422.133141.37880
542.148041.42610
582.123941.41843
Table 2. Meteorological stations and their locations.
Table 2. Meteorological stations and their locations.
Station IDLongitudeLatitude
D52.1237941.41864
X22.1884741.38943
X42.1677541.38390
X82.1054041.37919
Table 3. Application of the KNN algorithm to the stations.
Table 3. Application of the KNN algorithm to the stations.
Air Quality StationMeteorological Station
50X4
43X4
44X4
57X8
4X4
42X8
54D5
58D5
Table 4. Errors of ML models used to predict NO2 concentrations by station.
Table 4. Errors of ML models used to predict NO2 concentrations by station.
Station504344
MAERMSER2MAERMSER2MAERMSER2
KNN0.12210.15750.27230.10900.14310.31970.11120.14700.3056
Decision Tree0.00230.00650.99870.00160.00350.99960.00170.00330.9996
SVR0.14420.1874−0.02950.13680.1740−0.00570.13990.1781−0.0201
Random Forest0.00160.00570.99910.00090.00270.99980.00110.00300.9997
ANN0.15240.2031−0.20940.13040.16680.07670.16410.1957−0.2306
Station44254
MAERMSER2MAERMSER2MAERMSER2
KNN0.11750.15440.29320.10010.13320.31300.09890.12930.2864
Decision Tree0.00150.00250.99980.00180.00500.99900.00130.00300.9996
SVR0.14670.1851−0.01610.12500.1619−0.01510.11650.1541−0.0139
Random Forest0.00110.00280.99980.00090.00240.99980.00080.00210.9998
ANN0.15060.1952−0.12960.12670.1697−0.11530.13200.1741−0.2942
Station5758
MAERMSER2MAERMSER2
KNN0.08910.12060.36030.07680.09800.2597
Decision Tree0.00160.00420.99920.00090.00200.9997
SVR0.11520.1511−0.00480.08780.11390.0015
Random Forest0.00080.00170.99990.00060.00160.9998
ANN0.10130.13910.14820.13700.1630−1.046
Note: Values in bold represent the best results for each metric across models.
Table 5. Comparison of the current NO2 concentration KPI values (µg/m3) with two scenarios.
Table 5. Comparison of the current NO2 concentration KPI values (µg/m3) with two scenarios.
202020212022
Current Scenario [NO2]23.337223.324926.4427
Minimum Scenario [NO2]24.371724.078028.2378
Optimal Scenario [NO2]25.318425.224428.2392
Increase in the Minimum Scenario (%)4.433.236.78
Increase in the Optimal Scenario (%)8.498.146.79
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Soriano-Gonzalez, R.; Martin, X.A.; Perez-Bernabeu, E.; Carracedo, P. Modeling and Optimization of NO2 Stations in the Smart City of Barcelona. Appl. Sci. 2024, 14, 10355. https://doi.org/10.3390/app142210355

AMA Style

Soriano-Gonzalez R, Martin XA, Perez-Bernabeu E, Carracedo P. Modeling and Optimization of NO2 Stations in the Smart City of Barcelona. Applied Sciences. 2024; 14(22):10355. https://doi.org/10.3390/app142210355

Chicago/Turabian Style

Soriano-Gonzalez, Raquel, Xabier A. Martin, Elena Perez-Bernabeu, and Patricia Carracedo. 2024. "Modeling and Optimization of NO2 Stations in the Smart City of Barcelona" Applied Sciences 14, no. 22: 10355. https://doi.org/10.3390/app142210355

APA Style

Soriano-Gonzalez, R., Martin, X. A., Perez-Bernabeu, E., & Carracedo, P. (2024). Modeling and Optimization of NO2 Stations in the Smart City of Barcelona. Applied Sciences, 14(22), 10355. https://doi.org/10.3390/app142210355

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop