Next Article in Journal
Exploring the Impacts of Urban Community Leisure on Subjective Well-Being during COVID-19: A Mixed Methods Case Study
Next Article in Special Issue
Association between Short-Term Exposure to Ozone and Heart Rate Variability: A Systematic Review and Meta-Analysis
Previous Article in Journal
A Study on Nurse Manager Competency Model of Tertiary General Hospitals in China
Previous Article in Special Issue
The Association between Urinary Polycyclic Aromatic Hydrocarbons Metabolites and Type 2 Diabetes Mellitus
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Hourly Seamless Surface O3 Estimates by Integrating the Chemical Transport and Machine Learning Models in the Beijing-Tianjin-Hebei Region

1
School of Economics, Qingdao University, Qingdao 266071, China
2
College of Global Change and Earth System Science, Beijing Normal University, Beijing 100875, China
3
Department of Atmospheric and Oceanic Science, Earth System Science Interdisciplinary Center, University of Maryland, College Park, MD 20742, USA
*
Authors to whom correspondence should be addressed.
Int. J. Environ. Res. Public Health 2022, 19(14), 8511; https://doi.org/10.3390/ijerph19148511
Submission received: 15 June 2022 / Revised: 9 July 2022 / Accepted: 10 July 2022 / Published: 12 July 2022
(This article belongs to the Special Issue Air Pollution and Cardiorespiratory Health: Population-Based Insights)

Abstract

:
Surface ozone (O3) is an important atmospheric trace gas, posing an enormous threat to ecological security and human health. Currently, the core objective of air pollution control in China is to realize the joint treatment of fine particulate matter (PM2.5) and O3. However, high-accuracy near-surface O3 maps remain lacking. Therefore, we established a new model to determine the full-coverage hourly O3 concentration with the WRF-Chem and random forest (RF) models combined with anthropogenic emission data and meteorological datasets. Based on this method, choosing the Beijing-Tianjin-Hebei (BTH) region in 2018 as an example, full-coverage hourly O3 maps were generated at a horizontal resolution of 9 km. The performance evaluation results indicated that the new model is reliable with a sample (station)-based 10-fold cross-validation (10-CV) R2 value of 0.94 (0.90) and root mean square error (RMSE) of 14.58 (19.18) µg m−3. In addition, the estimated O3 concentration is accurately determined at varying temporal scales with sample-based 10-CV R2 values of 0.96, 0.98 and 0.98 at the daily, monthly, and seasonal scales, respectively, which is highly superior to traditional derivation algorithms and other techniques in previous studies. An initial increase and subsequent decrease, which constitute the diurnal variation in the O3 concentration associated with temperature and solar radiation variations, were captured. The highest concentration reached approximately 112.73 ± 9.65 μg m−3 at 15:00 local time (1500 LT) in the BTH region. Summertime O3 posed a high pollution risk across the whole BTH region, especially in southern cities, and the pollution duration accounted for more than 50% of the summer season. Additionally, 43 and two days exhibited light and moderate O3 pollution, respectively, across the BTH region in 2018. Overall, the new method can be beneficial for near-surface O3 estimation with a high spatiotemporal resolution, which can be valuable for research in related fields.

1. Introduction

Since the 21st century, China has experienced a period of rapid economic growth, urbanization, and industrialization. By 2020, the annual gross domestic product (GDP) reached approximately 101,598.62 billion yuan, ranking second globally (China Statistical Yearbook). However, the industrial structure in the early 21st century was extensive and mainly depended on energy, raw materials, and labor, resulting in the large release of air pollutants into the atmosphere [1,2,3]. These air pollutants exert a notable impact on human health, vegetation growth, and environmental change [4,5,6]. It was estimated that the number of fine particulate matter (PM2.5)-related deaths in China increased by approximately 390,000 from 2002 to 2017 [2]. Therefore, air pollution control has become one of the primary tasks of government environmental management at local and national scales. Fortunately, since 2012, China has conducted extensive real-time monitoring of six conventional pollutants [7], including PM2.5, coarse particulate matter (PM10), nitrogen dioxide (NO2), sulfur dioxide (SO2), carbon monoxide (CO), and ozone (O3). Then, with the proposal of an action plan for air pollution prevention and control by the Chinese government on September 10, 2013, the air quality has generally improved, especially in terms of the PM10 and PM2.5 concentrations, and notable decreasing trends have been captured throughout mainland China [8,9,10]. Despite all these achievements, the near-surface O3 concentration has exhibited the opposite trend to that of the particulate matter concentration [11,12].
O3 is one of the main secondary air pollutants in China and is formed by volatile organic compounds (VOCs) and nitrogen oxides (NOx) under solar radiation-driven reaction conditions [13]. According to epidemiological studies, the morbidity and mortality of respiratory diseases, heart diseases and even cancer are closely related to O3 loading levels [14,15,16]. In addition, high O3 loadings can destroy the vegetation physiological structure and growth environment, resulting in crop reduction and ultimately affecting food prices. Recently, several O3 pollution episodes have occurred in China, especially in urban agglomeration areas, and the annual mean daily maximum 8-h (MDA8) O3 concentration reached approximately 193, 170 and 165 μg m−3 across the Beijing-Tianjin-Hebei (BTH), Yangtze River Delta (YRD) and Pearl River Delta (PRD) regions, respectively, in 2017 [17]. Therefore, many studies have been carried out on ozone pollution at local and national scales, including field measurements of the ozone concentration [18,19], determination of the relationship among ozone precursors [20,21], analysis of the influences of meteorology on ozone pollution episodes [22,23], and attribution of emission sources [24]. However, previous studies have mostly considered surface observation records, which exhibit a discontinuous spatial distribution. In addition, due to the extremely high cost of manpower and material resources associated with ground-based monitoring, ozone concentration records are often discontinuous over time. Therefore, full-coverage high-quality near-surface O3 datasets are urgently needed for future studies on environmental economics, epidemiology, and climate change.
Two methods are generally employed to estimate the O3 concentration: model simulation and algorithm inversion. In regard to model simulation, regional air quality prediction models have been extensively adopted, e.g., the Community Multiscale Air Quality Modeling System (CMAQ), global chemical transport models (GEOS-Chem), Nested Air Quality Prediction Modeling System (NAQPMS) and Weather Research and Forecasting-Chemistry (WRF-Chem) model. Lu et al. (2019) applied the GEOS-Chem model to map the spatial distribution of the MDA8 O3 concentration from May to August from 2016–2017 across China, and the spatial correlation coefficient (R2) value reached approximately 0.67 [25]. Relying on the WRF-Chem model, Li et al. (2020) estimated the hourly O3 concentration in summer across the Lanzhou region [26]. Compared to surface measurements, the R2 values at each station were all consistently less than 0.4. Then, based on the above models, the formation, transport, and dissipation of O3 could be explained in detail via mechanism analysis [27,28,29]. However, deviations occurred from emission inventories, resulting in many uncertainties in O3 estimation. With the improvement of traditional statistics and the development of mathematical algorithms, the aforementioned problem has been increasingly resolved. Among the various techniques, traditional statistical models, e.g., multiple linear regression (MLR), linear mixed effect model (LME), geographically weighted regression (GWR), land use regression (LUR) and generalized additive model (GAM), have been widely implemented to estimate the concentration of air pollutants [30,31,32]. Zhang et al. (2020) adopted a GWR model to estimate the monthly O3 concentration in eastern China, and the validation based R2 value reached approximately 0.77 [33]. Thereafter, due to their strong data mining and information capture abilities, machine/deep learning-based methods have increasingly replaced traditional statistical methods. Zhan et al. (2018) selected the random forest (RF) model to estimate the MDA8 O3 concentration in 2015 across China with a cross-validation R2 value of 0.69 [34]. Then, based on the extreme gradient boosting (XGBoost) algorithm, the daily O3 concentration was simulated at the national scale, and the cross-validation R2 value reached 0.78 [22]. In addition, other machine learning methods have been widely applied to derive O3 and other air pollutant concentrations at national and local scales [35,36]. Despite all these developments, the estimation accuracy remains low, and high uncertainties still persist in previous algorithms. Furthermore, simulation of the O3 concentration in most previous studies occurred at a coarse temporal resolution, i.e., monthly, and daily resolutions, which cannot meet the requirements of meticulous research on short-term ozone pollution episodes. Therefore, accurate inversion of the hourly O3 concentration is urgently needed for environmental governance and policy implementation purposes.
Here, our objective is to establish an advanced approach to determine the full-coverage hourly accurate near-surface ozone concentration. Throughout all of mainland China, as one of the leading political, economic, and cultural centers, the BTH region is typically exposed to the highest O3 pollution burden. Therefore, this region was selected as an example in this study. For this purpose, the WRF-Chem model was combined with the RF method, meteorological factors, and other ancillary datasets to simulate the hourly O3 concentration across the whole BTH region from 1 January 2018, 00:00 local time (0000 LT) to 31 December 2018, 23:00 local time (2300 LT) at a horizontal resolution of 9 km × 9 km. In addition, we compared our algorithm to other similar algorithms and studies. Based on this approach, an hourly ozone map was established covering the BTH region, and we further performed a comprehensive investigation of the spatial distribution of the O3 concentration and ozone pollution level in the BTH region.

2. Materials and Methods

2.1. Study Area

In this study, the hourly O3 concentration in the BTH region is estimated. This region is located in northern China at latitudes and longitudes ranging from 36.0° N–42.6° N and 113.5° E–119.8° E, respectively. This region covers an area of approximately 218,000 km2 and includes Beijing (BJ) and Tianjin (TJ) and 11 cities in Hebei Province (Baoding (BD), Cangzhou (CZ), Chengde (CD), Handan (HD), Hengshui (HS), Langfang (LF), Qinhuangdao (QHD), Shijiazhuang (SJZ), Tangshan (TS), Xingtai (XT) and Zhangjiakou (ZJK)). In addition, this area hosts more than 8% of the population of China. As one of the large urban agglomerations in China, regional industrialization, urbanization, and motorization are closely related to changes in the atmospheric environment, thus forming a symbiotic situation among the emissions of coal fires, motor vehicles, and industrial exhaust. Especially regarding the emissions of O3 precursors, i.e., NOx and VOCs, a notable increasing trend has been captured in recent years, resulting in heavy ozone pollution in this region [16]. High ozone loading also seriously affects the health of residents. When the MDA8 O3 concentration met the Chinese Ambient Air Quality Standards (CAAQS) Grade II standard, an increasing of 10 μg m−3 O3 concentration could lead to about a 0.31% increase in daily emergency room visits in Beijing [37]. In addition, a nonlinear association was exited between ozone and ischemic stroke, and younger adults are more susceptible to extremely high ozone levels than the elderly population in Beijing [38].

2.2. Datasets

2.2.1. Measured Near-Surface Ozone

Hourly near-surface O3 records were collected from 87 state-managed environmental real-time monitoring stations across the BTH region. Figure S1 shows the spatial distribution of the O3 monitoring stations. In general, all cities included more than three sites. Beijing and Tianjin contained the most stations, with 12 and 20 monitoring stations, respectively. In this study, observation records from 0000 LT (GMT+8) on 1 January 2018, to 2300 LT on 31 December 2018, were collected as training samples and validation datasets. Moreover, to prevent systematic errors caused by the monitoring processes, observation records exceeding three times the standard deviation were eliminated. In addition, to avoid instrument failure, any values remaining constant for three consecutive hours were removed [39]. In regard to the state-managed environmental real-time monitoring stations, the O3 concentration was measured via the ultraviolet spectrophotometry method. However, the air quality monitoring protocol was amended on 1 September 2018 [40]. Therefore, we transformed the observed concentrations before this date via multiplication with a fixed coefficient, which was approximately 0.92 [41]. An uneven spatial distribution of the measurement stations occurs in the BTH region, resulting in multiple stations existing in the same grid. Therefore, we calculated the average concentration if one grid contained multiple records. Eventually, 289,553 effective hourly O3 records were collected for modeling.

2.2.2. WRF-Chem-Simulated Ozone

In our two-stage model, the WRF-Chem model version 3.9.1 (WRF-Chem 3.9.1) was applied to simulate the O3 concentration with temporal and spatial resolutions of 1 h and 9 km, respectively, at the first stage [42]. In regard to the WRF-Chem model, meteorological and emission data are the essential driving factors of the initial field and boundary conditions. Meteorological data were collected from the National Centers for Environmental Prediction (NCEP) Final Operational Global Analysis dataset with temporal and spatial resolutions of 6 h and 1° × 1°, respectively. In terms of the adopted emission datasets, anthropogenic and biogenic emissions inventory data were obtained from the China Multiresolution Emission Inventory (MEIC) and Model of Emissions of Gases and Aerosols from Nature (MEGAN) at horizontal and temporal resolutions of 0.25° × 0.25° and 1 month, respectively [43]. To ensure temporal consistency, all emission driving datasets were interpolated to the hourly scale.

2.2.3. Meteorological Factors

Ozone formation is greatly limited by meteorological conditions [44]. Based on this consideration, eight meteorological factors, i.e., the 2-m temperature (TEM, unit: K), 10-m wind speed (WS, unit: m s−1) and wind direction (WD, unit: degree), solar radiation (RAD, unit: W m−2), boundary layer height (BLH, unit: m), surface pressure (SP, unit: kPa), relative humidity (RH, unit: %) and total evaporation (EVA, unit: mm) were selected to reflect the generation, transport and dissipation processes of O3. However, most currently available meteorological datasets do not reach the hourly temporal resolution. Fortunately, fifth-generation European Centre for Medium-Range Weather Forecasts (ECMWF) reanalysis products (ERA5) have been released since 2018 (www.ecmwf.int), and the temporal resolutions have been increased to hourly intervals. Moreover, we interpolated the spatial resolutions of the above meteorological factors from 0.25° × 0.25° to 9 km with the bilinear interpolation method to ensure data consistency.

2.2.4. Other Ancillary Data

Vegetation can release biogenic volatile organic compounds (BVOCs), which are also an important precursor of ozone generation [45]. Here, we collected annual cover vegetation data (CVL, the sum of low and high cover vegetation levels) sourced from the ERA5 land version dataset on a single level at a horizontal resolution of 0.25° × 0.25°. In addition, the hourly vertical integral of the divergence in the ozone flux (VIDO) was retrieved from the ERA5 product to reflect the ozone loading. Similar to meteorological factors, the above two parameters are both resampled to the same spatial resolution as that of the WRF-Chem-simulated O3 concentration.

2.3. Methodology

2.3.1. Two-Stage Model

To estimate the full-coverage near-surface O3 concentration in the BTH region, a two-stage model was established in this study. Figure 1 shows a flowchart of our two-stage model. At the first stage, the WRF-Chem model was employed to explain the generation processes of O3. In addition, the impacts of anthropogenic and natural emissions on O3 concentrations were revealed. The WRF model is a mesoscale numerical simulation and data assimilation system that can simulate physical processes at cloud and weather scales [46]. Moreover, to improve the simulation accuracy, three- and four-dimensional variational assimilation algorithms and multilayer nesting grids were adopted in the latest version of the WRF model. To reveal the chemical processes of ozone, we adopted the online coupled chemical transport module of the WRF model, i.e., the WRF-Chem model [42]. For this model, the meteorological conditions and chemical composition can be synchronously and completely simulated, which exhibit the same time step, simulation areas, spatial resolution, and vertical coordinates, in addition to realizing two-way feedback simulation of atmospheric and chemical substances in real-time. In addition, this model can be employed for the simulation of the emission and transportation of atmospheric chemical components, and the interaction between gaseous pollutants (O3 and NO2) and particulate matter (PM2.5 and PM10) can be exactly captured. In this study, a double nesting grid was built with the Lambert projection at the first stage of our 2-stage model. In regard to the first-level domain, a grid with a size of 64 × 56 was established at a horizontal resolution of 27 km × 27 km, which could cover most of North China. Then, to further improve the simulation accuracy and resolution, the second domain (D02) was established based on the results of the first domain at a horizontal resolution of 9 km × 9 km (size: 81 × 87), which could cover the whole BTH region. In addition, for the purpose of physical mechanism unification, both grids were established under the same configurations as the boundary layer scheme of Yonsei University [47], Noah land surface scheme [48], Grell three-dimensional cumulus parameterization scheme [49] and Morrison double-moment microphysics scheme. Moreover, the radiation transport scheme was unified between the two domains, and the Goddard [50] and rapid radiative transfer models [51] were selected for the shortwave and longwave radiation schemes, respectively, of the WRF-Chem model. In addition, the chemical mechanism was consistent between the two domains, which entailed the Carbon-Bond Mechanism version Z [52]. In terms of the simulation of the near-surface O3 concentration, due to the inherent limitations of the WRF-Chem model, a monthly simulation cycle was set. Before simulation in each month, 48-h spin-up processes were first conducted for model preheating. Through the WRF-Chem model, the hourly ozone concentration was preliminarily simulated from 1 January 2018 to 31 December 2018, across the BTH region.
However, because notable spatiotemporal heterogeneities exist in O3 concentration data, deviations are still found when only employing the WRF-Chem model in the first stage. Therefore, we selected the RF model [53] to further determine the relationship between O3 and various independent variables at the second stage to improve the O3 inversion accuracy with Equation (1).
O 3 _ P r e i , j , h = f RF   WRF O 3 i , j , h ,   TEMI i , j , h ,   TEM i , j , h ,   RAD i , j , h , ,   RH i , j , h , CVL i , j , h
where O 3 _ P r e i , j , h denotes the simulated near-surface O3 concentration in grid i on day j at hour h, WRF O 3 i , j , h denotes the WRF-Chem simulated near-surface O3 concentration (WRFO3) in grid i on day j at hour h and TEMI i , j , h denotes the temporal information in grid i on day j at hour h. In this study, a temporal weighted matrix was established according to the method described by Xue et al., which includes the day of the year (DOY), time distance of one day to spring, summer, autumn and winter, and local time (LT) [54]. In addition to temporal information, VIDO, CVL and meteorological parameters, including TEM, RAD, WS, WD, BLH, SP, EVA, and RH, in grid i on day j at hour h were selected as explanatory variables for model construction. To ensure that all input factors could attain statistical significance and avoid multicollinearity, correlation analysis and collinearity diagnosis methods were adopted here. Table S1 lists the correlation coefficient (R) and variance inflation factor (VIF) between the surface measured O3 concentration and all independent variables used for modeling in 2018 across the BTH region. Overall, WRFO3, TEM, RAD, WD, BLH and CVL imposed significant positive effects on O3 (p < 0.01). Among these variables, except for WRFO3, the highest R value of 0.66 was attained by TEM. The O3 concentration exhibited a significant negative response to TEMI, WS, SP, EVA, RH and VIDO, with R ranging from −0.08 to −0.56 (p < 0.01). Moreover, according to the threshold proposed by Ziegel et al., if the VIF value is higher than 10, significant collinearity exists among the variables [55]. In our model, the VIF values of all variables were lower than 4 (ranging from 1.04 to 3.91), which suggests that no multicollinearity occurred in the input data of the RF model. Thus, significant interrelation and no multicollinearity indicated that all independent variables selected in this paper could be considered in O3 concentration estimation in 2018 across the BTH region.
Before hourly O3 simulation, we also evaluated the contribution of all independent variables. Table 1 also lists the feature importance (FI) of all input datasets of our two-stage model. The total FI value is 100%, which reflects the contributions of each independent variable to model training, and a higher FI value indicates a higher contribution of the RF model to O3 estimation. The highest contribution was yielded by WRFO3, with an FI value of 59.2%. And the second major contributing factor was TEM, with an FI value of 14.1%. Generally, a high temperature facilitates the volatilization of VOCs, and heavy O3 pollution episodes usually occur at high temperatures [56]. In addition, high temperatures can affect atmospheric turbulence and accelerate photochemical reactions [57]. Figure S1 also shows the mean temperature in the BTH region. In general, the mean temperature reaches approximately 283.75 K, and a high temperature was captured in the southern BTH region. Another vital reaction condition is radiation, which accounts for ~6.6% of the estimated hourly O3 concentration. Radiation is a necessary condition for photochemical reactions, which could limit the release of biological VOCs and the photodissociation reaction rate, resulting in O3 loading changes. The next important contributing factor was RH (FI value: ~5.4), which can affect radiation transfer, reduce the air temperature, and accelerate O3 dissipation. Moreover, the total contribution of the considered temporal information and other meteorological factors reached approximately 14.7%, and these factors could describe the temporal variation, generation, transport, and dissipation processes of O3 across the BTH region.
In addition, many traditional models used for O3 concentration estimation were selected for training based on the same input datasets as those employed for our two-stage model for comparison purposes, including MLR, LME, GWR, GAM and traditional two-stage models.

2.3.2. Evaluation Approach

Two tenfold cross-validation (10-CV) approaches, i.e., sample- and station-based 10-CV approaches, were selected to evaluate the simulated results of our two-stage model. In regard to the sample-based 10-CV approach, the total samples (the explained variable is the measured O3, the explanatory variables include WRF-Chem-simulated O3, meteorological factors and other ancillary data, and each explanatory variable record corresponds to 12 explanatory variables) were randomly divided into ten groups according to the data samples. Among all partitions, nine partitions were selected as training samples for modeling with the two-stage model, while the remaining samples were adopted as the testing dataset. The above process was repeated ten times to ensure that all samples were applied for one time testing and nine times modeling. In the station-based 10-CV approach, the total samples were divided into ten subsets according to the O3 monitoring stations. Similarly, all samples were considered nine times for training and once for testing. Moreover, various statistical indexes, including regression equation parameters (slope and intercept), goodness of fit (R2), root mean square error (RMSE) and mean absolute error (MAE), were employed to evaluate the consistency between the simulated and observed O3 concentrations.

3. Results

3.1. Overall Accuracy Evaluation

Generally, the two-stage model achieved a strong data-mining ability. Figure 2 shows the sample-based 10-CV results of our model in terms of O3 estimation on an hourly basis across the BTH region. Because ozone formation occurs under solar radiation, we only selected the results from 08:00 local time (0800 LT) to 1800 LT for illustration. Overall, our model could establish relationships between the hourly measured O3 concentration and independent variables with overall coefficient values of R2, RMSE and MAE of 0.94, 14.58 μg m−3 and 9.96 μg m−3, respectively. In addition, our model greatly avoided overfitting, and the slope of the best-fit linear regression lines reached 0.92. In addition, the evaluation indexes at each hour were calculated. The two-stage model was highly accurate in hourly O3 concentration simulation, with high sample-based 10-CV R2 values and linear regression slopes ranging from 0.82 to 0.95 and 0.77 to 0.93, respectively. In addition, the uncertainties in the two-stage model were low with a linear regression intercept, RMSE and MAE ranging from 7.72–9.91 μg m−3, 12.70–17.89 μg m−3 and 9.12–12.54 μg m−3, respectively. However, the reaction conditions of O3, such as radiation and temperature, are different throughout the day, while the human activity and precursor concentration levels also vary, resulting in the evaluation indexes exhibiting slight differences throughout the day. Better estimation performances were captured from 1200 to 1800 LT. The R2 values were all greater than 0.93, and the slopes were all greater than 0.90. The precision difference is caused by WRF-Chem simulation, which are mainly caused by the deviation of meteorological field simulation. Nevertheless, the R2 in morning hour are all more than 0.80. This could further indicate that our model achieved a stable and robust simulation ability.
Moreover, the station-based 10-CV results were evaluated. Figure S2 shows the station-based 10-CV results from 0800 to 1800 LT in 2018 across the BTH region. Overall, our two-stage model achieved a stable and robust spatial prediction ability. The ensemble station-based 10-CV R2 and slope values were 0.90 and 0.89, respectively. Additionally, the RMSE and MAE values were 19.18 and 11.32 μg m−3, respectively. This illustrates that our two-stage model can predict the O3 concentration accurately in areas with no surface measurement coverage. Furthermore, the station-based 10-CV R2 value was slightly lower than that obtained with the sample-based 10-CV approach, which can further indicate the robustness of our model. Similar to the sample-based 10-CV approach, significant diurnal differences in accuracy were also captured with the station-based 10-CV approach. In contrast, more accurate simulation results were obtained from 1200 to 1700 LT, with R2 values ranging from 0.89 to 0.90 and slopes ranging from 0.87–0.88. In addition, concentrations with a higher density were distributed close to the 1:1 line. However, a slight underestimation occurred with our model, which could be explained by the simulation uncertainty at the first stage. Despite these limitations, based on our two-stage model, the relationship between the O3 concentration and natural and human activities was precisely established.

3.2. Station-Scale Accuracy Evaluation

The performance of our two-stage model in regard to hourly O3 concentration estimation at each individual station was also evaluated (Figure 3). In general, our two-stage model attained a high adaptability at each station across the BTH region with a mean sample-based (station-based) 10-CV R2 value of 0.95 (0.91) and RMSE and MAE values of 14.25 μg m−3 (18.38 μg m−3) and 10.20 μg m−3 (13.32 μg m−3), respectively. Approximately 90% of stations achieved a high accuracy with a sample-based 10-CV R2 value higher than 0.90. In contrast, the stations with better O3 estimation results were located in the southwestern areas of the BTH region, and the highest sample-based 10-CV R2 value could reach 0.98. Furthermore, this model yielded a low uncertainty, and approximately 74% and 68% of all stations attained RMSE and MAE values less than 16 and 10 μg m−3, respectively, with the sample-based 10-CV approach. However, the uncertainty in the station-based 10-CV results was slightly higher than that in the sample-based 10-CV results, and at approximately 42% and 31% of all stations, RMSE < 16 μg m−3 and MAE < 10 μg m−3 were reached in regard to hourly O3 concentration estimation. This was mainly attributed to the scattered site distribution and discontinuity in the spatial information, resulting in incorrect assessment of the relationship between the hourly O3 concentration and other ancillary data. In addition, we calculated the hour of occurrence of the highest 10-CV R2 value of the O3 estimates to reflect the hourly adaptive model performance at the station scale. In general, the estimated O3 concentrations at each station from 1400 to 1600 LT were the most consistent with the ground measurements, and approximately 79% and 54% of all stations attained the highest sample- and station-based 10-CV R2 values, respectively, during this period. This can be interpreted as the stable relationships existing among the temperature, radiation and O3 concentration from 1400–1600 LT, suitable for model training. In addition, we selected eight stations, which located in different locations (central, northern, western, eastern, northwestern, northeastern, southwestern, and southeastern regions) of Beijing-Tianjin-Hebei region, to compare the mean hourly simulation results with the observation O3 concentration (Figure S3). Overall, the simulation results are completely consistent with the monitoring results, and the correlation coefficients are all more than 0.99.

3.3. Temporal-Scale Accuracy Evaluation

First, we evaluated the estimation bias in a time series of the hourly O3 concentration from 0800 to 1800 LT across the BTH region (Figure S4). Here, the bias was calculated as the difference between the surface measured and estimated O3 concentrations. Overall, the estimation bias indicated a notable diurnal variation involving an initial increase and subsequent decrease. The maximum hourly O3 bias was captured at 1600 LT (~0.97 μg m−3), while the minimum bias occurred at 1700 LT (~0.06 μg m−3). Notably, the biases from 1000–1400 LT were all negative, which indicates that there occurred a slight overestimation with our model during this period. In contrast, the estimation bias from 1500–1800 LT suggested slight underestimation. In addition, the standard deviation of the bias was calculated for each hour, as shown in Figure S4. The hourly standard deviation of the bias remained at a lower level, ranging from 12.70 μg m−3 (0900 LT) to 17.89 μg m−3 (1800 LT), which is consistent with the O3 loading results. This further confirmed the stability of our model for hourly O3 concentration estimation.
We also investigated the estimation performance at the daily scale, and the daily sample-based 10-CV R2, RMSE and MAE values for the DOY were all calculated here. Figure 4 shows the temporal performance of our two-stage model. In 2018, the sample-based 10-CV R2 value ranged from 0.41 to 0.95, and the average R2 value reached approximately 0.84 across the BTH region. Overall, approximately 74.3% of all days in 2018 attained a 10-CV R2 value greater than 0.7. However, only three days attained a R2 value lower than 0.50 because of the lack of training samples, and the training samples on these days were at least a quarter fewer than those on the other days. Figure 4a also shows that the R2 value in the spring and summer was lower than that in the summer. This disparity occurred because the dominant photochemical reaction conditions (e.g., temperature and radiation) in spring and winter are weak, which is adverse to simulating near-surface O3 concentration by the WRF-Chem model. RMSE and MAE were low across the BTH region, and approximately 85.8% and 87.1% of all days in 2018 exhibited values less than 20 and 15 μg m−3, respectively. High values were mainly captured in summer because of intense photochemical reactions, and severe O3 pollution days occurred during this season.
The MDA8 O3 concentration is an important standard to evaluate the daily ozone pollution level. Therefore, we also evaluated temporally synthesized MDA8 O3 data from hourly samples in 2018 across the BTH region (Figure S5). At the daily scale, compared to surface measurements, our model could accurately reflect daily MDA8 O3 variations with a high consistency (R2 = 0.96, and slope = 0.94) and low uncertainty (RMSE = 9.84 μg m−3, and MAE = 7.23 μg m−3). In addition, a significant consistency was captured between the monthly and seasonal mean MDA8 O3 concentrations and surface observations, and the R2 (slope) value at both scales was 0.98 (0.98). Furthermore, the scatter points were distributed close to the 1:1 line, which could also suggest that our method is reasonable and stable in terms of O3 estimation. Moreover, at these two time scales, the estimation uncertainty was reduced, with mean RMSE (MAE) values of 5.50 μg m−3 (4.22 μg m−3) and 4.69 μg m−3 (3.66 μg m−3) for the estimated monthly and seasonal MDA8 O3 concentrations, respectively. Thus, our two-stage model could accurately describe O3 pollution across the BTH region, and the derived full-coverage O3 concentration can be widely applied in research on economics, epidemiology, and other related disciplines.

3.4. Spatial Distribution of Ozone Pollution in the BTH Region

3.4.1. Diurnal Variations in Ozone

Based on our two-stage model, the full-coverage hourly O3 concentration was estimated. Figure S6 shows the spatial distribution of the hourly O3 concentration in the BTH region from 0800 to 1800 LT in 2018. Overall, the average hourly O3 concentration reached 90.12 ± 5.17 μg m−3. Due to the notable limitations of photochemical reaction conditions, O3 pollution exhibited significant diurnal variation. From 0800–1800 LT, a low level was captured at sunrise with an O3 concentration of ~44.86 ± 9.65 μg m−3. Then, with increasing temperature, solar radiation and human activities, the chemical reaction conditions of O3 and precursor emissions were both enhanced, resulting in O3 pollution, and the peak concentration reached approximately 112.73 ± 9.65 μg m−3 at 1500 LT. Severe O3 pollution occurred in the BTH region from 1300 to 1700 LT, with its concentrations higher than 100 μg m−3. In general, the O3 concentrations in the morning (0800–1200 LT) were lower than those in the afternoon (1300–1800 LT), and the O3 concentration in the afternoon (~107.93 ± 8.09 μg m−3) was 1.57 times that in the morning (~68.74 ± 6.88 μg m−3).
In terms of the spatial distribution, approximately 46.5% of all areas was exposed to high O3 levels with annual mean hourly O3 concentrations higher than 90 μg m−3, and these high-O3 level areas were mainly located in the southeastern and northwestern parts of the BTH region. However, significant diurnal variations in the spatial distribution occurred in this region. From 0800–1000 LT, high-O3 loading areas were mainly concentrated in the northern BTH region, at high altitudes. During this period, there occurred more radiation than in low-altitude regions. Moreover, a large amount of cultivated land and forestland cover the area, enhancing the emission of natural source-derived precursors of photochemical reactions (e.g., VOCs, methane, and terpenes). Subsequently, the high-O3 pollution regions were mainly located in the southeastern and northwestern BTH areas. In these areas, human activities contributed a large number of NOx and VOCs. Especially in the southern BTH area, many heavy industrial enterprises are located, resulting in very high anthropogenic emissions. Then, as the day progressed, due to the weakening in human activities, the O3 concentrations in high-pollution areas decreased toward sunset (1700–1800 LT). Overall, O3 pollution exhibits the spatial distribution characteristics of high levels in the south and low levels in the north throughout the BTH region, which is consistent with the spatial distribution of PM2.5 pollution [3]. These results indicated that anthropogenic emissions are one of the primary causes of O3 pollution.

3.4.2. Seasonal Variations in Ozone

Due to seasonal subsolar point movement, solar radiation and temperature exhibit notable seasonal variation, resulting in seasonal variations in the O3 concentration across the BTH region. Figure 5 shows the spatially average MDA8 O3 concentration across the BTH region in 2018, and the seasonal MDA8 O3 concentration was synthesized from daily MDA8 O3 maps. In general, the O3 pollution levels revealed similar spatial patterns between spring and summer. The mean MOD8 O3 concentration reached 120.33 ± 7.21 μg m−3 in spring. In addition, we calculated the proportion of O3 pollution time over 13 cities throughout the BTH region in each season, as shown in Figure 5. The black and yellow vertical lines indicate the proportion of O3 pollution time across the whole region. During approximately 70.7% of the spring season, an MDA8 O3 concentration exceeding the first-level pollution standard (100 μg m−3) was observed throughout the whole region. Among the various areas, Xingtai, Hengshui, Handan and Cangzhou exhibited a longer exposure to ozone pollution in spring. In summer, O3 pollution was severe, and the mean MDA8 O3 concentration reached approximately 148.28 ± 32.04 μg m−3, which far exceeded the first-level pollution standard of the ambient air quality standards in China. In this season, all areas of the BTH region exhibited high O3 pollution levels with mean MDA8 O3 > 100 μg m−3, and 32.0% of all areas attained a mean MDA8 O3 concentration notably exceeding the second-level pollution standard (160 μg m−3) of the ambient air quality standards. The high-value areas were mainly distributed in the southern and northwestern BTH areas and included most cities in Hebei Province. Across the whole BTH region, during approximately 94.6% of the time, MDA8 O3 > 100 μg m−3, while during approximately 35.9% of the time, MDA8 O3 > 160 μg m−3. Among all cities in this region, the Zhangjiakou region attained the highest proportion of pollution time, with MDA8 O3 > 100 μg m−3 during 98.9% of the time. However, the time of exposure to extremely serious O3 pollution (>160 μg m−3) time was relatively short in the Zhangjiakou region. Overall, the time of exposure to extremely serious ozone pollution in 9 cities (i.e., Xingtai, Tianjin, Tangshan, Shijiazhuang, Langfang, Hengshui, Handan, Cangzhou and Baoding) was longer than the average level in the BTH region, especially in Hengshui, where MDA8 O3 > 160 μg m−3 accounted for 58.7% of the summer season. In contrast, the temperature in this area was higher than that in the other areas of the BTH region (Figure S1), suggesting stronger photochemical reaction conditions. Moreover, the anthropogenic emissions of VOCs and NOx in these cities were higher than those in the other cities due to the intense heavy and transportation industries, resulting in the emission of more precursors, which is beneficial for O3 generation. In contrast, O3 pollution greatly decreased in autumn and winter, and the average MDA8 O3 concentration was 81.59 ± 7.81 μg m−3 and 61.84 ± 8.09 μg m−3, respectively. Especially in winter, the MDA8 O3 concentration in the BTH region throughout the winter was lower than 100 μg m−3. The main reason is that the temperature and radiation greatly decrease with southward movement of the direct subsolar point. Note that Xingtai, Hengshui, Handan and Cangzhou exhibited O3 pollution time proportions ranging from 1.1–5.6% (>100 μg m−3). This phenomenon indicated that although photochemical reaction conditions are unfavorable in winter, high precursor emissions (sourced from motor vehicles and heating) could also increase the risk of O3 pollution. In addition, significant spatial spillover effects were existed in O3 pollution, indicating that the four cities also are threatened by ozone pollution in the surrounding areas.

3.4.3. O3 Pollution in the Beijing-Tianjin-Hebei Region

To further explore ozone pollution in the BTH region, we also estimated the daily MDA8 O3 concentration in 13 cities (Figure 6). Overall, the daily mean MDA8 O3 concentration reached approximately 103.01 ± 43.41 μg m−3 across the BTH region. Among the various cities, the highest O3 pollution was captured in Hengshui, with a daily MDA8 O3 concentration of 113.73 ± 54.98 μg m−3. In contrast, the lowest MDA8 O3 concentration was captured in Qinhuangdao, with an average concentration of 94.75 ± 43.66 μg m−3. In addition, with increasing DOY, the MOA8 O3 concentration exhibited a change characteristic of first increasing and then decreasing in all cities, and the MOA8 O3 concentration peaked from 1 June to 1 July. Note that O3 pollution occurred almost synchronously among the considered cities because of the integrity of the atmospheric transport conditions throughout the BTH region, which illustrates the importance of overall joint governance in this region. Based on the daily MDA8 O3 concentration, individual air quality index (IAQI) values of the daily average O3 concentration were calculated according to the method proposed by the Technical Regulation on the Ambient Air Quality Index (HJ 633-2012). According to this standard, the IAQI of an MDA8 O3 concentration ranging from 0~50, 51~100, 101~150, 151~200, 201~300 and >300 was defined as excellent, good, light pollution, moderate pollution, heavy pollution, and serious pollution, respectively. Similar to the MDA8 O3 concentration, variations involving an initial increase and subsequent decrease were also captured for the IAQI. The annual average IAQI value was approximately 58 across the whole BTH region, indicating a good air quality under O3 loading. Despite these findings, there remains a long pollution period from May to October. Especially during the period from 1 June to 1 July, moderate O3 pollution days were observed.
In addition, the proportion of O3 pollution days was investigated in this study (Figure 7). Throughout the whole BTH region, there were no severe ozone pollution days, and the excellent and good days of O3 level accounted for 53% and 35% in 2018, respectively. However, 12% (43 days) and <1% (2 days) of 2018 exhibited light and moderate O3 pollution, respectively. Similar to the seasonal spatial distribution of the MDA8 O3 concentration, Beijing, Chengde and Qinhuangdao maintained a low O3 pollution level, with days exhibiting an excellent air quality in terms of O3 accounting for 59% (215 days) of the year. In contrast, cities with high pollution levels were mainly located in the southern BTH area, with light and moderate pollution days accounting for 17–19% and 1–5%, respectively, of 2018. These results indicated that O3 governance is urgently required in the southern BTH area.

4. Discussion

Here, we first compared the model performance between our two-stage model and six widely applied traditional models in air pollutant concentration estimation (Table 1). In regard to these models, the same hourly training datasets, except for WRFO3, across the BTH region in 2018 were selected for modeling. Among these models, due to the simple linear relationship, the MLR model achieved a poor estimation accuracy with low R2 values of 0.63 and 0.62 for the sample- and station-based 10-CV results, respectively. In addition, the RMSE (30.37–31.32 μg m−3) and MAE (29.85–31.01 μg m−3) values were the highest. Then, since potential nonlinear relationships were captured and spatial relationships were considered, the estimation capability of the GAM and GWR models was higher. The sample-based 10-CV R2 values of these two models increased to 0.69 and 0.72, respectively, and the RMSE and MAE values also declined. Furthermore, because fixed and random effects were considered, the estimation capacity of the LME and LME+GWR models was enhanced with sample-based 10-CV R2 values of 0.81 and 0.87, respectively. As for using the chemical transport model alone, due to the fixity of its chemical scheme and the particularity of atmospheric transmission, the simulation results are relatively poor. The statistical indicators are even lower than most traditional statistical models with the R2 of 0.67 and 0.66 for sample-based and station-based 10-CV, respectively. Meanwhile, the RMSE and MAR of WRF-Chem results alone are also high, which is close to twice that of WRF+RF model. In regard to the machine learning method, we investigated the performance of the RF model. In comparison, with WRFO3 input, the coefficient of determination R2 increased by 0.03 for both the sample- and station-based 10-CV results. Moreover, the uncertainty (RMSE and MAE) decreased by nearly 10%. These results indicated that our two-stage model is superior to the other traditional models and highlighted the importance and stability of WRFO3.
We also compared our results with those obtained with methods adopted in similar studies. Several previous publications are summarized in Table S2 for comparison. Overall, our two-stage model yielded a superior estimation ability than that yielded by previous methods in terms of the various temporal scales. At the hourly scale, compared to the estimation of Liu et al. with a chemical transport model, i.e., the CMAQ model, the CV R2 value is at least doubled [58]. Meanwhile, the CV R2 value of our model is 0.29 higher than using WRF model alone in BTH region in 2018 [59]. In addition, our model yielded a better sample-based 10-CV R2 value than that reported in a previous regional study (R2 = 0.81) on the BTH region from 2010 to 2017 with only the RF model for hourly O3 estimation [35]. Then, we compared our model to other daily mean or MDA8 O3 concentration estimation studies based on machine learning methods, e.g., the data fusion model, XGBoost model and RF algorithm, conducted at the national and regional scales with an approximate horizontal resolution of 0.1° × 0.1° [26,34,36,44]. As indicated in Table S2, the CV R2 values are all <0.8 (0.59–0.79), and the RMSE values are all >20 μg m−3, which indicates that our model achieves an excellent performance. Furthermore, our two-stage model outperforms many statistical models and machine learning models at the monthly and seasonal scales [33,44]. Although Liu et al. employed the XGBoost model to achieve a high accuracy with R2 values of 0.90 and 0.93 at the monthly and seasonal scales, respectively [44], the index values are 0.98 at both scales for our model, which suggests a smaller error than that in the aforementioned studies. Compared with our pervious study, our estimation of near surface O3 concentration in Beijing-Tianjin-Hebei region has been slightly improved [60]. In future, we can be extended this method to a wider range.
The purpose of this study was to accurately map the full-coverage hourly O3 concentration. However, there remain certain limitations, which will be improved in future research. First, the horizontal resolution can be further improved. Currently, with the needs of refined research in epidemiology, economics and environmental sciences, higher-horizontal resolution high-quality O3 maps, e.g., 0.01° × 0.01°, can provide basic data guarantees for more accurate research. Second, the study area should be extended. Here, we only selected the BTH region as an example. Through evaluation, our model achieved an excellent spatial prediction ability, and the model can be widely applied in the estimation of the near-surface O3 concentration at the national scale in the future. Third, the time series should also be expanded. The O3 monitoring network was established in 2013, and historical records remain unavailable. Therefore, based on the method proposed in this paper, the relationship between the measured O3 and other factors can be built since 2013, and the high-accuracy full-coverage hourly O3 historical records can be derived over the long term. Furthermore, the WRF-Chem model can predict ozone concentration in future, based on the relationship between the measured O3 and other factors, we also can forecast the near surface O3 concentration accurately in our future work, which can also be beneficial for research in related fields.

5. Conclusions

Currently, the joint management of O3 and PM2.5 comprises the focus of air pollution control in China. However, high-quality near-surface O3 concentration data are relatively scarce in China. This paper attempts to determine the full-coverage hourly near-surface O3 concentration, and the BTH region was selected as an example. Therefore, a fusion algorithm (WRF-Chem and RF models) was established that combined meteorological and anthropogenic emission data to estimate the hourly O3 concentration in 2018 throughout the BTH region. The assessment results indicated that our model achieved a high accuracy with a sample-based 10-CV R2 value of 0.94 and RMSE of 14.58 μg m−3. Moreover, the O3 concentration estimated with the proposed method was extremely consistent with station-based measurement at varying temporal scales. In addition, after incorporating the chemical transport mechanism and with the use of a data-mining algorithm, the performance of our two-stage model was highly superior to that of the traditional derivation algorithm and methods proposed in previous related studies. With this model, hourly and seasonal O3 concentration maps were generated across the BTH region in 2018. The obtained results indicated that the BTH region faces a considerable O3 exposure risk with an average hourly O3 concentration of 90.12 ± 5.17 μg m−3, and the peak concentration reached approximately 112.73 ± 9.65 μg m−3 at 1500 LT. Moreover, severe O3 pollution mainly occurred in summer. In addition, through calculation of the IAQI associated with the O3 concentration, we found that the vast majority of cities suffered slight pollution in 2018 throughout the BTH region, and severe O3 pollution regions were observed in the southern BTH area. In summary, the established method is beneficial for accurate O3 concentration estimation, and O3 maps can be widely applied in economics, epidemiology, and environmental science research.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijerph19148511/s1, Figure S1: Spatial distribution of the ground O3 monitoring stations in the BTH region; Figure S2: Density scatter plots of the station-based 10-CV results from 08:00 local time (0800 LT) to 18:00 local time (1800 LT) across the BTH region in 2018; Figure S3. The variation of mean hourly measured and simulated ozone concentration in eight stations around Beijing-Tianjin-Hebei region. Figure S4: The time series of hourly O3 concentration bias during 0800 LT to 1800 LT over BTH region; Figure S5: Density scatter plots of the sample-based 10-CV results of the daily, monthly and seasonal MDA8 O3 concentrations in 2018 across the BTH region; Figure S6: Spatial distributions of the hourly O3 concentration from 0800–1800 LT and annual mean O3 concentration across the BTH region in 2018; Table S1: VIF, FI and R values between the surface measured O3 concentration and all factors for model building; Table S2: Comparison of the model performances between our two-stage model and other model used in other similar studies in O3 concentration estimation.

Author Contributions

Conceptualization, W.X. and J.W.; methodology, W.X. and J.Z.; software, W.X. and X.H.; validation, W.X., J.W. and X.H.; formal analysis, W.X.; data curation, W.X. and X.H.; writing—original draft preparation, W.X. and J.W.; writing—review and editing, W.X., Z.Y., J.Z. and J.W. All authors have read and agreed to the published version of the manuscript.

Funding

This study was supported by the National Natural Science Foundation of China (41575144), the National Key R&D Program of China (2017YFA0603603) and the Qingdao Social Science Planning Project (QDSKL2101073).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Chan, C.K.; Yao, X. Air pollution in mega cities in China. Atmos. Environ. 2008, 42, 1–42. [Google Scholar] [CrossRef]
  2. Geng, G.; Zheng, Y.; Zhang, Q.; Xue, T.; Zhao, H.; Tong, D.; Zheng, B.; Li, M.; Liu, F.; Hong, C.; et al. Drivers of PM2.5 air pollution deaths in China 2002–2017. Nat. Geosci. 2021, 14, 645–650. [Google Scholar] [CrossRef]
  3. Xue, W.; Zhang, J.; Zhong, C.; Li, X.; Wei, J. Spatiotemporal PM2.5 variations and its response to the industrial structure from 2000 to 2018 in the Beijing-Tianjin-Hebei region. J. Clean. Prod. 2021, 27, 123742. [Google Scholar] [CrossRef]
  4. Wu, X.; Braun, D.; Schwartz, J.; Kioumourtzoglou, M.A.; Dominici, F.J.S.A. Evaluating the impact of long-term exposure to fine particulate matter on mortality among the elderly. Sci. Adv. 2020, 6, 5692. [Google Scholar] [CrossRef] [PubMed]
  5. Rai, R.; Agrawal, M. Impact of tropospheric ozone on crop plants. Proc. Natl. Acad. Sci. India Sect. B 2012, 82, 241–257. [Google Scholar] [CrossRef]
  6. Wei, J.; Liu, S.; Li, Z.; Liu, C.; Qin, K.; Liu, X.; Pinker, R.; Dickerson, R.; Lin, J.; Boersma, K.; et al. Ground-level NO2 surveillance from space across China for high resolution using interpretable spatiotemporally weighted artificial intelligence. Environ. Sci. Technol. 2022. [Google Scholar] [CrossRef]
  7. GB 3095-2012; Revision of the Ambien Air Quality Standards. Ministry of Ecology and Environment (MEE): Beijing, China, 2018.
  8. Wei, J.; Li, Z.; Lyapustin, A.; Sun, L.; Peng, Y.; Xue, W.; Su, T.; Cribb, M. Reconstructing 1-km-resolution high-quality PM2.5 data records from 2000 to 2018 in China: Spatiotemporal variations and policy implications. Remote Sens. Environ. 2021, 252, 112136. [Google Scholar] [CrossRef]
  9. Xue, W.; Li, X.; Yang, Z.; Wei, J. Are House Prices Affected by PM2.5 Pollution? Evidence from Beijing, China. Int. J. Environ. Res. Public Health 2022, 19, 8461. [Google Scholar] [CrossRef]
  10. Xue, W.; Zhang, J.; Zhong, C.; Ji, D.; Huang, W. Satellite-derived spatiotemporal PM2.5 concentrations and variations from 2006 to 2017 in China. Sci. Total Environ. 2020, 712, 134577. [Google Scholar] [CrossRef]
  11. Liu, R.; Ma, Z.; Liu, Y.; Shao, Y.; Zhao, W.; Bi, J. Spatiotemporal distributions of surface ozone levels in China from 2005 to 2017: A machine learning approach. Environ. Int. 2020, 142, 105823. [Google Scholar] [CrossRef]
  12. Li, K.; Jacob, D.J.; Liao, H.; Shen, L.; Zhang, Q.; Bates, K.H. Anthropogenic drivers of 2013–2017 trends in summer surface ozone in China. Proc. Natl. Acad. Sci. USA 2019, 116, 422–427. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  13. Wang, T.; Xue, L.; Brimblecombe, P.; Lam, Y.F.; Li, L.; Zhang, L. Ozone pollution in China: A review of concentrations, meteorological influences, chemical precursors, and effects. Sci. Total Environ. 2017, 575, 1582–1596. [Google Scholar] [CrossRef] [PubMed]
  14. Amann, M.; Derwent, D.; Forsberg, B.; Hänninen, O.; Hurley, F.; Krzyzanowski, M.; de Leeuw, F.; Liu, S.J.; Mandin, C.; Schneider, J.; et al. Health Risks of Ozone from Long Range Transboundary Air Pollution; World Health Organization: Geneva, Switzerland, 2008. [Google Scholar]
  15. Turner, M.C.; Jerrett, M.; Pope, C.A., III; Krewski, D.; Gapstur, S.M.; Diver, W.R.; Beckerman, B.S.; Marshall, J.D.; Su, J.; Crouse, L.D.; et al. Long-term ozone exposure and mortality in a large prospective study. Am. Rev. Respir. Dis. 2016, 193, 1134–1142. [Google Scholar] [CrossRef] [Green Version]
  16. Wang, Y.; Wild, O.; Chen, X.; Wu, Q.; Gao, M.; Chen, H.; Qi, Y.; Wang, Z. Health impacts of long-term ozone exposure in China over 2013–2017. Environ. Int. 2020, 144, 106030. [Google Scholar] [CrossRef]
  17. Liang, S.; Li, X.; Teng, Y.; Fu, H.; Chen, L.; Mao, J.; Zhang, H.; Gao, S.; Sun, Y.; Ma, Z.; et al. Estimation of health and economic benefits based on ozone exposure level with high spatial-temporal resolution by fusing satellite and station observations. Environ. Pollut. 2019, 255, 113267. [Google Scholar] [CrossRef] [PubMed]
  18. Geng, F.; Zhao, C.; Tang, X.; Lu, G.; Tie, X. Analysis of ozone and VOCs measured in Shanghai: A case study. Atmos. Environ. 2007, 41, 989–1001. [Google Scholar] [CrossRef]
  19. Xu, W.; Xu, X.; Lin, M.; Lin, W.; Tarasick, D.; Tang, J.; Ma, J.; Zheng, X. Long-term trends of surface ozone and its influencing factors at the Mt Waliguan GAW station, China—Part 2: The roles of anthropogenic emissions and climate variability. Atmos. Chem. Phys. 2018, 18, 773–798. [Google Scholar] [CrossRef] [Green Version]
  20. Lu, X.; Ye, X.; Zhou, M.; Zhao, Y.; Weng, H.; Kong, H.; Li, K.; Gao, M.; Zheng, B.; Lin, J.; et al. The underappreciated role of agricultural soil nitrogen oxide emissions in ozone pollution regulation in North China. Nat. Commun. 2021, 12, 5021. [Google Scholar] [CrossRef] [PubMed]
  21. Zhang, J.; Wang, T.; Chameides, W.L.; Cardelino, C.; Kwok, J.; Blake, D.R.; Ding, A.; So, K.L. Ozone production and hydrocarbon reactivity in Hong Kong, southern China. Atmos. Chem. Phys. 2007, 7, 557–573. [Google Scholar] [CrossRef] [Green Version]
  22. Liu, Y.; Wang, T. Worsening urban ozone pollution in China from 2013 to 2017—Part 1: The complex and varying roles of meteorology. Atmos. Chem. Phys. 2020, 20, 6305–6321. [Google Scholar] [CrossRef]
  23. Tang, G.; Liu, Y.; Huang, X.; Wang, Y.; Hu, B.; Zhang, Y.; Song, T.; Li, X.; Wu, S.; Li, Q.; et al. Aggravated ozone pollution in the strong free convection boundary layer. Sci. Total Environ. 2021, 788, 147740. [Google Scholar] [CrossRef] [PubMed]
  24. Zhang, J.; Wang, T.; Chameides, W.L.; Cardelino, C.; Blake, D.R.; Streets, D.G. Source characteristics of volatile organic compounds during high ozone episodes in Hong Kong, southern China. Atmos. Chem. Phys. 2008, 8, 4983–4996. [Google Scholar] [CrossRef] [Green Version]
  25. Lu, X.; Zhang, L.; Chen, Y.; Zhou, M.; Zheng, B.; Li, K.; Liu, Y.; Lin, J.; Fu, T.M.; Zhang, Q. Exploring 2016–2017 surface ozone pollution over China: Source contributions and meteorological influences. Atmos. Chem. Phys. 2019, 19, 8339–8361. [Google Scholar] [CrossRef] [Green Version]
  26. Li, J.; Wang, Z.; Chen, L.; Lian, L.; Li, Y.; Zhao, L.; Zhou, S.; Mao, X.; Huang, T.; Gao, H.; et al. WRF-chem simulations of ozone pollution and control strategy in petrochemical industrialized and heavily polluted Lanzhou city, northwestern China. Sci. Total Environ. 2020, 737, 139835. [Google Scholar] [CrossRef] [PubMed]
  27. Visser, A.J.; Boersma, K.F.; Ganzeveld, L.N.; Krol, M.C. European NOx emissions in WRF-chem derived from OMI: Impacts on summertime surface ozone. Atmos. Chem. Phys. 2019, 19, 11821–11841. [Google Scholar] [CrossRef] [Green Version]
  28. Wei, W.; Li, Y.; Ren, Y.; Cheng, S.; Han, L. Sensitivity of summer ozone to precursor emission change over Beijing during 2010–2015: A WRF-chem modeling study. Atmos. Environ. 2019, 218, 116984. [Google Scholar] [CrossRef]
  29. Zhang, Q.; Tong, P.; Liu, M.; Lin, H.; Yun, X.; Zhang, H.; Tao, W.; Liu, J.; Wang, S.; Tao, S.; et al. A WRF-chem model-based future vehicle emission control policy simulation and assessment for the Beijing-Tianjin-Hebei region, China. J. Environ. Manag. 2019, 253, 109751. [Google Scholar] [CrossRef]
  30. Miri, M.; Ghassoun, Y.; Dovlatabadi, A.; Ebrahimnejad, A.; Löwner, M.O. Estimate annual and seasonal PM1, PM2.5 and PM10 concentrations using land use regression model. Ecotoxicol. Environ. Saf. 2019, 174, 137–145. [Google Scholar] [CrossRef]
  31. She, Q.; Choi, M.; Belle, J.H.; Xiao, Q.; Bi, J.; Huang, K.; Meng, X.; Geng, G.; Kim, J.; He, K.; et al. Satellite-based estimation of hourly PM2.5 levels during heavy winter pollution episodes in the Yangtze River Delta, China. Chemosphere 2020, 239, 124678. [Google Scholar] [CrossRef]
  32. Xie, Y.; Wang, Y.; Bilal, M.; Dong, W. Mapping daily PM2.5 at 500 m resolution over Beijing with improved hazy day performance. Sci. Total Environ. 2019, 659, 410–418. [Google Scholar] [CrossRef]
  33. Zhang, X.Y.; Zhao, L.M.; Cheng, M.M.; Chen, D.M. Estimating Ground-Level Ozone Concentrations in Eastern China Using Satellite-Based Precursors. IEEE Trans. Geosci. Remote Sens. 2020, 58, 4754–4763. [Google Scholar] [CrossRef]
  34. Zhan, Y.; Luo, Y.; Deng, X.; Grieneisen, M.L.; Zhang, M.; Di, B. Spatiotemporal prediction of daily ambient ozone levels across China using random forest for human exposure assessment. Environ. Pollut. 2018, 233, 464–473. [Google Scholar] [CrossRef] [PubMed]
  35. Ma, R.; Ban, J.; Wang, Q.; Zhang, Y.; Yang, Y.; He, M.Z.; Li, S.; Shi, W.; Li, T. Random forest model based fine scale spatiotemporal O3 trends in the Beijing-Tianjin-Hebei region in China, 2010 to 2017. Environ. Pollut. 2021, 276, 116635. [Google Scholar] [CrossRef] [PubMed]
  36. Xue, T.; Zheng, Y.; Geng, G.; Xiao, Q.; Meng, X.; Wang, M.; Li, X.; Wu, N.; Zhang, Q.; Zhu, T. Estimating spatiotemporal variation in ambient ozone exposure during 2013–2017 using a data-fusion model. Environ. Sci. Technol. 2020, 54, 14877–14888. [Google Scholar] [CrossRef] [PubMed]
  37. Tian, Y.; Xiang, X.; Juan, J.; Song, J.; Cao, Y.; Huang, C.; Li, M.; Hu, Y. Short-term Effect of Ambient Ozone on Daily Emergency Room Visits in Beijing, China. Sci. Rep. 2018, 8, 2775. [Google Scholar] [CrossRef]
  38. Liu, X.; Li, Z.; Zhang, J.; Guo, M.; Lu, F.; Xu, X.; Deginet, A.; Liu, M.; Dong, Z.; Hu, Y.; et al. The association between ozone and ischemic stroke morbidity among patients with type 2 diabetes in Beijing, China. Sci. Total Environ. 2022, 818, 151733. [Google Scholar] [CrossRef]
  39. Wei, J.; Huang, W.; Li, Z.; Xue, W.; Peng, Y.; Sun, L.; Cribb, M. Estimating 1-km-resolution PM2.5 concentrations across China using the space-time random forest approach. Remote Sens. Environ. 2019, 231, 111221. [Google Scholar] [CrossRef]
  40. Jin, L.; Wang, B.; Shi, G.; Seyler, B.C.; Qiao, X.; Deng, X.; Jiang, X.; Yang, F.; Zhan, Y. Impact of China’s recent amendments to air quality monitoring protocol on reported trends. Atmosphere 2020, 11, 1199. [Google Scholar] [CrossRef]
  41. Wu, Y.; Di, B.; Luo, Y.; Grieneisen, M.L.; Zeng, W.; Zhang, S.; Deng, X.; Tang, Y.; Shi, G.; Yang, F.; et al. A robust approach to deriving long-term daily surface NO2 levels across China: Correction to substantial estimation bias in back-extrapolation. Environ. Int. 2021, 154, 106576. [Google Scholar] [CrossRef]
  42. Grell, G.A.; Peckham, S.E.; Schmitz, R.; McKeen, S.A.; Frost, G.; Skamarock, W.C.; Eder, B. Fully coupled “online” chemistry within the WRF model. Atmos. Environ. 2005, 39, 6957–6975. [Google Scholar] [CrossRef]
  43. Guenther, A.; Karl, T.; Harley, P.; Wiedinmyer, C.; Palmer, P.I.; Geron, C. Estimates of global terrestrial isoprene emission using MEGAN. Atmos. Chem. Phys. 2006, 6, 3181–3210. [Google Scholar] [CrossRef] [Green Version]
  44. Liu, P.; Song, H.; Wang, T.; Wang, F.; Li, X.; Miao, C.; Zhao, H. Effects of meteorological conditions and anthropogenic precursors on ground-level ozone concentrations in Chinese cities. Environ. Pollut. 2020, 262, 114366. [Google Scholar] [CrossRef]
  45. Calfapietra, C.; Fares, S.; Manes, F.; Morani, A.; Sgrigna, G.; Loreto, F. Role of biogenic volatile organic compounds (BVOC) emitted by urban trees on ozone concentration in cities: A review. Environ. Pollut. 2013, 183, 71–80. [Google Scholar] [CrossRef]
  46. Skamarock, W.C.; Klemp, J.B.; Dudhia, J.; Gill, D.O.; Barker, D.M.; Wang, W.; Powers, J.G. A Description of the Advanced Research WRF Version 2; Mesoscale and Microscale Meteorology Division, National Center for Atmospheric Research: Boulder, CO, USA, 2005. [Google Scholar]
  47. Noh, Y.; Hong, S.-Y.; Dudhia, J. A New Vertical Diffusion Package with an Explicit Treatment of Entrainment Processes. Mon. Weather Rev. 2006, 134, 2318–2341. [Google Scholar]
  48. Chen, F.; Dudhia, J. Coupling an Advanced Land Surface–Hydrology Model with the Penn State–NCAR MM5 Modeling System. Part I: Model Implementation and Sensitivity. Mon. Weather Rev. 2001, 129, 569–585. [Google Scholar] [CrossRef] [Green Version]
  49. Grell, G.A.; Dévényi, D. A generalized approach to parameterizing convection combining ensemble and data assimilation techniques. Geophys. Res. Lett. 2002, 29, 38-1–38-4. [Google Scholar] [CrossRef] [Green Version]
  50. Chou, M.D.; Suarez, M.J. A Solar Radiation Parameterization for Atmospheric Studies; NASA: Washington, DC, USA, 1999. [Google Scholar]
  51. Mlawer, E.J.; Taubman, S.J.; Brown, P.D.; Iacono, M.J.; Clough, S.A. Radiative transfer for inhomogeneous atmospheres: RRTM, a validated correlated-k model for the longwave. J. Geophys. Res. Atmos. 1997, 102, 16663–16682. [Google Scholar] [CrossRef] [Green Version]
  52. Zaveri, R.A.; Peters, L.K. A new lumped structure photochemical mechanism for large-scale applications. J. Geophys. Res. Atmos. 1999, 104, 30387–30415. [Google Scholar] [CrossRef]
  53. Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef] [Green Version]
  54. Xue, W.; Wei, J.; Zhang, J.; Sun, L.; Che, Y.; Yuan, M.; Hu, X. Inferring Near-Surface PM2.5 Concentrations from the VIIRS Deep Blue Aerosol Product in China: A Spatiotemporally Weighted Random Forest Model. Remote Sens. 2021, 13, 505. [Google Scholar] [CrossRef]
  55. Neter, J.; Kutner, M.H.; Nachtsheim, C.J.; Wasserman, W. Applied Linear Statistical Models. Technometric 1997, 39, 342. [Google Scholar]
  56. Im, U.; Markakis, K.; Poupkou, A.; Melas, D.; Unal, A.; Gerasopoulos, E.; Daskalakis, N.; Kindap, T.; Kanakidou, M. The impact of temperature changes on summer time ozone and its precursors in the Eastern Mediterranean. Atmos. Chem. Phys. 2011, 11, 3847–3864. [Google Scholar] [CrossRef] [Green Version]
  57. He, J.; Gong, S.; Yu, Y.; Yu, L.; Wu, L.; Mao, H.; Song, C.; Zhao, S.; Liu, H.; Li, X.; et al. Air pollution characteristics and their relation to meteorological conditions during 2014–2015 in major Chinese cities. Environ. Pollut. 2017, 223, 484–496. [Google Scholar] [CrossRef] [PubMed]
  58. Liu, X.H.; Zhang, Y.; Cheng, S.H.; Xing, J.; Zhang, Q.; Streets, D.G.; Jang, C.; Wang, W.X.; Hao, J.M. Understanding of regional air pollution over China using CMAQ, part I performance evaluation and seasonal variation. Atmos. Environ. 2010, 44, 2415–2426. [Google Scholar] [CrossRef]
  59. Hu, X.; Zhang, J.; Xue, W.; Zhou, L.; Che, Y.; Han, T. Estimation of the Near-Surface Ozone Concentration with Full Spatiotemporal Coverage across the Beijing-Tianjin-Hebei Region Based on Extreme Gradient Boosting Combined with a WRF-Chem Model. Atmosphere 2022, 13, 632. [Google Scholar] [CrossRef]
  60. Wei, J.; Li, Z.; Li, K.; Dickerson, R.R.; Pinker, R.T.; Wang, J.; Liu, X.; Sun, L.; Xue, W.; Cribb, M. Full-coverage mapping and spatiotemporal variations of ground-level ozone (O3) pollution from 2013 to 2020 across China. Remote Sens. Environ. 2022, 270, 112775. [Google Scholar] [CrossRef]
Figure 1. Flowchart of the two-stage model.
Figure 1. Flowchart of the two-stage model.
Ijerph 19 08511 g001
Figure 2. Density scatter plots of the sample-based 10-CV results from 08:00 local time (0800 LT) to 18:00 local time (1800 LT) across the BTH region in 2018: (a) all hourly records from 0800 to 1800 LT; and (bl) sample-based 10-CV values for each hour from 0800 to 1800 LT. The black lines denote 1:1 lines and the red lines denote linear regression fitting lines.
Figure 2. Density scatter plots of the sample-based 10-CV results from 08:00 local time (0800 LT) to 18:00 local time (1800 LT) across the BTH region in 2018: (a) all hourly records from 0800 to 1800 LT; and (bl) sample-based 10-CV values for each hour from 0800 to 1800 LT. The black lines denote 1:1 lines and the red lines denote linear regression fitting lines.
Ijerph 19 08511 g002
Figure 3. Site-scale evaluation of the estimated hourly O3 concentration in 2018 across the BTH region. The upper and lower rows indicate the sample- and station-based 10-CV results, respectively. The columns from left to right indicate R2, RMSE, MAE and hour of occurrence of the highest 10-CV R2 value.
Figure 3. Site-scale evaluation of the estimated hourly O3 concentration in 2018 across the BTH region. The upper and lower rows indicate the sample- and station-based 10-CV results, respectively. The columns from left to right indicate R2, RMSE, MAE and hour of occurrence of the highest 10-CV R2 value.
Ijerph 19 08511 g003
Figure 4. The temporal time series of the consistency between the two-stage model-derived concentrations and surface measurements in 2018 across China. (ac) was R2, RMSE and MAE, respectively.
Figure 4. The temporal time series of the consistency between the two-stage model-derived concentrations and surface measurements in 2018 across China. (ac) was R2, RMSE and MAE, respectively.
Ijerph 19 08511 g004
Figure 5. Spatial distributions of the seasonal MDA8 O3 concentration (ad) and proportion of O3 pollution time in 13 cities across the BTH region in 2018: (a,e) Spring; (b,f) summer; (c,g) autumn; and (d,h) winter.
Figure 5. Spatial distributions of the seasonal MDA8 O3 concentration (ad) and proportion of O3 pollution time in 13 cities across the BTH region in 2018: (a,e) Spring; (b,f) summer; (c,g) autumn; and (d,h) winter.
Ijerph 19 08511 g005
Figure 6. Daily MDA8 O3 concentration (a) and IAQI of the daily average O3 concentration (b) in 13 cities and the whole BTH region.
Figure 6. Daily MDA8 O3 concentration (a) and IAQI of the daily average O3 concentration (b) in 13 cities and the whole BTH region.
Ijerph 19 08511 g006
Figure 7. O3 pollution level in each city of the BTH region in 2018.
Figure 7. O3 pollution level in each city of the BTH region in 2018.
Ijerph 19 08511 g007
Table 1. Comparison of the model performances between our two-stage model and other widely used traditional model used in air pollutant concentration estimation.
Table 1. Comparison of the model performances between our two-stage model and other widely used traditional model used in air pollutant concentration estimation.
ModelSAMPLE-BASED 10-CVStation-Based 10-CV
R2SlopeRMSEMAER2SlopeRMSEMAE
MLR0.630.6330.3729.850.620.6231.3231.01
GAM0.690.6627.4120.060.650.6129.5724.88
GWR0.720.6825.8618.430.690.6527.5020.01
LME0.810.7920.2114.780.790.7722.0317.27
LME + GWR0.870.8518.6713.260.850.8321.2015.11
WRF0.670.6928.6222.410.650.6629.0721.74
RF0.910.8815.8411.720.870.8520.0214.53
WRF + RF0.940.9214.589.960.900.8919.1813.32
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Xue, W.; Zhang, J.; Hu, X.; Yang, Z.; Wei, J. Hourly Seamless Surface O3 Estimates by Integrating the Chemical Transport and Machine Learning Models in the Beijing-Tianjin-Hebei Region. Int. J. Environ. Res. Public Health 2022, 19, 8511. https://doi.org/10.3390/ijerph19148511

AMA Style

Xue W, Zhang J, Hu X, Yang Z, Wei J. Hourly Seamless Surface O3 Estimates by Integrating the Chemical Transport and Machine Learning Models in the Beijing-Tianjin-Hebei Region. International Journal of Environmental Research and Public Health. 2022; 19(14):8511. https://doi.org/10.3390/ijerph19148511

Chicago/Turabian Style

Xue, Wenhao, Jing Zhang, Xiaomin Hu, Zhe Yang, and Jing Wei. 2022. "Hourly Seamless Surface O3 Estimates by Integrating the Chemical Transport and Machine Learning Models in the Beijing-Tianjin-Hebei Region" International Journal of Environmental Research and Public Health 19, no. 14: 8511. https://doi.org/10.3390/ijerph19148511

APA Style

Xue, W., Zhang, J., Hu, X., Yang, Z., & Wei, J. (2022). Hourly Seamless Surface O3 Estimates by Integrating the Chemical Transport and Machine Learning Models in the Beijing-Tianjin-Hebei Region. International Journal of Environmental Research and Public Health, 19(14), 8511. https://doi.org/10.3390/ijerph19148511

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop