1. Introduction
Numerous studies have examined the association between exposure to traffic-related air pollution and children’s respiratory health [
1,
2,
3,
4]. Long-term exposure to air pollution, indicated by concentrations of nitric oxides (NO, NO
, NO
), ozone (O
), particulate matter with aerodynamic diameter less than 2.5 µm (PM
), and particulate matter with aerodynamic diameter less than 10 µm (PM
), has been shown to lead to reduced lung development in children [
5]. Fortunately, decreases in air pollution in Southern California over the past 17 years have led to significant reductions in these detrimental effects [
6].
The air pollution exposures used in the aforementioned studies were estimated from concentrations measured at national- and state-operated fixed-site monitors [
7]. For example, in a longitudinal assessment of air quality and lung development, Gauderman et al. [
6] used concentrations of nitrogen dioxide (NO
), O
, PM
, PM
from one monitor within each community to determine exposures. With this approach, the 120 to 300 study subjects residing in each community were assigned the same exposure. Spatial statistical techniques such as kriging, smoothing, and land use regression have been used to incorporate additional information (e.g., traffic, population density, elevation, land cover, and other geographic data) to characterize the spatial relationships in fixed-site monitoring data and interpolate concentrations to the unmonitored locations where there are health data [
8,
9]. While these approaches are valuable for generating exposures with greater spatial coverage, it has been shown that if their prediction performance is poor, subsequent epidemiological studies can yield severe biases and underestimation of standard errors in the health effects estimates [
10].
Advances in using satellite observations of aerosol optical depth (AOD) to estimate ground-level concentrations of particulate matter (PM) air pollution have been extremely valuable in improving the spatial and temporal coverage of exposure estimates [
11,
12,
13,
14,
15]. Among the satellite instruments most commonly used for PM estimation are the Multi-angle Implementation for Aerosols (MAIAC) of the Moderate Resolution Imaging Spectroradiometer (MODIS) on-board the NASA Earth Observing System (EOS) Terra and Aqua satellites [
16], and the Multiangle Imaging SpectroRadiometer (MISR) on-board the Terra satellite [
17]. Recent algorithms applied to observations from these instruments provide global, near-daily AOD at a spatially resolved grid resolutions (1 km, 4.4 km) [
16,
18].
Some studies have combined AOD from MODIS and MISR to derive PM
concentrations [
19] and more recently PM
speciation concentrations [
20]. MISR, given its configuration of nine cameras and four spectral bands, has the capability of differentiating aerosol size and type resulting in fractionated AOD [
21]. In a recent study over Southern California, we reliably estimated PM
with MISR 4.4-km resolution AOD small+medium, and PM
with AOD large using generalized additive models (GAMs) [
22]. The MISR AOD-derived PM concentrations were well correlated (confirmed by leave-one-site-out cross validation, CV) with EPA monitoring site data (PM
CV
, PM
CV
). In the same region, speciated PM
(sulfate, SO
; nitrate, NO
; organic carbon, OC; and elemental carbon, EC) were estimated using GAMs from 8 MISR component fractions combined with meteorology and geographic characteristics [
23].
In a simulation study, high-resolution exposure estimates derived from satellite AOD were found to produce less biased acute and chronic health effects estimates with smaller standard errors than did exposure estimates derived from kriging PM
concentrations from fixed-site monitors [
10]. Satellite-derived PM
concentrations have been instrumental in studies of the global burden of disease [
24,
25]. A few epidemiological studies of smaller cohorts have used satellite-derived PM
to estimate residential exposures in longitudinal children’s health effects [
26,
27,
28].
In this study, we derived daily PM and PM speciation (SO, NO, EC, dust) exposures from 2000–2018 over the state of California by applying machine learning approaches to ground-level air quality measurements linked with the high-dimensional 4.4-km MISR AOD products and mixtures. Estimated annual average concentrations were then assigned to the residences of children in 8 Southern California communities to examine the chronic effects of exposure to PM and the aforementioned PM components on lung function. This study is unique in that it is the first of its kind to examine the differential effects of satellite-derived PM speciation on children’s respiratory health.
4. Discussion
In this study, we used satellite observations of AOD, characterized by size, shape, and absorption properties as well as fractionated into 74 mixtures, to estimate PM
and select PM
chemical components. We then incorporated these estimates into an epidemiological assessment of their association with children’s lung function. In terms of exposure estimation, MISR AOD products resulted in better and more robust estimates than did AOD mixtures, except for dust. Non-linear models (GBM, RF, and SVM) performed better than linear models (Ridge and LASSO), which was consistent with previous studies where linear models were inadequate in explaining the relationship between AOD and ground-monitored PM [
22,
35,
46]. Although MISR aerosol data have coarser temporal and spatial resolution compared to MAIAC (every 3–5 days vs. daily and 4.4 km vs. 1 km, respectively), our model achieved high prediction performance (
Table 1) using MISR-specific data products on size, shape, and absorption, which proved vital in the prediction models. At least two MISR AOD products were among the five most important features for predicting PM
, SO
, and NO
, and the ten most important features for predicting dust were all AOD mixtures (
Figure 2 and
Figure 3).
Our PM
prediction performance was similar to those by Sorek-Hamer et al. [
46], who modeled PM
using AOD data from MODIS (Collection 5 Level 2) and the Ozone Monitoring Instrument (OMI) at 10-km resolution over the Central Valley in California. In Southern California, we improved upon previous work that predicted PM
, SO
, and NO
by Franklin et al. [
22], whose meteorological data from NOAA weather stations did not provide the spatial coverage of gridMET data. Surface shortwave radiation provided by gridMET was also among the most important predictors for PM
, NO
, and EC (
Figure 2 and
Figure 3). Our PM
, SO
, NO
, and EC models performed comparably to those by Meng et al. [
23], who reconstructed fractional AOD using the AOD mixtures (V22) while we relied on MISR AOD products (V23).
One limitation in our PM
speciation prediction models is the scarcity of data. As the number of CSN sites in California increased from 3 in 2000 to 19 in 2013 and decreased to 16 in 2018 (California PM
mass sites increased from 95 in 2000 to 157 in 2018), spatial coverage was certainly restricted (
Figure A1). Furthermore, the locations of these sparsely available monitors are not necessarily representative of the population density of Southern California. We used the coordinates of MISR pixels, which lacked a fixed grid, instead of monitoring sites as geospatial predictors to help mitigate this problem by introducing additional spatial variability. We also attempted prediction models for PM
and SO
(detailed results not reported here) to compare with our previous work over Mongolia [
35]. While PM
models performed about the same, i.e., average test
, SO
models over California performed much worse, with average test
(vs. test
over Ulaanbaatar). Poor prediction performance for SO
was likely due to much lower concentrations of SO
in California (mean SO
ppb in 2000–2018) compared to Ulaanbaatar (9.7 ppb in 2008–2017), where SO
is the more dominant source of PM. For the epidemiological purposes of the current study, we focused on models predicting PM
and its chemical components.
This study is unique in estimating air pollution exposure specifically to the residence and follow-up period of each subject. Previously, exposures were assigned using annual means from one central air pollution monitoring site for each study community [
5,
6,
42], even if the children might have lived far away from these sites. Furthermore, the follow-up period spanned about 211 days, yet the annual means of central-site air pollutants for each community were calculated using a fixed time window. Leveraging MISR aerosol and gridMET meteorological data, we improved upon these limitations by assigning exposures that were spatially within 4.4 km of where each child lived and temporally specific to the 12 months prior to each child’s assessment visit. Nevertheless, our exposure prediction models are not without unexplained residual variance; our best models had CV R
from 0.53 (dust) to 0.71 (SO
, NO
). As noted by Alexeeff et al. [
10], there can be 1–5% upward bias in subsequent health effects estimates when exposure predictions have performance statistics in the range we observed, and their standard errors may be underestimated. It is difficult to mitigate these issues due to imperfect exposure models, but it is worth keeping in mind while interpreting our epidemiological results.
While this is not the first study using AOD-derived PM
concentration in an epidemiological context, it is the first examining satellite-derived PM
speciation. Previous studies of AOD-derived PM
include Rice et al. [
27], who found that each 2 µg/m
increase in AOD-derived PM
was associated with a 28 mL (
to 0.2 mL) decrease in forced vital capacity (FVC) and higher odds of forced expiratory volume in 1 second (FEV
) being less than 80% predicted (OR
, 1.03 to 1.93). In another study, AOD-derived PM
concentrations were associated with an increased rate of asthma onset (HR
, 1.28 to 1.33) in Quebec [
28].
Similar to previous evaluations of the CHS [
3,
5,
6], biological characteristics (age, gender, race/ethnicity, height, height squared, BMI, and BMI squared) were significantly associated with both measurements of lung function (
Table A3). With these adjustments, several MISR-derived estimates of air pollutants were able to explain the residual differences in lung function measurements among the children. Another strength of this study is identifying the effect of specific PM
chemical components on lung function. While MISR-derived PM
was significantly associated with decreases in FEV
, its effect, measured as the difference in FEV
between the highest and lowest exposure level for each pollutant, is smaller than those of SO
and dust (
Table 3). Similarly, although FVC was only marginally statistically significantly associated with MISR-derived PM
, its associations with SO
and NO
were clinically significant. In California, secondary aerosols including nitrate and sulfate have been shown to be the most abundant contributors to ambient PM
, with nitrate accounting for as much as 55% of the total mass [
47]. Geologic dust can also contribute up to 20% of the mass in summer in more arid regions of Southern California. Importantly, we were able to distinguish that these PM
species had differentially stronger associations with children’s FEV
and FVC.
Urman et al. [
3], who examined the cross-sectional effect of central-site air pollution on lung function in the same cohort but at an earlier visit when the children were at ages 11–12, found central-site PM
to be significant with both log-transformed FEV
and FVC. In our study, we did not find central-site PM
to be significant with either outcomes. We did find MISR-derived PM
to be significantly associated with log-transformed FEV
, with a similar effect size, but not with log-transformed FVC. Children exposed to the highest level of MISR-derived PM
on average were
(95% CI:
) lower in FEV
compared to those exposed to the lowest level. In the same CHS cohort and during the same follow-up period, Franklin and Fruin [
4] found a significant association between NO
on FVC when adjusted for traffic-related noise exposure. We found a similarly significant relationship between NO
and FVC where an IQR increase in NO
(0.64 µg/m
) is associated with 53 mL decrease in FVC (95% CI:
).