1. Introduction
Since its outbreak in January, the COVID-19 pandemic has severely impacted socioeconomic activities throughout the world. As of 3 July, there have been 10,719,946 confirmed cases globally, including 517,337 deaths. The United States is the leading country with 2,671,220 confirmed cases and 127,858 deaths [
1]. Since the beginning of April, the U.S. has become the COVID-19 pandemic center, and the number of cases is still increasing. Social distancing is one of the most effective ways to reduce COVID-19 infection, but due to residential segregation—the separation of people based on income and/or race—some individuals from specific ethnic minority groups cannot practice social distancing. They are often found in overcrowded urban housing areas, which make physical distancing and self-isolation difficult. This leads to an increased risk for the spread of COVID-19 [
2]. Moreover, socioeconomic inequities frequently impact health and healthcare access, resulting in a higher burden of disease and mortality in vulnerable social groups. Therefore, it is necessary to integrate social-economic information and disease statistics to help analyze and understand the spread of COVID-19.
Many research findings have highlighted racial disparities in the transmission of COVID-19. Across the country, deaths caused by COVID-19 are disproportionately high among African Americans [
3], while Chicago and New York City reported greater COVID-19 mortality among Latinos [
4]. To utilize more detailed information on the racial and socioeconomic disparities, Raifman [
5] used the 2018 Behavioral Risk Factor Surveillance System (BRFSS) to estimate the proportion of adults that meets at least one of the many CDC (Centers for Disease Control and Prevention) criteria for risk of severe illness from COVID-19. The analysis is categorized by age group, race, and household income. Results show that people who are Black, American Indian, or live in low-income households are more likely to have conditions associated with increased risk of illness from COVID-19, compared to those who are White or have a higher income, respectively. Anyane-Yeboa, et al. [
6] investigated the racial disparity of infection and deaths caused by COVID-19 in the U.S. and assessed the rates of COVID-19 infection and death by race and ethnicity with information obtained from the Department of Health websites of sixteen states. They stated that Black patients had higher rates of infection and death from COVID-19, which are consistent with the findings from some other individual states. Laurencin and McClinton [
7] presented the overview of racial and ethnic distribution of COVID-19 confirmed cases and fatalities in the state of Connecticut to demonstrate the unique challenges among Black and Brown communities. At the metropolitan level, Yu et al. [
8] examined the growth rate of both COVID-19 confirmed cases and deaths in the first 30-day period of the outbreak within 100 of the largest metropolitan cities. They observed that the growth curve was particularly steep in counties that are located in cities with high economic disparity and residential segregation of Blacks and Hispanics.
One fundamental cause of racial disparities in health is racial residential segregation, which presents the physical isolation of one racial group from others [
9]. This segregation can affect health through concentrated poverty, the quality of the neighborhood environment, and the individual socioeconomic attainment of minorities [
10]. Health experts believe that person-to-person and community transmission are the most common ways to spread the COVID-19. Thus, neighborhoods with concentrated poverty and over-occupancy of housing units are at a higher risk of COVID-19 infection. Furthermore, minorities under the poverty line are more likely to work in industries that have remained open during non-essential business closures [
5]. Thus, they have greater exposure to COVID-19 and also contribute to the transmission of COVID-19. Given the influence of racial residential segregation on the socioeconomics of neighborhoods, it is necessary to conduct a systematic investigation of whether segregation has a direct impact on COVID-19 transmission.
In addition, adequate access to affordable testing sites and hospitals is critically important to identify potential carriers of COVID-19. Since it is the best way to provide evidence-based decisions to slow down the disease, the method of ”test, test, test” has been greatly endorsed by the World Health Organization to countries around the world. Appropriate Public Health measures, such as self-isolation and hospitalization, must also be taken to contain the pandemic. Many states, with the support of the federal government, have expanded access to COVID-19 related health care services either by setting up more mobile testing sites and increasing testing providers or providing telehealth to cover as many people as possible. However, these testing sites are more likely to be distributed in the well-off suburbs of White-dominant neighborhoods rather than in low-income minority neighborhoods [
11]. Such disparities in health care access are becoming worse among socioeconomically disadvantaged groups due to the lack of health insurance, access to transportation, and individual awareness of the disease severity. This is demonstrated by the fact that there has been an estimated 25% of the Black population accounting for 41% of COVID-19 cases in Boston [
12]. Additionally, the Hispanic communities are reported to be at higher risk of infection and death from COVID-19. Hence, a reliable and accurate measure of access to testing sites could help us understand which areas and what demographic groups suffer from inadequate access and the best testing strategies to adopt in efforts to mitigate the COVID-19 pandemic.
According to previous studies, sociodemographic and economic, as well as environmental features, are also important factors in affecting the spread of COVID-19 disease. For example, in the analysis conducted by Mollalo et al. [
13], they considered income inequality, median household income, the percentage of nurse practitioners, and the percentage of the Black female population when modeling the COVID-19 incidence at the U.S. county level. The results demonstrated that areas with a high incidence of COVID-19 usually have high-income inequality and median household income [
13]. Liu [
14] collected the number of laboratory-confirmed COVID-19 cases in 312 cities in China, and a series of sociodemographic variables such as distance to the epicenter, the total length of built urban metro lines, urban area, population density, the annual quantity of wastewater discharged, and residential garbage connected and transported, per capita public recreational green space, the daily highest temperature, and the capital city. Based on these data, a study of the impacts of COVID-19 transmission was conducted from the urban perspective. The statistically significant results revealed that the amount of residential garbage that was connected and transported, in conjunction with the annual quantity of wastewater discharged, could increase the confirmed infection number of COVID-19 [
14].
In the midst of COVID-19 mitigation efforts, to explore regional patterns and inform local policymakers about the efficient allocation of resources and personnel, making inferences at finer scales, such as the sub-county, census tract, or block group, may produce more accurate results than coarse levels, such as the county or above. In Chicago, more than 50% of COVID-19 cases and nearly 70% of COVID-19 deaths involve Black individuals. These deaths are concentrated in just five neighborhoods in the city’s South Side. Thus, it is critical to quantify the interaction between the various factors and disease statistics at a finer scale. Analyzing regional patterns and increasing their awareness of the COVID-19 dangers and preventative measures will benefit the most affected communities [
15]. Quantifying disparities in risk is important for allocating resources to prevent, identify, and treat COVID-19-related severe illness and limit diverging outcomes for vulnerable subgroups.
The spatial autocorrelation effect is an important issue when modeling geographic distributions of various events, such as crime, housing, and other human activities. Recent studies have demonstrated that health care events are also affected by spatial autocorrelation. Spatial autocorrelation is an instantiation of the first law of geography addressed by Waldo Tobler [
16]. That is, all things are related, but near things are more related than distant things. It suggests the observations at different locations are not independent or that the spatial pattern is not random. In other words, it captures the association of the observed variables in one location and the neighboring areas. The closer the geographic phenomena are across space, the more similar they are. For example, high cancer mortality in one area is more likely to be bordered by neighborhoods with high cancer mortality. Because of the spatial autocorrelation effect, some traditional statistical models assuming the independence and randomness of variables are not suitable for studying public health events that are attached to geography, such as the COVID-19 incidence in this study, as they may bias the estimation results [
17]. Following the line of the studies examining the spatial autocorrelation effect, two commonly used spatial regression models, the spatial lag model (SLM) and the spatial error model (SEM) [
18] are adopted in this study.
This study aims to identify the impact of racial segregation and testing site accessibility on the COVID-19 incidence rate in cities/towns of Massachusetts. To the authors’ knowledge, this is the first study to apply spatial analysis methods to the neighborhoods’ COVID-19 data in the U.S. The objectives are to: (1) evaluate the racial segregation of minorities, such as Hispanics, Non-Hispanic Black Americans, and Non-Hispanic Asians and socioeconomic characteristics in Massachusetts; (2) access the spatial accessibilities to testing sites across different sociodemographic groups in the study area; (3) investigate whether neighborhoods with higher COVID-19 incidence rate are positively associated with highly segregated areas for minority ethnics; (4) explore whether testing sites are well distributed for COVID-19 testing; and (5) examine the association between socioeconomic and COVID-19 incidence rate.
4. Results
4.1. Exploratory Analysis
The COVID-19 incidence rate varies across different sociodemographic groups at the city/town level. As shown in
Table 2, the Non-Hispanic Black population had the lowest mean value of 2.75% ranging from 0 to 43.95%, followed by Asian and Hispanic populations with mean values of 3.6% and 4.83%. The White population had the highest mean value of 91.06% ranging from 39.27% to 100%, reflecting the dominance of White people in Massachusetts. These four groups experienced very similar COVID-19 incidence rates across the first three quantiles, but the rate differences enlarged in the fourth quantile. Using the first quantile as a reference,
Figure 2 shows the variation of the COVID-19 incidence rate ratio of each quantile with that of the first quantile. All four groups initially exhibited a smooth increase with minor differences, but soon after, the ratio dramatically increased to a range of 9 to 16. Due to a large amount of confirmed COVID-19 cases in Boston city, we recalculated the COVID-19 incidence rate in the fourth quantile outside of Boston city in
Table 2 and
Figure 2. The rate ratios of Asian and White were consistent with those in the entire study area, while Non-Hispanic Black and Hispanic groups exhibited a small increase in incidence rate, which corroborate the concentrated segregation indices of Non-Hispanic Black and Hispanic in Boston city in
Figure 2.
Racial residential segregation was estimated by the isolation index described in
Section 2.
Figure 3 illustrates the segregations of the minorities in Massachusetts, including Hispanic, Non-Hispanic Black, and Asian. Higher values indicate a greater possibility that people of the same race are living as neighbors. As shown in
Figure 3, Hispanic and Non-Hispanic Blacks in Boston had higher segregation indexes than Asians, as Asians were more likely to live together in Middlesex and Norfolk. Middlesex County is amongst the top 25 most populated counties with the highest household income.
To quantify the health disparities across different racial groups, the weighted travel times to testing sites and hospitals are illustrated in the last two columns of
Table 2. The Non-Hispanic Black group had the lowest weighted travel time of 5.69 min to the testing sites, followed by the Hispanic, Asian, and White groups. Access to hospitals showed similar results, with a minor difference. The Hispanic group surpassed the Non-Hispanic Black with the lowest travel time of 8.88 min. Compared with the other three groups, the White group had the longest travel times of 7.82 and 11.21 min to testing sites and hospitals, respectively, which is consistent with the lowest COVID-19 incidence rate in more concentrated quantiles. In addition, the travel times to testing sites and hospitals were both in acceptable ranges less than 15 min across each group. However, the travel times to testing sites were almost 3 min less than the travel times to hospitals, which revealed the condition of accessible health care services related to COVID-19 in Massachusetts.
4.2. Results of Spatial Regression Analysis
We conducted a Moran’s I test to examine the spatial effects. The corresponding result implied that the spatial distribution of city/town-level COVID-19 incidence rate offered strong evidence of spatial autocorrelation with Moran’s I equals to 0.436, which was positive and significantly different from expected Moran’s I of −0.003 (p-value < 0.001, Z-score = 13.196). The positive sign implied the existence of the spatial adjacency effect. The COVID-19 incidence rate of a city/town affects that of other nearby cities/towns, partially because adjacent cities/towns share similar attributes. The p-value was considerably lower than 1%, indicating that the existence of spatial autocorrelation in the COVID-19 incidence rate was statistically significant.
After correlation analysis (correlation coefficients < 0.6), among the 12 candidate explanatory variables, 9 variables were selected to be included in the final models. These variables were the percentage of elderly people (65 years and over), percent of more than 1 occupant per room, poverty rate, income inequality, road density, and test site accessibility.
Table 3 shows the results of the two spatial regression models. For comparison, the classic OLS model was also calibrated, and results were listed.
Three metrics were used to compare the performances of the OLS, SLM, and SEM: log-likelihood at convergence (log-likelihood), Akaike information criterion (AIC), and R-squared (R2). For log-likelihood and R2, a higher value means better performance; for AIC, a lower value represents better performance. The results of the model performances suggested the following. First, two spatial models could better fit the observations than OLS could: the AIC of the OLS (5263.18) was much larger than in the SLM (5213.96) and SEM (5210.48), while the log-likelihood and R2 of the OLS (−2621.59, 0.579) were smaller than those of the SLM (−2596.979, 0.654) and SEM (−2594.24, 0.657). Second, the performance of the SEM was slightly better than that of the SLM: the log-likelihood and R2 of the SEM were larger than those of the SLM, while the AIC of the SEM was smaller than that of the SLM. The highest R2 was achieved by SEM (0.656657), which explains 65.67% of the total variations of COVID-19 incidence rates.
A tenet of regression is that residuals should be independent of each other, that is, they should be randomly distributed in space. Furthermore, the degree to which residuals are autocorrelated in space is another important indicator to judge the performance of regression models. The results showed that the residuals of the OLS revealed a strong positive autocorrelation, while the residual autocorrelations of the SLM and SEM were eliminated, and the residual autocorrelation of the SEM was eliminated more thoroughly than that of SLM (the Z-score of SEM (0.140) is smaller than that of SLM (1.017)).
According to the above-mentioned results, by incorporating spatial dependence, the two spatial regression models improved the performance of OLS in modeling the COVID-19 incidence rate in Massachusetts. Additionally, the SEM outperformed the SLM, meaning that the spatial spillover effect in the data was mainly reflected in the residuals. Therefore, to account for the spatial autocorrelation effect, the SEM specification is the more appropriate choice in this research.
The results of the three models demonstrate that segregations of Hispanic and Non-Hispanic Black were significantly positively associated with the COVID-19 incidence rate, while segregation of Asians had a negative and nonsignificant influence on the COVID-19 incidence rate. As for the results of SEM, a one-point increase in the segregation of Hispanic and segregation of Non-Hispanic Black was, respectively, associated with a 215.677-point and 47.236-point increase in the COVID-19 incidence rate, while a one-point increase in the segregation of Asian was associated with a 47.194 decrease in the COVID-19 incidence rate. The accessibility of the test site had a negative impact on the COVID-19 incidence rate, and this impact was statistically significant in the results of OLS and SEM. For every one-point increase in test site accessibility, the COVID-19 incidence rate decreased by 75.207, as shown by the results of the SEM.
As demonstrated by the results, the rate of the population older than 65 was negatively associated with the COVID-19 incidence rate, and the coefficient was low, while the results of two spatial regression models showed that this relationship was positive. However, the results of the three models were not statistically significantly correlated with the incidence rate. As to the rate of households with more than 1 occupant per room, the results were all significantly positive. Specifically, the result of SEM indicated that a one-point increase in the rate of households with more than 1 occupant per room was associated with a 157.385-point increase in COVID-19 incidence rate. With a parameter estimate of −66.217 (SEM), the rate of the population below the poverty level had a significantly negative influence on the COVID-19 incidence rate. The association between this variable and the COVID-19 incidence rate was also negative in OLS and SLM, although the relationship had no statistical significance in the result of SLM. The results of the three models demonstrated that income inequality had a nonsignificant and negative impact on the COVID-19 incidence rate. Road density had a statistically significant strong and positive influence on the COVID-19 incidence rate, as demonstrated by the results of three models. A one-point increase in road density was associated with a 226.89-point increase in COVID-19 incidence rate, according to the result of SEM.
5. Discussion
Racial/ethnic segregation is regarded as a fundamental cause of disparities in diseases [
9]. This study investigated the association between racial segregation and the COVID-19 incidence rate in Massachusetts, particularly minority groups, such as Hispanic, Black/African American, and Asian. We found that higher Hispanic and Black/African American segregations are more likely to be associated with a higher COVID-19 incidence rate. The areas where many Black people reside are in poor areas that are characterized by high housing densities [
27]. As revealed in the regression model, a higher percentage of more than 1 occupant per room and a higher poverty rate is also significantly associated with the incidence rate. The higher observed incidence and severity in minority groups may also be associated with socioeconomic, cultural, or lifestyle factors, genetic predisposition, or pathophysiological differences in susceptibility or response to infection [
28]. Furthermore, Bhala [
2] proposed that the role of culture, including multigenerational households, and variation in social interactions both play important roles in the increased risk of contracting COVID-19. Minorities also contribute to heightened exposure to COVID-19 because they are more likely to work in industries that have remained open during the non-essential business closures [
5].
Racial disparities in medical care treatments and outcomes are pervasive [
29]. In Massachusetts, Black and Hispanic residents were less likely to be insured than White residents, and they were more likely to report fair or poor health than Whites [
30]. Severe disease increases the chance of being infected by COVID-19. For example, Yang et al. [
31] conducted a meta-analysis of eight studies, including 46,248 patients with laboratory-confirmed cases of COVID-19. This indicated that those with the most severe disease were more likely to have hypertension, respiratory disease, and cardiovascular disease. Moreover, other studies found that obesity and smoking were associated with increased risks [
32,
33]. In Italy, higher risks have also been reported in men than in women, which could be partly due to their higher smoking rates and subsequent comorbidities [
34]. This pandemic presents a window of opportunity for achieving greater equity in the health care for all vulnerable populations [
4].
The access to testing sites shows that the Non-Hispanic Black population has the shortest drive time of 5.69 min to testing sites, followed by Hispanic, Asian, and Non-Hispanic White, which is consistent with that to hospitals, with travel times of 9.22, 8.88, 9.96, and 11.21 min. All travel times to two kinds of sites for COVID-19 testing are acceptable. Such findings show some discrepancies addressed in previous studies. Yancy [
27] demonstrated that African Americans have higher rates of COVID-19 infection and death. Such a finding was not displayed in Massachusetts, as it has a lower Non-Hispanic Black population of 9% with 9.4% COVID-19 confirmed cases, while the Non-Hispanic White is predominant (comprising 80% of the population) with 29.6% confirmed cases. Another study stated that COVID-19 testing centers are more likely to be established in the well-off suburbs, where the White population is more predominant, than in low-income Non-Hispanic Black dominant neighborhoods [
11]. Most testing sites and hospitals of Massachusetts are distributed in neighborhoods dominated by minorities, as they tend to live for more job opportunities and accessible public transportation. The longer travel time of the Non-Hispanic White population to access testing sites and hospitals may be attributable to their preference of living far from cities for spacious houses. Although easy access to testing sites might not be the only factor that influences the COVID-19 incidence rate, it is undeniably a crucial component in identifying potential carriers of COVID-19. Note that in this study, we assumed car-driving is the only mode of transportation. Our omission of other modes of transportation, such as public transit in well-developed regions, may present a bias in travel times for testing.
Previous studies have demonstrated that people over 65 are most at risk for COVID-19 infection [
35]. However, our study suggested that the relationship between COVID-19 incidence rate and the proportion of older people had no statistical significance. This may be attributable to the geographical distribution of the senior citizens in Massachusetts. As shown in
Figure 4, cities/towns with high percentages of the elderly population are mainly located in the west and eastern coastal areas, which are far from the Greater Boston area.
Results of SEM suggested a strongly significant and positive relationship between COVID-19 incidence rate and the percentage of occupied housing units with more than 1 occupant per room. The act of quarantining individuals is often among the first responses against the prevention of infectious diseases. However, it is estimated that 44% of secondary cases were infected during the index cases’ pre-symptomatic stage in settings with substantial household clustering and quarantine outside the home [
36]. In lower income households where several indviduals share the same room, the possibility of viral transmission may increase. This study has validated this hypothesis by showing that a higher percentage of more than 1 occupant per room and higher poverty rates are significantly associated with the incidence rate. The results of SEM also suggested a strongly significant and negative impact of the poverty rate on the COVID-19 incidence rate within Massachusetts. At first glance, this result is counterintuitive, since those in poverty are generally at risk of losing their health insurance coverage, which makes them vulnerable in the face of emergent epidemics such as COVID-19. However, this phenomenon can be explained by the fact that most cities/towns with high poverty rates are located in peripheral areas, which are remote from high COVID-19 incidence regions, as can be seen from
Figure 5. Interestingly, the road density is significantly positively associated with the COVID-19 incidence rate. This may be because cities/towns with higher road densities typically have more frequent human activities, which could provide conditions for disease transmission.
6. Conclusions
The geographic disparity of the COVID-19 incidence rate has been recognized in previous studies [
37,
38,
39]. However, most research was conducted at or above the county level, which prevents us from observing how the COVID-19 incidence rate interacts with a multitude of factors at a finer spatial scale. This study was able to analyze how the COVID-19 incidence rate was associated with racial and health accessibility disparities at a finer scale, after controlling for the possible influences of a set of demographics, economic, and transportation factors. The study was conducted at the city/town level in the State of Massachusetts, which is one of the states hit hardest by the pandemic.
The classic OLS regression model is unable to deal with the spatial autocorrelation effect. Therefore, we conducted a further spatial regression analysis based on two models, i.e., SLM and SEM, to obtain a more robust estimation of the significances and directions of the influences of racial and health accessibility disparities, demographic, economic, and transportation characteristics on COVID-19 incidence rate. Results suggest that residential segregations of Hispanic and Non-Hispanic Black/African Americans are associated with an increased risk of COVID-19 infection. Similarly, road density and percent of more than 1 occupant per room have statistically significant and positive impacts on the COVID-19 incidence rate. However, test site accessibility and poverty rate are related to a decreased risk of COVID-19 infection.
The empirical findings shown in this paper can provide helpful insights and guidance for policymakers to develop public health strategies to contain COVID-19 transmissions within Massachusetts. Most importantly, political action is needed to resolve long-standing societal inequalities, addressing the injustices of public health, and tackling the COVID-19 pandemic [
2,
40]. Public health is complicated and social reengineering is complex, but a change of this magnitude does not happen without a new resolve [
27]. Due to the overwhelming influence that road density and overcrowding have on COVID-19 transmission, social distancing or stay-at-home policies must be vigorously reinforced. As the condition of COVID-19 is still going on and evolving quickly in the United States, it is vital to explore the spatial patterns in small units, such as neighborhoods. The finer the geographic scale, the more detailed findings will be discovered. The research framework and implications proposed in this study can further serve as a reference for other states. This will allow individual states to better understand COVID-19 transmission in the context of their unique geospatial factors, such as racial and socioeconomic distributions as well as travel infrastructure.
There are several limitations and potentials in this study. Neighborhood socioeconomic deprivation may affect potential COVID-19 detection bias, especially in minority communities. Although this research found that Hispanic and Non-Hispanic Black people had the best accessibility to testing sites, low-income families without private vehicles still face challenges in accessing testing sites [
41]. Thus, assuming private vehicles as the only transportation mode might be problematic for cities where the public transit service coverage is high. Meanwhile, in Dr. Fareed’s research, he found that Non-Hispanic Black and Hispanics have much lower rates of health information seeking and confidence in accessing health information than Whites [
42], which may lead to their lack of COVID-19 knowledge and information regarding the distribution of testing sites. Moreover, Hispanic and Non-Hispanic Blacks tend to suspect that they were currently infected, which may raise the potential for asymptomatic infection [
43]. Therefore, differential self-assessments of the likelihood of current infection may result in differential testing behavior, causing the bias of incidence rates in the area. This research focuses on the city-level variables in the absence of individual-level data. If individual socioeconomic and health records data were readily available, they would provide further insight into the assessment of the relationship between socioeconomic status, health condition (i.e., chronic disease), and disease incidence. This in turn would enable the effective distribution of limited vaccine supplies by providing policymakers with insight into which populations to prioritize (i.e., people who are 65 years and older).