1. Introduction
From late 2019, the new coronavirus (SARS-CoV-2 or COVID-19) has been swiftly spreading around the world and affected over 21 million people (with ~776,000 fatalities) worldwide as of 15 August 2020 [
1]. The virus initially emerged in Wuhan, China, and the World Health Organization (WHO) was alerted on 31 December 2019 by the Chinese authorities about the pneumonia cases related to COVID-19. By the end of March 2020, the virus was pandemically propagated through Europe (e.g., Italy, Spain, France) and Asia (e.g., Turkey, Iran, India) (
Figure 1a), leading to national lockdowns, quarantines, and global curbs. As of 15 August 2020, the COVID epicenter is the United States (USA), with over 5.3 million confirmed cases, followed by Brazil, India, and Russia with 3.3, 2.6, and 0.9 million cases, respectively (
Figure 1b).
An increasing amount of scientific literature has focused on the environmental sensitivity of COVID-19 to modulate the risk of transmission. However, the study caveats have likely obscured plausible confounders and led to misconceptions about how weather drives COVID-19 exposure and transmission [
2]. With the evolution in the virus spread, more information and dataset have become available, providing an opportunity to (re)assess the possible contributing factors to pandemic growth in populous regions such as cities. The inconsistent conclusions related to the role of weather (‘climate’), and the access to newer data, motivate the re-evaluation of the findings from the early-stage studies that assessed the link between the COVID-19 transmission and weather.
Several attempts have been undertaken to evaluate the impact of meteorological parameters on the spread of infectious diseases (e.g., influenza, SARS). For example, in northern Europe, the influenza virus’ highest activity was found during periods with low ultraviolet radiation and colder periods [
2]. A higher risk of SARS transmissibility was also reported in a colder environment (16 °C to 28 °C) [
3,
4]. Since the beginning of the COVID-19 pandemic, researchers have linked mean air temperature [
5,
6] and absolute/relative humidity [
7,
8,
9] with the pandemic growth. These studies have reported that the novel coronavirus transmission can be suppressed in warm and humid conditions analogous to a seasonal respiratory flu virus behavior [
10].
The findings of these studies, however, have been based on assessments over a relatively short period (e.g., the study by Sajadi et al. [
11]) or a specific geographical location (e.g., the analysis by Rosario et al. [
12]). As reported in Baker et al. [
13], the studies during the early-stage of the COVID-19 pandemic could be inevitably inconclusive, given the limited available data and information about the virus. Baker et al. [
13] also simulated a pandemic event using a climate-dependent epidemic model and showed that climate could drive only a modest change to the pandemic size. Similarly, recent studies have reported that COVID-19 does not behave as other conventional coronaviruses based on weather and seasonality patterns. For example, Iqbal et al. [
14] used wavelet transform coherence to analyze the impact of air temperature on COVID-19 spread and reported no link between temperature and transmission rate in Wuhan, China. Similarly, in other studies in Spain [
15] and Iran [
16], researchers reported no evidence of a relationship between COVID-19 cases and climatic parameters.
The temporal and spatial scale considered in a study can also impact the outcome and interpretation of the findings. For example, in the study by Tosepu et al. [
5] in Jakarta, Indonesia, the range of vulnerable air temperature for the virus spread was reported to be between 26 °C and 28 °C. In contrast, Sajadi, Habibzadeh, Vintzileos, Shokouhi, Miralles-Wilhelm and Amoroso [
11] reported the range between 5 °C and 11 °C over the USA (considering the data until March 2020), and Gupta et al. [
17] reported the vulnerable air temperature between 3 °C and 17 °C over the USA (considering the data until mid-April 2020). In China, several studies [
18,
19] have reported the efficacy of COVID-19 transmission in cold and dry environments. In contrast, Poirier et al. [
20] reported that the weather-driven parameters are not necessarily correlated with the COVID-19 outbreak when different spatial scales are considered. These inconsistencies substantiate the role of “spatiotemporal scale” and “statistical method” to interpret the impact of meteorological parameters on the COVID-19 pandemic, as noted in a review study by Briz-Redón and Serrano-Aroca [
21].
Evaluating the impact of meteorological parameters on the epidemic spread of COVID-19 is challenging because of the unknown implications of several intermediate parameters on the behavior and pattern of the transmission [
22]. The clinical studies have consistently reported that the most frequent and plausible coronavirus transmission routes are “droplet transmission”, “direct contact transmission” (not involving contaminated surfaces), and “indirect contact transmission” (involving contaminated surfaces) [
23,
24,
25]. Thus, tracking the transmission routes is possible for clinical trials and not in a population-level scenario. In a real-world setting, the modes of the virus transmission are indirectly mirrored or triggered by contact-based dynamics that are, we postulate, better exemplified within an urban exposure environment. For example, a higher number of daily trips in a city with higher population density translates into a greater chance of contact among people, which increases the likelihood of the virus spread.
Specific environmental and socioeconomic components such as individual health and hygiene factors (e.g., washing hands or sanitizing, living conditions, and working environments) are difficult to track; yet, several datasets for critical factors affecting the virus transmissibility are available. Examples of critical elements and available datasets include gridded demographic data [
26], mobility and foot-traffic data from Google community reports [
27] or SafeGraph [
28] as indicators of social and physical distancing, and mask-wearing data [
29]. By account for the interplay between these factors and the virus transmission dynamics, we could reduce the uncertainty regarding the relation between weather/climate and the COVID-19 pandemic.
In this study, we evaluated the impact of a weather-driven parameter (represented by the equivalent air temperature) on COVID-19 transmissibility by considering the role of population and density (number of people per area of land), mobility and homestay metrics, and mask usage at various spatial scales from global to county levels. The findings will be of interest to policymakers, medical centers, and future risk assessment studies as communities try to develop mitigation strategies against COVID-19 and similar infectious diseases.
3. Results
To provide an overview of the COVID-19 infected cases, we first presented the data at a global scale (
Figure 2a) and the USA (
Figure 2b).
Figure 2a is categorized for different regions as per the WHO’s criterion. The highest number of cases were in the Americas, followed by Southeast Asia and Europe. During the first wave of the virus spread (until mid-May 2020), Europe and the Americas had the most significant share of the COVID-19 cases. During the second wave (from mid-May to August), the number of cases in Europe declined, while this figure increased in Southeast Asia and the Americas. Given the highest number of infections in more populous areas (e.g., the USA and Brazil in the Americas region, and India in South Asia), highlight the importance of considering the population as a factor affecting the COVID-19 transmission rate.
The strong correlation between the COVID-19 infected cases and population (r
2 = 0.41 to 0.85) is shown at a global scale (
Figure 2c), and over the USA at the state level (
Figure 2d) and county level (
Figure 2e). The resulting correlation highlighted the need to normalize the data relative to the population for evaluating the impacts of weather parameters. We divided the number of infected cases in a region by the regional population and considered it as the COVID-19 infected proportion.
3.1. Analysis at the Global Scale (Regional and National Level)
The global distribution of COVID-19 cases (aggregated over April and July 2020) and the equivalent temperature (averaged over April and July 2020) are shown in
Figure 3. Comparing different regions and countries in
Figure 3, we noted diverse patterns between the spreading rate and spatiotemporal distribution of equivalent temperature. In the northern hemisphere, considering the USA, the number of infected cases has risen with the increase in the equivalent temperature from April to July. Concurrently, the number of infections in Europe and Asia (e.g., in Spain, Turkey) has decreased despite higher equivalent temperatures. Similarly, in the southern hemisphere taking the examples of Australia and Brazil, we noted disparate patterns between the COVID-19 infected cases and equivalent temperature.
To better illustrate the spatiotemporal patterns mapped in
Figure 3, we converted the weekly cases of COVID-19 in each country to the infected proportion. We assessed its correlation with the equivalent temperature (averaged spatially over each country). The scatter plots of
Figure 3 show the resulting correlation for different geographical regions. Splitting the data into the regional segments helped reduce the bias imposed by the contrasting weather across the regions (compared to a global analysis). The resulting correlations were not statistically significant except for the Eastern Mediterranean, for which a significant but low correlation (r
2 = 0.26,
p < 0.05) was found. Note that in Eastern-Mediterranean countries, the onset of COVID-19 testing and collecting data coincided with warmer months of the year. In Iran, as an example of a country in the Eastern-Mediterranean region, the COVID-19 screening started in early April [
39]. From April, the air temperature typically increases in the region; therefore, a biased correlation between the number of infected cases and warm weather was expected.
At the country level, the equivalent temperature variations were not consistent with the transmission rate. As shown in
Figure 4, from January to July 2020, countries like the USA, Italy, and India have experienced a positive trend, while in China, Brazil, and Australia, the trend was negative. These assessments emphasized the importance of scale, both temporally and spatially, when the virus transmissibility is evaluated against meteorological parameters. Thus, considering the relationship between COVID-19 cases and weather-driven variables at a country level may not necessarily represent the actual behavior of COVID-19 transmission. For example, Rosario et al. [
12] reported that high solar radiation and temperature could effectively suppress the spread of COVID-19 in Brazil, while Xie and Zhu [
19] noted an opposite trend in China.
Some studies have linked the impact of latitude on the number of coronavirus cases, postulating that the mid-latitude countries experience more COVID-19 issues [
11,
40,
41]. Contrary to these findings, we highlight that the latitudinal effect is invalid when we consider the country’s population. As demonstrated in
Figure 4, while a larger portion of the COVID-19 cases was found in mid-latitude, we noticed that these coincide with the higher number of populations living in these areas (
Figure 4h). Normalizing the data based on the population (
Figure 4i), we obtained a nearly equal distribution of the COVID-19 in high, low, and mid-latitude. Therefore, the latitude was not correlated with the COVID-19 cases; instead, the population led to a higher number of infected cases in mid-latitudes.
The overall pattern achieved at the global scale is shown in
Supplementary Figure S3. While the infected proportion was marginally reduced in warmer weather, the changing rate did not exhibit any consistent pattern. The absence of a trend in the changing rate of COVID-19 cases for the change in equivalent temperature showed the insensitivity of COVID-19 transmission with the increasing or decreasing temperature (
Supplementary Figure S3b,c). The infected proportion was not statistically different when the equivalent temperature was between 0–10, 10–20, and 30–40 °C. Although the statistical analysis showed the highest infections in the range of −10–0 °C and the lowest in the 20–30 °C range, these findings were biased due to the significantly different frequencies of cases in the countries in these ranges (
Supplementary Figure S3d).
The COVID-19 pandemic has forced the countries to adapt rapidly to new policies and procedures to stop or slow down the virus spread. Given the impact of centralized or decentralized actions on the novel coronavirus, on the one hand, and the impact of the geographical location of a region on its weather, on the other hand, it will be unlikely or too convoluted to extract the true role of weather on the pandemic growth over a large spatial scale. The analysis considering a large spatial scale can be biased due to several caveats: (i) The weather type and regime can vary distinctly across a country or a region. Assessing the averaged equivalent temperature over the whole area for correlation analysis could be subject to bias, although it provides a broad understanding. (ii) Each country and region has been subjected to different strategies and policies which were not considered in the initial assessments reported in the literature thus far. (iii) The local policies and adaptation/mitigation strategies may dynamically change during the COVID-19 pandemic depending on the magnitude of the virus spread in the area. (iv) The timing of COVID-19 testing could directly bias the result as it was exemplified in the Eastern Mediterranean region.
Overall, the inconsistent relation cross different spatial extent highlighted the limited role of weather in the COVID-19 spread. Results indicated an increase, decrease, and no change in the number of infected cases and infected proportion for a constant range of equivalent temperature. Thus, the COVID-19 spread is likely controlled by other environmental factors rather than the regional climate. Given the caveats at the global and regional analysis, we performed a more detailed analysis at a finer spatial scale (county-level over the USA) to better understand the role of climate in the pandemic growth for other possible contributing factors. Typically, the weather does not vary distinctly at the county level (compared to a regional scale) and provides a more controlled environment to assess the interplay between the contributing factors. The following sub-sections provide the analysis for the equivalent temperature, urban density, mobility, homestay, and mask-wearing for the USA counties.
3.2. Impact of the Urban Area and Density on COVID-19
To evaluate the impact of urban area and density, we narrowed our focus over the US. The results for the urban areas and density are illustrated in
Figure 5a. The mean of urban density per county was calculated, and the results are shown in
Figure 5b. A visual comparison between the urban density (
Figure 5b) and the total number of cases at the county level (
Figure 5c) illustrated a probable correlation. The correlation between urban density and COVID-19 cases was notable in the southwest and northeast of the country. When the number of COVID-19 cases was normalized by the county-level population (
Figure 5d), a different pattern emerged. The overall infected proportion was higher in southeastern and southern USA counties, with some instances in the eastern and northeastern sides.
It should be noted that the role of urban population and density differ considering their functionality and impact on COVID-19 cases. A larger population relates to a higher number of potential hosts for the virus, directly linked with the spread of the COVID-19 cases (also shown in
Figure 2). A higher urban density, however, hypothetically relates to a higher chance of contact, exposure, and interactions between people, and therefore, indirectly can cause an increase in COVID-19 cases. To illustrate this difference, an example of two counties (the Suffolk and New York counties) with contrasting urban density but similar weather, imposed policies, and a relatively similar population was considered (shown in
Supplementary Figure S4).
The Suffolk and New York counties are among the most populated in the USA, with 1.48 million and 1.62 million people, respectively. However, the urban density in these two counties differs significantly, with a mean value of 32,000/km2 for New York county and 5800/km2 for Suffolk county. The substantially higher urban density in New York county compared to the Suffolk county (while the population is similar) resulted in three times higher infected cases in New York county during the early stages of the pandemic (March 2020). However, considering the data until July, a higher number of infections were found in Suffolk county, highlighting the impact of the factors beyond population. A similar pattern was also observed when all counties were analyzed. That is, other socioeconomic factors affect infection rates. The comparison here highlights that it is not just the population but the urban density that prominently contributed to the infection rates.
The number of coronavirus cases exponentially grew as urban density increased, suggesting a higher chance of COVID-19 transmission when the urban density was higher than 1400 people/km
2 (shown in
Supplementary Figure S5). After controlling the effect of the population (i.e., normalizing the number of infected cases by population), the resulting correlation was less pronounced (though statistically significant). Thus, it was not consistent with the initial hypothesis (that urban density leads to higher transmission). The temporal scale was likely the reason for the low correlation between COVID-19 infected proportion and urban density. During the early pandemic in the USA, the first cities and counties significantly affected by the virus had a dense population (e.g., the Bronx, King, Queens, and New York counties, as shown in
Supplementary Figure S5c). However, as the pandemic grew, new rules and policies helped curtail the spread of the virus, making the role of urban density less apparent. The exponential increase of virus transmissibility during the early pandemic was transformed into a linear growth and showed less difference between the ranges of urban densities when all data (until July 2020) were considered (shown in
Supplementary Figure S5d,e).
3.3. Impact of Mobility, Homestay, Mask Usage, and Weather on COVID-19
We evaluated the impact of mobility and homestay on the number and changing rate of COVID-19 at the USA county level. The percentage change of mobility and homestay at the county level was determined (by comparing the data in 2020 paired with a similar date in 2019), and the results are shown in
Figure 6. There was no significant difference during March (early pandemic) compared to the previous year (2019). Yet, the eastern part of the country experienced more mobility (the reduction in mobility was marginally negative, i.e., red colors) compared to the west (the reduction in mobility is shown in green colors). In April, the mobility has reduced dramatically, nationwide, after that in May and July, with the imposed policies, mobility again increased. A similar pattern was observed with the change in the number of people who stayed home during the pandemic (
Figure 6, homestay column). In April, the homestay was the highest, and it decreased after April. Comparing the change in mobility and homestay with the number of COVID-19, we note a feedback between these variables. From March to April, the increasing number of cases due to community spread [
42] caused people to reduce mobility and increase homestay, which, in turn, slowed the rate of community spread. In May, a notable change was observed in mobility and homestay metrics, as depicted in
Figure 6 (i.e., green colors shift to red from April to May). With the reopening policies, homestay percentage reduced, and mobility increased, which likely accelerated the community spread, and thus, the number of COVID-19 cases significantly increased during May in the United States. This increase coincided with the warmer season with higher temperatures and humidity (
Figure 6, last column).
Figure 6 provides an overview of the spatial correlation, and the changing rate of these variables was analyzed to interpret the association.
Figure 7 and
Figure 8 show the time series of the data (for all counties) and display how the weekly percentage change of COVID-19 cases was correlated with mobility and homestay changing rates. The probability density function (PDF) and cumulative density function (CDF) associated with this analysis are shown in
Supplementary Figure S6. The virus transmissibility was better correlated with mobility change and homestay. Weak correlations were noted when the COVID-19 growth rate was compared against the equivalent temperature. Considering the period from March to July, the transmissibility rate change became increasingly correlated with mobility and homestay (from r
2 of 0.004 to 0.26, as shown in
Figure 7). This trend was compatible with the general understanding of the COVID-19 spreading rate, considering a higher transmission rate with more daily trips (mobility) and less homestay. As a result, a low correlation was expected. In a population-level setting, not all the trips would lead to the virus spread. Several other factors are involved in this process, but as shown in
Figure 7, the more mobility and less homestay favored the virus transmissibility. At the time of conducting this study, the imposed policies (e.g., stay-at-home directives) over the USA were mostly decentralized, issued by local authorities at the state level. Given these variable responses across different states and the complex dynamics of human mobility, it continues to be a challenge to delineate a strong association between meteorology, mobility, and homestay.
Thus, the increasing or decreasing equivalent temperature did not yield a significant correlation (r
2 < 0.002) with the changing rate of COVID-19 cases (
Figure 7). For example, during June 2020, when a 2.5 °C increase in the equivalent temperature occurred, both positive (+120 weekly infected cases) and negative (−125 weekly infected cases) changes were observed. The nearly equal distribution of these changes resulted in a neutral trend (the horizontal fit lines in
Figure 7). Therefore, we could not establish any impacts of the weather (i.e., equivalent temperature) on changing the COVID-19 transmission rates.
Consistent with our results, we highlight the analysis by Kraemer et al. [
43] that used near-real-time mobility data and explained the COVID-19 spread across China. Although the impact of homestay, urban density, and the weather was not included in their analysis, most of the virus spread could be explained by the travel data alone. Over the USA, Badr et al. [
44] analyzed the correlation between mobility and COVID-19 cases and reported a strong correlation (r
2 > 0.7). However, they only considered 25 counties over the USA and did not remove the population impact. The population would affect both magnitudes of the cases (as we showed) and mobility. A higher population translates into higher daily trips and mobility. From a societal impact perspective, the positive role of mobility and homestay on curbing the COVID-19 transmission has also been highlighted in Sen-Crowe et al. [
45].
Figure 9 shows the percent of the population in each USA county that has used a mask in daily interactions during July 2020. Comparing the mask-wearing percentage with the number of COVID-19 cases during the same period showed an “effect-and-cause” scenario. That is, in the areas with a higher infected population (i.e., southwest, coastal east), the mask-wearing percentage was also higher, suggesting that mask usage was likely dictated by the COVID-19 spread and the ensuing rules that followed. The shift in the COVID-19 cases for the percentage of people in each mask usage category highlighted the “effect-and-cause” postulation (
Figure 9, lower row). For example, the counties where 70–80% of the population always used masks (considering category: ALWAYS) also corresponded to higher COVID-19 cases, while the part of the community that never used a face cover (considering category: NEVER) was in counties with low infected cases. Thus, mask use by itself was a tricky confounder that should be used with considerable caution when developing future analysis.
The limited mask-wearing data availability (only available during July) hinders our study from reflecting on the importance and explicit role of mask-wearing in the containment of the COVID-19 infection rate. However, many clinical studies confirmed the positive impact of mask-wearing on reducing the air dispersion and droplets during a human cough [
30,
46]. For example, Eikenberry et al. [
47] have provided a conceptual model to quantify the impact of mask-wearing on the transmission rate. They reported wearing a mask by 80% of the population could reduce the projected mortality rate by 17–45% in New York, and by 24–65% in Washington state in the United States.
3.4. Relative Importance of the Factors Affecting COVID-19
After analyzing the impact of weather, mobility, urban density, population, homestay, and mask-wearing, separately on the COVID-19 outbreak, we conducted a multiple regression analysis [
38] to evaluate the relative importance of these variables with respect to each other. The changing rate of COVID-19 cases was compared against other variables at a monthly scale, and the results are shown in
Figure 10. We did not include the mask usage as the data was only available during July, and the correlation was, as discussed above, an “effect-and-cause” scenario. The multiple regression weighted coefficient was rescaled and considered as the relative importance of the variables. The resulting r
2 of the multiple regressions varied from 0.52 to 0.57, with mobility, homestay, and population as the most important contributing factors to the COVID-19 transmissions. The importance of urban density reduced from May to July during the pandemic, which was compatible with the results previously discussed in the section related to urbanization (
Section 3.2). The impact of weather (i.e., equivalent temperature) remained the least contributing factor with less than 3% of relative importance. Therefore, when evaluating the role of weather on COVID-19 transmission, it is necessary to consider factors such as population, urban density, mobility, and homestay and ensure that these effects are explicitly considered or removed from the data.
Nevertheless, our analysis showed that the impact of weather was not greater than 3%. This highlights irrespective of the weather type, without safety considerations in daily mobility, newer waves of COVID-19 can occur. We note that the analysis could be subjected to the input data’s bias as the data originated from a dynamic source, namely human activity. Additional factors, including personal health and hygiene factors, age, and re-openings, could confound the results.
4. Conclusions
Evaluating the impact of individual factors on the outbreak of infectious diseases, such as COVID-19, is challenging in an unsupervised environment. For most parts, this complexity stems from the dynamics of individual behavior in a complex environment (e.g., urban setting) and limitations in the data (e.g., spatial resolution, local influences). The analysis is compounded further by the spatiotemporal lags between the causal processes and effects (in this case, the exposure and the reporting). Since the start of the COVID-19 pandemic, studies have attempted to address the role of weather in regulating the COVID-19 transmission.
Our assessment indicated possible misconceptions and biases in the analysis, primarily when the study focused only on the effects of meteorological variables alone. No compelling evidence was found to include weather as a significant contributor by itself to the spread of the COVID-19. The evidence from controlled small-scale laboratory experiments on the factors affecting the transmissibility mechanism of infectious respiratory diseases (e.g., transmission through contact) highlights how these factors are manifested in real life. We found the characteristics exemplified in urban metabolism or footprint as essential considerations when COVID-19 transmission is studied. Our findings highlighted the critical role of spatial and temporal meteorological scales on interpreting the impact of environmental factors on COVID-19 spread. The function of urban density, for example, was found to be determinative considering the colossal outbreak in New York within a short time during the early pandemic (March 2020). As the COVID-19 pandemic evolved, counties with lower density showed more infections due to other dynamical factors.
At the finer scale (i.e., the USA county level), the role of urban density, mobility, homestay, and the population was evident in affecting the infection rate. These factors resulted in a low (r2 < 0.3) correlation at the county level in the USA, highlighting the strongly nonlinear behavior of the incorporated elements. Based on our analysis, the weather by itself was identified noninfluential factor (relative importance < 3%); therefore, when the weather is considered in a study, finer-scale data is recommended, which accounts for urban form, function, and density. Our findings can help deploy decisions and policies on the COVID-19 outbreak and restructure the role of different factors for modeling the transmission and spread of the virus that considers the urban processes, density, mobility, and population in developing an improved understanding.