4.1. Identification of Groups with Different Activity Structures
In this subsection, we generate the clusters of similar individual activity structures by applying the K-means algorithm [
35] (the algorithm is provided in
Appendix A). One problem that needs to be taken into account in the clustering process is how to determine the most suitable number of clusters. The Dunn index [
36] and the Silhouette index [
37] are utilized in this paper to evaluate the clustering results [
38]. The Dunn index mainly reflects the compactness and separation of clusters, and the silhouette index reflects the rationality of clustering. The higher the values of the two indices, the better the clustering result.
Figure 5 shows the changing trend of the Dunn index and the Silhouette index when choosing different cluster numbers. Note that both indices indicate that a relatively stable clustering effect can be achieved when the number of clusters is five. In addition, RMSD [
39] which is defined as the sum of the root mean square deviations of cluster elements from the corresponding cluster center over clusters are utilized to characterize the homogeneity within clusters. It can be seen that the RMSD value decreases as the number of clusters increases and reaches the lowest value when the number of clusters is five and then increases again. It also shows that five clusters are suitable and produce the largest improvement in cluster performance. Therefore, the sample is clustered into five groups.
The activity structure is organized through a set of hierarchically ordered places that have a particular meaning for an individual [
40]. Excluding home, the major place is work or school or a place where a major regular activity occurs. The places where people spend their leisure time and socialize with others are called the secondary places (i.e., shops, cafes, bars, restaurants, parks, etc.) [
41]. In this paper, we mainly identify the major place and the secondary place to compose and analyze the activity structure. For the convenience of discussion, the main characteristic of the activity structure of each group is portrayed. The major place is utilized to identify the social character for each cluster.
Therefore, for each activity structure group, the ratio of the visited number of each POI-type to the total number of all visited POI-types at each time period is calculated and the POI-type with a percentage larger than 70% is defined as the major type that the major activity takes place. The corresponding relationship between the social character of each activity structure group and POI category is defined in
Table 1.
The POI-types of “industrial park”, “home”, “education”, “company” and “tourist place” listed in
Table 1 for groups 3–5 are consistent with the POI category used by AutoNavi Map (AutoNavi Software Co., Ltd., Beijing, China). Considering that “company” defined by AutoNavi Map is relatively broad, this paper defines “industrial park” as the enterprises that have a serious effect on the surrounding environment, such as metallurgy and chemical industries, minerals, and construction, while other companies such as network technology, advertising and decoration, commercial trade, high-tech enterprises, and so on are defined as “company”. In this way, it is possible to distinguish between factory workers and office workers.
Five distinctive activity structure clusters are identified and defined as “factory-centered activity structure”, “office-centered activity structure”, “home-centered activity structure”, “outgoing-centered activity structure” and “education-centered activity structure” which represents the group of “factory workers”, “office workers”, “stay-at-home”, “adventurers” and “education-related” respectively. People in the group with a much higher proportion of factories than that of other POIs spend most of their time at a factory, and thus the group is defined as a “factory-centered activity structure”. Similarly, the groups with much higher proportions of enterprises, home, and education (i.e., schools, museums, art galleries and exhibition) are defined as “office-centered activity structure”, “home-centered activity structure” and “education-centered activity structure”, respectively. Lastly, the group of “outgoing-centered activity structure” is defined when the total proportion of “shopping”, “leisure entertainment”, and “green space and parks” is dominant. For the simplicity of expression, “factory workers”, “office workers”, “stay-at-home”, “adventurers” and “education-related” will be used in the later sections to represent each activity structure group.
It is noteworthy that the average time proportion of staying at home for the “stay-at-home” group reaches 90%, which means this group of people mainly stay at home and rarely perform out-of-home activities. Since the activities of staying at home account for a very high percentage and thus the residential location (instead of the activity structure) may play a decisive role in the individual exposure level of people in this group. Therefore, the “stay-at-home” group will be excluded in later discussion. In other words, only four activity structure groups—“factory workers”, “office workers”, “adventurers” and “education-related” will be taken as the whole sample and analyzed.
4.2. Correlation Analysis of Activity Structure and PM2.5 Exposure Level
The Pearson correlation coefficient between the distance of activity structure and the distance of PM2.5 exposure is 0.78 (p < 0.01). With a p-value < 0.01, the correlation coefficient is statistically significant. The results indicate that there is a strong correlation between individual activity structure and individual PM2.5 exposure.
Then, a linear regression model is estimated and the results show a positive relationship between the distance of PM
2.5 exposure and distance of activity structure vector (with a slope of 7.34), which is shown in
Figure 6. The residual distribution is illustrated in
Figure 7, which indicates that 78.7% of the absolute residuals are less than 0.5, and 98.2% of the absolute residual is less than 1. This indicates that in most cases, the estimated distance of PM
2.5 exposure value is close to the actual distance of PM
2.5 exposure value.
The above results also indicate that the shorter the distance of activity structure between two individuals, the shorter is the distance of PM2.5 exposure between them. On the contrary, the longer the distance of activity structure between two individuals, the longer the distance of PM2.5 exposure between them. Based on the properties of clustering, people with similar activity structures are more likely to form a similar activity pattern and may have similar exposure to PM2.5 pollution, and vice versa. From this perspective, the structure of human activities influences the level of individual exposure to a certain extent.
We further examined the Pearson’s correlation coefficient for each activity structure group, the results are shown in
Table 2. All the value of the Pearson’s
r for each group is larger than 0.70. The value of Pearson’s
r for the factory workers and the education-related group is 0.85 and 0.81, which shows that for the two activity structure groups, there is a very strong correlation between activity structure and exposure level. For the adventurers and office workers, the correlation between activity structure and exposure level is a little lower compared with the former two groups, with the value of
r still higher than 0.73. In
Table 3, the value of R
2 on the entire sample is 0.60, which means the activity structure distance could interpret 60% of the variance in the exposure level distance. Similarly, the values of R
2 for the factory workers and education-related group are higher than the office workers and adventurers group.
The above results also cogently answer the question we posed at the beginning of this paper: whether individuals’ activity structure influences their PM2.5 exposure? The connection between people’s activity structure and their PM2.5 exposure has been established quantitatively hereto, which provides the basis to further analyze how different activity structures impact people’s exposure levels at the inter- and intra-group levels.
4.3. Inter-Group Relationships between Activity Structure and Exposure Effects
This subsection examines whether groups with different activity structures have different exposure levels and whether the relationship between the distance of activity structure and the distance of PM2.5 exposure also applies to these groups.
First,
Figure 8 shows the activity structure of the four distinctive groups. The horizontal axis represents time (hour) and the vertical axis represents the percentage of the number of users appearing at a specific POI-type among the total number of users. Each line represents the percentage of a specified POI-type. Thus the top line represents the main activity and the other lines represent the secondary activities of a specific group. Note that the four groups have distinctive activity patterns.
Figure 9 represents a radar graph that shows the activity structures of the secondary activities of the four groups.
Factory workers refer to people who spend most of their time at a factory. The proportion of factory as the primary activity places starts to rise around 6:00 a.m. which is the earliest among the four groups and declines at 4:00 p.m. They undertake secondary activities actively at noon and evening, but less than those of office workers and adventurers.
Figure 9a indicates that the main secondary activity places of factory workers consist of restaurants, shopping and entertainment-related places. Among these activity places, the restaurant has the highest proportion (9%). It means that the structure of factory workers’ activities is relatively simple and monotonous. In addition to working in a factory, eating out is the most important activity for this group. It is also noteworthy that this group rarely visits green spaces and parks (1%) or education-related places (2%) which would reduce their PM
2.5 exposure and benefit their health.
Office workers mainly refer to the people whose workplaces are the enterprises such as high-tech companies, financial institutions, advertising and decoration, commercial trade, and so on.
Figure 8b shows the activity structure of this group. Most of them leave home during 7:00–8:00 a.m. and return home from work at 5:00–8:00 p.m. Their secondary activities after work are vibrant and plentiful. Office workers are usually white-collar workers with a mid-to-high income level, and hence they are more willing to go out for social, recreational and leisure activities after work. The main secondary activity places of this group consist of entertainment avenues, shopping centers, restaurants, bus stations, parking places and industrial parks. The proportion of shopping and leisure activities for office workers is surpassed only by that of the adventurer group and is much higher than factory workers and education-related workers. These two kinds of activities are characterized by two obvious peaks. The first peak appears around 12:00 a.m., which is associated with the noon break. The second peak appears around 8:00 p.m. The proportion of dining-out activities is a little lower when compared with entertainment activities. The first peak appears near 1:00 pm and the second peak appears around 5:00 p.m. It is noteworthy that this group rarely visits education-related places (less than 2%) and green spaces and parks (less than 1%), which means that the use of green spaces is very low in this group.
Adventurers mainly refer to people who visit public places for relaxation and recreation (i.e., tourist places, green spaces, parks, and so on) as their main activities.
Figure 8c shows the activity structure of this group which mainly includes tourists and local residents who like to undertake outing activities. According to the places where they leave in the morning, about 65% of this group belong to the local residents and approximately 15% of this group belong to the tourists. The home ratio (proportion of staying at home) in the morning reaches a trough around 11:00 a.m., which shows that compared with ordinary workers, the travel time of people who like to go out is relatively flexible. They usually avoid the morning rush hour and undertake outing activities when traffic is smooth. The home ratio in the evening usually starts to rise at 6:00 p.m, which is one hour later than that of the ordinary workers. And the upward trend is relatively gentle, especially between 6:00–8:00 pm, which means that the outing activities of adventurers usually last until the late evening. The main activity structure of adventurers consists of green spaces and parks (15%), entertainment avenues (9%), bus stations (10%), shopping centers (10%) and restaurants (8%). This group rarely visits industrial parks and companies.
The education-related group mainly includes education practitioners and students. They start education-related work during 7:00–8:00 a.m. and finish working around 6:00 pm. There is a small peak around 8:00–10:00 pm, which means some people still have to do education-related work at night (i.e., teaching or attending evening classes). The peak value of the proportion of education and cultural places for this group is close to 80%, which is the highest for the main activity among the four groups. It also means that members of this group perform the fewest secondary activities. The time secondary activities occur is at noon and in the evening.
Figure 9d represents the activity structure of the secondary activities of this group. The main places this group visit are entertainment avenues (3%), greens and parks (3%), restaurants (2%), shopping-related places (5%) and bus stations (3.5%). This group rarely visit industrial parks, companies and hotels.
The above results indicate that the characteristics of the activity structure for each group are distinctive. Second, we examine whether the PM
2.5 exposure levels of the four activity structure groups are different. Individual exposure level was calculated according to the average PM
2.5 concentration at each time interval. The mean personal PM
2.5 exposure level of the sample residents is 67.67 μg/m
3 (range: min 51.93–max 83.41). The personal exposure levels of the 15,120 residents are shown in
Figure 10. The box plot in
Figure 11 represents three quartiles of PM
2.5 exposure of the four activity structure groups. PM
2.5 exposures of the four activity structure groups show a clear sequence from high values to low. They are factory workers, office workers, education-related, and adventurers. Specifically, the minimum value, maximum value and mean value of PM
2.5 exposure for the factory workers are 62.7, 81.3 and 69.0 μg/m
3. All these three values are the highest among the four groups. It indicates that, on the whole, factory workers tend to be exposed to the highest level of PM
2.5 pollution among the four activity structure groups and the overall exposure risk of this group is the highest among them. This is in line with our expectations. According to the activity structure of this group, factory workers have long working hours and suffer from the high pollution of their work environments. They often have difficulty in accessing high-end facilities and rarely have the opportunity to reduce their daily exposure level [
9].
Next to the factory workers, the minimum value, maximum value and mean value of PM2.5 exposure for office workers are 60.1, 74.9 and 66.2 μg/m3. Thus, it indicates that the group of office workers as a whole still tends to have a relatively high exposure risk due to their high mobility and diversified activities. Then, the minimum value, maximum value and mean value of PM2.5 exposure for the education-related group are 53.6, 71.7 and 61.7 μg/m3. This group tends to have relatively low exposure risks. This is mainly because the workplaces of this group are educational and cultural places, where the vegetation coverage rate and the overall greening level are relatively high. Lastly, the minimum value, maximum value and mean value of PM2.5 exposure for adventurers are 53.0, 72.9 and 60.7 μg/m3. On the whole, this group tends to have the lowest exposure risk because of its highest levels of green space usage among the four groups. In general, the PM2.5 exposures of factory workers and office workers are higher, while the PM2.5 exposures of the education-related group and adventurers are lower.
To further verify whether the mean PM
2.5 exposure levels of the four activity structure groups differ significantly from each other, an ANOVA test [
42] which compares the means of a continuous variable in two or more independent comparison groups was performed. The results are shown in
Table 4. It can be seen that the between-group and within-group mean square deviations are 45,614 and 9.20 respectively. The F-value (4596.97) is much larger than the critical value of F (2.6056) when the level of significance is 0.05. This huge F-value is strong evidence that the null hypothesis (the four activity structure groups having equal mean PM
2.5 levels) should be rejected. Meanwhile,
p = 0.004 (<0.05) means the result is statistically significant and the mean PM
2.5 exposure levels of the four activity structure groups differ significantly from each other.
Then, the post-hoc procedure of the Scheffé test [
43] was conducted to further examine which group pairs’ PM
2.5 exposure differ significantly from each other. The result is shown in
Table 5, which indicates that the F-value between the adventures and education-related groups is less than F-Critical (7.8168), and the F-value between any other pairs of activity structure groups is all greater than F-Critical. This indicates that no significant difference exists between the mean PM
2.5 of adventures and education-related groups, while a significant difference exists between the mean PM
2.5 of the rest of the activity structure group pairs.
These results indicate that different activity structure groups do experience different levels of PM2.5 exposure. In other words, at the inter-group level, different activity structures affect people’s exposure level and daily activity structure does have a certain influence on people’s exposure.
Finally, we examine the activity structure distance and PM
2.5 exposure distance among the four groups.
Figure 12 represents the activity structure distance matrix and PM
2.5 exposure distance matrix among the four activity structure groups. The activity structure distance between the two groups is calculated using the average distance between them, and the PM
2.5 exposure distance uses the F-value of the Scheffé test. As shown in
Figure 12a, for factory workers, the distance of activity structure between this group and office workers is the shortest, followed by the education-related group and finally the adventurer group. The distance of PM
2.5 exposure between factory workers and the other three groups also follows this order. Specifically, the activity structure for factory workers and office workers are the most similar (0.83), and the distance of PM
2.5 exposure between these two groups is the shortest (7.97). On the contrary, the activity structure for the factory workers and the adventures are the most dissimilar (0.89). Therefore, the distance of PM
2.5 exposure between these two groups is the longest (12.34). For adventures, this group is most similar to the education-related group, followed by office workers and finally factory workers in terms of activity structure. The distance of PM
2.5 exposure among them also follows the same order. In general, the distance sequence of PM
2.5 exposure corresponds with that of the activity structure among the four activity structure groups. This indicates that the relationship between the activity structure and PM
2.5 exposure is also true at the level of activity structure group. That is to say, the smaller the distance of activity structure between two activity structure groups, the closer is the distance of PM
2.5 exposure between them. On the contrary, the longer the distance of activity structure between two activity structure groups, the longer is the distance of PM
2.5 exposure between them.
4.4. Intra-Group Relationship between Activity Structure and Exposure Effects
This subsection further observes the interaction relationship between activity structure and exposure level inside each activity structure group in the reverse direction. We first divide the PM2.5 exposure level into three categories: high (ranges from 70.20 to 81.3 μg/m3), medium (63.54 to 70.19 μg/m3) and low (53 to 63.54 μg/m3) by clustering the PM2.5 exposure value. Then, the activity structures at different exposure levels for each group are examined whether people at different exposure levels have different detailed characteristics within each group.
4.4.1. Factory Workers
Figure 13a–c show the activity structure at high-, medium- and low exposure levels. The three charts in the upper row describe the overall activity structure of the group and the three charts in the lower row depict the details of the curves that represent the secondary activities in the first row.
Figure 14 is a radar graph that shows the activity structure of the secondary activities of factory workers at different exposure levels.
There are differences in the working hours at different exposure levels. The proportion of factory starts to rise around 6:00 a.m. At high exposure levels, the factory curve falls to 28% at the bottom during 6:00–8:00 p.m. After 8:00 p.m., the curve shows an upward trend and rises to 40%. It can be explained that 40% of the factory workers still have to return to the factory to work. They may be overtime workers or night-time workers. At medium exposure level, after 4:00 p.m., the curve shows a downward trend and there is no trend of recovery, reaching a trough (25%) at 10:00 p.m. At low exposure level, it also shows a rising trend starting from 4:00 p.m., and the downward trend continues until 10:00 p.m. The results indicate that the working hours of this group at low- and medium exposure levels are significantly shorter than those at high exposure levels. In other words, by reducing the working hours or the time spent in factory, the individual exposure of factory workers is reduced to a great extent.
The proportion of home also differs at different exposure levels. At high exposure levels, the proportion of home shows an upward trend after 7:00 p.m., reaching a peak (37%) at 9:00 p.m. and then slightly decreases. At medium exposure level, it shows a sharp increase since 6:00 p.m., reaching 43% at 10:00 p.m., and there is no trend of falling back. At low exposure level, the curve increases sharply since 6:00 p.m., reaching 78% at 10:00 pm. This indicates that the proportion of returning home after work increases significantly with the decrease of exposure level and the choice of going home earlier helps reduce the personal exposure level of factory workers to a great extent.
There are also differences in recreational activities for factory workers at different exposure levels. At high exposure levels, there are no obvious recreational activities except for dining activity which shows two obvious peaks during lunch and dinner time. The shopping and entertainment activities account for a very low percentage, which means the recreational activities are monotonous. This may be due to the limitation of working hours and they have very little time to take non-working activities out. At low- and medium exposure levels, on the one hand, the proportion of factory decreases. The peak proportion of factory at high-, medium- and low-exposure level during 9:00 a.m.–5:00 p.m. are 60%, 51% and 41%. On the other hand, the recreational activities of the factory workers at low- and medium exposure levels start showing a diversified trend. The non-working activities are more abundant than those at high exposure levels.
Among the factory workers, the activity structures at different exposure levels have differentiation in detail. The results show that the higher the exposure level, the longer the working hours and the more monotonous the types of recreational activities; while the lower the exposure level, the shorter the working hours and the more diverse the types of recreational activities.
4.4.2. Office Workers
Figure 15 shows the secondary activity structure of office workers.
Figure 16 shows the activity structure of the secondary activities of office workers at different exposure levels.The activity structures of this group at different exposure levels are similar on the whole. Differences in the choice of commuting modes for this group at the three exposure levels are observed. At high exposure levels, the parking curve shows three peaks in the time periods of 7:00–10:00 a.m., 12:00–2:00 p.m., and 5:00–8:00 p.m. This is consistent with the commuting hours in the morning peak, the noon break and the evening peak. The proportion of parking at high exposure level is the highest with three peaks (8%, 10% and 9%). The three peak values are 7%, 5% and 5% at medium exposure level, and are 1%, 3% and 4% at low exposure level. It indicates that with the decrease in the proportion of the use of parking, the exposure level also shows a downward trend. Interestingly, the use of bus stops also shows three obvious peaks in the three corresponding time periods. The three peak values are 6%, 4% and 8% at high exposure level, are 6%, 7% and 12% at medium exposure level, and are 13%, 8% and 14% at low exposure level. It indicates that people with lower exposure levels tend to use bus stops higher, while people with higher exposure levels tend to use parking lots more. In other words, the travel mode of office workers with high exposure levels is mostly private cars while the office workers with low exposure levels often choose public transportation as their commuting mode. In addition, the shopping and leisure activities for office workers are colorful, but the obvious differences in recreational activities among this group at different levels of exposure are not observed.
4.4.3. Adventurers
Figure 17 shows the structure of the secondary activities of adventurers at different exposure levels.
Figure 18 shows the activity structure of the secondary activities of adventures at different exposure levels. From the activity structure point of view, there are obvious differences at the three different exposure levels. For shopping activities, the peak value of the percentage is 20% at high exposure level, 15% at medium exposure level, and 12% at low exposure level. The higher the exposure level, the higher the proportion of shopping activities and vice versa. From the perspective of time, shopping activities fluctuate less at different times during the day and are evenly distributed, which means adventurers would take shopping activities at any time of the day.
The trends of recreational activities during 8:00 a.m.–6:00 p.m. at different exposure levels are similar, but there are differences during 7:00 p.m.–9:00 p.m. at night (the time period with the worst air quality). At high exposure level, the leisure and recreational activities account for the highest proportion, with a peak of 21%; at medium exposure level, the ratio declines and peaks at 13%; at low exposure level, the proportion falls sharply at 10%. Similar to shopping activities, it can be observed that the higher the exposure level, the higher the proportion of leisure and entertainment activities and vice versa.
At different exposure levels, the travel modes are also different. At low exposure levels, the use of public transportation accounts for the highest proportion, reaching two peaks 18% and 20% at 11:00 a.m. and 6:00 p.m.; while the use of public transportation at high exposure levels is very low with an average value of 3%. At the same time, it is also observed that the proportions of parking at high-, medium- and low exposure levels are different (the average proportions are 8%, 5%, and 2% respectively). In other words, at high exposure levels, adventurers tend to take outing activities and travel by car; while at low exposure levels, adventurers mainly use public transportation to travel.
The visiting behaviors to tourist places (i.e., greens and parks) differ at different exposure levels. At low exposure level, the proportion of such activities is at a relatively high level, gradually rising from 8:00 a.m., reaching the first peak (25%) and the second peak (18%) at 3:00 p.m. and 8:00 p.m. and then falling sharply. At the medium exposure level, the trend is similar to that at low exposure levels, but the proportion decreases and reaches the first peak (14%) and the second peak (13%) at 3:00 p.m. and 8:00 p.m. However, at high exposure levels, the ratio is very low, with an average value of 5%. It shows that different ways of using the greens and parks may lead to different exposure results. The adventurers at low exposure level visit more green spaces, while the adventurers at high exposure level rarely visit such places.
Generally, the detailed activity structure of adventurers is different at different exposure levels. Adventurers with high exposure level visit leisure and entertainment, shopping-related places and restaurants with a very high percentage and the main travel mode is the private car. It is noteworthy that the use of green spaces is very low. However, adventurers with low exposure levels are more inclined to visit tourist places and travel by public transportation, but the entertainment and shopping activities are relatively low.
4.4.4. Education-Related Group
Figure 19 represents the structure of the secondary activities of the education-related group at different exposure levels. At high exposure levels, people leave home during 7:00–8:00 a.m., return home during 5:00–6:00 p.m. and do less outing activities. At medium exposure level, people start choosing to go out for leisure and entertainment or shopping in the evening. At low exposure levels, the outing activities are more abundant. For example, the proportion of bus stops and green spaces increases, and the overall proportion of people at home is also far lower than the high- and medium exposure levels. It can be seen from
Figure 20, at low exposure level, the education-related group shows stronger vitality and has a higher willingness to go out than those at medium-to-high exposure level, and the places visited appear the characteristic of diversification. However, at high exposure levels, this group shows the opposite trait.
Specifically, as the exposure level decreases, the frequency of outing activities (shopping, entertainment and leisure, bus stops, etc.) gradually increases during the periods 12:00 a.m.–2:00 p.m. and after 5:00 p.m. For example, the peak proportion of shopping activities at the high-, medium- and low levels are 2.9%, 5.8%, and 6.9% at noon, and 3.7%, 6.7%, and 9.7% in the evening hours. The peak proportion of bus stations at the high-, medium- and low exposure levels are 2.7%, 3.7%, and 4.7% at noon, and 5.2%, 5.6%, and 5.6% in the evening hours. The peak proportion of leisure and entertainment at high-, medium- and low exposure levels are 2.8%, 4.3%, 4.8% at noon and 2.4%, 5.4%, and 9.4% in the evening hours. The peak ratio of visiting green spaces at night is 1.5%, 3.2%, and 8.5% at high-, medium- and low exposure levels respectively. Therefore, these results indicate that the characteristics of detailed activity structure differ at different exposure levels. People with high exposure have the lowest proportion of secondary activities and vice versa.
In general, by observing the detailed characteristics of people’s activity structures under different exposure levels in each group, there is indeed a distinction between the detailed activity structures. These results also show in turn that at the intra-group level, different activity structures may affect people’s exposure levels, and the daily activity structure does have a certain impact on people’s final exposure results.