4.1. PFA Results
There are two main purposes of conducting PFA simulations. One is to obtain pedestrian exposure streamlines and the other is to obtain the locations and values of pedestrian particulate generation sources to more accurately model the spatial distribution characteristics of particulate matter.
For the first purpose, the movement track of each pedestrian is available in the software. Meanwhile, the space utilization analysis of different directions of boarding and alighting pedestrians for the station’s concourse and platform floors can also be performed, which indicates the pedestrian’s choice of path.
Figure 8 shows the space utilization analysis results for each subway trip. The closer the area to red, the higher the space utilization and the more concentrated the flow of pedestrians.
For the second purpose, the spatial density of pedestrians in the subway station needs to be analyzed to determine the crowded areas and the congestion of pedestrians. The spatial average density characterizes the average number of pedestrians per unit time and unit area. Higher density values indicate more severe pedestrian congregation in this area, and therefore mean that it is more prone to congestion, and larger values for pedestrian particulate generation sources. The average density of pedestrians can be characterized by the service level of the area, which can be found in the International Air Transport Association (IATA) waiting level of service comparison table given in Massmotion software [
21], as shown in
Table 4.
The simulation results of pedestrian spatial density at the station concourse floor and platform floor are shown in
Figure 9a,b, respectively, while the average number of pedestrians in the area below service level B is also marked in the figure. The location of the pedestrian particulate generation source is set at the center of the circle used to calculate the density, and the value is determined by multiplying the number of people by the amount of particulate generated by a single person. The congested areas in the concourse floor are mainly at the inbound security check queue, in front of the inbound gates, and in front of the outbound gates, corresponding to service levels E, E, and C, respectively. The congested areas on the platform floor are mainly the stairway entrance and the area in front of the PSDs.
Figure 10 and
Figure 11 give comparisons of the observed and simulated pedestrian distribution in part of the crowded areas at the station concourse and platform to validate the PFA model. Observations of pedestrian flow were made during the same period as the field measurement results.
4.2. CFD Results
After determining the locations and values of pedestrian particulate generation sources, CFD simulations can further be performed to analyze the spatial distribution characteristics of the particulate matter. The CFD model is solved in two steps; the first step is to solve the airflow field without the particulate matter source. For example,
Figure 12 shows simulation results of the airflow field at the station concourse floor under different airflow organizations. After that, the particulate matter source is then set and the DPM model is used to complete the solution of the particulate matter concentration field. The spatial concentration distribution of PM2.5 at the breathing plane (Height = 1.5 m) with different airflow organization is obtained as shown in
Figure 13 and
Figure 14.
For the concourse floor, the PM2.5 is mainly concentrated at the inbound security checkpoint, inbound vending machines, and outbound vending machines, and at the stairway entrance and the return air outlet where the crowd gathers. The main reason is that, firstly, pedestrians are one of the main PM2.5 sources, so the concentration of PM2.5 is larger where pedestrians gather; secondly, PM2.5 from the platform floor can move to the concourse floor with the airflow,. Therefore, higher concentrations of PM2 5 gather near the stairway.thirdly, the PM2.5 is easy to gather under the return air outlet due to the suction effect. The average concentration of PM2.5 on the concourse floor when the PSDs are closed is 59.73 μg/m3; the average concentration of PM2.5 when the PSDs are opened is 46.67 μg/m3. At this time, outdoor air enters the station through the exits, so the concentration of PM2.5 in the four passageways is greater, reaching 57.23 μg/m3. Overall, the concentration of PM2.5 in the passageway is close to the simulated concentration with the PSDs opened, while the concentration inside the station concourse is close to the simulated concentration with the PSDs closed.
For the platform floor, the PM2.5 mainly gathers at the PSDs on both sides. The main reasons for this are, firstly, that pedestrians gather in front of the PSDs to wait for trains; secondly, due to the suction effect, the concentration of PM2.5 under the return outlet is greater. The average PM2.5 concentrations at the platform floor are 60.03 μg/m3, 53.59 μg/m3, and 79.88 μg/m3 for PSDs closed, AB side PSDs opened, and CD side PSDs opened, respectively. When CD side PSDs are opened, the return air outlets are on the opposite side from the CD side PSDs, and the particulate matter tends to settle in the slower airflow area, so the overall concentration is higher.
4.3. Particulate Matter Concentration Prediction Based on Surrogate Models
According to Equation (1), the calculation of particulate matter exposure for each pedestrian requires a time-based integral calculation of the particulate matter concentrations along this movement track. Considering that the airflow organization state in the subway station is time-varying, the distribution of particulate matter and its concentration are different when the PSDs are open or closed. How to establish the relationship between the PM2.5 concentration distribution obtained by solving the steady-state DPM model in two airflow organization states and the concentration distribution in the real state so that the time-varying characteristics of the airflow organization state can be considered is the core problem of the accurate calculation of pedestrian exposure. Since there is no definite relationship between the simulated values of particulate matter concentration corresponding to the two simulated states and that of the real state, a data-driven surrogate model between the three is trained by SVR to facilitate the exposure calculation.
Given a training set {(
xi,
yi), …, (
xN,
yN)}, where
xi = [
xi1,
xi2, …,
xid]
T,
d is the dimension, the optimization objective of SVR is to find a regression hyperplane
f(
x) =
ωTϕ(
x) +
b that minimize of the difference between
f(
x) and
y, which corresponding to the following optimization function [
27].
where
ω and
b are the coefficient matrix and the bias matrix, they are used to characterize the regression hyperplane of the SVR model;
ε is the tolerance error;
ξ,
ξ * are slack variables, corresponds to the two error bounds of SVR respectively;
ϕ(
x) is the kernel function, when the radial basis function (RBF) is adopted, it satisfies,
In this study, two hyper-parameters
λ and
σ in Equations (6) and (7) are determined using the particle swarm optimization algorithm within the libSVM tool of MATLAB [
28].
The establishment of the surrogate model is divided into the following two parts: the first step is to conduct a reasonable sampling to obtain the real particulate matter distribution in the station. In this paper, the measurement points are arranged based on the optimal-latin hypercube sampling (OLHS) method [
29], this space-filling sampling method can capture the true concentration distribution characteristics of particulate matter with small samples set as fully as possible. Considering the long length of the concourse and platform floor, the space is divided into two parts in length for sampling, and 20 points are sampled in each part using the OLHS method. Besides, three sampling points are arranged in each passageway. After eliminating some sampling points that are not convenient for measurement, the sampling points at the concourse and platform floor are arranged as shown in
Figure 15. Considering the limited experimental equipment, a ten-minute sampling was conducted at each point during the period of 9:00–13:00 on 11 November 2020, when the PM2.5 concentration was relatively stable, to obtain the PM2.5 concentration variation and solve for its mean value. The sampling results and the corresponding simulated concentration value at each sampling point are given in
Appendix A.
After the sampling is completed, the second step is to train the surrogate model using the simulated concentration values of the two states as input and the measured concentration values as output.
For the concourse floor, there are two main particulate matter sources, the PM2.5 within the supply air, and the outdoor PM2.5 entering from the passageway when the PSDs are opened. The two sources cause the PM2.5 concentration data at the sampling points to show different distribution characteristics. To train surrogate models with better generalization ability, the
K-means algorithm is used to classify the sampling data at the concourse floor into two categories.
Figure 16 shows the classification results. Combined with the CFD simulation results, it can be seen that the A-type point is more influenced by the air supply pollution source; the B-type point is more influenced by external pollution sources. Surrogate models are built for the sampling points of types A and B, respectively. Since the regression effect of surrogate models is limited by the amount of data in the training set, to ensure a high generalization ability, the training set is set to be larger, with the test set data accounting for 10% to 20% of the total samples. Type A has 23 sampling points; 20 sets of data are randomly taken as the training set and three sets of data as the test set. Type B has 24 sampling points, 20 sets of data are randomly selected as the training set, and four sets of data as the test set. Later, when calculating the exposure, the data points located in the different areas will use the corresponding surrogate model according to the division results of the concourse floor shown in
Figure 16b.
The surrogate model training results for A-type sampling points are shown in
Figure 17a,b, with the optimal hyper-parameter combination of [
λ, σ] = [15.2, 0.03]. The determination coefficient
R2 is used to characterize the regression result [
29], as shown in Equation (8). the closer
R2 is to 1, the better the regression effect, an
R2 value of 0.995 for the training set and 0.988 for the test set.
where
yi,
,
i denote the measured data, the mean value of the measured data, and the predicted data, respectively, and
n is the number of data.
The surrogate model training results for B-type sampling points are shown in
Figure 18a,b, with the optimal hyper-parameter combination of [
λ, σ] = [907.0, 0.09]. An
R2 value of 0.989 for the training set and 0.992 for the test set.
Unlike the concourse floor, the main particulate matter source of the platform floor is only the PM2.5 within the supply air, so its surrogate model can be trained without clustering the data. Among the 29 sets of data at the platform floor, 25 sets of data are randomly selected as the training set and four sets of data are used as the test set. The particulate concentration values at the sampling points under the three conditions of PSDs closed, AB side PSDs open and CD side PSDs open are used as the input, and the measured results are used as the output; the training of the surrogate model is conducted, as shown in
Figure 19, with the optimal hyper-parameter combination of [
λ, σ] = [677.6, 2.90]. The
R2 value for the training set is 0.998 while the
R2 value for the test set is 0.932.
4.4. Exposure Calculation
In calculating the exposure, the pedestrian streamlines from Massmotion software are imported into Fluent to obtain the simulated values of particulate matter concentration at each point on the pedestrian streamlines. Fifty samples are randomly selected among the boarding pedestrians and the alighting pedestrians, respectively, where the streamlines of the 50 boarding pedestrians are shown in
Figure 20.
Furthermore, the exposure of pedestrians during the boarding and alighting process can be calculated using the surrogate model. The specific process of exposure calculation can be briefly expressed as follows. Firstly, considering that the sampling interval of the instrument used for field measurements is 1 s, for each pedestrian, their position is obtained once every second according to the PFA results. Then, the PM2.5 concentration is calculated based on the surrogate model corresponding to that location. Finally, this concentration value is regarded as the average exposure concentration of the pedestrian to PM2.5 in this second, and then, according to Equation (1), the exposure concentration of the pedestrian in the subway station space is calculated cumulatively to obtain the exposure amount. The exposure calculation results for the sampled 50 pedestrians are shown in
Figure 21. There are individual differences in the calculated exposures due to factors such as pace speed, route choice, and train waiting time. The average particulate matter exposure of pedestrians during the boarding process is 9985.74 μg·s/m
3, and the average particulate matter exposure of pedestrians during the alighting process is 5761.48 μg·s/m
3. Due to the need to queue for security checks and swipe cards to enter the station, the particulate exposure at the concourse floor during the boarding process is much higher than at the platform floor. The amount of particulate matter exposure at the concourse floor is comparable to that at the platform floor during the alighting process. The particulate matter exposure of pedestrians during the boarding process is about twice as much as that of the alighting process.